Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Grouper diagnostics provides a URL on Grouper WS which will help to give the health of Grouper.  This can include memory in the WS server, connection to the Grouper Registry DB, that sources can perform queries, and that Grouper loader jobs are successfully executing.  If everything is ok, a 200 HTTP code will be returned, else 500.  A description of the issue will be returned as well.  The point is that this URL can be by pointed to be web monitoring software like nagio, big brother, BMC, etc.

...

Code Block
#if ignore tests.  Note, in job names, invalid chars need to be replaced with underscore (e.g. colon)
#anything in this regex: [^a-zA-Z0-9._-]
ws.diagnostic.ignore.memoryTest = false
ws.diagnostic.ignore.dbTest_grouper = false
ws.diagnostic.ignore.source_jdbc = false
ws.diagnostic.ignore.loader_CHANGE_LOG_changeLogTempToChangeLog = false
ws.diagnostic.ignore.loader_MAINTENANCE__grouperReport = false

#number of minute that can go by without a success before an error is thrown
ws.diagnostic.minutesSinceLastSuccess.loader_SQL_GROUP_LIST__aStem_aGroup2 = 60

Trivial option

Use this to do checks often, or when there is a cluster, you can use this on all nodes, and a deeper check on one node only

...

Here is an example of an error

Code Block
 HTTP Status 500 -

type Exception report

message

description The server encountered an internal error () that prevented it from fulfilling this request.

exception

java.lang.RuntimeException:
There was an error in the diagnostic task DiagnosticLoaderJobTest, Loader job CHANGE_LOG_changeLogTempToChangeLog

:Cant find a success since: 2010/05/17 01:38:50.000, expecting one in the last 30 minutes
	edu.internet2.middleware.grouper.ws.status.GrouperStatusServlet.doGet(GrouperStatusServlet.java:191)
	javax.servlet.http.HttpServlet.service(HttpServlet.java:617)
	javax.servlet.http.HttpServlet.service(HttpServlet.java:717)

root cause

java.lang.RuntimeException: Cant find a success since: 2010/05/17 01:38:50.000, expecting one in the last 30 minutes
	edu.internet2.middleware.grouper.ws.status.DiagnosticLoaderJobTest.doTask(DiagnosticLoaderJobTest.java:103)
	edu.internet2.middleware.grouper.ws.status.DiagnosticTask.executeTask(DiagnosticTask.java:44)
	edu.internet2.middleware.grouper.ws.status.GrouperStatusServlet.doGet(GrouperStatusServlet.java:129)
	javax.servlet.http.HttpServlet.service(HttpServlet.java:617)
	javax.servlet.http.HttpServlet.service(HttpServlet.java:717)

note The full stack trace of the root cause is available in the Apache Tomcat/6.0.20 logs.

...