In patches in Grouper 2.4 (api #61, ui #37), Grouper has a reporting capability. This will start simple and we can add more features later.
You may want to check out the blog on Grouper Reporting from November 2019. |
You need to use a file system (if you have a shared filesystem among all grouper component JVMs), or Amazon AWS S3.
grouper.properties
###################################### ## Grouper Reporting ###################################### # folder where system objects are for reporting config # {valueType: "stem"} reportConfig.systemFolder = $$grouper.rootStemForBuiltinObjects$$:reportConfig # if grouper reporting should be enabled # {valueType: "boolean", required: true} grouperReporting.enable = true # grouper reporting storage# grouper reporting storage option. valid values are database, fileSystem or S3 # {valueType: "string", required: true} reporting.storage.option = database # grouper reporting file system path where reports will be stored, e.g. /opt/grouper/reports # {valueType: "string", required: false} reporting.file.system.path = # grouper reporting s3 bucket name where the reports will be uploaded # {valueType: "string", required: false} reporting.s3.bucket.name = # grouper reporting s3 bucket name where the reports will be uploaded, e.g. us-west-2 # {valueType: "string", required: false} reporting.s3.region = # grouper reporting s3 access key # {valueType: "string", required: false} reporting.s3.access.key = # grouper reporting s3 secret key # {valueType: "string", required: false} reporting.s3.secret.key = #grouper reporting email subject # {valueType: "string"} reporting.email.subject = Report $$reportConfigName$$ generated #grouper reporting email body. Can use variables # {valueType: "string"} reporting.email.body = Hello $$subjectName$$, \n\n Report $$reportConfigName$$ has been generated. Download the report: $$reportLink$$ \n\n Thanks |
For this example lets use the file system. Configure in grouper.properties
# grouper reporting file system path where reports will be stored, e.g. /opt/grouper/reports # {valueType: "string", required: false} reporting.file.system.path = d:/temp/temp/grouperReports |
Make sure you have mail setup in the SMTP external system
#smtp server is a domain name or dns name. set to "testing" if you want to log instead of send (e.g. for testing) # {valueType: "string"} mail.smtp.server = localhost mail.smtp.from.address = noreply@whatever.edu |
Make sure you have a mailAttributeName in your person subject source
Make sure you have grouper.ui.url set in grouper.properties
#put the URL which will be used e.g. in emails to users. include the webappname at the end, and nothing after that. #e.g. https://server.school.edu/grouper/ # {valueType: "string"} grouper.ui.url = http://localhost:8097/grouper/ |
Make sure you have an encrypt.key in morphString.properties
subjectApi.source.jdbc.param.emailAttributeName.value = email |
If you are using the built in subject source, you can add a user for yourself with an email address (yours), this is GSH
grouperSession = GrouperSession.startRootSession(); RegistrySubject.addOrUpdate(grouperSession, "mchyzer", "person", "Chris Hyzer", "Chris Hyzer", "mchyzer", "Chris Hyzer - IAM architect", "your@email.address"); |
Open a group, add a new report
myReport | |
my service users | |
SQL | |
SELECT gm.subject_id as SUBJECT_ID, gm.name as NAME, gm.description as DESCRIPTION FROM grouper_memberships_lw_v gmlv, grouper_members gm WHERE gmlv.group_name = 'testB:testGroup2' AND gmlv.member_id = gm.id AND gmlv.subject_source = 'jdbc' ORDER BY 1 | |
CSV | |
0 0 6 * * ? (run daily at 6am) | |
usersOfMyService_$$timestamp$$.csv | |
Yes, send email when the report is ready | |
Allowed group id | A group with your user in it |
Now get an email about the report
See the report
The GSH report type can have an output type of either CSV or FILE. For both types, the script will use fields from the available gsh_builtin_gshReportRuntime variable to add data to the output. For CSV output, the script will set a header array and a list of data arrays. For FILE output, the script will open a Writer and write arbitrary data to the character stream.
Variables available to the GSH script
Variable | Java class | Description |
---|---|---|
gsh_builtin_grouperSession | GrouperSession | session the script runs as |
gsh_builtin_ownerStemName | String | owner stem name where template was called |
gsh_builtin_ownerGroupName | String | owner group name where template was called |
gsh_builtin_gshReportRuntime | GshReportRuntime | container to hold important information about the run |
gsh_builtin_gshReportRuntime.getOwnerGroup() | Group | owner group where template was called |
gsh_builtin_gshReportRuntime.getOwnerStem() | Stem | owner stem where template was called |
gsh_builtin_gshReportRuntime.getOwnerGroupName() | String | same as gsh_builtin_ownerGroupName |
gsh_builtin_gshReportRuntime.getOwnerStemName() | String | same as gsh_builtin_ownerStemName |
gsh_builtin_gshReportRuntime.getGrouperReportData() | GrouperReportData | container for the output file (FILE) or csv rows (CSV) |
gsh_builtin_gshReportRuntime.getGrouperReportData().getFile() | File | (FILE) file object to be written to |
gsh_builtin_gshReportRuntime.getGrouperReportData().getHeaders() | List<String> | (CSV) column names to appear in csv header row |
gsh_builtin_gshReportRuntime.getGrouperReportData().getData() | List<String[]> | (CSV) rows of data to appear in the csv; not set by default, so must be initialized with at least an empty list |
Group g = gsh_builtin_gshReportRuntime.ownerGroup File file = gsh_builtin_gshReportRuntime.grouperReportData.file file.withWriter('utf-8') { writer -> writer << ['Row', 'ID', 'UID', 'Name', 'Email'].join(",") << "\n" g.members.eachWithIndex { it, i -> writer << i+1 << "," writer << it.subject.getAttributeValue('employeenumber') << "," writer << it.subject.getAttributeValue('uid') << "," writer << it.subject.getAttributeValue('cn') << "," writer << it.subject.getAttributeValue('mail') << "\n" } } |
Group g = gsh_builtin_gshReportRuntime.ownerGroup GrouperReportData grouperReportData = gsh_builtin_gshReportRuntime.grouperReportData grouperReportData.headers = ['Row', 'ID', 'UID', 'Name', 'Email'] grouperReportData.data = new ArrayList<String[]>() g.members.eachWithIndex { it, i -> String[] row = [ i+1, it.subject.getAttributeValue('employeenumber'), it.subject.getAttributeValue('uid'), it.subject.getAttributeValue('cn'), it.subject.getAttributeValue('mail'), ] grouperReportData.data << row } |
import edu.internet2.middleware.grouper.app.reports.GshReportRuntime import edu.internet2.middleware.grouper.app.reports.GrouperReportData def gs = GrouperSession.startRootSessionIfNotStarted().grouperSession def g = GroupFinder.findByName(gs, "test:vpn:vpn_legacy_exceptions", true) GshReportRuntime gshReportRuntime = new GshReportRuntime() gshReportRuntime.ownerGroup = g gshReportRuntime.ownerGroupName = g.name GrouperReportData grouperReportData = new GrouperReportData() gshReportRuntime.grouperReportData = grouperReportData // (next line is for FILE output only, set to an arbitrary file instead of the autogenerated one) grouperReportData.file = new File('/tmp/legacy_exceptions.csv') // simulate the built-in variables GrouperSession gsh_builtin_grouperSession = gs GshReportRuntime gsh_builtin_gshReportRuntime = gshReportRuntime String gsh_builtin_ownerStemName = gsh_builtin_gshReportRuntime.ownerStemName String gsh_builtin_ownerGroupName = gsh_builtin_gshReportRuntime.ownerGroupName /** continue from here with reporting script */ |
The configuration will follow the same attribute structure as other Grouper modules like attestation and deprovisioning
Attribute definitions for config
Definition | Assigned To | Purpose | Value | Cardinality |
---|---|---|---|---|
reportConfigDef | folder, group | identify a report config | marker | Multi assign |
reportConfigValueDef | folder assignment, group assignment | name/value pairs | string | Single assign, single valued |
Attribute names for config
Name | Definition | Required? | Value |
---|---|---|---|
reportConfigMarker | reportConfigDef | <none> | |
reportConfigType | reportConfigValueDef | required (SQL and blank available) | Currently only SQL is available |
reportConfigFormat | reportConfigValueDef | required (CSV and blank available) | Currently only CSV is available |
reportConfigName | reportConfigValueDef | required | Name of report. No two reports in the same owner should have the same name |
reportConfigFilename | reportConfigValueDef | required and shown for CSV type | e.g. usersOfMyService_$$timestamp$$.csv $$timestamp$$ translates to current time in this format: yyyy_mm_dd_hh24_mi_ss |
reportConfigDescription | reportConfigValueDef | required | Textarea which describes the information in the report. Must be less than 4k |
reportConfigViewersGroupId | reportConfigValueDef | optional | GroupId of people who can view this report. Grouper admins can view any report (blank means admin only), check if EveryEntity is in the group, then public |
reportConfigQuartzCron | reportConfigValueDef | required | Quartz cron-like schedule |
reportConfigSendEmail | reportConfigValueDef | required (default to true, no blank option available) | true/false if email should be sent |
reportConfigEmailSubject | reportConfigValueDef | optional (default to generated subject, blank means use generated) | subject for email (optional, will be generated from report name if blank) |
reportConfigEmailBody | reportConfigValueDef | optional (default to generated body, blank means use default, this should be a textarea, on submit, convert the newlines (/r/n, or /r, to standard \n) | optional, will be generated by a grouper default if blank body for email, support \n for newlines, and substitute in: $$reportConfigName$$, $$reportConfigDescription$$, $$subjectName$$ and $$reportLink$$ The link note: the $$reportLink$$ must be in the email template if it is not blank |
reportConfigSendEmailToViewers | reportConfigValueDef | required if reportConfigSendEmail=true, default to true, no blank option | true/false if report viewers should get email (if reportConfigSendEmail is true) |
reportConfigSendEmailToGroupId | reportConfigValueDef | required if reportConfigSendEmail=true and reportConfigSendEmailToViewers=false | if reportConfigSendEmail is true, and reportConfigSendEmailToViewers is false), this is the groupId where members are retrieved from, and the subject email attribute, if not null then send |
reportConfigQuery | reportConfigValueDef | required and shown for CSV type | SQL for the report. The columns must be named in the SQL (e.g. not select *) and generally this comes from a view |
reportConfigEnabled | reportConfigValueDef | default to true (required, no blank option) | Use logic from loader enabled, either enable or disabled this job |
Attribute definitions for instance (a report that was run)
This attribute is assigned to the same owner as the config attribute (e.g. the same group/folder)
Definition | Assigned To | Purpose | Value | Cardinality |
---|---|---|---|---|
reportInstanceDef | folder, group | identify a report that was run | marker | Multi assign |
reportInstanceValueDef | folder assignment, group assignment | name/value pairs | string | Single assign, single valued |
Attribute names for instance
Note: the ID is the attribute assign id of the marker (this is passed in URLs/emails etc)
Name | Definition | Value |
---|---|---|
reportInstanceMarker | reportInstanceDef | <none> |
reportInstanceStatus | reportInstanceValueDef | SUCCESS means link to the report from screen, ERROR means didnt execute successfully |
reportElapsedMillis | reportInstanceValueDef | number of millis it took to generate this report |
reportInstanceConfigMarkerAssignmentId | reportInstanceValueDef | Attribute assign ID of the marker attribute of the config (same owner as this attribute, but there could be many reports configured on one owner) |
reportInstanceMillisSince1970 | reportInstanceValueDef | millis since 1970 that this report was run. This must match the timestamp in the report name and storage |
reportInstanceSizeBytes | reportInstanceValueDef | number of bytes of the unencrypted report |
reportInstanceFilename | reportInstanceValueDef | filename of report |
reportInstanceFilePointer | reportInstanceValueDef | depending on storage type, this is a pointer to the report in storage, e.g. the S3 address. note the S3 address is .csv suffix, but change to __metadata.json for instance metadata |
reportInstanceDownloadCount | reportInstanceValueDef | number of times this report was downloaded (note update this in try/catch and a for loop so concurrency doesnt cause problems) |
reportInstanceEncryptionKey | reportInstanceValueDef | randomly generated 16 char alphanumeric encryption key (never allow display or edit of this) |
reportInstanceRows | reportInstanceValueDef | number of rows returned in report |
reportInstanceEmailToSubjects | reportInstanceValueDef | source::::subjectId1, source2::::subjectId2 list for subjects who were were emailed successfully (cant be more than 4k chars) |
reportInstanceEmailToSubjectsError | reportInstanceValueDef | source::::subjectId1, source2::::subjectId2 list for subjects who were were NOT emailed successfully, dont include g:gsa groups (cant be more than 4k chars) |
Under folders or groups, in the more actions, should be "Reports", which goes to View reports screen. Note we need to harmonize this with Shilen's group and folder reports. Should they share a menu item?
This is the default screen. Drop down with the following options:
Screen shows
The report will take the SQL and columns and make a CSV with all the results. Chris has this logic and will commit it in the branch. This will be delivered as a download from browser
If reports are being configured to be emailed, then the configured or default email will be sent. Note, the actual report will not be attached in the email for security reasons. A link to the report instance screen will be in the email.
In 2.4 we dont want to add a new table to store files, so for people who want to use this feature the only option will be AWS S3 buckets or filesystem with the report encrypted. We can add more storage options later
In 2.5.34+ this is stored by default in the database.
Stores in grouper_file table
The deployer will need an AWS account, the free level might suffice
Need to configure the AWS creds in grouper.properties
Configure the AWS S3 bucket location
Configure the path where report files will be stored
Inside there Grouper will create "folders"
$base$/reports/YYYY/MM/DD/$someUniqueId$/$reportFilename$.csv.encrypt
$base$/reports/YYYY/MM/DD/$someUniqueId$/$reportFilename$__metadata.json
Report encryption
To delete a report instance, delete the metadata and report data from storage. If not it will be deleted eventually with a clean up daemon
When a report is deleted, delete all the metadata and report data from storage. If not it will be deleted eventually with a clean up daemon
There are no direct links to reports, and they are encrypted anyways. The only way to download reports is through the Grouper UI (or API), by authorized users. This is a reverse proxy to the report storage.
The overall report daemon should go through storage, and
Audits should be added for reports creation/editing/downloading. No audits for emails sent. These audits should be linked to the group or folder where the report is configured
A new item "Reports" is available in More actions dropdown.
Grouper admins can add new reports as shown in the screenshot below
The screenshot below shows the existing reports.
For each report config, a few actions are available as shown in the screenshot below
A report can be downloaded from the report instance page as shown in the screenshot below
In another pass we could create a report based on loading/provisioning.
Grouper Report showing summary of your installation