Salsa Computer Security Incidents - Internet2 (CSI2) Phillip Deneault Working Group draft-internet2-salsa-csi2-renoir-overview-02.html Worcester Polytechnic Institute Copyright © 2007 by Internet2 and/or the respective authors Comments to: salsa-csi2-comments AT internet2 DOT Created: 18-Oct-2006 edu Last Updated 11-Jan-2007 Draft Expires: 11-July-2007 RENOIR: Research and Educational Networking Operational Information Retrieval ============================================================================= *Overview* RENOIR is a reporting system to be used for sharing information regarding security incidents within an inter-institutional trust community - to aid inter-institutional incident response, notification regarding compromised systems, analysis for recognition of attack behaviors and trends, and awareness for protection. RENOIR will handle security data from a variety of sources - human and machine - and organize that data into individual high-level cases which can then be used for response, analysis, and reporting. The system depends on a trusted third-party, which for the purposes of this project, centers on REN-ISAC (Research and Education Networking Information Sharing and Analysis Center) [1] as that third party. Through REN-ISAC, incidents can be centrally coordinated, reported and mitigated. *Background *Prior to the formation of the REN-ISAC there was no private trust community dedicated to security information sharing focused on the unique needs and environments of universities. Existing communities open to university participation didn't maintain a tightly controlled private membership, were oriented to internet service provider rather than university needs, or had other requirements which prevented universal participation of institutions of higher education. Operational security incident information sharing, if done at all, incompletely reached affected institutions, was not timely, and required a high investment of time due to the work of identifying contact information, multiple notifications, etc. With the advent of REN-ISAC institutions of higher education have the ability to participate in an organization that is designed around their needs, is trust community composed of a tightly-vetted private membership, and has the ability to aggregate data from various resources. The data will come from sensors, REN-ISAC's partnership with Internet2, REN-ISAC's partnership with the US Department of Homeland Security through the formal ISAC structure, other ISACs, REN-ISAC's own membership, as well as other groups who recognize REN-ISAC's role in organizing and mitigating security problems. This data is both a mixture of machine-generated reports and events as well as human-generated reporting and commentary. As identified by the Computer Security Incidents Internet2 working group (CSI2) [2], one of the much-needed problems in operational security is moving security reports around to sites in a timely fashion. RENOIR was proposed as being a system for accepting information in a variety of ways, transforming it into a standard, machine-readable report format, and outputting the report in a variety of ways which each receiving site could specify and integrate. REN-ISAC would be the trusted third-party from which this effort was organized. CSI2 set the following high-level goals for RENOIR: · The system will accept human input along with structured data to form reports which are stored in an appropriate format. · The system will allow for input from users from a variety of roles (Reporting party, affected site, researchers, administrators, etc) · The system will use useful, widely-accepted, transport mechanisms (HTTP, SMTP) and use encrypted channels whenever possible either in the transport layer, and/or by encrypting message content. · The system will use a central repository of contact information in order to facilitate automated notifications of affected sites. · The system will be extendable to include new security problems and reported incident types as they occur in the future. *Identified Problems* *Problem A: Human V. Automated Reporting:* RENOIR is intended to be an organizational tool for security data, generated both by humans and by machines. This means that the human-entered data needs to be organized into a mechanism that can be read later by computer systems with minimal parsing, and machine-reported data needs to be minimized and reported in a format which is useful for reporting problems in a succinct way (i.e. this is not a system for storing flows, but the summary results of the flows or what they represent). *Problem B: Data Access Levels and Encryption:* An information sharing system which uses access levels and encryption to hide data could be considered self-defeating, but is necessary to facilitate various levels of data sharing. If handled properly, sensitive incidents can be shared via RENOIR with different levels of access and encryption to minimize the number of parties with knowledge of the report. This can help solve a problem as to how to report to other sites without a high-liability to negative public relations. It also solves the problem of storing potentially sensitive information on a non-university server since a proper encryption mechanism would only allow authorized users with the proper keys and access to view the report. The problem becomes how best to implement an encryption system for N-parties. This is very closely tied to Problem C. *Problem C: Reporting Policies and Procedures:* RENOIR depends on the REN-ISAC membership to send data to it for analysis. This will require sites to decide how best or how much to participate. This can be mitigated by adding access levels, encryption and other features which give a greater breadth of options for information release. For example, one site might only wish to see the open-access, unencrypted reports for their own informational purposes, while another site might wish to use RENOIR's encrypted limited-access reports as a mechanism between many affected sites. *Problem D: Data Input and Output:* RENOIR will need to accept a structured data format for handling both human and machine data. There will need to be mechanisms to better handle data input, either at the client, or to fill in by RENOIR. On output, the data might need to be converted from this structured format to a human-readable format for consumption by a non-security professional. This requires a modular interface to allow flexible access to the system. *Problem E: Data Retention:* Although RENOIR contains much time-sensitive data, the information that is gathered does not `expire' once critical dates have passed. Incidents stored within RENOIR can be mined and accessed later to build an operational knowledge base and case system for detecting trends. This data can be kept indefinitely to build a security history of university space however; data should not be kept indefinitely for both legal and technical reasons. It will be necessary to purge data on a regular basis, but there will need to be some criteria established, or some way to summarize data for future use. This will require a data handling policy as well as a technical methodology, *Problem F: Service Disruption:* Any system which is relied on for security purposes needs to be robust. This problem is rarely encountered on a single site, since disruptions can be easily found and corrected (and they usually need to be corrected before anything else can happen anyway). In an inter-school system, RENOIR will need to have redundancies built in, and engineered for fairly straightforward to recover from any type of system failure. It will also be necessary to structure the system in such a way that downtime means only a delay in time of reporting and not loss of data. *Proposed Solutions:* *XML Data with IODEF* *(Incident Object Description and Exchange Format)*: The structured format of an XML document is exactly what is needed to solve the human/machines reporting problem. It's structured enough to be generated and parsed by machines with relatively little programming and it can be flexible enough to hold any format decided on. ; For the purposes of RENOIR, a good XML format to use would be the Incident Object Description and Exchange Format (IODEF) [3]. IODEF is an IETF proposed standard for reporting security incidents is a standardized way. This is not a catch-all for security events (i.e. firewall logs, netflows, syslog entries, etc) and is instead a format for describing an incident from a human perspective. It can be used as a container format for machine generated events, but it would be ideal if only the data related to an incident were stored to both save space and eliminate excess processing. ; IODEF is also extensible. It's flexible enough to handle most existing incidents and has several proposed extension to it already including one for phishing/spamming and another for web applications and grid computing. *Report Types:* A small collection of report types has been brainstormed. These all store their information in XML and are controlled in different fashioned. These are outlined below: · Encrypted Reports XML reports are encrypted on a per message basis and have access controls to only allow access by involved parties that require high security · 'Limited Access' Reports XML reports have no encryption but have access controls for involved parties only. REN-ISAC is also able to view these reports. · Normal Report XML reports which have no encryption or access controls. Member sites can search/access the reports, only limited by client scope or view · Semi-Anonymous Reports XML reports produced by REN-ISAC saying "A member institution has had problem X..." This allows for open reporting without giving away the source. · Non-Incident Reports XML reports produced by REN-ISAC for informational purposes like general announcements. *Encryption:* All reports need various levels of encryption and verification. All reports should be signed to verify that the reports have not been modified. Strong encryption is required for Encrypted Reports. This is a problem because the typical solution of asymmetric key encryption becomes much more difficult with more than two parties involved. This is solved by using per-report symmetric keys distributed using symmetric methods between RENOIR and each site. *Data Expiration:* Since each report is a collection of information from both the submitter and other parties and can be used to build statistical trends of incidents, each report can be kept for a very long time to build a very useful background of historical information. Standard information handling procedures dictate that this information should be purged from the system eventually. Some middle ground must be found to try to save useful content and purge excess data. There are currently several proposed mechanisms within the CSI2 group. The final solution will most likely include several of these. · A deletion flag (which may or may not be set by default at creation time) which would mark the record as deletable after X number of days. · Segmentation of the data into record types and giving each one a different policy. · An archive which would not really purge the data, just move it into longer term storage and accessible only by REN-ISAC. This could be useful to store everything except encrypted data and would possibly involve implementation of an 'archive/do not archive' flag. · Possibly sending a notice email once a month to sites which reports will be purged and forcing site to either access the reports or set them to not be deleted. · Purging only parts of messages like flows, evidence, etc and maintaining the comments, dialog, etc. This is can be done for all but encrypted reports. *Key Metrics:* *Time from Detection to Reporting:* It important that in any system which reports data to RENOIR do so as easily and cleanly as possible in order to minimize the time from detection at a site to reporting to other sites. Much of the time sites don't report problems at all, but for a system like RENOIR to succeed, those sites must not only report, but report quickly. It is imperative it be as easy as possible. This improvement can be implemented any number of ways. Sites could script reporting into existing incident workflow systems or intrusion detection systems. Sites can input data into REN-ISAC-based security initiatives like the shared darknet system and let REN-ISAC do the reporting. Or, sites can do their own write-ups using client interfaces to RENOIR with wizards or web interfaces. *Time from Reporting to Remediation/Detection:* Just as important as it is to accurately and quickly perform reporting, it is also important to get that information into the hands of affected parties to either include that data into an intrusion detection system, or for quick remediation. Data cannot be held onto by REN-ISAC alone because that will quickly become the bottleneck to RENOIR. A possible mechanism to use would be EDDY [4] (*E*nd-to-end *D*iagnostic *D*iscover*y*). A system using EDDY would help keep this metric low, since EDDY could automatically notify various sites via a preferred method for various types of incidents. Also, if RENOIR automatically outputted interesting bits-of-interest like IRC Command and Control servers to pre-existing lists that are being used all the time, then this would help identify problems faster and allow sites to only report once. *Planned Project Stages:* *Phase 1:* *Step 1 - Building the Storage Engine:* Before any further testing and experimentation can be done, there needs to be a storage engine in place to get reports in and out of the system. *Step 2 - Building Input/Output Agents for Automated Reporting:* · Input agents should be built for machine-generated events that already exist or are close to fruition. An example of such input agents would be an agent that takes denial of service reports from an Arbor system and reports from the REN-ISAC Shared Darknet Project and feeds those reports into the storage engine. · Output agents - starting with an SMTP output agent - should be built and tested using real events in a limited fashion. · Define an API for sites to implement their own tools. *Step 3 - Building a Routing Agent:* · A routing agent should be built to move the messages around and send them to the output agents for reporting. Eddy could perform this task. *Step 4 - Build a Portable Client:* · A portable agent should be built to allow flexible access to reports. The goal for Phase 1 is to build up the system and start collecting data which is quickly replaced so developers can see how well the system handles rapidly expanding load. Building the system in this way will also begin to automate the reporting process both to REN-ISAC and from REN-ISAC to individual sites, which is a near-term goal for REN-ISAC. * Phase 2:* · Work on feedback mechanisms for existing outputs to tie the user feedback into the reports which spawned them. · Work on human-generated reports. Get data entered by humans into normalized formats which can be used for more reporting. · Build out visualization mechanisms or at least agents to generate data for analysis. There should be enough data in the database after Phase 1 to generate useful information. · Continue to build-out inputs and output types Goals for Phase 2 are to use the data generated from Phase 1 and include the more difficult to deal with human-generated data. None of these phases have a set timeline however the sections in Phase 1 are parts that need to happen in order. Once those major pieces are done, REN-ISAC requirements can help decide what gets built next. Some development can also be done concurrently. *References:* [1] http://www.ren-isac.net [2] http://security.internet2.edu/csi2/ [3] http://www.cert.org/ietf/inch/inch.html [4] http://middleware.internet2.edu/e2ed/ This project was supported by Grant No. 2006-DD-BX-K271 awarded by the Bureau of Justice Assistance. The Bureau of Justice Assistance is a component of the Office of Justice Programs, which also includes the Bureau of Justice Statistics, the National Institute of Justice, the Office of Juvenile Justice and Delinquency Prevention, and Office for Victims of Crime. Points of view or opinions in this document are those of the author and do not represent the official position or policies of the United States Department of Justice.