*Proceedings: Salsa NetGuru Meeting, Feb. 14-15, 2007, Minneapolis, MN * ======================================================================== _Attending:_ Eric Brown _Virginia Tech_ Kevin Miller _Duke_ Christopher _UC Berkeley_ David Morton _U. Washington_ Chin Michael Contino _PSU_ James Nesbitt _Duke_ Rich Cropp _PSU_ Steve Olshansky _Internet2_ Alan Crosswell _Columbia_ Mark Poepping _CMU_ Cas D'Angelo _Georgia Tech_ Lea Roberts _Stanford_ David Farmer _U. Minnesota_ Richard Sammis _Indiana_ Clark Gaylord _Virginia Tech_ Michael Sinatra _Berkeley_ Jim Gogan _UNC-Chapel Hill_ David Sinn _U. Washington_ Terry Gray _U. Wash_ Joe St. Sauver _Internet2, U. Oregon_ Peter Gutierrez _U. Mass_ Gregory Travis _Indiana_ Marilyn Hay _UBC_ Michael Van _UCLA_ Norman Roy Hockett _U. Michigan_ Toby Wong _UBC_ Shumon Huque _U. Pennsylvania_ Russell J. _CMU_ Yount Steven Lee _Virginia Tech_ Tom Zeller _Indiana_ *Wed 14-Feb-2007* *Best of 2006* *Worst of 2006* · Deploying Layer 3 core · Traffic engineering · No traffic disruption devices · Capacity exceptions · Load balancing · Planning of lightweight wireless deployments · Redundant fibers to closest POP · Voice outage - operational issue · Data center upgrade rather than an outage · Upgrading all 1200 APs on campus · Sharing responsibility with (see also Worst...) another group to manage fiber install · DNSSEC being evaluated seriously · Secure wireless prototype (WPA) · Large upgrades happening across rollout no one is using... campuses · Migration of old load balancer - · VoIP - worked out successful caused outages business model · Upgrading all 1200 APs on campus · Integrating/synchronizing teams of engineers on multiple campuses · Turning off Usenet on campus · Hired new security director - · Integrating/synchronizing teams of convinced performance is as engineers on multiple campuses important as security · VoIP rollout - cost · IT re-org - got rid of silos recovery/business model · Merging voice/video/data orgs · Silo disruption is getting in the way of work getting done · Schedule failure testing of network - 95% worked as it should · Security reports separately from have and uncovered some problems... IT infra group. They campaigned to put firewalls in front of every edge · Getting off state-mandated network on campus... Centrex PBX · VoIP rollout challenges · Successfully fending off attacks · Metro wireless challenges · Network diagnostics rising up the priority stack · Moving onto new phone switch... · Fiber pathway going away... · Unit level firewall deployment · User expectations for networking · Engineering challenges around VoIP (e.g. performance, availability) rollout are finally living up to what we thought they might be in years past · Traffic disruption (middle) boxes · Statewide fiber network · No sign of progress against worsening complexity... · Wireless standards · Fresh look at options as a result · Layer 3 core of external challenges · Organization embraced "customer · User expectations that wireless driven" approach capacity/availability is/should be the same as wired network. · Systematic approach to network data export (for research) · "selling" new cost recovery or business models for · Blackhole router (automated) - wireless/phone/data/backup services Juniper XML or Zebra router · New voicemail system deployment · Changed the funding model for faculty grants · Big firewall project · sup720-3b table space exhaustion · Redundancy - degraded service harder to see · Installing new IPSEC platform instead of SSL VPN *Timely Topics - Identification* Peter Gutierrez, U. Mass · DNSSEC · VoIP · Network Authentication · Network Management · BGP Management · DNS & DHCP · Traffic control · IPv6 · NMS/Flow tools · Wireless · Authentication / 802.1x · Mobility (embedded) · Web middlebox · NAC · Remote Access · VPN, IPSec v. SSL VPN · RADIUS logs · Dept. Firewalls · User isolation? · Documentation · Firewalls · Generally · DHCP · Flow data lifecycle · User feedback · Multi-site coordination · Type of data, sensors · Darknets · When credentials are stolen, there is no feedback for users to determine this easily, as in the standard *nix "last login" notification · Better info/feedback for end-users about what is going on their systems, but this could easily overload? · log aggregation? Making sense of too much data? Splunk - text search for logs? · isolation strategies - what do users want/need? · VoIP · Handset v. technology choice for users? · Power · QoS or 100G? *=> create wiki page for collecting survey questions, toward the goal of surveying the community to guide and inform future NetGuru work?* *Data Center Networks *Steven Lee, Virginia Tech David Sinn, U. Washington Between data centers - Metro Ethernet MPLS (eliminates need for spanning tree, also provides more capabilities for traffic engineering) separate from campus network, runs own routing protocol. "virtual wire." Economic pressures to virtualize and consolidate... Issues with load balancer? Maintaining state: "Sticky sessions"... Many departments haven't upgraded MS SQL Server to newest versions which do load balancing Eric Brown, Virginia Tech · Recently upgraded their data center · Good to isolate traffic for performance reasons? E.g. putting SANs on same backplane seems to make sense... · How to determine when backplane is oversubscribed? Can be difficult... · Systems staff often not well suited to managing network devices. · Separate management networks? Older slower switches and gear often well used for this, doesn't need high performance. · Issues with blade servers: · Power consumption · Cooling · Cost more · Load balancing failures or problems commonplace? Anecdotal evidence would seem to indicate that they are... many have too many features, and if you use many of them the configs become quite complex · Need to work with apps people to try to give them the service they want, but also to educate them about the pros/cons. Some load balancers tend to overreach... · Common theme: define "complexity" in our context. What are *we* doing that is causing complexity, and how can we work to lessen it? What is in and out of our control? Decouple some components so as to make management easier? · DR - hotsite backup: Overbuild new data center capacity and enter into reciprocal backup agreements? Or sell space as a revenue source? · Amazon and Google building where there is low cost power and space, and selling colo capacity... · Some have out of state data centers as backups, but not many. Many more are interested in exploring this... Different tectonic plates, different storm zones, different power grids. · Common constraints: cost, space, expectations *=> Role for "matchmaking service" within Internet2, and using Internet2 network as transit?* => Role for Internet2 to facilitate the creation/dissemination of template contracts or SLAs for backups? NYSERNet model extensible to other state nets or RONs? //////////////////////////////////////////////////////////////////////////// //////////////////////////////////////////////////////////// *Thu 15-Feb-07* *Multilayer Bypass* *- Optical* *- Policy* Michael Van Norman, UCLA David Sinn, U. Washington Is end-to-end still valid? · "E2E is dead" in terms of having a clear path without intermediate boxes interfering? · Policy bypass a better term that optical bypass - more accurate · There are places where e2e is dead but we don't care - e.g. student systems. · But for researchers we want to pass full flows · Network technologies need to move closer to the edge? · Fundamental topology question: is geo declining in favor of organizational? · How do we provide various classes of service to geo distributed constituencies? On and off campus... · How to dev an arch that efficiently provides different equivalence levels of connectivity to arbitrary endpoints? · How many gated communities are viable to support? Many are asking, all want to be isolated from each other · When departments stand up their own firewalls, NAT, they then end up complaining about the net not working, getting in the way of what they want to do · Terminology drivers- if we talk VLAN everyone asks for VLAN. Better to talk about a service profile, explore actual requirements, let us choose how to deliver what is needed... · E2e may be needed among communities of interest, but not between them. · How to stay out of the way of users trying to do legitimate work that may run into barriers? · The more net disruption devices we insert, the more we limit what we can do. Perimeter defense implies lots of maintenance burden with firewall holes, VPNs, etc... Load balancers interacting with firewalls can be a problem also. All devices are "careless" in some way requiring intervention... · Consensus that part of our constituency that wants a clear channel, and some that want a gated community - thus we need to figure out how to service both? Yes... Some who want clear channels may not actually need them, e.g. they look at traceroute and see the delays on each hop, and think this is a bigger deal than it really is... · Raise awareness that the network itself is among the things that needs to be protected.. · Are users certain they can trust every other user/device behind their firewall? · IP address not a great proxy for identifying who you can trust, need to move to identity/credentials. · Gated communities create problems, many users have different roles in different communities and move frequently between them - maybe carry infections or problems along with them. · How many using NAC/NetAuth/endpoint security currently? · Defense-in-depth - how valid in R&E? perimeter firewalls e.g. generally not a good fit... DiD ignores network availability. · Need to carefully consider your security needs before determining what devices you need, and where they should live in the network. · Many want to move firewalls as close as possible to the resources needing protection, makes policy more flexible but may make management more of a challenge - policies a moving target per device based on the needs of the day? · Policy v. security plan for different types of data... · Threat vectors are changing, and many perimeter defense strategies may not be keeping pace. Web mail a common example of a significant threat vector that will enter right through a tight firewall · Sometimes installing a TippingPoint can be politically useful, blocking certain bad traffic can appease users, and provide data about the sources. · Having a single unified control plane a viable goal? Mapping identities practical? MPLS to the edge? · Proposal: keep v6 unencumbered, and focus our attention on v4 traffic? But changing IP transport doesn't affect endpoint vulnerabilities. · Defense in depth - analogy to cameras v. locks? Right protections need to be applied in the right places. · Signature-based anti-virus being effectively compromised since sigs only updated once per day, and viruses may be updated more frequently... · We don't have good user feedback letting them know what is going on, what is causing problems. · Practical to put all clients behind default inbound deny policy as a baseline? *Optical/Policy Bypass* Researchers doing work asking for GLIF/personal lambda environment, getting bypass from the campus network. Funding/cost recovery still being worked out... Aggregating (MUX) traffic becoming more common, making life easier - going out over common link at the edge. Try to sell standard L3 service if possible, or if not then L2, or if not then L1. · Fiber becoming the default media being installed, making provisioning easier. · Researchers don't want security staff in the middle, generally. · Monitoring at high speeds a problem? · Can the infrastructure handle the traffic spikes coming from the researchers? · No one-size-fits-all answer for external connectivity, need to work out what makes sense in your environment · Whose credentials will be used to establish a dynamic wave, for how long, how will they be managed? Terminology - users talking about lambdas when what they are really talking about is Ethernet, often L2 Ethernet · Virtual *Public* Network - allow users to login to get direct external access outside the security infrastructure, e.g. for CS to get to nasty sites or pretend to be something else... This has the added benefit of useful logs, allowing security staff to know who is doing what and when in this context. · What does a user actually mean by "the network is slow"? · Users sometimes using VPN to bypass firewall policies they find problematic or just annoying, e.g. traffic shapers. · SSL VPN · newer solutions are client based · concern about logging into kiosk using home credentials.. keyloggers? · Why SSL VPN? · clients are simpler .. don't need to install clients · VPN clients are broken by the network... blocking IPSec, udp, or tcp. users need to try different options · Problems with a common commercial client: group password vulnerabilities (may have to do a MITM attack) · some campuses using mutual group auth (MGA) to eliminate vulnerability · limits interoperability with clients from other vendors · Problems in managing AuthN at multiple devices becoming an argument for using VPN to manage AuthN coming in, to the extent this works for particular apps/hosts. · VPNs create false sense of security? Downplay the need to boost security at the hosts? · VPNs limit exposure to outside threats, practically speaking, and this is probably a primary reason many are so sold on it... · VPNs relatively low maintenance and low cost to operate. · Sometimes staff-generated scans are misperceived by users, requiring explanation. · Security staff needs to communicate clearly with users (i.e. in plain English, not tech speak) what they are doing and why. E.g. blocking a service without an announcement causes many user complaints about the network, boosts help-desk load, bad PR, etc. => Wiki topic: Where you should place firewalls, defense in depth, maybe leading to a white paper? Tom Zeller - editor? => Wiki topic: Firewall service module war-stories? //////////////////////////////////////////////////////////////////////////// ///////////////////////////////////////////// *Timely topics: Dept firewalls* · user isolation? · Documentation · firewalls · generally · DHCP · Flow data lifecycle · user feedback? · multi-site coordination? · type of data, sensors · Darknets · Wireless · Authentication / 802.1x · Mobility (embedded) · Web middlebox · NAC · Remote access · VPN, IPSec, SSL VPN · RADIUS logs · Log aggregation, view · VoIP · power · QoS or 100G? · Users tend to try to social engineer around restrictions (e.g. share access credentials to defeat being shut out of a resource) if they are able, and are often successful... · NAC for wired as well as for wireless - provides traceability with minimal impact on users. · MAC registration that is infrequent seems fairly low impact. UMN requires users to specify their dept and support contacts as well, because ent dir not a good source for this currently - e.g. when users have multiple roles/affiliations. · What exactly are we accomplishing with MAC registration, given the current state of affairs - multiple MAC addresses on individual machines, and how easily they are changed/spoofed? What about when machines change owners? When you can reliably match a name to a MAC address it makes support easier, and what is the harm? · What can we do in this area that is more beneficial? 802.1x really worth deploying? Service provisioning with differential classes of service by subnets useful... You can't have unmanaged devices, the supplicant needs to talk directly to the client. · User education important, and once they become accustomed they don't seem to mind given the benefits they understand will flow to themselves, as well as to their fellow users. · Issue not just whether machines are managed, but WHO is responsible for managing them? · Anyone consider universal 802.1x for wired connections a good thing? Not yet, given the current state of it... · Some using 802.1x, but more than one flavor... Provide options for authenticating? · If 802.1x clients were better, more reliable, would that have a big effect on uptake? · Need to differentiate between user identity and machine identity. · Credentials will be cached if this option is available, easier for users. · Lack of "hub" feature in 802.1x spec needs to be addressed... requirement to use unicast frames in a hub environment problematic. · How can we have accountability at the port level? Is this driven by billing requirements? · Anyone looking at tying into community wireless initiatives? CALEA a hindrance to this? Work to avoid duplicating coverage, reciprocal agreements e.g. granting a SSID dedicated for the university, condo APs. Contract with external service like iPass, for users on the fringes (e,g. coffeeshops)? · Salsa-FWNA finalizing SAML/RADIUS draft (Steve Carmody) https://wiki.internet2.edu:443/confluence/x/kSo · What are campuses doing with federations? Want a local knob to be able to deny access locally in case of visitors doing bad things - be able to do this easily and essentially the same way as internal users would be handled. · Some using multiple access modes - choose VPN or Shib depending on the app/user's needs (attributes required or useful?), and whether you are local or visitor. · What are the implications of stratified service levels? Visitors are limited e.g. to port 80 or VPN. If user can show affiliation (including sponsored guests) they are granted more access. Abuse case: student sponsoring guest IDs and using them to rack up DMCA violations... What are the repercussions to students for bad behavior, aside from DMCA violations? Extreme case - banished from the network, after what process? · What are successful strategies for dealing with high-density wireless requirements, e.g. large lecture halls? No clear consensus, several vendors being tried. · What about VoIP over wireless - big problem coming soon? Dedicated VLAN, but what about wireless coverage/capacity? · What about WIMAX? Limited number of carriers poised to jump in to the market so far... => Wiki topic: Wireless Net monitoring instrumentation strategies · EDDY · Log aggregation · performance monitoring v. NAGIOS/OpenView · Active v. passive monitoring · PerfSonar · What will work for 10G? //////////////////////////////////////////////////////////////////////////// ////////////////////// Organization ============ Organizational discussion about NetGuru: what are the strengths, what are the weaknesses, what's the membership, what's the leadership? *Strengths * *Weaknesses* · open discussion · some unsure how to make the most of the netguru experience · can explore tangents: enough time.. · forgot that the mailing list · appropriate technical level existed · appropriate sizing - discussion is · communicating netguru fostered by the size information to the broader community · common problems and commonality of solutions · need a new name · campus level focus · certain size and scale · 1.5 days - a good amount of time · can let a "jazz" go.. explore good ideas · limited structure - gives flexibility · enable veering off track · relatively low overhead for organizers · range of skills and level of responsibility · cross-pollination of this group with other groups (working groups, etc.) · this group has no decision-making or policy agenda · no vendors · the group can help inform us of what to do in certain circumstances · "this is useful" *NOTE WELL*: All Internet2 Activities are governed by the Internet2 Intellectual Property Framework [1]. NetGuru [2] || Internet2 Security [3] | SALSA [4] 1: http://members.internet2.edu/intellectualproperty.html 2: http://security.internet2.edu/netguru/ 3: http://security.internet2.edu/ 4: http://security.internet2.edu/salsa/