NTAC Call 9-3-13

v6 WG Alan:
The next WG call is set for 3:00 PM ET on Sept 19th. Generally these calls are the 3rd thursday of the month at 3:00 PM ET.

The last call included a discussion of whether or not IPv6 was done in the core and connectors. Consensus was yes. However much work remains to be done in the Server Farm areas.
The WG was clear that there is a need for more measurements of actual v6 activity at all levels. This could include:

  • MRTG graphs that differentiate v4 and v6
  • Flow based measurements when more granularity is needed.
  • Separate measurements of TR-CPS links.
  • Continued look at R&E links
    There was also some discussion about whether Multicast was still relevant.
    Phil Benchoff also led a discussion about the inclusion of v6 in purchasing requirements.

Peering and Routing:
Darrell gave a brief overview of TR-CPS activity.

Performance WG Ken Miller:
Perfsonar 3.3 and 3.3.1 have been released. These include updates and new features.
They continue to look at network stats and reporting tools and the response times of those.
Ken has been doing v4 and v6 reports at Penn State that are based on flow data. He will pass that information along to the v6 WG. It could be used to create a dashboard.

SDN - Dan:
There has been discussion in the WG about helping the community get information on AL2S/DYNES and using these with GENI racks.
Talked about Open Daylight. Dec 9 is the first official release of Open Daylight.

Internet2 reports Chris:
Update on the Amazon Issue:
Reports first emerged about 10:00 PM ET on Monday from LONI about traffic to Amazon.

  • Internet2 began working with Amazon.
  • This process did reveal some areas where work needs to be done.
    Amazon had done some maintenance around 3:00 AM Tuesday as is seen on the MRTG graphs.
  • Around 9:30 AM or 10:00 AM on tuesday it was clear that the circuits had not recovered fully.
  • Sites were starting to move traffic to other paths around this time.
  • This was good for them but made it harder to track the problem since there was no traffic.
  • Worked with Amazon on traffic from Amazon to Internet2 for several hours.
  • Realized after that there seemed to be a problem between Ashburn and McLean.
    -This was not seen until traffic levels grew.
  • Tried several approaches
  • Turned down peering.
  • Needed to find a better way to put traffic on the link.
  • Reloaded some line cards which seemed to help some.
  • Moved the Amazon peering to Cleveland at around 3:30 PM tuesday. That seems to have resolved the problem.

Around 2:00 PM ET started to see some ASTRA issues on SIP.

  • At first thought these were unrelated.
  • Ultimately this traffic was seen to traverse McLean.
  • Reset all of the line cards at McLean
  • Seems to have addressed the issue.

Chris indicated a more complete report would be forthcoming later.

Darrell asked if the problems were actually fixed or not? Chris responded that he thought that for now the issue was resolved, though perhaps not fully understood yet. Still working with ASTRA to determine if all of their issues have been resolved.

Interface Flapping Update:

Have done a lot of changes involving moving LR10's of various types and LR4"s around. These changes do seem to have resolved the issues.
Overall there seemed to be 2 separate problems. One was a faulty module in Cleveland, two was problems with LR10's. The initial problem in Cleveland was not notices because of the optics originally being used there. Once they were replaced and PM data became available it was clear there was a module problem.

There is still work being done to research finding a reproducible cause. And this is still be watched closely. Chris will send out another update on this soon.

AL2S/AL3S:

  • Initially it was decided to wait till after the holiday to make any changes.
  • On Tuesday (3rd) the Amazon issues delayed making this change.

Darrell asked about the methodology for reporting on the availability of interfaces and ISIS uptime. Chris indicated this had been described in the initial emails but that he would resend it.

Chris Griffen asked about notification procedures and whether they should have been sent earlier? Chris said he agreed that more and earlier notification should have gone out, part of the reason they did not was that the errors did not trigger an alarm, they were not an outage, so it did not trigger the NOC sending out alarms. This will need to be looked into.
Dave Farmer also brought up that it is important to be able to recognize when the network moves from a sort of low level minor issue to something the tis broad and serious. Grover indicated that this particular topic was important and would fall with the scope of one of the discussion topics for the face to face meeting in late September.
Dave Diller pointed out that some AMPATH and RNP traffic (likely on A-Wave) had been reported but were cleared up by the line card resets. He will send more information to Chris.

In general Chris asked the community to send along any indications they have had of issues, this will be of help in determining the scope and nature of the problems.

AL2S Installs:
Jackson, MS, Portland and Pittsburg were done last week. Charlotte and Philadelphia are in motion. There are several (5 or 6) more that could come up when demand for them emerges.

Sounding Board:

David Crowe will send out an invitation. Volunteers are welcome over the next couple weeks. Would like this set before the Face to Face.

Elections:
Michael Lambert presented the approach listed in the agenda. It was decided that he would resend the proposal separately (done on Wed morning) and if there were no major objections over the next week it would be considered adopted.

Face to Face meetings:
This will focus on several distinct issues by smaller groups. They will:

  • prepare drafts on the topics
  • these drafts would be sent on to the NTAC later
  • These will hopefully be more regular events (yearly or so?)
  • There are 30 attendees, 5 staff.
  • Grover will send out a note on Wed about the topics that will be discussed.

Net+ services:
Blue Jeans network is coming up soon. The Microsoft peering is being worked on. Discussion will continue on Wed. A quick update will be sent to the list.

In attendance (hopefully correct - there was some confusion due to the initial dialing problems):
Matt Z, Alison F., Shuman Huque, Darrell Newcomb, Hans Addleman, Andrew Lee, Linda WInkler, Jeff Ambern, Chris Robb, Paul Schopis, mark Johnson, David Crowe, John Moore, Michael Lambert, Jay Ford, Jeff S., Brian Cashman, Joe Breen, Ryan Vaughn, Dave Diller, Tom Lehman, Quang, Chris Griffen, Robert Nordmark, Grover B., Tony, Alan Whinery, Eric Boyd, Cort Buffington, Ken Miller, Dan S, Dave Farmer et al.

  • No labels