Date: Fri, 29 Mar 2024 08:22:06 +0000 (UTC) Message-ID: <1534637218.7691.1711700526647@ip-10-10-7-29.ec2.internal> Subject: Exported From Confluence MIME-Version: 1.0 Content-Type: multipart/related; boundary="----=_Part_7690_706901039.1711700526645" ------=_Part_7690_706901039.1711700526645 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Content-Location: file:///C:/exported.html
NRP Engagement Webinar=
Typically on the fourth Monday of the month&nbs= p; at 1 ET - 12 CT - 11 MT - 10 PT
Zoom: see your email for coordinates!
Internet2 Research Engagemen= t YouTube Playlist
To hear about future activities, please join the NRP engagement em= ail list via this= link.
Feel free to share this with anyo= ne that may be interested. Calls will typically be on the fourth Mond= ay of the month.
If you would like to present or have a suggestion for a session, please = write to Dana Brunson.
PAST CALLS:= p>
Monday, August 24, 2020 at&= nbsp;1 ET - 12 CT - 11 MT - 10 PT
Title: The Pacific Research Platform, National Research = Platform, Global Research Platform and Nautilus
Presenters:
Dmitry Mishin: = Applications Developer at University of California San Diego
Dmitry graduated with a Master's from Moscow Institute of Radio Engi= neering, Electronics and Automation, speciality computing machines, complex= es, systems and networks. Dmitry also has a PhD in Geophysics from Ea= rth Physics Institute, Russian Academy of Sciences. He is currently working as Applications Develop= er in University of California San Diego, San Diego Supercomputer Center on= Comet Supercomputer system enhancements, and Calit2 on supporting and expa= nding the Nautilus =E2=80=93 global Kubernetes cluster. The development foc= uses on supporting the large-scale computations, data visualization, IoT, d= ata storage solutions. His research interests are HPC systems, data storage= and access, microservice architectures, performance measurement and analys= is.
John Graham: Senior Development Engineer at Uni= versity of California San Diego
Recor= ding: https://ww= w.youtube.com/playlist?list=3DPLLIsQFBBoJG_jXSxkoGtDzcFjcrkT2y-2
Slide= s: https://docs.google.com/presentation/d/1Vrbf_-= a9VZ1GSck9Z850BlfrlaxksvdB7oDNHca_4EA/edit#slide=3Did.g90f5ee2b76_0_22<= /p>
Abstract:
T= he PRP/NRP/GRP now represents a partnership among more than 50 institutions= . Most host Flash I/O Network Appliances (FIONAs), which are rack-mounted P= Cs. FIONAs are advanced Science DMZ Data Transfer Nodes, optimized for 10-1= 00Gbps data transfer and sharing. Many sport two to eight GPU add-in boards= .
165 of these 10-100G connected FIONAs have been joined into a =E2=80=9Ch= yper-converged cluster=E2=80=9D called Nautilus, which uses Google= =E2=80=99s open-source Kubernetes to orchestrate software containers across= the distributed system. Kubernetes is a now widely adopted way to manage c= ontainerized software. In fact, more than two thirds of the Fortune 500 com= panies have adopted it, and it is available within all the major commercial= cloud providers. Nautilus currently has 550 GPUs, 6000 CPU cores, and more= than 2PB of disk, all distributed among campus Science DMZs.
John Graham will talk about how you can participate and explore Nautilus= , add your own node to Nautilus to get the benefits of informal Potluck Sup= ercomputing (PLSC), and how to build a Kubernetes cluster that will federat= e with Nautilus, making it an persistent experiment in community-based infr= astructure. He will explain how RocketChat has helped create a community of= users who need to share big data and compute on it
Dima Mishin will then discuss Nautilus=E2=80=99 use of Ansible to automa= te sysadmin tasks, Ceph storage pools, advanced measuring and monitoring, a= s well as Nautilus=E2=80=99 coming federation with the new NSF SDSC Expanse= supercomputer and its data-centric architecture. He will finish up with th= eir new work with InMon and Reservoir Labs.
Monday, June = 22, 2020 at 1 ET - 12 CT - 11 MT - 10 PT
Recording: The Eastern Regional Network - Simplifying Multi-Campu= s Research Collaborations
Abstract:
The Eastern Reg= ional Network (ERN) was formed to address challenges related to simplifying= multi-campus collaborations and partnerships in the Northeast that advance= the frontiers of research, pedagogy, and innovation. The ERN is first and = foremost a network of people interested in pursuing this vision, and who ma= nage and use the campus and regional research computing, data, storage and = network resources that can make it happen. We have chosen to make this a re= gional effort for two principal reasons: (1) face to face interactions are = relatively straightforward and inexpensive to arrange; (2) the characterist= ics of our region are unique =E2=80=93 for example, the Northeast contains = eight different state university systems in a geographic area whose size is= comparable to that of California. During this presentation I will give a b= rief overview of the ERN. It starts with motivation, mission and vision, an= d follows with a summary of the steps that we are taking to realize that vi= sion, including accomplishments to date and ambitions for the future.
Bio:
Dr. James Barr = von Oehsen is the Associate Vice President of the Rutgers University Office= of Advanced Research Computing (OARC). Dr. von Oehsen is responsible for p= roviding strategic leadership in advancing Rutgers University=E2=80=99s res= earch and scholarly achievements through next generation computing, network= ing, cloud services, and data science infrastructure. Prior to joining Rutg= ers, he was employed by Clemson University Computing and Information Techno= logy (CCIT) as the Executive Director of the Cyberinfrastructure Technology= Integration (CITI) group. Dr. von Oehsen has extensive experience working = with diverse campus research communities throughout the nation as well as w= ithin the US industry sector. His interests are in advanced research comput= ing, data science, machine/deep learning, cybersecurity, mathematical model= ing, commercial and campus cloud solutions, and hardware architectures.&nbs= p;In 2018 he received the NJEdge Technology Innovation Award for his work i= nvolving the convergence of software defined networking, research computing= , tiered storage, commercial cloud, and federation of services. He is also = a founding member of the Eastern Regional Network, a consortium of universi= ties, regional network providers, and research centers with a vision to sim= plify multi-campus collaborations and partnerships that advance the frontie= rs of research, pedagogy, and innovation.
Monday, April 27, 2020 at = 1 ET - 12 CT - 11 MT - 10 PT
Dan Schmiedt, "Life of a Packet"
Abstract:
Early in his career as a network engineer, while interviewing at a= large networking company, the interview team lead told Dan, "You have an i= mpressive-looking resume. But we don't really care about that."  = ;He pointed at a whiteboard. "Tell us exactly how a packet is built, = travels across a network, and arrives at a destination host, and don't leav= e anything out."
Dan nervously stood up and started drawing and telling the story of the = life of a packet. Heads nodded and he got the job, although he didn't= stay long before he decided to go back to work at Clemson.
Today, any time someone asks Dan how a network works, he stands in front= of a whiteboard and tells the story of the life of a packet.
Bio: Dan Schmiedt is the Director of N= etwork Services and Telecommunications at Clemson University, where he has = worked (with a few meaningful interruptions) since he was a student employe= e there in the early 1990's.
Monday, March 23, 2020 at 1 ET - 12 CT - 11 MT - 10 PT
Inter-campus collaboration in the era of big data enabled by the= Pacific Research Platform with UCSC, WUSTL, and UCSF - David Parks
Abstract:
UCSC is utilizing the PRP in collaboration with both the Washington Univ= ersity in St. Louis and UCSF. These collaborations involve cross-campus col= laboration on datasets that are well into the multi-terabyte sizes. The PRP= provides an enabling platform for these collaborations. The Henglab, lead = by Keith Hengen, out of the University of Washington in St. Louis is produc= ing terabyte and even petabytes scale data from novel longitudinal studies = of neural activity in live mouse subjects. The PRP has enabled collaboratio= n between UCSC and WUSTL where one campus provides data and the other provi= de the algorithms and compute, and both meet seamlessly on the PRP. UCSF is= involved in recording organoid cell cultures of human glial neural cells, = producing individual recordings that can reach into terabyte-scale, the PRP= is enabling standard tools and analysis of data shared between UCSC and UC= SF. At UCSC we are scaling up live organoid cell culture experimentation su= ch that hundreds or even thousands of simultaneous experiments can be under= taken in parallel, generating massive datasets and the PRP enables scalable= processing and streaming solutions. In this talk, we will introduce the sc= ience being performed at the scale of big data and how the PRP is enabling = both the collaboration and the science in novel ways.
Bio:
David Parks is a graduate student researcher at UCSC pursuing his Ph.D. = in Bimolecular Engineering with a focus on deep learning technologies. He w= orks under Professor Haussler in the Braingeeers lab, a multi-disciplinary = lab scaling up cell culture experimentation and bringing it into the open-s= ource ecosystem. David has over a decade of experience in Silicon Valley wo= rking in enterprise software and deploying big data systems on platforms su= ch as Hadoop.
Monday, February 24, 2020 at 1 ET - 12 CT - 11 MT - 10 PT
IceCube Computing Grid - Benedikt Riedel
Abstract: We pre= sent how the IceCube computing efforts have evolved over the last 15 years = from mostly home-grown solutions to a globally distributed computing infras= tructure. It will highlight advantages and disadvantages of such an approac= h for an experiment with broad science goals ranging from astrophysics to p= article physics to geophysics, and what we see in our future as we engage i= n more external collaborations in computing.
Bio: Benedikt Riedel is the Global Computing Coor= dinator for the IceCube Neutrino Observatory and Computing Manager for the = Wisconsin IceCube Particle Astrophysics Center. Previously he worked on the= Open Science Grid at University of Chicago. He received a Ph.D. in 2014 fr= om University of Wisconsin-Madison working on supernova neutrino signals in= the the IceCube Neutrino Observatory.
Monday, January 27, 2020 at 1 ET = - 12 CT - 11 MT - 10 PT
Running a 380PFLOP32s GPU burst for Multi-Messenger Astrophysics= with IceCube across all available GPUs in the Cloud
Igor Sfiligoi and Frank W=C3=BCrthwein
The IceCube Neutrino Observatory is the National Science Foundations (NS= F)=E2=80=99s premier facility to detect neutrinos with energies above appro= ximately 10 GeV and a pillar for NSF=E2=80=99s Multi-Messenger Astrophysics= (MMA) program, one of NSF=E2=80=99s 10 Big Ideas. The detector is located = at the geographic South Pole and is designed to detect interactions of neut= rinos of astrophysical origin by instrumenting over a gigaton of polar ice = with 5160 optical sensors. The sensors are buried between 1450 and 2450 met= ers below the surface of the South Pole ice sheet. To understand the impact= of ice properties on the incoming neutrino detection, and origin, photon p= ropagation simulations on GPUs are used. We report on a few hour GPU burst = across Amazon Web Services, Microsoft Azure, and Google Cloud Platform that= harvested all available for sale GPUs across the three cloud providers the= weekend before SC19, reaching over 51k GPUs total and 380 PFLOP32s. GPU ty= pes span the full range of generations from the NVIDIA GRID K520 to the mos= t modern NVIDIA T4 and V100. We report the scale and science performance ac= hieved across all the various GPU types, as well as the science motivation = to do so.
Igor Sfiligoi is Lead Scientific Software Developer and Researcher at UC= SD/SDSC. He has been acti= ve in distributed computing for over 20 years. He has started in real-time = systems, moved to local clusters, worked with leadership HPC systems, but s= pent most of his career in computing spanning continents. For about 10 year= s, he has been working on one such world-wide system, called glideinWMS, wh= ich he brought from the design table to being de-facto standard for many sc= ientific communities. He has recently moved his attention in supporting use= rs on top of Kubernetes clusters and Cloud resources. He has a M.S. in Comp= uter Science equivalent from Universita degli studi di Udine, Italy. He has= presented at many workshops and conferences over the years, with several p= ublished papers.
Frank W=C3=BCrthwein is the Execu= tive Director of the Open Science Grid, a national cyberinfrastructure to a= dvance the sharing of resources, software, and knowledge, and a physics pro= fessor at UC San Diego. He received his Ph.D. from Cornell in 1995. After h= olding appointments at Caltech and MIT, he joined the UC San Diego faculty = in 2003. His research focuses on experimental particle physics and distribu= ted high-throughput computing. His primary physics interests lie in searchi= ng for new phenomena at the high energy frontier with the CMS detector at t= he Large Hadron Collider. His topics of interest include, but are not limit= ed to, the search for dark matter, supersymmetry, and electroweak symmetry = breaking. As an experimentalist, he is interested in instrumentation and da= ta analysis. In the last few years, this meant developing, deploying, and n= ow operating a worldwide distributed computing system for high-throughput c= omputing with large data volumes. In 2010, "large" data volumes are measure= d in Petabytes. By 2025, they are expected to grow to Exabytes.
Monday, October 28, 2019 at 1 ET - 12 CT - 11 MT - 10 PT
Running Genomics Workflows on the Pacific Research Platform=E2= =80=99s Nautilus Kubernetes Cluster
Alex Feltus, Ph.D.
Professor|Genetics &a= mp; Biochemistry @ Clemson University; Co-Founder|Praxis AI
Abstract: Our core biological research mission is to discover causal all= eles underlying complex trait expression in plants and animals. Activ= e projects include (A) discovery of genetic subsystems driving legume-micro= be symbiosis that can be engineered into other plants so they can make thei= r own fertilizer, (B) elucidation of gene expression pattern shifts between= normal and disordered brain tissue for better diagnosis of intellectual di= sability, and (C) detection of tumor specific alterations in kidney and oth= er tumors of relevance to precision medicine. Our scientific instrume= nt is the high performance/throughput computer where we run bioinformatic, = machine learning, and network biology workflows on tens to thousands of ter= abytes of in-house and open source deep DNA sequencing datasets. In r= ecent years, we have wrapped applications in containerized NextFlow workflows and now= run data intensive experiments on Kubernetes (K8s) clusters including the = PRP Nau= tilus cluster (we have added a node at Clemson) and the Google Cloud Platform.&n= bsp; In this webinar we will (A) present results from a large tumor biomark= er screen generated with the Nautilus cluster, (B) describe broadly useful = open source genomics workflows (GEMmaker, KINC, and Gene Oracle) with Nautilus-specific usage documentation, = (C) outline a grassroots strategy to add nodes to the Nautilus cluster and = train people how to use that super-awesome system, and (D) discuss a busine= ss model where one can build an elastic K8s cluster for a small virtual org= anization that can be dynamically linked to larger national compute fabrics= platforms via aggregation or federation.
Dr. F. Alex Feltus received a B.Sc. i= n Biochemistry from Auburn University in 1992, served two years in the Peac= e Corps, and then completed advanced training in biomedical sciences at Van= derbilt and Emory. Since 2002, he has performed research in bioinformatics,= high-performance computing, cyberinfrastructure, network biology, genome a= ssembly, systems genetics, paleogenomics, and bioenergy feedstock genetics.= Currently, Feltus is an Professor in Clemson University's Dept. of Genetic= s & Biochemistry, CEO of Allele Systems LLC, Core Faculty in the CU-MUS= C Biomedical Data Science and Informatics (BDSI) program, member of the Cen= ter for Human Genetics, and serves on the Internet2 Board of Trustees as we= ll as various "Advance Research Computing" engagement workgroups. Feltus ha= s published numerous scientific articles in peer-reviewed journals, teaches= undergrad and PhD students in bioinformatics, biochemistry, and genetics. = At present, he is funded by multiple NSF grants and is engaged in tethering= together extremely smart people from diverse technical backgrounds in an e= ffort to propel genomics research from the Excel-scale towards the Exascale= .