Page tree
Skip to end of metadata
Go to start of metadata

Welcome to the NET+ Google Cloud Platform (GCP) wiki

Many Internet2 member institutions take advantage of this service offering. If your institution is one of them then this wiki will provide details on how to make the most of your participation of the programming and interact with peers across Internet2 member institutions. 

This program is open to all US higher education institutions. There are additional access fees for institutions that are not members of Internet2.  For details on how to join the program, please email netplus@internet2.edu. 

You can also find out more about the Internet2 Cloud Connect offering for GCP Partner Interconnect.

Service Documentation and Resources

Accessibility:

Identity:

  • GCP identities are closely tied to the GSuite environment - schools currently use Google Cloud Directory Sync to sync users and to populate Google Groups
  • If you are interested in working on a recommendation for dynamic group population and Role mapping for authorization management - email netplus@internet2.edu

Information Security:

Contract and Pricing:

Community Resources

Participate in our Subscriber Community:

Institutions participating in the NET+ GCP program may take advantage of our email discussion list and Slack channel to receive curated program updates and participate in other activities and events.

The NET+ GCP Service Advisory Board hosts regular subscriber calls where campus cloud teams meet to discuss their challenges, share lessons learned and collaborate to find the best answers for their institutions' GCP deployment. We regularly bring in Google engineers or product managers to discuss services and give feedback on how GCP features could best serve the unique needs of higher education institutions.

Please contact Bob Flynn bflynn@internet2.edu to be added.

Join the GCP Community Forum (Open to all community members):

Users of GCP are encouraged to join the #google channel in the EDUCAUSE Cloud Community Group Slack. See the Higher Ed Cloud Community page on the Cloud Wiki for instructions to join.

Collaborate on the Cloud Wiki:

Speaking of community, did you know about the Cloud Wiki? This was created specifically for YOU, members of the higher education community to collaborate with each other. Log in to see a Cloud Job descriptions page and contribute your knowledge!

Contribute Code:

Looking to share your latest Terraform config? Add it to the Cloud Wiki Helpful GitHub Repos list or email sjeanes@internet2.edu to request access and create a repo in the Community Cloud Config GitHub organization.

Questions on Offers, Distributors and Resellers, Agreement Structure:

Find answers to frequently asked question in these Knowledge Base articles.

Key Program Updates

Subscribers may review our mailing list archives for monthly program and GCP updates.

Here at Internet2, we are fortunate to be working with a wonderful group of students from Notre Dame's Master of Science in Business Analytics program. The group is working to help us gain insight from detailed usage data we get from the NET+ AWS and GCP programs. Our hopes are that we will be able to use that data to observe emerging patterns of cloud infrastructure in higher education and research, and to use that knowledge to help the community support effective cloud use.

In order to provide analytic access to the data, which is kept in Google Big Query tables, we wanted to provide the students with a Jupyter notebook environment where they would not need to download or store the data on their own personal laptops while they work with us. This post documents how we are providing that environment using Managed Notebooks in GCP's Vertex AI Workbench.

We have set up a Google Group for the class project, containing the members of the class as well as the Internet2 staff working on the project with them. In order to allow the group the ability to create notebooks, we added the Notebooks Admin role for the group within our GCP project (as described in (https://cloud.google.com/vertex-ai/docs/workbench/user-managed/iam). Open question: Would Notebooks Runner be adequate for our purposes?

For our purposes, as we only have four students in our group, we used the GCP Console to manually create the notebooks. The process could be automated  for larger repeated use (or one could use Google's Rad Lab Data Science repo).

The process for creating Managed Notebooks is documented here: https://cloud.google.com/vertex-ai/docs/workbench/managed/create-managed-notebooks-instance

At present Managed Notebooks are only available for a single user, so we created an individual instance for each student, naming each notebook with the student's email identifier. Each notebook can be assigned a single owner (at the bottom of the Advanced Settings screen), which is where you assign the notebook to the student's email address.

To help in managing costs, we reduced the size of the instances from the default n1-standard-4 to n1-standard-2, and reduced the idle timeout period from 180 minutes to 60 minutes.

The result of creating notebooks manually in the console is a running notebook process, viewable in the Vertex AI Workbench screen in the console. We then stop those processes, as we will rely on the students to start them up when they want to use a notebook.

To give the notebooks access to our Big Query tables required assigning the BigQuery Read Session User role to our group. The group already had the BigQuery Data Viewer and BigQuery Job User roles assigned within our project.

The process for accessing Big Query data from a Jupyter notebook is documented here: https://cloud.google.com/bigquery/docs/visualize-jupyter

Because we are using GCP Managed Notebooks, all the necessary pieces for accessing Big Query are pre-installed (as are the usual Python data science modules), so the notebooks are ready to go once started.

We anticipate very low costs for using this service: Managed Notebooks are currently in Preview, and there is no management fee for managed notebooks while in Preview. The instance costs for the n1-standard-2 machines are $0.10 per hour. There can be costs for queries submitted to Big Query, but we anticipate that our uses will remain well within the free tier of Big Query usage.

Many thanks to Maddie Howe for helping to test and troubleshoot this process!

We sent out the following instructions to the students to let them know how to access their notebooks.

I’ve set you each up with a Jupyter environment in our GCP organization for work on the capstone project. 
To get to the environment, follow these instructions:
  1. Go to the Managed Notebooks page in the GCP console: 
    https://console.cloud.google.com/vertex-ai/workbench/list/managed?_ga=2.66336813.283589364.1646256329-1869828962.1513966007
  2. You should see a notebook named with your email id – e.g. nd-capstone-jdoe
  3. Click in the checkbox next to your notebook name and then click on the Start icon up on the Workbench 
    menu line at the top of the page.
    (if you don’t see the Start icon, click on the three dots there and you will).
    It takes 5-10 minutes to spin up the instance.
  4. Once your instance is running, click on Open Jupyterlab and you’ll get a new tab with 
    Jupyterlab – that can also take a few minutes.
  5. You can then start a new notebook.
  6. You should be able to access our Big Query tables as documented here:

    https://cloud.google.com/bigquery/docs/visualize-jupyter
A sample query to test:

%%bigquery testdf
SELECT distinct Product_Name FROM `projectname.datasetname.tablename`
order by Product_Name

That will put the result of the query in a pandas dataframe called testdf. To verify:
print(testdf)

A few notes:
- When you’re done using Jupyter, please go back into the console and stop your instance.
- The instances time out after 60 minutes of no use, so it’s not the end of the world
if you don’t stop it, but it’s a good practice to get into.
- The instances are not huge – 2 CPU, 7.5 GB of RAM, no GPU, 100 GB of disk. If you need more power, please let me know.

Update: March 9, 2022

Aaron Gussman from Google sent along an example of using the notebooks API to create a managed notebook instance, which doesn't appear to be in Google's documentation anywhere yet.

Here is the API example to create a Managed Notebooks runtime with Idle Shutdown settings:

BASE_ADDRESS="notebooks.googleapis.com"

LOCATION="us-central1"

PROJECT_ID="YOUR_PROJECT_ID"

AUTH_TOKEN=$(gcloud auth application-default print-access-token)

RUNTIME_ID="my-runtime"

OWNER_EMAIL="YOUR_EMAIL"

RUNTIME_BODY="{

  'access_config': {

    'access_type': 'SINGLE_USER',

    'runtime_owner': '${OWNER_EMAIL}'

  },

  'software_config': {

    'idle_shutdown': true,

    'idle_shutdown_timeout': 180

  }

}"


curl -X POST https://${BASE_ADDRESS}/v1/projects/$PROJECT_ID/locations/$LOCATION/runtimes?runtime_id=${RUNTIME_ID} -d "${RUNTIME_BODY}" \

 -H "Content-Type: application/json" \

 -H "Authorization: Bearer $AUTH_TOKEN" -v






NET+ GCP Service Advisory Board Membership

  • John Bailey, Washington University of St Louis (Chair)
  • James Bennett, Indiana University
  • Damian Doyle, University of Maryland, Baltimore County
  • Stratos Efstathiadis, New York University
  • Jay Hartley, Massachusetts Institute of Technology
  • Joshua Hyman, University of Pittsburgh
  • Atul Pala, San Jose State
  • Brian Pasquini, University of Pittsburgh
  • Sam Porter, University of Maryland
  • Rick Rhoades, Penn State University
  • Jim Thomas, Indiana University
  • Bob Flynn, Internet2, Staff Liaison

To Contact the Service Advisory Board

Questions?

Send Feedback or Submit a Feature Request:

The NET+ GCP program is managed by an Internet2 program manager with the support of the NET+ GCP Service Advisory Board. 

The NET+ GCP Service Advisory Board reviews and priorities community feature requests on a periodic basis. Feature requests may be submitted to netplus@internet2.edu.




 




  • No labels