Skip to content

Medium-scale Computations

Overview

The notebook for this example runs medium-scale computations on a notebook instance in GCP Vertex AI Workbench using XArray and Dask.

Step 1: Create a notebook environment in Vertex AI

Follow the steps in the GCP documentation to create a user-managed notebook instance within Vertex AI Workbench.

From the Google Cloud Console, navigate to Vertex AI > Workbench, then click on New Notebook in the User-managed Notebooks tab. For this example you can select the Python 3 environment, give your notebook a name, and select your desired region and zone.

Creating a notebook instance in GCP Vertex AI Workbench

You can click on the Create button to create your notebook, or you can customize your machine type and other settings by clicking on the Advanced Options button.

For this example, we'll use a n1-standard-32 notebook instance that has 32 vCPUs and 120 GB RAM by setting the following machine configuration in the Advanced Options page:

Specifying a machine configuration in GCP Vertex AI Workbench

Once the notebook instance is running, click on the Open JupyterLab button to view the notebook environment in your browser.

Step 2: Create a new Python environment and notebook kernel

From within the notebook environment, open a new terminal window and create a conda environment with Python 3.8:

conda create -n py38 python=3.8

Activate the newly created environment:

conda activate py38

Install and register a notebook kernel for this environment.

conda install ipykernel

Step 3: Clone the repository

Clone the repository with these examples by running the following command:

git clone https://github.com/koverholt/scaling-python-on-gcp

Step 4: Install dependencies

Change to the scaling-python-on-gcp directory by running:

cd scaling-python-on-gcp

Install the Python packages in this repository by running:

pip install -r requirements.txt

Step 5: Run the medium-scale notebook

Open the notebook located at scaling-python-on-gcp/2-medium-scale/satellite-imagery.ipynb for this medium-scale computation, which contains all of the remaining code that you need to run for this example.

Be sure to switch to the notebook kernel for the py38 conda environment that you created earlier.

Run through all of the notebook cells to point to the satellite image data, start a local Dask cluster on your machine, and compute and visualize the NDVI.

Running the medium-scale satellite imagery notebook

Success!

Congratulations! 🎉 You've successfully run the medium-scale computation example and calculated the normalized difference vegetation index (NDVI) on 30 satellite images from a notebook instance within GCP Vertex AI Workbench.