Skip to content

Using SpRIT with Google Colab

RJbalikian edited this page Jul 3, 2024 · 9 revisions

Introduction

Google Colab is a service from Google that allows you to run write and execute Python in your browser. It creates a temporary linux virtual private machine (i.e., a virtual computer of sorts) on a Google server that has many python packages preinstalled. You are able to access this computer via a jupyter notebook, in which you can run python code. These notebooks and python code can be shared, but the actual data and files are deleted/recycled after each use, so any data you wish to retain must be exported first.

Google Colab can be accessed here: https://colab.research.google.com/

(after accessing for the first time with your Google account, you should be able to create a Google colab notebook from Google Drive itself, see image)

image

Is Google Colab right for you?

Some of the benefits of using Google colab are:

  • Little configuration required
  • Python and many standard data analysis packages are already installed
  • Basic access is free of charge
  • Easy sharing
  • Relatively easy integration with Google drive
  • Can be run on any device with a modern browser (no mobile app, but can be run from a browser)

Potential negatives:

  • A Google account is needed
  • Accessing files on a local computer is burdensome
    • (Data must first be uploaded to a Google drive, which then must be mounted; or just uploaded to the kernel. No direct access to local files)
  • Python versions may change without warning (older versions are difficult to access)
  • Must be run somewhere with internet access

Instructions

Step 1. Set up Google Colab

In order to set up Google Colab in your Google Drive account, you first need to visit https://colab.research.google.com/ and sign in with your Google account. This will enable you to create Colab notebooks in you Google Drive. You can also open a Colab notebook as well.

After you do this, you can open any folder in your Google Drive and create a colab notebook.

image

Step 2. Start a Colab Runtime

You can open a link to a Colab notebook or create your own. The first step is to create a code cell. This may have already been completed for you.

image

Step 3. Install SpRIT in Your Runtime

You will need to use the python package installer command pip. You will also need to use a terminal command. To signify to the notebook that the command is a terminal command and not a python command, the line needs to be preceded with a ! or %. The command to install packages via the python package index (Pypi) is pip install <package-name>. To install sprit, you will need to run a code cell with the following command:

!pip install sprit

Be sure to include the exclamation point at the start of the line.

Step 4. Restart Runtime

One of the primary packages on which SpRIT depends is Obspy, and it has a module called "signal," which is also the name of a module already installed in the Colab runtime. Because of this, you will need to restart the runtime in order to successfully use the SpRIT package. Usually, after you run the command in the previous step, a dialog will pop asking you to restart the session:

image

Click the "Restart session" button.

If this dialog does not appear, go to the Runtime Menu and select Restart session:

image

Only after you restart the session are you ready to use SpRIT in the Colab environment.

Step 5. Use SpRIT to process HVSR data

To create a new code cell, hover your mouse over the center of the screen just below/between existing code cells. Select the "Code +" button.

image

You can now test that the installation was successful by processing some sample data:

hvsrData = sprit.run("sample")

To process your own data, you will need to upload data to your runtime.

For data on your local machine, click the Files icon on the left of your colab notebook. You can then upload files to your session storage.

image

The home directory is "/content". You can right click on any file to copy its filepath. If the data is formatted correctly, you can process the data using:

hvsrData = sprit.run("/content/path/to/data.mseed")

See the [SpRIT documentation[(https://sprit.readthedocs.io/en/latest/) for more information on the SpRIT package and the sprit.run() function. You can also use the python command help() for any function. For example, help(sprit.run)

Step 6. Export Processed Data

Because the files in your colab notebook are notebook are deleted when you close the notebook, it is important to immediately export your results after processing. For SpRIT there are three main outputs or reports: a printed report, a plot of the data, and a table of the relevant information. These are saved as attributes of the HVSRData object that is created when you process the data. See the table below for more information.

It is recommended to save the report table as an actual .csv file (or copy the data into an existing excel file) as well as the plot image.

Report Attribute name Data type Description
Print report Print_Report String A printout of the peak frequency and the results of the SESAME validation tests for that peak
Report table CSV_Report Pandas Dataframe A table containing the site name, location, peak frequency, and results of the SESAME Validation tests
Plot HV_Plot Matplotlib or Plotly figure If matplotlib engine used (used by default), contains a tuple with (matplotlib.Figure, matplotilb.Axes) objects. If plotly, simply contains a plotly figure

For all reports, if they are not generated during your processing run, you can regenerate them using the sprit.get_report() function or (if you have a variable called hvsrData) the hvsrData.get_report() method (or its alias, hvsrData.report()). Both of these latter methods belonging to the HVSRData class use the same parameters as the get_report() function, except that they do not need to have the hvsr_results parameter specified since that is what is being used to run the method itself.

For example, if you save the output to sprit.run() to a variable named hvsrData (e.g., for the first sample dataset, you could do that with the following code: hvsrData = sprit.run('sample')), then the report table would be available using the following line of code:

hvsrData.CSV_Report

This returns a Pandas DataFrame, and all the same methods and attributes that belong to a Pandas DataFrames in general can be used for the this table. For example, to export the data to a .csv file:

hvsrData.CSV_Report.to_csv("path/to/where/you/want/to/save/the/table_report.csv")

If you export the CSV_Report using the .to_csv() method, this only saves a .csv file to your virtual machine in Google Colab. To retain this data, you will need to download the data from your session files in the left bar of the notebook. This can be done by right-clicking on the file and selecting Download.

In jupyter notebooks, if you run a code cell with only: hvsrData.CSV_Report, it will show an interactive table. You can also copy this and paste it into (for example) a local excel file where you are working on all data for a particular project.

image

The print report is not exportable but can be copy/pasted as desired and can be accessed using hvsrData.Print_Report For this to format correctly in a jupyter notebook, you will need to use the command print(hvsrData.Print_Report)

image

The plot (and the print report) will appear as an output in the jupyter notebook by default when running sprit.run(). In this case, you can simply right click the plot and save the image to your local computer.

image

Finally, you can save the entire HVSRData object to a file using the .export() method. With your hvsrData variable, the following line of code will exoprt a .hvsr file containing all the information mentioned above, and more (including the raw seismic data).

hvsrData.export('path/to/export/your/data.hvsr')  

Examples

An example Google Colab notebook can be viewed here. This notebook has several examples for how to use SpRIT to process HVSR data.