Folding@home is simulating the dynamics of COVID-19 proteins to hunt for new therapeutic opportunities (recent updates from Mar 10 and Feb 27). This template is provided to easily run Folding@home on Google Cloud, and help increase total number of simulations done. You can use this Terraform script to automatically deploy one or more Folding@home clients on GCP. The template creates the instance template with the Folding@home binaries, a managed instance group to uniformly deploy as many clients as specified by user, network firewall rules, and a Cloud NAT gateway for internet access without requiring public IPs, all in an existing or newly created network as specified by user.
This is not an officially supported Google product. Terraform templates for Folding@home are developer and community-supported. Please don't hesitate to open an issue or pull request.
- GCP project to deploy to.
- Sufficient CPU & GPU resource quota in your GCP project. See Review project quota below for instructions on who to determine amount of spare quota available.
Parameter | Description | Default |
---|---|---|
project | Id of the GCP project to deploy to | |
region | Region for cloud resources | |
zones | One or more zones for cloud resources. If not set, up to three zones in the region are used to distributed instances depending on number of instances. Note on GPU: Not all regions and zones support all GPUs. If running with GPUs, you should specify explicit list of zones (available for you in your region) that support your selected GPU model. Refer to list of zones per GPU model |
|
create_network | Boolean to create a new network | true |
network | Network to deploy resources into. It is either: 1. Arbitrary network name if create_network is set to true 2. Existing network name if create_network is set to false |
fah-network |
subnetwork | Subnetwork to deploy resources into It is either: 1. Arbitrary subnetwork name if create_network is set to true 2. Existing subnetwork name if create_network is set to false |
fah-subnetwork |
subnetwork_cidr | CIDR range of subnetwork | 192.168.0.0/16 |
fah_worker_image | Docker image to use for Folding@home client | stefancrain/folding-at-home:latest |
fah_worker_count | Number of Folding@home clients or GCE instances | 3 |
fah_worker_type | Machine type to run Folding@home client on. Note on GPU: only general-purpose N1 machine types currently support GPUs |
n1-highcpu-8 |
fah_worker_gpu | GPU model to attach to each machine running Folding@home client. Possible options: nvidia-tesla-t4 , nvidia-tesla-v100 , nvidia-tesla-p100 , nvidia-tesla-p4 , nvidia-tesla-k80 .Set to empty string "" for no GPU. |
nvidia-tesla-t4 |
fah_team_id | Team id for Folding@home client. Defaults to F@h team Google or 446 | 446 |
fah_user_name | User name for Folding@home client | Anonymous |
fah_passkey | User passkey for Folding@home client. This is optional. More details here | None |
Before proceeding, you need to ensure you have enough CPU & GPU spare quota in your project, and specifically in the region you intend to deploy to. This template will deploy a fixed-size managed instance group (MIG) with preemptible VMs with GPUs attached for workload acceleration. Note that preemtible VMs can be terminated (and then recreated by MIG) at any time, but run at much lower price than normal instances.
- Visit https://cloud.google.com/compute/quotas
- Under Location, search and select a location such as "us-east1"
- Under Metrics, search for "CPU" and select "CPUs" and "Preemptible CPUs". If you do not have Preemptible CPUs quota, Compute Engine will still use regular CPUs quota to launch preemptible VM instances.
- Under Metrics, search for "GPU" and select all GPUs except "Commmitted.." ones and the "...Virtual Workstation GPUs". If you do not have Preemptible GPUs quota, Compute Engine will still use regular GPU quotas to add GPUs to preemptible VM instances.
- You can now determine (1) which GPU models are available and (2) how many spare CPU cores there are for your project and your target region. This gives you maximum size MIG (i.e. the number of VMs each running a Folding@home client) you can deploy, and whether you can attach GPUs (and which GPU device). If desired, you can request for more quota (including separate and additional quota for preemptible CPUs/GPUs), by selecting the specific quota(s), clicking on 'Edit Quotas', and entering the requested 'New quota limit'.
Below a screenshot of a newly created project with a starting quota in '"us-east1" region of 72 CPU cores and 4 Preemptible Nvidia T4 GPUs. In that case, one might opt with a MIG of size 4, where each worker node is a preemptible n1-highcpu-8 with a T4 GPU attached, so a total of 4*8=32 CPUs and 4 T4 GPUs. Here are the relevant parameters in this example:
- fah_worker_count = 4
- fah_worker_type = h1-highcpu-8
- fah_worker_gpu = nvidia-tesla-t4
- Terraform 0.12+
- Run
git submodule init && git submodule update
after cloning to ensure cos-gpu-installer submodule is installed - Copy placeholder vars file
variables.yaml
into newterraform.tfvars
to hold your own settings. - Update placeholder values in
terraform.tfvars
to correspond to your GCP environment and desired Folding@home settings. See list of input parameters above. - Initialize Terraform working directory and download plugins by running
terraform init
. - Provide credentials to Terraform to be able to provision and manage resources in your project. See adding credentials docs to supply a GCP service account key.
$ terraform plan
$ terraform apply
Once Terraform completes:
- Confirm Folding@home instance group has been created with correct number of instances
- Navigate to Compute Enginer -> Instance groups:
https://console.cloud.google.com/compute/instanceGroups/list
- Click on the newly created instance group to view its details
- Confirm number of instances created. Take note of one the instances names and corresponding zone
- Access one of the new instances via CLI.
- First, make sure you have IAP SSH permissions for your instances by following these instructions
- Type
gcloud compute ssh [INSTANCE_NAME] --zone [INSTANCE_ZONE]
to SSH to the instance you took note previously. Since instances are created without external IP, this will default to using IAP tunnel.
- View Folding@home container logs
- Once logged in, retrieve container name via
docker ps
- Type
docker logs -tf [CONTAINER_NAME]
to tail the logs and confirm its operation
To stop Folding@home client(s) and remove all provisioned resources, type and confirm:
$ terraform destroy
- Add GPU usage logging to Stackdriver for quick monitoring & troubleshooting
- Scale down to 1 when no jobs available
- Scale down to 0 when no jobs available for extended time. Spin back up periodically.
This is not an officially supported Google product. Terraform templates for Folding@home are developer and community-supported. Please don't hesitate to open an issue or pull request.
Copyright 2020 Google LLC
Terraform templates for Folding@home are licensed under the Apache license, v2.0. Details can be found in LICENSE file.