On Google Cloud Storage
Understanding Cromwell on Google Cloud
Cromwell is the Broad Institute's open-source workflow execution engine, designed to run Workflow Description Language (WDL) workflows on local or cloud infrastructure. Google Cloud provides the Cloud Life Sciences API, which:
Manages, processes, and transforms life sciences data
Orchestrates task execution on Google Compute Engine instances
Was formerly known as the Pipelines API
How It Works
When using Cromwell on Google Cloud with Workbench, the process flows as follows:
Workbench submits the workflow to Cromwell
Cromwell generates individual task definitions
Cromwell dispatches tasks to Google Compute Engine via Cloud Life Sciences API
Tasks execute on Google Compute Engine instances using specified containers and resources
Outputs write to Google Cloud Storage
Cromwell returns results to Workbench
Workbench monitors and reports status throughout this process, as detailed in the User Guide.
Deployment
For simplified setup, we provide a Terraform-based installer script that supports both single-project and multi-project architectures.
The deployment uses Cloud Run as the ingress point for Cromwell requests, as the Cromwell server isn't directly internet-accessible. Requests require a GCP identity token from an authorized principal (user or service account).
For complete deployment instructions, refer to the installer README.
Last updated