On Google Cloud Platform
Backend services
Cromwell is an open-source workflow execution engine developed by the Broad Institute of MIT and Harvard. It is designed to run workflows written in the Workflow Description Language (WDL, pronounced ‘widdle’) on either local or cloud infrastructure.
Cloud Life Sciences API (formerly Pipelines API) is a suite of services and tools for managing, processing, and transforming life sciences data provided by Google Cloud. It includes a workflow execution service capable of orchestrating the execution of individual tasks on Google Compute Engine instances.
Main operations
These are the main operations that occur when using Cromwell on GCP as the workflow execution backend for Workbench:
Workbench submits the workflow to Cromwell
Cromwell generates individual task definitions
Cromwell dispatches task definitions to Google Compute Engine via the Cloud Life Sciences API
Tasks are executed on Google Compute Engine instances using the tool containers and computing resources specified in the workflow
Outputs are written to Google Cloud Storage
Cromwell returns the result of running the workflow to Workbench
Various status monitoring operations also take place and are reported by Workbench as described in the User Guide.
Deployment and configuration
For your convenience, we provide an installer script that uses Terraform to create a suitable installation of Cromwell on Google Cloud. It can be used either in a single-project or a multi-project architecture.
For complete step-by-step instructions, see the installer README.
In this deployment, a Cloud Run service is used as the ingress for all requests to Cromwell, as the Cromwell server itself is not accessible from the internet. Calling this service requires a GCP identity token of a principal (user or service account) that has been granted permission to call that service.
Last updated