On Google Cloud Platform

Backend services

Cromwell is an open-source workflow execution engine developed by the Broad Institute of MIT and Harvard. It is designed to run workflows written in the Workflow Description Language (WDL, pronounced ‘widdle’) on either local or cloud infrastructure.

Cloud Life Sciences API (formerly Pipelines API) is a suite of services and tools for managing, processing, and transforming life sciences data provided by Google Cloud. It includes a workflow execution service capable of orchestrating the execution of individual tasks on Google Compute Engine instances.

Main operations

These are the main operations that occur when using Cromwell on GCP as the workflow execution backend for Workbench:

  1. Workbench submits the workflow to Cromwell

  2. Cromwell generates individual task definitions

  3. Cromwell dispatches task definitions to Google Compute Engine via the Cloud Life Sciences API

  4. Tasks are executed on Google Compute Engine instances using the tool containers and computing resources specified in the workflow

  5. Outputs are written to Google Cloud Storage

  6. Cromwell returns the result of running the workflow to Workbench

Various status monitoring operations also take place and are reported by Workbench as described in the User Guide.

Deployment and configuration

For your convenience, we provide an installer script that uses Terraform to create a suitable installation of Cromwell on Google Cloud. It can be used either in a single-project or a multi-project architecture.

For complete step-by-step instructions, see the installer README.

In this deployment, a Cloud Run service is used as the ingress for all requests to Cromwell, as the Cromwell server itself is not accessible from the internet. Calling this service requires a GCP identity token of a principal (user or service account) that has been granted permission to call that service.

Last updated

© DNAstack. All rights reserved.