On Amazon Web Services
Backend services
Cromwell is an open-source workflow execution engine developed by the Broad Institute of MIT and Harvard. It is designed to run workflows written in the Workflow Description Language (WDL, pronounced ‘widdle’) on either local or cloud infrastructure.
Amazon Genomics CLI (AGC) is a workflow execution management package provided by Amazon Web Services that simplifies the deployment of AWS resources for running genomics workflows. It bundles the Cromwell execution engine and orchestrates the execution of individual tasks by the AWS Batch service.
Main operations
These are the main operations that occur when using Cromwell on AWS as the workflow execution backend for Workbench:
Workbench submits the workflow to Cromwell;
Cromwell generates individual task definitions;
Cromwell dispatches task definitions to AWS Batch;
Tasks are executed on AWS instances using the tool containers and computing resources specified in the workflow;
Outputs are written to AWS Cloud Storage (S3);
Cromwell returns the result of running the workflow to Workbench.
Various engine and task status monitoring operations take place and are reported by Workbench as described in the Running and Monitoring a Workflow.
Deployment and configuration
This backend requires that you deploy an AGC Context in AWS by following Amazon's Getting started and Installation guide. Once you have completed this step, collect the following information and proceed to Connecting to a workflow engine.
Access Key Id and Access Secret Key ( see below)
The AWS region where AGC was deployed
The AGC project name and the context name from the
agc-project.yaml
file that you defined. See the placeholders below (<project-name>
and<context-name>
) where the corresponding values can be found in your file.
Generating an Access Key and Assigning it the Required Permissions
Generate IAM Policies
Using AGC requires specific permissions allowing you to interact with all of the associated AWS tools and infrastructure. The simplest way to configure these permissions is to have AGC do it for you. This guarantees the list of permissions is up to date and is the smallest subset of permissions required to run workflows using AGC. You can generate the required IAM policies by following the AGC documentation.
If you are deploying Cromwell with AGC, you must also assign s3:GetObject
to the Workbench user for whom you generate the access key in order to stream logs through Workbench. The AgcPermissionsStack
does not currently configure this permission.
Generating a User
While you may use your own credentials for configuring the AGC engine in Workbench, it is highly recommended that you create a dedicated user for this purpose. By creating a dedicated user, you are guaranteeing only the necessary permissions are given to Workbench and creating an easy way to revoke access if you ever needed to.
Open the IAM console.
Click "Users" from the navigation menu.
On the top bar Choose “Add Users”. Type in a new username and click “Next”
Under the “Permissions Options” select “Attach policies directly”.
Under “Permissions policies” type “Agc” in the filter.
Select all: “AgcPermissionsStack-agcuserpolicy*” permissions. This will ensure that the created user has all of the necessary permissions for interacting with AGC resources. The other permission is for admin operations and is not required.
Click "Next".
Click "Create User". Once the user has been created, follow the AWS Account and Access Keys guide to generate an access key and secret. Save the access key and secret; these will be used to register the engine on Workbench.
Create an Access Key and assign it the proper permissions
In order to interact with the AGC WES API directly, you will need to generate an access key and assign it the proper permissions using the AWS Account and Access Keys guide. You can either create an access key on an existing user or create a new IAM user (recommended).
Last updated