On Premises

On Premises

Backend Services

Cromwell is an open-source workflow execution engine developed by the Broad Institute of MIT and Harvard. It is designed to run workflows written in the Workflow Description Language (WDL, pronounced ‘widdle’) on either local or cloud infrastructure.

DNAstack WES Service is an open-source adapter built by DNAstack providing a fully featured GA4GH Workflow Execution Schema (WES) API on top of an existing Cromwell execution engine. The WES Service can be deployed alongside Cromwell in any environment that Cromwell supports including on-premises.

The WES service provides several key features

  1. Improved security, supporting OAuth2 authentication powered by DNAstack Passport

  2. Improved auditability of API operations

  3. A standardized API for submitting and monitoring workflow execution

  4. The ability to upload attachments as part of a run request and have those attachments available to the running workflow

  5. Simplified log streaming, removing the need to access files directly, thus improving security

  6. Automatic translation of file paths in inputs

Main operations

These are the main operations that occur when using the DNAstack WES Service as as the workflow execution backend for Workbench:

  1. Workbench submits the workflow to the WES API

  2. The WES API translates the request and submits it to Cromwell

  3. Cromwell generates individual task definitions

  4. Cromwell dispatches task definitions to the underlying execution service

  5. Tasks are executed on the environment's compute infrastructure

  6. Outputs are written to the attached storage

  7. Cromwell returns the result of running the workflow to the WES Service

  8. The WES Service returns the workflow results to Workbench

Various status monitoring operations also take place and are reported by Workbench as described in the User Guide.

Deployment and Configuration

The following guide describes how to deploy the DNAstack WES Service on a local HPC or compute infrastructure and connect it to Workbench. There are many configuration options which will not be covered; for a complete list please visit the DNAstack WES Service GitHub Page.

If you have previously followed the setup guide, you can skip ahead to Connecting to Workbench

Pre-requisites

  • Java 17+ installed

  • Download the latest release of the WES Service

  • Download the latest release of Cromwell (or have Cromwell accessible locally)

  • NGINX (or another reverse proxy software) installed

Setup

You can get started in seconds with the DNAstack WES Service and Cromwell. Each environment and compute system will require different Cromwell configurations which you can find described in the Cromwell documentation.

  1. Start Cromwell in server mode

CROMWELL_VERSION="85"
java -jar "cromwell-${CROMWELL_VERSION}.jar" server
  1. On the same compute node, start the DNAstack WES Service and bind it to only allow traffic from localhost. Binding the service to the localhost guarantees that it does not accept traffic from external clients. The DNAstack WES Service will expect Cromwell to be available at http://localhost:8000 by default. If Cromwell is available on a different port or is not running locally you can configure the location by specifying -Dwes.cromwell.url="<IP>:<PORT>". The DNAstack WES Service will start on port 8090.

DNASTACK_WES_VERSION="1.0.0"
java \
  -Dserver.address=127.0.0.1 \
  -Dspring.profiles.active=no-auth \
  -jar "cromwell-wes-service-${DNASTACK_WES_VERSION}.jar"
  1. Submit a workflow to the WES Service

cat <<EOF > main.wdl
version 1.0

task say_hello {
  input {
    String name
  }
  command <<<
    echo "Hello ~{name}"
  >>>

  output {
    String greeting = read_string(stdout())
  } 
}

workflow hello_world {
  call say_hello
}
EOF

RUN_ID=$(curl -X POST http://localhost:8090/ga4gh/wes/v1/runs \
  --key client.key --cert client.crt \
  -F workflow_attachment=@main.wdl \
  -F workflow_url=main.wdl \
  -F workflow_params='{"hello_world.say_hello.name":"Foo"};type=application/json' \
  | jq -r '.run_id')
curl "http://localhost:8090/ga4gh/wes/v1/runs/${RUN_ID}"

Configuring SSL and Authentication

In order to ensure secure communication, Workbench requires both an SSL connection and for authentication to be enabled on the WES Service instance. The simplest way to enable these features is through the use of Mutual TLS and a forward proxy ( like NGINX) running beside the WES Service.

The forward proxy is responsible for handling all incoming traffic, establishing a secure connection, validating the client certificate, and forwarding traffic to the WES Service. Using this approach, the WES Service does not actually need access to be made accessible except through the localhost.

Download NGINX

  1. If you have not done so already, you will need to download a forward proxy. This guide will use NGINX, which is a fast, easy to use forward proxy that supports Mutual TLS out of the box.

Generate the Server Keys

The server certificate will be used by clients connecting to the NGINX instance to establish a secure (https) connection. You can use OpenSSL to generate the certificate.

  1. If you are using a Copy the following text and save it to a file called server.conf.

    1. Replace both instances of the ${IP_ADDRESS} variable with the actual Public IP address the NGINX instance will be accessible on.

    2. If you plan on connecting to the instance using a domain name that you own, you can uncomment the last line and replace the ${DNS_NAME} variable with the domain name you plan on using

[req]
default_bits  = 2048
distinguished_name = req_distinguished_name
req_extensions = req_ext
x509_extensions = v3_req
prompt = no
[req_distinguished_name]
countryName = XX
stateOrProvinceName = N/A
localityName = N/A
organizationName = Self-signed certificate
commonName = ${IP_ADDRESS}: Self-signed certificate
[req_ext]
subjectAltName = @alt_names
[v3_req]
subjectAltName = @alt_names
[alt_names]
# Put IP addresses here
IP.1 = ${IP_ADDRESS}
IP.2 = 127.0.0.1
# Uncomment this section if you have a DNS name that you plant to use to connect to the
# Compute instance
# DNS.2 = ${DNS_NAME}
  1. Using OpenSSL and the config you just created, generate a server certificate and a public key.

    The client.key and client.pem files contain sensitive information that will grant anyone access to your WES service. Treat these files like a password.

openssl req -new -nodes -x509 -days 365 -keyout server.key -out server.crt -config server.conf

Generate the Client Keys

The client certificate will be used by clients for authentication purposes. The server will be provided the client's public key and will validate incoming traffic is signed with the client's private key.

  1. Copy the following text and save it to a file called client.conf. Replace the ${COMMON_NAME} variable with a string that can be used to identify this certificate.

[req]
default_bits  = 2048
distinguished_name = req_distinguished_name
prompt = no
[req_distinguished_name]
countryName = XX
stateOrProvinceName = N/A
localityName = N/A
organizationName = Self-signed certificate
commonName = ${COMMON_NAME}: Self-signed certificate
  1. Using OpenSSL and the client.conf file, generate the client certificate and public key. You will also want to combine both files into a single client.pem file for later use with Workbench.

    The client.key and client.pem files contain sensitive information that will grant anyone access to your WES service. Treat these files like a password.

openssl req -new -nodes -x509 -days 365 -keyout client.key -out client.crt -config client.conf

# Generate a PEM file containing both the key and the certificate
cat client.key > client.pem
cat client.crt >> client.pem

Configure and Start NGINX

  1. Copy the following text and save it to a file named nginx.conf. This configuration will start nginx in daemon mode listening on port 8443. All incoming requests must be authenticated using the client.key.

# daemon off; # Uncomment me to run in the forground
# error_log stderr; # Uncomment me to log errors to stderr instead of a file

worker_processes  1;

events {
    worker_connections  1024;
}


http {
    #include       mime.types;
    #default_type  application/octet-stream;

    log_format  main  '$remote_addr - $remote_user [$time_local] "$request" '
                      '$status $body_bytes_sent "$http_referer" '
                      '"$http_user_agent" "$http_x_forwarded_for"';


    sendfile        on;

    keepalive_timeout  65;

    # access_log /dev/stdout; # Uncomment me to log messages to stdout instead of a file

    server {
        listen 8443 ssl default_server;
        listen [::]:8443 ssl default_server;
        # Comment out if you want to disable HTTPS
        ssl_certificate         server.crt;
        ssl_certificate_key     server.key;

        ssl_client_certificate  client.crt;
        ssl_verify_client       on;

        location / {
                proxy_pass http://localhost:8090;
        }
    }
}
  1. Start NGINX running in the background

nginx -c "$(pwd)/nginx.conf"
  1. Test that you are able to connect to the WES Service using the nginx forward proxy to authenticate the client certificate over an https connection

curl -k \
  --key client.key \
  --cert client.crt \
  https://127.0.0.1:8443/ga4gh/wes/v1/runs
  1. Finally, you will want to make sure you have adjusted any firewall rules to allow incoming requests on port 8443. You can check to make sure that external connectivity is working by running the same cURL command as above, changing the IP address to be the compute node's public IP:

curl --cacert server.crt \
  --key client.key \
  --cert client.crt \
  https://${PUBLIC_IP}:8443/ga4gh/wes/v1/runs

Connecting to Workbench

Once you have completed the setup process and validated that the WES instance is publicly accesable, you are now ready to connect to Workbench. To connect to Workbench you will need the following information:

  1. Contents of the server.crt file generated here.

  2. Contents of the client.pem file generated here.

  3. The public facing IP address that the WES Service can be reached at.

Configuring the Engine

  1. Step 1: Log into Workbench and navigate to the Settings Page.

  2. Step 2: Click the Add Engine button in the top right hand corner and select the GA4GH Workflow Execution Service engine type.

  3. Step 3: Fill in the engine information:

    1. Type a readable name for the engine. The ID should be auto-generated based on the name.

    2. For the URL field type the complete IP and port prefixed by https that will be used to access your WES Service. For example, if the compute node was accessible at IP 192.192.1.1 and port 8443 then the URL would be: https://192.192.1.1:8443.

  4. Step 4: Fill in the Provider and Region. If you are running this on-premises, select the Self Hosted option, otherwise choose the provider that corresponds to your compute environment.

  5. Step 5: If you are using a self-signed certificate then under the Environment section, toggle Configure SSL to on.

    1. In the Server Certificate input box paste the contents from the server.crt file.

  6. Step 6: Under the **Authentication ** section change the Method to be Mutual TLS .

    1. In the Client Key and Certificate input box, paste the contents from the client.pem file. This file should contain both the client public key and the client private key.

  7. Step 7: Click the Save button. If Workbench was able to connect to the engine, you will see a message informing you that

  8. Step 8: Once the engine has been created, you are now ready to run a workflow it was added and will be redirected back to the Settings page.

Last updated

© DNAstack. All rights reserved.