LogoLogo
  • Overview
  • publisher
    • Introduction
    • Getting Started
      • Logging in to Publisher
    • Data Sources
      • Connecting a Data Source
      • Managing a Data Source
      • Connectors
        • AWS S3 Permissions
        • Connecting to AWS S3 Storage
        • Google Cloud Storage (GCS) Permissions
        • Connecting to Google Cloud Storage
        • PostgreSQL Permissions
        • Connecting to PostgreSQL
        • PostgreSQL on Azure Permissions
        • Microsoft Azure Blob Storage Permissions
        • Connecting to Microsoft Azure Blob Storage
        • Connecting to HTTPS
        • Connecting to other sources via Trino
          • BigQuery
    • Collections
      • Creating a Collection
      • Sharing a Collection
      • Collection Filters
      • Editing Collection Metadata
      • Updating Collection Contents
    • Access Policies
      • Creating an Access Policy
      • Managing Access Policies
    • Questions
      • Adding Questions
      • Example Question
    • Settings
      • Viewing Current and Past Administrators
      • Adding an Administrator
      • Removing an Administrator
      • Setting Notification Preferences
  • Explorer
    • Introduction
    • Viewing a Collection
    • Browsing Collections
    • Asking Questions
    • Accessing a Private Collection
      • Requesting Access to a Private Collection
    • Filtering Data in Tables
      • Strings
      • Dates
      • Numbers
  • Workbench
    • Introduction
    • Getting Started
      • Logging into Workbench
      • Connecting an Engine
      • Finding or Importing a Workflow
      • Configuring Workflow Inputs
      • Running and Monitoring a Workflow
      • Locating Outputs
    • Engines
      • Adding and Updating an Engine
        • On AWS HealthOmics
        • On Microsoft Azure
        • On Google Cloud Platform
        • On Premises
      • Parameters
        • AWS HealthOmics
        • Google Cloud Platform
        • Microsoft Azure
        • On-Premises
        • Cromwell
        • Amazon Genomics CLI
    • Workflows
      • Finding Workflows
      • Adding a Workflow
      • Supported Languages
      • Repositories
        • Dockstore
    • Instruments
      • Getting Started with Instruments
      • Connecting a Storage Account
      • Using Sample Data in a Workflow
      • Running Workflows Using Samples
      • Family Based Analysis with Pedigree Information
      • Monitor the Workflow
      • CLI Reference
        • Instruments
        • Storage
        • Samples
        • OpenAPI Specification
    • Entities
    • Terminology
  • Passport
    • Introduction
    • Registering an Email Address for a Google Identity
  • Command Line Interface
    • Installation
    • Usage Examples
    • Working with JSON Data
    • Reference
      • workbench
        • runs submit
        • runs list
        • runs describe
        • runs cancel
        • runs delete
        • runs logs
        • runs tasks list
        • runs events list
        • engines list
        • engines describe
        • engines parameters list
        • engines parameters describe
        • engines health-checks list
        • workflows create
        • workflows list
        • workflows describe
        • workflows update
        • workflows delete
        • workflows versions create
        • workflows versions list
        • workflows versions describe
        • workflows versions files
        • workflows versions update
        • workflows versions delete
        • workflows versions defaults create
        • workflows versions defaults list
        • workflows versions defaults describe
        • workflows versions defaults update
        • workflows versions defaults delete
        • namespaces get-default
        • storage add
        • storage delete
        • storage describe
        • storage list
        • storage update
        • storage platforms add
        • storage platforms delete
        • storage platforms describe
        • storage platforms list
        • samples list
        • samples describe
        • samples files list
      • publisher
        • datasources list
  • Analysis
    • Python Library
    • Popular Environments
      • Cromwell
      • CWL Tool
      • Terra
      • Nextflow
      • DNAnexus
Powered by GitBook

© DNAstack. All rights reserved.

On this page
  • Overview
  • Workflow Structure
  • Components and Management
  • Command Chains
  • Computational Management
  • Workflow Languages
  • Conclusion

Was this helpful?

  1. Workbench

Workflows

PreviousAmazon Genomics CLINextFinding Workflows

Last updated 4 months ago

Was this helpful?

Overview

A workflow is a systematic sequence of processes designed to achieve a specific goal, from automating business processes to executing bioinformatics analyses. In , workflows provide structured methods for processing and analyzing data.

Workflow Structure

Each workflow in Workbench consists of metadata, including its name, identifier, and description, along with a set of versions. Each version represents an immutable snapshot of descriptor files - the "instructions" that Workbench uses to analyze data.

The combination of workflow ID and version ID creates a unique identifier within your Workbench account. This allows workflows to be addressable and helps track which version was used for specific executions.

Components and Management

Command Chains

Workflows orchestrate commands in a predefined sequence. This ordered execution ensures methodical data processing, with outputs from one step often serving as inputs for subsequent steps.

Computational Management

Modern workflows coordinate multiple steps with varying requirements:

  • Different computational resources

  • Specific software environments

  • Distinct processing needs

Workflow Languages

Workflow languages bring structure and clarity to complex processes through:

Task Definition

Each task specifies its computational requirements, ensuring optimal execution environments and resource allocation.

Container Integration

Docker containers provide:

  • Consistent software environments

  • Required dependencies

  • Reproducible execution conditions

Task Chaining

Languages enable seamless connections between tasks, creating efficient data processing pipelines where outputs flow automatically to subsequent steps.

Conclusion

Workflows are fundamental to facilitating structured and systematic data processing and analysis. By leveraging the power of workflow languages and tools like Workbench, professionals can design, execute, and manage intricate workflows with ease and precision. This ensures not only efficiency and scalability, but also reproducibility, which is crucial for scientific and business applications.

Workbench