LogoLogo
  • Overview
  • publisher
    • Introduction
    • Getting Started
      • Logging in to Publisher
    • Data Sources
      • Connecting a Data Source
      • Managing a Data Source
      • Connectors
        • AWS S3 Permissions
        • Connecting to AWS S3 Storage
        • Google Cloud Storage (GCS) Permissions
        • Connecting to Google Cloud Storage
        • PostgreSQL Permissions
        • Connecting to PostgreSQL
        • PostgreSQL on Azure Permissions
        • Microsoft Azure Blob Storage Permissions
        • Connecting to Microsoft Azure Blob Storage
        • Connecting to HTTPS
        • Connecting to other sources via Trino
          • BigQuery
    • Collections
      • Creating a Collection
      • Sharing a Collection
      • Collection Filters
      • Editing Collection Metadata
      • Updating Collection Contents
    • Access Policies
      • Creating an Access Policy
      • Managing Access Policies
    • Questions
      • Adding Questions
      • Example Question
    • Settings
      • Viewing Current and Past Administrators
      • Adding an Administrator
      • Removing an Administrator
      • Setting Notification Preferences
  • Explorer
    • Introduction
    • Viewing a Collection
    • Browsing Collections
    • Asking Questions
    • Accessing a Private Collection
      • Requesting Access to a Private Collection
    • Filtering Data in Tables
      • Strings
      • Dates
      • Numbers
  • Workbench
    • Introduction
    • Getting Started
      • Logging into Workbench
      • Connecting an Engine
      • Finding or Importing a Workflow
      • Configuring Workflow Inputs
      • Running and Monitoring a Workflow
      • Locating Outputs
    • Engines
      • Adding and Updating an Engine
        • On AWS HealthOmics
        • On Microsoft Azure
        • On Google Cloud Platform
        • On Premises
      • Parameters
        • AWS HealthOmics
        • Google Cloud Platform
        • Microsoft Azure
        • On-Premises
        • Cromwell
        • Amazon Genomics CLI
    • Workflows
      • Finding Workflows
      • Adding a Workflow
      • Supported Languages
      • Repositories
        • Dockstore
    • Instruments
      • Getting Started with Instruments
      • Connecting a Storage Account
      • Using Sample Data in a Workflow
      • Running Workflows Using Samples
      • Family Based Analysis with Pedigree Information
      • Monitor the Workflow
      • CLI Reference
        • Instruments
        • Storage
        • Samples
        • OpenAPI Specification
    • Entities
    • Terminology
  • Passport
    • Introduction
    • Registering an Email Address for a Google Identity
  • Command Line Interface
    • Installation
    • Usage Examples
    • Working with JSON Data
    • Reference
      • workbench
        • runs submit
        • runs list
        • runs describe
        • runs cancel
        • runs delete
        • runs logs
        • runs tasks list
        • runs events list
        • engines list
        • engines describe
        • engines parameters list
        • engines parameters describe
        • engines health-checks list
        • workflows create
        • workflows list
        • workflows describe
        • workflows update
        • workflows delete
        • workflows versions create
        • workflows versions list
        • workflows versions describe
        • workflows versions files
        • workflows versions update
        • workflows versions delete
        • workflows versions defaults create
        • workflows versions defaults list
        • workflows versions defaults describe
        • workflows versions defaults update
        • workflows versions defaults delete
        • namespaces get-default
        • storage add
        • storage delete
        • storage describe
        • storage list
        • storage update
        • storage platforms add
        • storage platforms delete
        • storage platforms describe
        • storage platforms list
        • samples list
        • samples describe
        • samples files list
      • publisher
        • datasources list
  • Analysis
    • Python Library
    • Popular Environments
      • Cromwell
      • CWL Tool
      • Terra
      • Nextflow
      • DNAnexus
Powered by GitBook

© DNAstack. All rights reserved.

On this page
  • Overview
  • Samples Commands
  • Samples List
  • Samples Describe
  • Sample Files List
  • Runs Submit
  • Data Localization
  • Defaults
  • Examples

Was this helpful?

  1. Workbench
  2. Instruments
  3. CLI Reference

Samples

PreviousStorageNextOpenAPI Specification

Last updated 4 months ago

Was this helpful?

Overview

The samples group of commands allows users to manage biological samples in Workbench and sync them from a linked storage account:

Samples Commands

Samples List

List all samples:

omics workbench samples list
  [--max-results]
  [--page]
  [--page-size]
  [--sort]  
  [--platform]
  [--storage-id]

Flag Parameters

Flag

Description

Example

--max-results

Limit the total number of results returned.

--max-results 10

--page

Specify the page of results to retrieve.

--page 2

--page-size

The number of results to return in a single page.

--page-size 50

--sort

Sort results by column and direction (ASC or DESC).

--sort platform:ASC

--platform

Filter results by the sequencing platform associated with the sample.

--platform pacbio

--storage-id

Filter results by the storage account linked to the samples.

--storage-id my-gcp-account

Samples Describe

Describe one or more samples:

omics workbench samples describe SAMPLE_ID [...SAMPLE_ID]

Positional Parameters

Parameter

Description

Example

SAMPLE_ID

The ID of the sample to describe.

sample-123

[...SAMPLE_ID]

Optionally, additional sample IDs to describe.

sample-456

Sample Files List

List the files associated with a given sample across all storage accounts and platforms they were synced from. Optionally filter the results

omics workbench samples files list SAMPLE_ID
  [--max-results]
  [--page]
  [--page-size]
  [--sort]  
  [--platform]
  [--storage]
  [--instrument-id]

Positional Parameters

Parameter

Description

Example

SAMPLE_ID

The ID of the sample to list files for.

sample-123

Flag Parameters

Flag

Description

Example

--max-results

Limit the total number of results returned.

--max-results 10

--page

Specify the page of results to retrieve.

--page 2

--page-size

The number of results to return in a single page.

--page-size 50

--sort

Sort results by column and direction (ASC or DESC).

--sort path:ASC

--platform

Filter results by the sequencing platform associated with the sample.

--platform pacbio

--storage

Filter results by the storage account linked to the sample.

--storage my-gcp-account

--instrument-id

Filter results by the instrument ID associated with the sample.

--instrument-id 80243

--platform-type

File results by the platform type

--platform-type pacbio

--provider

Filter results by the cloud provider

--provider aws

--search

Whole text searching

--search “.bam”

Runs Submit

The runs submit command already exists and provides a powerful mechanism that enables a user to submit one or more workflows to Workbench. This change illustrates how a user can submit a workflow using samples as the sole input to “sample-enabled” workflows.

The following adds a samples flag as a mechanism to support running a workflow with a sample. Specific workflows will be able to support using samples as a mechanism to fill out the inputs without defining it as JSON. If the --samples flag is used on a non-supporting workflow, the user will be presented with a helpful error message.

To ensure this work will enable multiple versions of a single workflow and prevent issues if and when a workflow is updated, a backend component will handle the logic for how samples will be mapped into the inputs.

Data Localization

If the samples flag is used, we can ensure a given set of sample inputs matches the target regions and providers that the engine is running on.

If there are files from multiple regions or providers, only the supported provider/region files will be used. The user can modify this behavior by manually defining the sample inputs.

Defaults

Default values will be handled for specific workflows so that a user does not need to define them. Defaults can be specific to the engine, workflow, provider, or region (or any combination thereof) and will be defined within the backend service.

omics workbench runs submit 
  --url "PacificBiosciences:HiFi-human-assembly-WDL--HiFi-human-assembly-WDL/dockstore" 
  --samples [SAMPLE-NAMES]
  [--engine]

Flag parameters

Flag

Description

Example

--samples

An optional flag that accepts a comma separated list of Sample IDs to use in the given workflow. HG001,HG002,HG003

In order to capture the case where relationships between the proband and the parents need to be preserved, the user can use the father and mother prefix to indicate which samples are the parents. All unprefixed samples are considered to be the children of the the defined parents

This flag is not meant to capture all of the complexities that could possibly occur but at most solve for the trio use case

--samples HG001

--samples HG001,HG002,HGOO3

--samples father:HGOO1,mother:HG002,HG003

--engine

An existing flag that allows the user to define the specific engine they would like to run a workflow on

--engine aws-healthomics-2

Examples

# Simple case mapping a single sample
omics workbench runs submit 
  --url HifiSolves-single-sample/latest
  --samples HG001

# Simple case of mapping multiple samples
omics workbench runs submit 
  --url HifiSolves-multiple-sample/latest
  --samples HG001,HG002,HG003

# Complex case of mapping multiple samples with relationships
omics workbench runs submit 
  --url HifiSolves-single-sample/latest
  --samples father:HG001,mother:HG002,HG003

List
Samples
Samples Describe
Files List
Runs Submit