Samples
Overview
The samples group of commands allows users to manage biological samples in Workbench and sync them from a linked storage account:
Samples Commands
Samples List
List all samples:
Flag Parameters
Flag
Description
Example
--max-results
Limit the total number of results returned.
--max-results 10
--page
Specify the page of results to retrieve.
--page 2
--page-size
The number of results to return in a single page.
--page-size 50
--sort
Sort results by column and direction (ASC or DESC).
--sort platform:ASC
--platform
Filter results by the sequencing platform associated with the sample.
--platform pacbio
--storage-id
Filter results by the storage account linked to the samples.
--storage-id my-gcp-account
Samples Describe
Describe one or more samples:
Positional Parameters
Parameter
Description
Example
SAMPLE_ID
The ID of the sample to describe.
sample-123
[...SAMPLE_ID]
Optionally, additional sample IDs to describe.
sample-456
Sample Files List
List the files associated with a given sample across all storage accounts and platforms they were synced from. Optionally filter the results
Positional Parameters
Parameter
Description
Example
SAMPLE_ID
The ID of the sample to list files for.
sample-123
Flag Parameters
Flag
Description
Example
--max-results
Limit the total number of results returned.
--max-results 10
--page
Specify the page of results to retrieve.
--page 2
--page-size
The number of results to return in a single page.
--page-size 50
--sort
Sort results by column and direction (ASC or DESC).
--sort path:ASC
--platform
Filter results by the sequencing platform associated with the sample.
--platform pacbio
--storage
Filter results by the storage account linked to the sample.
--storage my-gcp-account
--instrument-id
Filter results by the instrument ID associated with the sample.
--instrument-id 80243
--platform-type
File results by the platform type
--platform-type pacbio
--provider
Filter results by the cloud provider
--provider aws
--search
Whole text searching
--search “.bam”
Runs Submit
The runs submit
command already exists and provides a powerful mechanism that enables a user to submit one or more workflows to Workbench. This change illustrates how a user can submit a workflow using samples as the sole input to “sample-enabled” workflows.
The following adds a samples flag as a mechanism to support running a workflow with a sample. Specific workflows will be able to support using samples as a mechanism to fill out the inputs without defining it as JSON. If the --samples
flag is used on a non-supporting workflow, the user will be presented with a helpful error message.
To ensure this work will enable multiple versions of a single workflow and prevent issues if and when a workflow is updated, a backend component will handle the logic for how samples will be mapped into the inputs.
Data Localization
If the samples flag is used, we can ensure a given set of sample inputs matches the target regions and providers that the engine is running on.
If there are files from multiple regions or providers, only the supported provider/region files will be used. The user can modify this behavior by manually defining the sample inputs.
Defaults
Default values will be handled for specific workflows so that a user does not need to define them. Defaults can be specific to the engine, workflow, provider, or region (or any combination thereof) and will be defined within the backend service.
Flag parameters
Flag
Description
Example
--samples
An optional flag that accepts a comma separated list of Sample IDs to use in the given workflow. HG001,HG002,HG003
In order to capture the case where relationships between the proband and the parents need to be preserved, the user can use the father and mother prefix to indicate which samples are the parents. All unprefixed samples are considered to be the children of the the defined parents
This flag is not meant to capture all of the complexities that could possibly occur but at most solve for the trio use case
--samples HG001
--samples HG001,HG002,HGOO3
--samples father:HGOO1,mother:HG002,HG003
--engine
An existing flag that allows the user to define the specific engine they would like to run a workflow on
--engine aws-healthomics-2
Examples
Last updated