LogoLogo
  • Overview
  • publisher
    • Introduction
    • Getting Started
      • Logging in to Publisher
    • Data Sources
      • Connecting a Data Source
      • Managing a Data Source
      • Connectors
        • AWS S3 Permissions
        • Connecting to AWS S3 Storage
        • Google Cloud Storage (GCS) Permissions
        • Connecting to Google Cloud Storage
        • PostgreSQL Permissions
        • Connecting to PostgreSQL
        • PostgreSQL on Azure Permissions
        • Microsoft Azure Blob Storage Permissions
        • Connecting to Microsoft Azure Blob Storage
        • Connecting to HTTPS
        • Connecting to other sources via Trino
          • BigQuery
    • Collections
      • Creating a Collection
      • Sharing a Collection
      • Collection Filters
      • Editing Collection Metadata
      • Updating Collection Contents
    • Access Policies
      • Creating an Access Policy
      • Managing Access Policies
    • Questions
      • Adding Questions
      • Example Question
    • Settings
      • Viewing Current and Past Administrators
      • Adding an Administrator
      • Removing an Administrator
      • Setting Notification Preferences
  • Explorer
    • Introduction
    • Viewing a Collection
    • Browsing Collections
    • Asking Questions
    • Accessing a Private Collection
      • Requesting Access to a Private Collection
    • Filtering Data in Tables
      • Strings
      • Dates
      • Numbers
  • Workbench
    • Introduction
    • Getting Started
      • Logging into Workbench
      • Connecting an Engine
      • Finding or Importing a Workflow
      • Configuring Workflow Inputs
      • Running and Monitoring a Workflow
      • Locating Outputs
    • Engines
      • Adding and Updating an Engine
        • On AWS HealthOmics
        • On Microsoft Azure
        • On Google Cloud Platform
        • On Premises
      • Parameters
        • AWS HealthOmics
        • Google Cloud Platform
        • Microsoft Azure
        • On-Premises
        • Cromwell
        • Amazon Genomics CLI
    • Workflows
      • Finding Workflows
      • Adding a Workflow
      • Supported Languages
      • Repositories
        • Dockstore
    • Instruments
      • Getting Started with Instruments
      • Connecting a Storage Account
      • Using Sample Data in a Workflow
      • Running Workflows Using Samples
      • Family Based Analysis with Pedigree Information
      • Monitor the Workflow
      • CLI Reference
        • Instruments
        • Storage
        • Samples
        • OpenAPI Specification
    • Entities
    • Terminology
  • Passport
    • Introduction
    • Registering an Email Address for a Google Identity
  • Command Line Interface
    • Installation
    • Usage Examples
    • Working with JSON Data
    • Reference
      • workbench
        • runs submit
        • runs list
        • runs describe
        • runs cancel
        • runs delete
        • runs logs
        • runs tasks list
        • runs events list
        • engines list
        • engines describe
        • engines parameters list
        • engines parameters describe
        • engines health-checks list
        • workflows create
        • workflows list
        • workflows describe
        • workflows update
        • workflows delete
        • workflows versions create
        • workflows versions list
        • workflows versions describe
        • workflows versions files
        • workflows versions update
        • workflows versions delete
        • workflows versions defaults create
        • workflows versions defaults list
        • workflows versions defaults describe
        • workflows versions defaults update
        • workflows versions defaults delete
        • namespaces get-default
        • storage add
        • storage delete
        • storage describe
        • storage list
        • storage update
        • storage platforms add
        • storage platforms delete
        • storage platforms describe
        • storage platforms list
        • samples list
        • samples describe
        • samples files list
      • publisher
        • datasources list
  • Analysis
    • Python Library
    • Popular Environments
      • Cromwell
      • CWL Tool
      • Terra
      • Nextflow
      • DNAnexus
Powered by GitBook

© DNAstack. All rights reserved.

On this page
  • Getting Started with Instruments
  • Prerequisites
  • Uploading Sample Data to the Cloud
  • PacBio Data

Was this helpful?

  1. Workbench
  2. Instruments

Getting Started with Instruments

PreviousInstrumentsNextConnecting a Storage Account

Last updated 2 months ago

Was this helpful?

Getting Started with Instruments

Before you begin working with Instruments, ensure you understand the prerequisites and basic setup requirements. This guide will walk you through the initial steps of setting up and using Instruments in your environment.

Prerequisites

Before using Instruments, you will need:

  1. A account with appropriate permissions.

  2. Access to cloud storage containing your sequencing data (, , or ).

  3. The and configured.

Uploading Sample Data to the Cloud

Use the native tools provided by your cloud service provider to upload data to their storage services. This ensures compatibility and ease of use.

  • For AWS, use the or the S3 Console.

  • For Azure, use the or the Azure Portal.

  • For Google Cloud, use the or the Cloud Storage Console.

Maintain the directory structure and file naming conventions for accurate indexing.

PacBio Data

Metadata File and Directory Structure

  • Workbench relies on the metadata.xml files generated by SMRT Link to identify samples and their associated Movie BAM files.

    • These metadata files map each sample to the correct data files, enabling Workbench to automatically associate the appropriate files.

  • It is critical to preserve the directory structure and metadata files as generated by PacBio instruments during upload to ensure seamless detection.

Using SMRT Link to Sync Data to the Cloud

  • SMRT Link version 25.1 and later supports native cloud storage synchronization.

    • This feature allows users to directly sync sequencing data, including metadata.xml and BAM files, to a cloud storage bucket of their choice.

Workbench Instruments supports indexing and analyzing the results of PacBio sequencing runs from the on the Revio and Vega platforms. All output data is organized into a directory structure reflecting the SMRT Cells that were run. Barcoding within a SMRT Cell delineates different samples, while unbarcoded Movie BAM files indicate a single sample for the entire SMRT Cell.

Workbench
AWS
GCP
Azure
Workbench CLI installed
AWS CLI
Azure Storage Explorer
gcloud CLI
SMRT cells