LogoLogo
  • Overview
  • publisher
    • Introduction
    • Getting Started
      • Logging in to Publisher
    • Data Sources
      • Connecting a Data Source
      • Managing a Data Source
      • Connectors
        • AWS S3 Permissions
        • Connecting to AWS S3 Storage
        • Google Cloud Storage (GCS) Permissions
        • Connecting to Google Cloud Storage
        • PostgreSQL Permissions
        • Connecting to PostgreSQL
        • PostgreSQL on Azure Permissions
        • Microsoft Azure Blob Storage Permissions
        • Connecting to Microsoft Azure Blob Storage
        • Connecting to HTTPS
        • Connecting to other sources via Trino
          • BigQuery
    • Collections
      • Creating a Collection
      • Sharing a Collection
      • Collection Filters
      • Editing Collection Metadata
      • Updating Collection Contents
    • Access Policies
      • Creating an Access Policy
      • Managing Access Policies
    • Questions
      • Adding Questions
      • Example Question
    • Settings
      • Viewing Current and Past Administrators
      • Adding an Administrator
      • Removing an Administrator
      • Setting Notification Preferences
  • Explorer
    • Introduction
    • Viewing a Collection
    • Browsing Collections
    • Asking Questions
    • Accessing a Private Collection
      • Requesting Access to a Private Collection
    • Filtering Data in Tables
      • Strings
      • Dates
      • Numbers
  • Workbench
    • Introduction
    • Getting Started
      • Logging into Workbench
      • Connecting an Engine
      • Finding or Importing a Workflow
      • Configuring Workflow Inputs
      • Running and Monitoring a Workflow
      • Locating Outputs
    • Engines
      • Adding and Updating an Engine
        • On AWS HealthOmics
        • On Microsoft Azure
        • On Google Cloud Platform
        • On Premises
      • Parameters
        • AWS HealthOmics
        • Google Cloud Platform
        • Microsoft Azure
        • On-Premises
        • Cromwell
        • Amazon Genomics CLI
    • Workflows
      • Finding Workflows
      • Adding a Workflow
      • Supported Languages
      • Repositories
        • Dockstore
    • Instruments
      • Getting Started with Instruments
      • Connecting a Storage Account
      • Using Sample Data in a Workflow
      • Running Workflows Using Samples
      • Family Based Analysis with Pedigree Information
      • Monitor the Workflow
      • CLI Reference
        • Instruments
        • Storage
        • Samples
        • OpenAPI Specification
    • Entities
    • Terminology
  • Passport
    • Introduction
    • Registering an Email Address for a Google Identity
  • Command Line Interface
    • Installation
    • Usage Examples
    • Working with JSON Data
    • Reference
      • workbench
        • runs submit
        • runs list
        • runs describe
        • runs cancel
        • runs delete
        • runs logs
        • runs tasks list
        • runs events list
        • engines list
        • engines describe
        • engines parameters list
        • engines parameters describe
        • engines health-checks list
        • workflows create
        • workflows list
        • workflows describe
        • workflows update
        • workflows delete
        • workflows versions create
        • workflows versions list
        • workflows versions describe
        • workflows versions files
        • workflows versions update
        • workflows versions delete
        • workflows versions defaults create
        • workflows versions defaults list
        • workflows versions defaults describe
        • workflows versions defaults update
        • workflows versions defaults delete
        • namespaces get-default
        • storage add
        • storage delete
        • storage describe
        • storage list
        • storage update
        • storage platforms add
        • storage platforms delete
        • storage platforms describe
        • storage platforms list
        • samples list
        • samples describe
        • samples files list
      • publisher
        • datasources list
  • Analysis
    • Python Library
    • Popular Environments
      • Cromwell
      • CWL Tool
      • Terra
      • Nextflow
      • DNAnexus
Powered by GitBook

© DNAstack. All rights reserved.

On this page
  • Required Connector Functionality
  • Configuration
  • Data Access
  • Monitoring
  • Connector Types

Was this helpful?

  1. publisher
  2. Data Sources

Connectors

A Connector enables Publisher to interface with various external storage systems. They provide standardized access to files and tables while managing authentication, monitoring, and data access functionality.

Required Connector Functionality

A Connector must fulfill these requirements:

Configuration

  • Provide an HTML user interface for configuring external connections

    • Store encrypted credentials for external systems (e.g., service accounts or dedicated user accounts)

    • Generate an audit trail for configuration changes

Data Access

  • Respond to requests about attached data by:

    • Listing files with current metadata (size, checksums, create/update dates)

    • Generating time-limited signed URLs for file retrieval

    • Listing tables with current metadata (column names & types, row count)

    • Dumping table data with optional column and row subsetting

    • Generating an audit trail for data access

Monitoring

  • Monitor external systems for data changes

    • Notify Publisher of changes

  • Log all external system access, including purpose and success/fail status

Connector Types

  1. AWS S3 Storage

  2. Google Cloud Storage

  3. Microsoft Azure Blob Storage

  4. PostgreSQL

  5. HTTPS

While Publisher includes DNAstack-developed connectors, third-party developers can create additional connectors for other data sources. For information on developing custom connectors, refer to the Data Source Connector Developer Guide.

PreviousManaging a Data SourceNextAWS S3 Permissions

Last updated 2 months ago

Was this helpful?