Connecting to other sources via Trino
Last updated
Was this helpful?
Last updated
Was this helpful?
Publisher can connect to additional data sources via Trino. This guide explains how to configure Trino connectors in Publisher.
From the Publisher interface, select "Data Sources" in the navigation bar and click "Connect Data Source". Choose "Trino" from the available connectors.
In Trino, the catalog properties define how Trino connects to your data source. A catalog contains schemas and references a data source through a connector, forming the foundation of your data access configuration.
When you configure a Trino data source, you need to specify appropriate catalog properties based on the type of data source you're connecting to. These properties typically include:
Connection details (hostname, port)
Authentication credentials
Schema information
Performance settings
Security configurations
For the connector types below, refer to the relevant instructions page:
For all other connector types, refer to the documentation for your specific connector.
The Row count query field allows you to specify a SQL query that counts the number of rows in a table. You can use template placeholders in your query, which will be automatically replaced with the appropriate values when the query is executed. Placeholders are surrounded by double braces ({{
and }}
).
Available placeholders:
{{catalog}}
- The catalog name
{{schema}}
- The schema name
{{table}}
- The table name
You can use placeholders as identifiers:
Or as string literals in more complex queries:
The Row count query will be executed frequently to provide up-to-date row counts for tables in your data source. As the data source owner, you are responsible for monitoring the cost and performance impact of the query.
While the standard COUNT(*) query works for most data sources, it is recommended to use a more efficient query to retrieve row counts when available, as this can significantly reduce resource usage and costs.
Trino connectors require specific permissions to access and interact with various data sources. These permissions ensure that Trino can read, write, and manage data as needed. The exact permissions depend on the data source type and the operations that Trino needs to perform.
File-Based Data Sources: For connectors accessing file-based data sources like HDFS, S3, or Azure Blob Storage, Trino needs permissions to list, read, and write files. This typically includes permissions like s3:ListBucket and s3:GetObject for S3, or equivalent permissions for other storage services.
Database Connectors: For relational databases such as MySQL, PostgreSQL, or SQL Server, Trino requires permissions to execute SQL queries. This includes SELECT, INSERT, UPDATE, and DELETE permissions on the relevant tables and schemas.
NoSQL and Other Data Stores: For NoSQL databases like Cassandra or MongoDB, Trino needs permissions to read and write data. This usually involves permissions to query collections or tables and manage indexes.
Cloud Services: Trino needs appropriate API access permissions when accessing cloud services like Google BigQuery or AWS Athena. This includes roles or policies that allow data querying and management.
Snowflake: For Snowflake, Trino requires permissions to execute SQL queries and manage data. This includes USAGE on the database and schema and SELECT on the tables and views.
Apache Iceberg: For Apache Iceberg tables, Trino requires permissions to access the underlying storage system (e.g., S3, HDFS). This includes permissions to list, read, and write files in the storage locations.
Note: If you plan to perform any ETL/ELT operations with Iceberg tables, Trino will need write permissions to the storage system in addition to read permissions.
For detailed information on the specific permissions required for each connector, refer to the official Trino documentation: