Connect to a Data Platform
This section explains how to connect Tecton to your data platform, which may be Databricks, EMR, or Snowflake. Connecting to your data platform allows Tecton to manage resources and execute jobs on your platform.
Connecting to your data platform is an important first step to using Tecton. Tecton uses your data platform to execute jobs that materialize features from your data sources and populate your feature store. Tecton needs permission to access your platform in order to execute these jobs.
Once connected, Tecton can execute and monitor these jobs in an automated fashion. Tecton also provides monitoring of key metrics related to feature materialization and data platform resource usage within its UI.
Connecting Tecton to your data platform depends on the specific platform. See the links below for details:
-
Configuring Databricks on AWS or Configuring Databricks on Google Cloud explains how to connect Tecton to Databricks. This is for users leveraging Databricks as their Spark compute engine.
-
Configuring EMR explains how to connect Tecton to EMR. This is for users leveraging EMR as their Spark compute engine.
-
Configuring Snowflake explains how to connect Tecton to Snowflake. This is for Tecton on Snowflake users.
-
Configuring Dataproc explains how to connect Tecton to Dataproc. This is for users leveraging Dataproc as their Spark compute engine.
The guide covers concepts such as IAM roles, security groups, VPC peering, and Glue Data Catalog permissions that may be required to set up the connection. Please review the full guide for your data platform before getting started.
Let Tecton Support know if you have any other questions! We're happy to help you through the setup process.