0.7 to 0.8 Upgrade Guide
Overview
Version 0.8 of the SDK comes with performance and cost improvements, new capabilities to define more powerful features, the ability to manage releases of the Tecton Materialization Runtime, and an enhanced development experience. It also includes minor changes to the behavior and naming of some Tecton objects, methods, and CLI commands. Read the general Upgrade Process page in addition to the following documentation for guidance on how to safely upgrade from 0.7 to 0.8.
To ensure a safe upgrade, Tecton
disallows any destructive changes
(i.e. Recreates) to your Feature Repository while upgrading. For example,
tecton apply
will prevent
changes to a Feature View's Offline Store format
while upgrading as this would normally cause re-materialization.
Step-by-Step Upgrade Flow
These are the most important changes in Tecton 0.8:
- Materialization clusters are required to
pin a specific version of
the
tecton
library (using thetecton_materialization_runtime
on Batch Feature Views, Stream Feature Views, and Feature Tables). - The default Offline Store format is changing from Parquet to Delta.
- Tecton is introducing a new required Repo Config
.yaml
file to configure defaults in your Feature Repository. - Stream jobs for Spark-based Stream Feature Views
must run on on-demand instances.
This may be a breaking change if your Feature Repository explicitly sets
instance_availability
tospot
orspot_with_fallback
instream_compute
for Stream Feature Views.
Most customers should follow this guidance while upgrading:
-
Repo Config file: Use Tecton's CLI to generate a
repo.yaml
file with pre-filled defaults fortecton_materialization_runtime
:- If your Feature Repository does not specify an Offline Store format, run
tecton repo-config init --parquet
to explicitly set the default format to Parquet (in order to avoid destructive changes or rematerialization). - Otherwise, run
tecton repo-config init
- If your Feature Repository does not specify an Offline Store format, run
-
Spark-based Stream Feature Views: If
stream_compute
is set, ensure thatinstance_availability
is either not specified, or is set toon_demand
. -
Deprecated Methods, Parameters, or Attributes: Ensure that your Feature Repository does not rely on any Tecton methods, parameters, or attributes that are deprecated or removed in 0.8. This may be a breaking change if your Feature Repository relies on methods, parameters, or attributes removed in 0.8.
Please see the following sections for details on all changes in 0.8.
Changes in 0.8
Versioning of the Tecton Materialization Runtime
Tecton 0.8 introduces versioning of the Tecton Materialization Runtime that is deployed to Databricks & EMR clusters for orchestration of backfills and materialization. This further improves the reliability of Tecton releases beyond our robust testing and validation process by letting customers iteratively upgrade Feature Views and Tables.
In 0.8+, tecton_materialization_runtime
is a required parameter on Batch
Feature Views, Stream Feature Views, and Feature Tables. It must be set to an
exact version of
the tecton
package (e.g. 0.8.0
).
Tecton guarantees +1 support -- for example, features applied with Tecton SDK
0.8 will support Materialization Runtime Versions of 0.8.*
or 0.9.*
.
This means that jobs for Feature Views & Tables applied using any version of
Tecton will no longer automatically update (or automatically restart, in the
case of Stream Feature Views) when Tecton releases changes to the
materialization runtime. Customers should instead upgrade their Feature Views &
Tables by setting the tecton_materialization_runtime
parameter when upgrading
their SDK to a new minor version or when recommended by Tecton Support. Tecton
will also document release notes in the Changelog.
The new Repo Config, described in the following section, helps customers define
a default tecton_materialization_runtime
for all Feature Views & Tables in
their Feature Repository.
Repo Config File
Tecton 0.8 introduces the
Repo Config,
a required configuration file used to set defaults for Tecton objects in a
Feature Repository. This helps developers avoid having to specify certain
parameters for every new Tecton object and ultimately results in simpler feature
definitions. For example, developers can use this file to set a default
tecton_materialization_runtime
for all Feature Views in a Feature Repository.
During tecton plan/apply/test
, Tecton will look for a Repo Config file named
repo.yaml
in the root of your Feature Repository. To specify another file, use
the --config
flag (e.g. tecton plan --config my_config.yaml
).
Tecton will automatically generate a new Repo Config file when initializing a
new Feature Repository using tecton init
. For existing Feature Repositories,
run tecton repo-config init
to generate a Repo Config file named repo.yaml
with pre-filled defaults for tecton_materialization_runtime
.
If you are also explicitly setting the Offline Store version to ParquetConfig
(as described in the
following section), use
tecton repo-config init --parquet
.
Changes to configuring offline_store
offline_store
is an optional parameter used to configure Tecton's Offline
Store for Batch Feature Views, Stream Feature Views, and Feature Tables.
OfflineStoreConfig
Tecton 0.8 introduces the new OfflineStoreConfig
object for configuring
offline_store
and the new
Publish Features
functionality.
The OfflineStoreConfig
object includes an optional parameter
(staging_table_format
) to configure the format of the Offline Store to
DeltaConfig()
or ParquetConfig()
. Before 0.8, this was done by setting
offline_store=ParquetConfig()
or offline_store=DeltaConfig()
.
In 0.8, offline_store
can be set to OfflineStoreConfig
, ParquetConfig
, or
DeltaConfig
. In a future version of Tecton, offline_store
will only support
being set to OfflineStoreConfig
.
Changes to the Default Offline Store Format
Tecton 0.8 changes the default format for the Offline Store from Parquet to
Delta. If your Feature Repository does not already explicitly set
offline_store
, you'll need to set the offline_store
format to Parquet on
Batch Feature Views, Stream Feature Views, and Feature Tables to avoid recreates
during the upgrade. This can be done by:
- Running
tecton repo-config init --parquet
, which adds the following to your Repo Config file:
offline_store:
kind: OfflineStoreConfig
staging_table_format:
kind: ParquetConfig
- OR by explicitly adding this to your Feature Views & Tables:
offline_store = OfflineStoreConfig(staging_table_format=ParquetConfig())
Changes to Tecton on Snowflake
Feature Views with mode="snowpark"
are deprecated in 0.8 and will be removed
in a future version of the SDK.
The output format of get_historical_features()
now takes the Data Source
parameter data_delay
into consideration. See the
SDK Reference for more details on
this parameter.
Additionally, get_historical_features(start_time, end_time)
now returns an
additional column named _effective_timestamp
. Please see
this page
for more details on this parameter.
Some users previously configured the ALPHA_SNOWFLAKE_COMPUTE_ENABLED
flag to
opt-in to the beta offline retrieval query engine used by Tecton on Snowflake.
In 0.8, this new query engine is enabled by default when connected to a
Snowflake cluster. To explicitly opt-in to this version (for example, when
retrieving features in a Jupyter notebook), users can set
tecton.conf.set('TECTON_OFFLINE_RETRIEVAL_COMPUTE_MODE', 'snowflake')
.
Changes to the Athena connector
Tecton can be used with Athena to retrieve features in any Python environment.
In Tecton 0.7, we introduced major optimizations to queries generated using
Tecton's Athena backend. Users previously opted-in to these optimizations by
setting the SQL_DIALECT
and ALPHA_ATHENA_COMPUTE_ENABLED
configs.
In 0.8, these configs are no longer used as this behavior is enabled by default.
To activate Tecton's Athena backend, set
tecton.conf.set('TECTON_OFFLINE_RETRIEVAL_COMPUTE_MODE', 'athena')
.
To use the deprecated Athena connector, set
tecton.conf.set('USE_DEPRECATED_ATHENA_RETRIEVAL', True)
. The deprecated
connector will be removed in a future version of the Tecton SDK.
Instance Type for Spark-based Stream Feature Views
0.8 requires that stream jobs for Spark-based Stream Feature Views use On-Demand
instances (instance_availability="on_demand"
). This is to prevent issues with
Spot Instances negatively impacting the freshness and reliability of Stream
Feature Views. If not specified in stream_compute
(within the
DatabricksClusterConfig
or EMRClusterConfig
), Tecton will automatically set
instance_availability
to "on_demand"
.
This is a breaking change if you previously explicitly set
instance_availability
to spot
or spot_with_fallback
in stream_compute
for Spark-based Stream Feature Views.
SDK Interfaces that are deprecated or removed in 0.8
Methods that are deprecated in 0.8 and will be removed in the future
Deprecated Parameter | Replacement |
---|---|
FeatureView.deletion_status() | Retrieve the Deletion Job ID (via FeatureView.delete_keys() or FeatureView.list_materialization_jobs() ) and call get_materialization_job(job_id).status |
Methods, parameters, & attributes that were previously deprecated and are officially removed in 0.8
This is a breaking change if your Feature Repository still uses the removed methods, parameters, or attributes.
Removed method, parameter, or attribute | Replacement |
---|---|
Workspace.get_all() | tecton.list_workspaces() |
FeatureView.max_batch_aggregation_interval | FeatureView.max_backfill_interval |
CLI Changes
Changes to tecton access-control
The tecton access-control assign-role
and
tecton access-control unassign-role
commands now apply only to the
currently-selected workspace by default. Use the --workspace
flag to apply ACL
changes to a specific workspace or the --all-workspaces
flag to apply changes
to all workspaces.
Note that assigning or unassigning the admin
role will always apply to all
workspaces since it is a deployment-wide role.
Removal of --safety-checks
and --no-safety-checks
--safety-checks / --no-safety-checks
were deprecated in 0.7 and are officially
removed in 0.8. Users should use the --yes
or -y
flag to disable interactive
safety checks when running apply
, destroy
, plan
, and upgrade
commands.
FAQ
What does a successful upgrade look like?
Here's an example of a successful apply
during an upgrade. In this case, we've
set the tecton_materialization_runtime
and updated the offline_store
parameter.
✅ Imported 1 Python module from the feature repository
⚠️ Running Tests: No tests found.
✅ Collecting local feature declarations
✅ Performing server-side feature validation: Initializing.
↓↓↓↓↓↓↓↓↓↓↓↓ Plan Start ↓↓↓↓↓↓↓↓↓↓
~ Update Batch Feature View
name: my_feature_view_1
description: Whether the user performing the transaction is over 18 years old.
tecton_materialization_runtime: -> 0.8.0
offline_store: 0.7 Offline Store -> 0.8+ Offline Store
~ Update Batch Feature View
name: my_feature_view_2
description: Whether the user performing the transaction is over 18 years old.
tecton_materialization_runtime: -> 0.8.0
offline_store: 0.7 Offline Store -> 0.8+ Offline Store
↑↑↑↑↑↑↑↑↑↑↑↑ Plan End ↑↑↑↑↑↑↑↑↑↑↑↑
Generated plan ID is ...
View your plan in the Web UI: ...
Are you sure you want to apply this plan to: "your_workspace"? [y/N]> y
🎉 Done! Applied changes to 2 objects in workspace "your_workspace".
What does a blocked upgrade look like?
During an SDK version upgrade, Tecton automatically blocks destructive changes (i.e. Recreates) to prevent unintended changes.
Here's an example of a blocked destructive change:
✅ Imported 1 Python module from the feature repository
⚠️ Running Tests: No tests found.
✅ Collecting local feature declarations
⛔ Performing server-side feature validation: Finished generating plan.
Blocked destructive change to 'feature_view'. Destructive change are not allowed when upgrading SDK versions.
tecton_materialization_runtime: -> 0.8.0
offline_store:
{"stagingTableFormat":{"parquet":{}}}
⬇⬇⬇⬇⬇
{"stagingTableFormat":{"delta":{"timePartitionSize":"86400s"}}}
In this case, the staging_table_format
change (in OfflineStoreConfig
) is a
destructive change (a Recreate) that was blocked. To resolve this, explicitly
set the staging_table_format
to ParquetConfig
.