0.5 to 0.6 Upgrade Guide
Sunsetting Python 3.7 supportβ
Starting in 0.6, the Tecton SDK and CLI no longer run in Python 3.7 environments. The Tecton SDK and CLI retain compatibility with Python 3.8 and Python 3.9.
β οΈ In some rare cases, updating Python versions can cause Tecton to identify
unexpected diff in transformation logic. In these scenarios, itβs typically safe
to use the --suppress-recreates
option to override the diff. Tecton recommends
updating your Python version separately from your Tecton SDK version. For
example, if you are currently using Python 3.7 with Tecton 0.5, you could first
update to Python 3.8, and then perform the Tecton 0.6 upgrade.
Sample Upgrade Process for Feature Repositoriesβ
This pull request
shows the upgrade process from 0.5.5
to 0.6
for a sample Feature Repository.
Breaking changes to Feature Repositoriesβ
Changes to default feature names when using the last_distinct()
aggregationβ
Impact: Feature Views using the last_distinct()
aggregation will cause a
tecton plan
error unless feature names are explicitly defined.
With the introduction of the last()
aggregation function, Tecton has changed
the default feature name for last_distinct()
aggregations to avoid confusion
between the two functions.
Previously, when using the last_distinct()
aggregation and not specifying the
name
argument, the default name would be set based on the number of values to
be returned, the aggregation time window, and the aggregation interval. For
example, the following Aggregation definition would result in a feature column
named my_column_lastn_15_7d_1d
.
@batch_feature_view(
# ...
aggregations=[Aggregation(column="my_column", function=last_distinct(15), time_window=datetime.timedelta(days=7))],
aggregation_interval=timedelta(days=1),
)
def my_fv(data_source):
pass
In 0.6, the new default name will be my_column_last_distinct_15_7d_1d
.
To upgrade to 0.6 when you used the default feature name previously, set the
name
argument to match the legacy naming convention. For example:
@batch_feature_view(
# ...
aggregations=[
Aggregation(
column="my_column",
function=last_distinct(15),
time_window=datetime.timedelta(days=7),
name="my_column_lastn_15_7d_1d",
)
],
aggregation_interval=timedelta(days=1),
)
def my_fv(data_source):
pass
If the explicitly set name matches the existing one, then no difference should
show during tecton plan
.
If you do not set the name
parameter, you will see an error during the upgrade
process.
$ tecton plan
Using workspace "prod" on cluster https://your-instance.tecton.ai
β
Imported 47 Python modules from the feature repository
β
Collecting local feature declarations
β Performing server-side feature validation: Finished generating plan.
Errors in `user_recent_transactions`(FeatureView) while changing SDK from 0.5.5 to 0.6.0. The default aggregation column name was changed in this SDK from:
amt_lastn10_1h_10m -> amt_last_distinct_10_1h_10m,
please explicitly set 'name' to the legacy name to avoid rematerializing the feature view, such as Aggregation(..., name="amt_lastn10_1h_10m")
=================== StreamFeatureView user_recent_transactions declared in fraud/features/stream_features/last_transactions.py ===================
0025: def user_recent_transactions(transactions):
0026: return f'''
0027: SELECT
0028: user_id,
0029: cast(amt as string) as amt,
0030: timestamp
0031: FROM
0032: {transactions}
0033: '''
0034: