Environments in Rift
This feature is currently in Public Preview.
Environments for Rift-based Feature Views are isolated compute environments where materialization jobs are executed.
Rift-based Feature Views with materialization enabled are required to set the
environment
field -- having a pinned environment ensures that Feature Views
and transformation logic run reliably and predictably. This can be set as a
default via your
Repo Config
or via the environment
field on a Batch or Stream Feature View.
When creating a Rift-based Feature View, use the latest supported Tecton Environment. If your feature transformation has additional dependencies, create a Custom Python Environment.
Tecton Environments
Tecton publishes a set of Python Environments that include common feature transformation packages that enable materialization with Rift.
Environments are identified by a name and a version number. The table below lists all available Tecton Environments for Rift-based Feature Views.
Environment | Date published |
---|---|
tecton-rift-core-0.9.5 | 2024-05-24 |
tecton-rift-core-0.9.0 | 2024-04-01 |
To view this list from the Tecton CLI, run tecton environment list
.
Custom Python Environments
Data pipelines commonly rely on a variety of third-party libraries to help facilitate transformations. With Tecton, users can create custom environments that include dependencies used in Rift-based Feature Views.
The process for creating custom environments is as follows:
- Create a requirements.txt file
- Create a custom environment via the Tecton CLI.
- Set the Feature View
environment
field to your custom environment name.
Create a requirements.txt
file
Here's an example requirements.txt
file that will be referenced in this
tutorial:
# PyPI packages
fuzzywuzzy==0.18.0
pydantic<2
tecton[rift-materialization,snowflake]==0.9.0
tecton-runtime==0.0.3
urllib3<1.27
To support batch materialization for Batch & Stream Feature Views, the
requirements.txt must include the
tecton[rift-materialization]
package pinned
to a specific version 0.9.0+.
To support
On-Demand Feature Views
and
Ingest API-based Feature Views,
the requirements.txt must include the
tecton-runtime
package pinned to a
specific version 0.0.3+.
We recommend using the latest available versions of these packages.
Create an environment via the Tecton CLI
To create an environment in Tecton, you can use the
tecton environment create
command.
$ tecton environment create --name "my-custom-env-0.1" --description "My Custom Env 0.1" --requirements /path/to/requirements.txt
💡 Creating environment 'my-custom-env-0.1' for job types:
✅ On Demand
✅ Rift Batch
✅ Rift Stream Ingest
⏳ Resolving dependencies for Python 3.8.17 and architecture x86_64. This may take a few seconds.....
✅ Successfully resolved dependencies
⏳ Downloading wheels. This may take a few seconds.....
⏳ Uploading compressed wheels in parts to S3. This may take a few seconds.....
Upload progress: 100%|████████████████████████████████████████████████| 9/9 [01:44<00:00, 11.62s/it]
✅ Successfully uploaded dependencies
$ tecton environment get --name "my-custom-env-0.1"
Id Name Description Type Status Materialization Version Tecton Transform Version Created At
===================================================================================================================================================================
19ba8091843146dc93e3480cd my-custom-env-0.1 My Custom Env 0.1 CUSTOM READY 0.9.0 0.0.3 2024-01-01 00:51:43 UTC
Use an Environment in a Feature View
You can specify the environment name in your Feature definition by setting the
environment
field.
Below is an example Batch Feature View that uses the fuzzywuzzy
package
available in the my-custom-env-0.1
environment.
from data_sources import product_source
from entities import product
from tecton import batch_feature_view
from tecton.types import Field, Float64, String, Timestamp
fv_schema = [
Field("product_id", String),
Field("timestamp", Timestamp),
Field("product_name", String),
Field("similarity_score", Float64),
Field("partial_similarity_score", Float64),
]
@batch_feature_view(
sources=[product_source],
entities=[product],
mode="pandas",
schema=fv_schema,
environment="my-custom-env-0.1",
)
def moch_cheesecake_similarity(product_df):
from fuzzywuzzy import fuzz
baseline = "Mocha Cheesecake Fudge Brownie Bars"
product_df["similarity_score"] = product_df["product_name"].apply(lambda name: fuzz.ratio(name, baseline))
product_df["partial_similarity_score"] = product_df["product_name"].apply(
lambda name: fuzz.partial_ratio(name, baseline)
)
return product_df
Delete an Environment
A custom environment can be deleted via the CLI using the tecton environment delete command. Deletion will fail if the environment is actively being used in a Feature View.
Usage Notes & FAQs
Usage Notes
- The Admin role is required to create and delete a custom environment via the Tecton CLI.
- Custom environments should be unique across all workspaces and can only contain letters, numbers, hyphens, and underscores.
- Environment creation takes place asynchronously, typically in 2-10 minutes. An
environment should be in
READY
status before it can be used -- check its status withtecton environment get
).
FAQ: How do I resolve dependency resolution errors during environment creation?
- These can be caused by conflicting version requirements in the
requirements.txt
file. Run theresolve-dependencies
command to view the fully resolved set of requirements and identify incompatible version specifications. - Verify that each dependency in the
requirements.txt
file has a.whl
file available for download in PyPI (or any custom Artifactory) for the relevant Python version and x86 architecture. - Tecton uses the
pex
utility to generate the fully resolved dependency set. To inspect the
underlying commands, run the
tecton environment resolve-dependencies --verbose
.
FAQ: How do I check which dependencies are available in my custom environment?
- Run tecton environment describe to see the input requirements as well as the fully resolved set of dependencies for a custom environment.