Version: 0.7

Pip dependencies and Python environments

Public Preview

This feature is currently in Public Preview.

Build more powerful On-Demand Features by leveraging popular Python packages available in Python Environments. Here's an example On-Demand Feature View that uses the fuzzywuzzy package to get the fuzzy similarity between two strings:

from tecton import on_demand_feature_view, RequestSource
from tecton.types import Field, Int64, String

request_schema = [Field("baseline", String), Field("text", String)]
similarity_request = RequestSource(schema=request_schema)
output_schema = [Field("similarity", Int64), Field("partial_similarity", Int64)]


@on_demand_feature_view(
    sources=[similarity_request],
    mode="python",
    schema=output_schema,
    environments=["tecton-python-extended:0.1"],
)
def fuzzy_similarity_feature_view(request):
    from fuzzywuzzy import fuzz

    result = {
        "similarity": fuzz.ratio(request["baseline"], request["text"]),
        "partial_similarity": fuzz.partial_ratio(request["baseline"], request["text"]),
    }
    return result

Python Environments for On-Demand Feature Views are isolated compute environments where transformations are run during Online feature retrieval. Specifying an environment enables the use of common Python libraries when building real-time features.

Available Python Environments

Tecton publishes a set of Python Environments that include common feature transformation packages.

Python Environments are identified by a name and a version number, such as tecton-python-core:0.1. By pinning your environment, you can be sure that your transformation logic will continue to run reliably.

The following Python Environments are available for use:

tecton-python-core is a lightweight environment with the minimal set of dependencies available
tecton-python-extended offers a larger set of common feature transformation packages

The table below lists all available versions for these environments.

Environment	Date published
tecton-python-core:0.1	2023-07-26
tecton-python-extended:0.1	2023-07-26
tecton-python-extended:0.2	2023-08-02
tecton-python-extended:0.3	2023-08-29
tecton-python-extended:0.4	2023-09-27

To view this list from the Tecton CLI, run tecton environment list-all.

Specifying Environments for On-Demand Feature Views and Feature Services

Tecton managed Environments can be used with two parameters:

environments parameter on an On-Demand Feature View definition specifies the set of Environments that the transformation logic is compatible with.
The on_demand_environment on the Feature Service definition specifies the single environment that will be used when running all On-Demand Feature Views in that Feature Service during Online retrieval.

Let’s look at an example. Say we want to create:

A Feature View with a dependency on fuzzywuzzy, which is only available in tecton-python-extended:0.1
A Feature View with a dependency on numpy, which is available in both tecton-python-core:0.1 and tecton-python-extended:0.1.
A Feature Service that contains both of these Feature Views

from tecton import on_demand_feature_view, RequestSource, FeatureService
from tecton.types import Field, Int64, String

request_schema = [Field("baseline", String), Field("text", String)]
similarity_request = RequestSource(schema=request_schema)
output_schema_similarity = [Field("similarity", Int64), Field("partial_similarity", Int64)]


@on_demand_feature_view(
    sources=[similarity_request],
    mode="python",
    schema=output_schema_similarity,
    environments=["tecton-python-extended:0.1"],
)
def fuzzy_similarity_feature_view(request):
    from fuzzywuzzy import fuzz

    result = {
        "similarity": fuzz.ratio(request["baseline"], request["text"]),
        "partial_similarity": fuzz.partial_ratio(request["baseline"], request["text"]),
    }
    return result


letter_count_request = RequestSource(schema=request_schema)
output_schema_letter_count = [Field("letter_count", Int64)]


@on_demand_feature_view(
    sources=[letter_count_request],
    mode="python",
    schema=output_schema_letter_count,
    environments=["tecton-python-core:0.1", "tecton-python-extended:0.1"],
)
def letter_count_feature_view(request):
    import numpy as np

    characters = np.array(list(request["text"]))
    letter_count = np.sum(np.char.isalpha(characters))
    result = {"letter_count": letter_count}

    return result


my_fs = FeatureService(
    name="text_processing_feature_service",
    features=[fuzzy_similarity_feature_view, letter_count_feature_view],
    on_demand_environment="tecton-python-extended:0.1",
)

Note that:

If environments is not specified for an On-Demand Feature View, then it is assumed to be compatible with all Tecton environments.
If the dependency required for your Feature View is available in multiple environments, then you can include the set of environments in this list.
During execution, all On-Demand Feature Views within a Feature Service run in the same Environment. As a result, the on_demand_environment specified in the Feature Service must be on the environments list for all On-Demand Feature Views included in the features list.
Conversely, if an On-Demand Feature View specifies an environments constraint, then any Feature Service that includes the On-Demand Feature View must specify an on_demand_environment on that list.
Configuring an on_demand_environment can have an impact on get-features latency. See section below.

Configuring Notebook and Testing environments to be compatible with package requirements

The Environment configurations above are managed by Tecton and used only during the online execution of On-Demand Feature Views. In order to develop and test these Feature Views in offline environments ensure that relevant dependencies are installed in your local environments.

Below are our suggestions on how to configure offline environments, but there are other ways to install the appropriate dependencies.

Installing dependencies in your Notebook environment

Databricks
EMR

Install individual packages in your notebook with %pip install. Alternatively, copy the full set of dependencies for the relevant version into a requirements.txt file to install all the dependencies at once.

Installing dependencies in your Unit Testing environment

In order to run unit tests for your On-Demand Feature Views with specific Python dependencies, ensure that the local Python environment executing the unit test have the proper dependency versions installed.

Impact of using Environments on online feature retrieval latency

The total latency observed is highly dependent on the complexity of the On-Demand Feature View transformation. For example, if the transformation contains sleep(1), then it will take at least 1 second to run.

Configuring the on_demand_environment for a Feature Service adds some overhead to each request, in addition to the time it takes to execute the transformation when calling that Feature Service with the get-features API.

Executing transformations in an environment typically adds 20-50ms on top of the transformation time. This latency will be higher if there is a sudden spike in traffic, as the service scales to match the new load.

If the On-Demand Feature View includes another Feature View as a source, then it must wait for the upstream Feature View to return before executing, making the latency additive. Otherwise, the On-Demand Feature View will be executed in parallel with other Feature Views in the Feature Service.

To inspect the impact of your On-Demand Feature Views on the total latency of your get-features request, you can compare the serverTimeSeconds and sloServerTimeSeconds values in the metadataOptions response object. The serverTimeSeconds value represents the entire time it took for Tecton to fulfill the request, while the sloServerTimeSeconds measurement removes time spent on On-Demand Feature View execution.

Pip dependencies and Python environments

Available Python Environments​

Specifying Environments for On-Demand Feature Views and Feature Services​

Configuring Notebook and Testing environments to be compatible with package requirements​

Installing dependencies in your Notebook environment​

Installing dependencies in your Unit Testing environment​

Impact of using Environments on online feature retrieval latency​

Was this page helpful?