Version: 0.6

On-Demand Feature View

An On-Demand Feature View is used to run row-level, request-time transformations on data from Request Sources, Batch Feature Views, or Stream Feature Views. Unlike Batch and Stream Feature Views, On-Demand Feature Views do not precompute and materialize data to the Feature Store, but instead run transformations both online and offline at the time of the request.

Running transformations request time can be useful for:

Calculating features based on data that is only available at the time of the request such as a current transaction or user location
Defining feature crosses that would be inefficient to precompute (example: compare embeddings between two users)
Running additional transformations on Tecton-managed aggregations
Defining new features without needing to rematerialize Feature Store data
Post-processing feature data (example: imputing null values)

Common Examples

Turning a user's GPS coordinates into a geohash
Parsing a user's search string
Checking if a user's incoming transaction is larger than the user's average number of transactions in the last 30 days
Picking the maximum transaction of the past 10 transactions of a user (if combined with a last-n aggregation in a Stream Feature View`)
Computing the cosine similarity between a pre-computed user embedding and a query embedding

info

On-Demand Feature View transformations introduce request-time latency based on the transformation being executed. For example, if your on-demand transformation executes a sleep("1") statement, the execution of this transformation won't be any faster than 1 second).

On-Demand Feature Transformations

On-Demand Feature View transformations are written using Python.

When using mode='python', Tecton passes in a row of data for each source in the form of a dictionary. On-demand feature outputs are returned in a single dictionary of one or more feature values. Outputs from an OnDemandFeatureView must be non-null.

When using mode='pandas', Tecton passes in one or many rows of data in the form of a pandas DataFrame. At offline execution time, Tecton will pass in a batch of several rows. At online inference time, Tecton will typically pass in a single row. Tecton expects the function to return a pandas DataFrame.

Example

Python
Pandas

from tecton import on_demand_feature_view, RequestSource
from tecton.types import Float64, Bool, Field
from features.user_transaction_amount_averages import user_transaction_amount_averages


transaction_request = RequestSource(schema=[Field("amount", Float64)])


@on_demand_feature_view(
    sources=[transaction_request, user_transaction_amount_averages],
    mode="python",
    schema=[Field("transaction_amount_is_higher_than_average", Bool)],
)
def transaction_amount_is_higher_than_average(transaction_request, user_transaction_amount_averages):
    amount_mean = user_transaction_amount_averages["amount_mean_24h_10m"] or 0
    return {"transaction_amount_is_higher_than_average": transaction_request["amount"] > amount_mean}

from tecton import on_demand_feature_view, RequestSource
from tecton.types import Float64, Bool, Field
from features.user_transaction_amount_averages import user_transaction_amount_averages


transaction_request = RequestSource(schema=[Field("amount", Float64)])


@on_demand_feature_view(
    sources=[transaction_request, user_transaction_amount_averages],
    mode="pandas",
    schema=[Field("transaction_amount_is_higher_than_average", Bool)],
)
def transaction_amount_is_higher_than_average(transaction_request, user_transaction_amount_averages):
    user_transaction_amount_averages["amount"] = transaction_request["amount"]
    user_transaction_amount_averages["transaction_amount_is_higher_than_average"] = (
        user_transaction_amount_averages["amount"] > user_transaction_amount_averages["amount_mean_24h_10m"]
    )

    return user_transaction_amount_averages[["transaction_amount_is_higher_than_average"]]

How to choose between pandas and python mode

mode='python' is significantly more performant than mode='pandas' during online inference, but slightly less performant when offline data is generated for training or offline prediction purposes.

Generally, for any online inference use case, use mode='python'. Only consider using mode='pandas' if you use an ODFV only to generate training data, or offline inference data.

Parameters

See the API reference for the full list of parameters.

On-Demand Feature View

On-Demand Feature Transformations​

Example​

How to choose between pandas and python mode​

Parameters​

Was this page helpful?

On-Demand Feature Transformations

Example

How to choose between pandas and python mode

Parameters