Feature Services
Feature Services group features from Feature Views together for training and serving. They provide a REST API for low-latency online retrieval and methods for offline batch lookups via Tecton's SDK.
It is generally recommended that each model deployed in production have one associated Feature Service deployed, which serves features to the model.
A Feature Service provides:
- A REST API to access feature values at the time of prediction
- A one-line method call to rapidly construct training data for user-specified timestamps and labels
- The ability to observe the endpoint where the data is served to monitor serving throughput, latency, and prediction success rate
Defining a Feature Service​
Define a Feature Service using the FeatureService
class.
Attributes​
A Feature Service definition includes the following attributes:
name
: The unique name of the Feature Servicefeatures
: The features defined in a Feature View or Feature Table, and served by the Feature Serviceonline_serving_enabled
: (Optional, default True) If True, users can send realtime requests to this FeatureService, and only FeatureViews with online materialization enabled can be added to this FeatureService.logging
: (Optional) A configuration for logging feature requests sent to this Feature Service.- Metadata used to organize the FeatureService. Metadata parameters include
description
,owner
andtags
.
Example: Defining a Feature Service​
The following example defines a Feature Service.
from tecton import FeatureService
from feature_repo.shared.features.ad_ground_truth_ctr_performance_7_days import (
ad_ground_truth_ctr_performance_7_days,
)
from feature_repo.shared.features.user_total_ad_frequency_counts import (
user_total_ad_frequency_counts,
)
from feature_repo.shared.features.user_ad_impression_counts import (
user_ad_impression_counts,
)
ctr_prediction_service = FeatureService(
name="ctr_prediction_service",
description="A Feature Service used for supporting a CTR prediction model.",
online_serving_enabled=True,
features=[
# add all of the features in a Feature View
user_total_ad_frequency_counts,
# add a single feature from a Feature View using double-bracket notation
user_ad_impression_counts[["count"]],
],
)
- The Feature Service uses the
user_total_ad_frequency_counts
, anduser_ad_impression_counts
Feature Views. - The list of features in the Feature Service are defined in the
features
argument. When you pass a FeatureView in this argument, the Feature Service will contain all the features in the Feature View. To select a subset of features in a Feature View, use double-bracket notation (e.g.FeatureView[['my_feature', 'other_feature']]
.)
Using Feature Services​
Using the low-latency HTTP API Interface​
See the Reading Online Features Using the HTTP API guide.
Using the Offline Feature Retrieval SDK Interface​
Use the offline or batch interface for batch prediction jobs or to generate
training datasets. To fetch a dataframe from a Feature Service with the Python
SDK as a client, use the FeatureService.get_historical_features()
method.
To make a batch request, first create a context consisting of the join keys for prediction and the desired feature timestamps:
events = spark.read.parquet("dbfs:/sample_events.pq")
display(events)
Sample output (data not shown):
ad_id | user_uuid | timestamp | clicked |
---|---|---|---|
... | ... | ... | ... |
... | ... | ... | ... |
Then, using get_historical_features()
, generate the feature values:
import tecton
ws = tecton.get_workspace("prod")
feature_service = ws.get_feature_service("price_prediction_feature_service")
result_spark_df = feature_service.get_historical_features(events).to_pandas()
Sample output (data not shown):
ad_id | user_uuid | timestamp | clicked | ad_ground_truth_ctr_performance_7_days__ad_total_clicks_7days | ad_ground_truth_ctr_performance_7_days__ad_total_impressions_7days |
---|---|---|---|---|---|
... | ... | ... | ... | ... | ... |
... | ... | ... | ... | ... | ... |
Using Feature Logging​
Feature Services have the ability to continuously log online requests and feature vector responses as Tecton Datasets. These logged feature datasets can be used for auditing, analysis, training dataset generation, and spine creation.
To enable feature logging on a FeatureService, simply add a LoggingConfig like
in the example below and optionally specify a sample rate. You can also
optionally set log_effective_times=True
to log the feature timestamps from the
Feature Store. As a reminder, Tecton will always serve the latest stored feature
values as of the time of the request.
Run tecton apply
to apply your changes.
from tecton import LoggingConfig
ctr_prediction_service = FeatureService(
name="ctr_prediction_service",
features=[ad_ground_truth_ctr_performance_7_days, user_total_ad_frequency_counts],
logging=LoggingConfig(
sample_rate=0.5,
log_effective_times=False,
),
)
This will create a new Tecton Dataset under the Datasets tab in the Web UI. This dataset will continue having new feature logs appended to it every 30 mins. If the features in the Feature Service change, a new dataset version will be created.
This dataset can be fetched in a notebook using the code snippet below.
AWS EMR users will need to follow instructions for installing Avro libraries notebooks to use Tecton Datasets since features are logged using Avro format.
import tecton
ws = tecton.get_workspace("prod")
dataset = ws.get_dataset("ctr_prediction_service.logged_requests.4")
display(dataset.to_pandas())
Sample output (data not shown):
ad_id | user_uuid | timestamp | clicked | ad_ground_truth_ctr_performance_7_days__ad_total_clicks_7days | ad_ground_truth_ctr_performance_7_days__ad_total_impressions_7days |
---|---|---|---|---|---|
... | ... | ... | ... | ... | ... |
... | ... | ... | ... | ... | ... |