Creating Feature 4
In this topic, you will create and test the fourth feature,
user_home_location
. The feature outputs the lat
(latitude) and
long
(longitude) of the user's home location. The
transaction_distance_from_home
feature, which you will define later, uses
these outputs.
In your local feature repository, open the file
features/batch_features/user_home_location.py
. In the file, uncomment the
following code, which is a definition of the Feature View.
from tecton import batch_feature_view, FilteredSource
from entities import user
from data_sources.customers import customers
from datetime import datetime, timedelta
@batch_feature_view(
sources=[FilteredSource(customers)],
entities=[user],
mode="spark_sql",
online=True,
offline=True,
feature_start_time=datetime(2017, 1, 1),
batch_schedule=timedelta(days=1),
ttl=timedelta(days=3650),
description="User date of birth, entered at signup.",
timestamp_field="signup_timestamp",
)
def user_home_location(customers):
return f"""
SELECT
signup_timestamp,
user_id,
lat,
long
FROM
{customers}
"""
In your terminal, run tecton apply
to apply this Feature View to your
workspace.
Testing the Feature View​
Get the feature view from the workspace.
fv = ws.get_feature_view("user_home_location")
Call the run
method of the feature view to get feature data for the timestamp
range of 2022-01-01
to 2022-04-10
, and display the generated feature values.
offline_features = fv.run(datetime(2017, 4, 1), datetime(2017, 7, 1)).to_spark().limit(10)
offline_features.show()
Sample Output:
signup_timestamp | user_id | lat | long |
---|---|---|---|
2017-04-06 00:50:31 | user_709462196403 | 45.0033 | -93.4875 |
2017-05-08 16:07:51 | user_687958452057 | 42.1938 | -85.5639 |
2017-06-15 19:33:18 | user_884240387242 | 38.9021 | -88.6645 |