Transformation
Summary​
A Tecton Transformation. Transformations are used encapsulate and share transformation logic between Feature Views.
Use the tecton.transformation()
decorator to create a Transformation.
Attributes​
Name | Data Type | Description |
---|---|---|
created_at | Optional[datetime.datetime] | The time that this Tecton object was created or last updated. |
defined_in | Optional[str] | The repo filename where this object was declared. |
description | str | Returns the description of the Tecton object. |
id | str | Returns the unique id of the Tecton object. |
info | ||
name | str | Returns the name of the Tecton object. |
owner | Optional[str] | Returns the owner of the Tecton object. |
tags | Dict[str, str] | Returns the tags of the Tecton object. |
transformer | The user function for this transformation. | |
workspace | Optional[str] | Returns the workspace that this Tecton object belongs to. |
Methods​
Name | Description |
---|---|
__init__(...) | Creates a new Transformation. |
run(...) | Run the transformation against inputs. |
summary() | Displays a human readable summary of this Transformation. |
validate() | Validate this Tecton object and its dependencies (if any). |
__init__(...)​
Creates a new Transformation. Use the @transformation
decorator to create a
Transformation instead of directly using this constructor.
Parameters​
-
mode
(str
) – This parameter must be one of “spark_sql”, “pyspark”, “snowflake_sql”, “bigquery_sql”, “snowpark”, “pandas” or “python”. See Transformation Modes for more details. -
name
(str
) – A unique name of the Transformation. -
description
(Optional
[str
]) – A human-readable description. -
tags
(Optional
[Dict
[str
,str
]]) – Tags associated with this Tecton Transformation (key-value pairs of arbitrary metadata). -
owner
(Optional
[str
]) – Owner name (typically the email of the primary maintainer). -
prevent_destroy
(Optional
[bool
]) – If True, this Tecton object will be blocked from being deleted or re-created (i.e. a destructive update) during tecton plan/apply. -
user_function
(Optional
[Callable
[…,Union
[str
,DataFrame
]]]) – The user function for this transformation.
run(...)​
Run the transformation against inputs.
Currently, this method only supports spark_sql, pyspark, and pandas modes.
Parameters​
-
*inputs
(Union
[DataFrame
,Series
,TectonDataFrame
,DataFrame
,str
,int
,float
,bool
]) – positional arguments to the transformation function. For PySpark and SQL transformations, these are eitherpandas.DataFrame
orpyspark.sql.DataFrame
objects. For on-demand transformations, these arepandas.Dataframe
objects. -
context
(Optional
[BaseMaterializationContext
]) – An optional materialization context object. (Default:None
)
summary()​
Displays a human readable summary of this Transformation.
validate()​
Validate this Tecton object and its dependencies (if any).
Validation performs most of the same checks and operations as tecton plan
.
-
Check for invalid object configurations, e.g. setting conflicting fields.
-
For Data Sources and Feature Views, test query code and derive schemas. e.g. test that a Data Source’s specified s3 path exists or that a Feature View’s SQL code executes and produces supported feature data types.
Objects already applied to Tecton do not need to be re-validated on retrieval
(e.g. fv = tecton.get_workspace('prod').get_feature_view('my_fv')
) since they
have already been validated during tecton plan
. Locally defined objects (e.g.
my_ds = BatchSource(name="my_ds", ...)
) may need to be validated before some
of their methods can be called, e.g.
my_feature_view.get_features_for_events()
.