Importing Python Modules and Objects into Transformations
Importing Python modules into transformations​
Transformations support the pandas
and numpy
modules, only. These modules
can only be used in Pandas transformations.
Python modules must be imported inside the transformation function.
Avoid using aliases for imports (e.g. use import pandas
instead of
import pandas as pd
).
Any modules used for type annotations in function signatures must be imported outside the function.
In the following example, the pandas
module is imported in two places:
- Inside of the transformation function, because the function uses the
pandas
module - Outside of the transformation function, because
pandas
type annotations are used in the function signature (my_transformation(request: pandas.DataFrame) -> pandas.DataFrame:
)
from tecton import transformation
import pandas # required for type hints on my_transformation.
@transformation(mode="pandas")
def my_transformation(request: pandas.DataFrame) -> pandas.DataFrame:
import pandas # required for pandas.DataFrame() below.
df = pandas.DataFrame()
df["amount_is_high"] = (request["amount"] >= 10000).astype("int64")
return df
Importing Python objects into transformation functions​
Object imports must be done outside of the transformation definition.
The following imports of objects into transformation functions are allowed:
- Functions
- Constants
The following imports of objects into transformation functions are not allowed:
- Classes
- Class instances
- Enums
In the following example,
my_func, my_int_const, my_string_const, my_dict_const
are imported from
my_local_module
. The import takes place outside of the transformation
function.
from tecton import transformation
import pandas # required for type hints on my_transformation.
from my_local_module import my_func, my_int_const, my_string_const, my_dict_const
@transformation(mode="pandas")
def my_transformation(request: pandas.DataFrame) -> pandas.DataFrame:
import pandas # required for pandas.DataFrame() below.
df = pandas.DataFrame()
df[my_dict_const["resultval"]] = my_func(request[my_string_const] >= my_int_const)
return df