Skip to main content
Version: 0.6

Creating and Training a Model

Creating the model​

In your notebook, create a RandomForestRegressor model, sending it the full_pipe pipeline that you created earlier.

n_estimators = 100
max_depth = 6
max_features = 3

rf = make_pipeline(
full_pipe,
RandomForestRegressor(n_estimators=n_estimators, max_depth=max_depth, max_features=max_features),
)

Training the model​

To train the model, call the fit method of the model (in your notebook). The transforms in the pipeline are applied to the training data before the model is trained:

rf.fit(X_train, y_train)

Output:

Pipeline(steps=[('columntransformer',
...
...
...
)]
)

Make predictions using the testing data​

To get a prediction, call the predict() method of the model (in your notebook):

predictions = rf.predict(X_test)
print(predictions)

Sample Output:

[0.32 0.36]
note

The prediction value will between 0 and 1. The higher the value, the higher the probability of the transaction being fraudulent. The value can be exactly 0 or 1, because this is the behavior of the RandomForestRegressor model that made the prediction.

Get the error in the prediction​

Get the error in the predictions made by the model (with the training data), compared to the predictions in the testing data (y_test):

mse = mean_squared_error(y_test, predictions)
print(mse)

Sample Output:

0.11599999999999999

Iterating on your model​

Based on the error in the prediction of a training model, you may choose to iterate on the model by updating your features, using a model with different parameters, and/or choosing a new model. Iterating on the model is not covered in this tutorial.

Was this page helpful?

🧠 Hi! Ask me anything about Tecton!

Floating button icon