In this vignette we show how to create a TabNet model using the tidymodels interface.
We are going to use the
lending_club dataset available
First let’s split our dataset into training and testing so we can later access performance of our model:
We now define our pre-processing steps. Note that TabNet handles categorical variables, so we don’t need to do any kind of transformation to them. Normalizing the numeric variables is a good idea though.
Next, we define our model. We are going to train for 50 epochs with a batch size of 128. There are other hyperparameters but, we are going to use the defaults.
We also define our
We can now define our cross-validation strategy:
And finally, fit the model:
After a few minutes we can get the results:
# A tibble: 2 x 5 .metric .estimator mean n std_err <chr> <chr> <dbl> <int> <dbl> 1 accuracy binary 0.946 5 0.000713 2 roc_auc binary 0.732 5 0.00539
And finally, we can verify the results in our test set:
# A tibble: 1 x 3 .metric .estimator .estimate <chr> <chr> <dbl> 1 roc_auc binary 0.710