Comparison To Well-tuned Simple Nets Excel on Tabular Datasets

Discussion on Paper1

The principal aim of the paper was to demonstrate that a simple multi-layer perceptron model with extensive regularisation and data augmentation could produce excellent results on tabular data.

To test their research they took numerous tabular datasets and tested their model against other neural network and boosted tree models such as XGBOOST.

These datasets are comprised entirely of classification problems.

Unfortunately, they used the balanced accuracy metric as the performance measure and didn’t give detailed results from the various model runs so the only results available are those from the paper.

The code behind the results is available in GitHub but the data has not been pre-split into train, validation and test sets. The splitting strategy is embedded in the code so it is possible to reproduce the splits the researchers used – this was confirmed by a member of the research team.

In the interest of time, AIBMod took just two of the datasets which overlapped with the ‘Revisiting Deep Learning Models For Tabular Data’ comparison runs. This meant AIBMod could simply use the same hyperparameters that were used for those comparisons.

AIBMod vs Paper Results Comparison

DatasetADULT INCOME (AD)HIGGS (HI)
Problem TypeBinary
Classification
Binary
Classification
Performance MetricBalanced AccuracyBalanced Accuracy
Which Is BetterHigherHigher
Model
AIBMod84.154% (ROC – 92.936%)73.905% (ROC – 82.296%)
MLP+C82.443%73.546%
AutoGL. S80.557%73.798%
XGBoost79.824%72.944%
The AIBMod figures are from their test results. All other figures are from Table 2 of the paper under discussion.
Bold figures indicate the best result for each dataset.

Results Discussion

The above shows that AIBMod’s model is able to surpass the results from the other tested models on both datasets.

  1. Well-tuned Simple Nets Excel on Tabular Datasets, Kadra et al. – originally published June 2021, updated Nov 2021 ↩︎