Model Background & Objectives

Model Background

Deep learning (i.e. neural network) models have made considerable progress in the last ten years in vision, language and game playing problems.

Neural network models were inspired by the biological wonder that is the human brain which is based on neurons, activations thereof and synaptic connections between them.

The architecture of the human brain is very loosely ‘replicated’ mathematically within a neural network model.

Back in 2018, given all the hype about neural network models and their advances, AIBMod initially had the belief that neural network models should be able to outperform all other machine learning model types on all problem types.

After all, if the above was not true, then aspects of neural network models that were going to control / are currently controlling critical functions such as self driving cars, medical analysis / diagnosis, military applications, machine failure detection, job applications, fraud detection, loan decisions, insurance premium calculations etc. etc. may not be optimal and so not as performant as they could be.

It would mean that there were fallibilities within neural network models that needed to be found, understood and overcome.

As it turned out, whilst neural network models performed well on business data problems, boosted trees could relatively easily outperform them. To AIBMod, this implied that there were still things that needed to be learnt about neural networks in general.

AIBMod’s Model Objectives

From the outset, AIBMod had various objectives for the deep learning model it would develop:

#	Objective	Reason
1	able to provide state of the art performance on business data problems	The model has an almost limitless variety of potential applications – credit card fraud detection, credit loan pricing and approval, insurance premium calculator, new product recommendation, machine failure prediction, health & wellbeing analysis etc. However it’s used, the model aims to provide the most accurate predictions based on the data the model is trained on. The premise at the outset was that a model capable of significant gains in vision and language problems must be able to learn something of their underlying structures. This ability must be useful when considering business data problems.
2	able to handle numerical inputs (such as age, income), categorical inputs (such as gender, country) and missing data	Language models represent words as numerical vectors (embeddings) and vision models process normalised numerical pixel data. Business data predominantly comes in both numerical and categorical forms. The numerical data will be distributed in many different ways. Categorical features might have high frequency or low frequency values. There may be missing values. The model needed to be able transform / process the inputs, however they present themselves so that it can establish the statistical rules underlying the data.
3	able to perform with both structured and unstructured input	Some business datasets are completely structured in nature – namely they look like an Excel Spreadsheet – a large number of rows where each row has the same number of columns or features. The classic data analysis tools such as linear regression, random forests or boosted trees can easily handle these datasets. Sometime, however, business data can be highly unstructured. Think of a loan decision / pricing model. Credit Bureaus may have information on some prospective borrowers and none on others. Some borrowers may have previously taken out loans or credit cards from the borrower with variably sized repayment histories, and others may not. Normally, these variably sized inputs would need to be processed, taking averages or standard deviations, to compress the data into fixed sized structures. This implicitly leads to loss of information. AIBMod developed a model framework to enable input structures, however they exist in the real world, to be passed into the model as they are with no pre-processing. This results in a significant performance boost over the classical models – this can be seen here.
4	capable of being interpreted	There are many machine learning applications which are subject to regulations – loan pricing, insurance premium calculations etc etc. As the founder of AIBMod has a background in finance, the model needed to have an interpretability framework built in so that it could be used for regulated activities. AIBMod, therefore, developed a proprietary training approach which facilitates feature contribution analysis – something that is essential if one wishes to explain model outputs. This means that for an individual prediction, the interactions and correlations of the features that drove such prediction can be analysed. This is an extremely powerful approach that does not limit the model to linear interactions (such as PCA analysis) – but instead the very model that is used to determine the overall prediction is similarly used to explain such prediction. A couple of worked examples have been set out here.
5	eradicate the need for feature crossing / manual feature crafting	The deep learning models that have made large performance gains in vision, language and games playing do not include any manual rules (heuristics) – the models learn the ‘decision making rules’ through the training process and it is entirely likely that manual intervention would be harmful. It therefore seemed strange to the founder of AIBMod that often machine learning competitions on tabular data were won by linear regression or tree based models with features created manually – feature crossing for example. This seems to be completely at odds with the approaches taken by researchers in the fields of vision, language and games playing. The need to ‘cross features’ highlights a weakness of the models historically used in the tabular data space as a theoretically pure deep learning model should be able to derive its own feature relations and interactions when or where needed.
6	capable of time series analysis	One of the big problem areas for machine learning is in the field of time series analysis. The model framework developed by AIBMod, it is believed, will be an excellent ‘engine’ for a time series model because of its ability to effectively process numerical and categorical values in conjunction with each other. The neural transformer built by AIBMod is currently capable of multivariate time series analysis but this has not yet been fully developed.