"Support Vector Machine and Value of Information, an exploration"
Abstract
The document proposes a framework to approach a first attempt to extract the Value of Information from a Support Vector Machine algorithm. In the first two chapters, a theoretical introduction is proposed, with a rigorous mathematical basis. The concepts explained range from Lagrangian Optimization to Counting, Statistics and laws of Probability. Support Vector Machines are outlined following the traditional path, using as sources publications, lectures and famous theorems. Value of Information is extracted from the notion of Probabilistic Sensitivity Measure and dealt with extensively. This intepretation is slightly different, and provides an original perspective. Having introduced the two topics, their union is proposed with a primordial method, designed to make its results as interpretable as possible. In line with the need of assessing the importance of dimensions in determining the final label, easy datasets are chosen. On these, substrips of information are sampled, and the performances of an omniscient classifier and a specific classifier are compared. The algorithm and the results' expectations are presented. Two small digressions are proposed concerning issues encountered during its creation. The first concerns the problem of sampling a sufficiently big substrip, with enough datapoints, making the training meaningful. The second is an interesting observation of a strange behaviour of Support Vector Machines when the training dataset is a substrip. As the expectations are not exactly met, further concerns are assessed and considered, as to have a complete overview of what is needed to improve the pipeline. Lastly, all the points dealt are summarized in the final chapter, in which a complete overview is offered.