- SVMlight - lean reference implementation by Thorsten Joachims. I had some problems to get it running in my code (linker errors …)
- SVMlightlib - an easy to use wrapper around the original SVMlight by Miha Grčar. Here I had no problems to get it running. Version 1.0 also comes with a .NET wrapper and it shows how to use the SVMlightlib functions:

public static void Main(string[] args) { // *** Test SVM^light inductive mode *** Console.WriteLine("Testing SVM^light inductive mode ..."); Console.WriteLine("Training ..."); StreamReader reader = new StreamReader(@"W:\\01_data\\97_svm_training_test_data\\example_textclassification\\train.dat"); List<int> train_set = ReadFeatureVectors(reader); reader.Close(); int model_id = SvmLightLib.TrainModel("", train_set.Count, train_set.ToArray()); Console.WriteLine("Classifying ..."); reader = new StreamReader(@"W:\\01_data\\97_svm_training_test_data\\example_textclassification\\test.dat"); List<int> test_set = ReadFeatureVectors(reader); reader.Close(); int correct = 0; foreach (int vec_id in test_set) { int true_lbl = (int)SvmLightLib.GetFeatureVectorLabel(vec_id); SvmLightLib.Classify(model_id, 1, new int[] { vec_id }); Debug.Assert(SvmLightLib.GetFeatureVectorClassifScoreCount(vec_id) == 1); double result = SvmLightLib.GetFeatureVectorClassifScore(vec_id, 0); int predicted_lbl = result > 0 ? 1 : -1; if (true_lbl == predicted_lbl) { correct++; } } Console.WriteLine("Accuracy: {0:0.00}%", (double)correct / (double)test_set.Count * 100.0); // cleanup SvmLightLib.DeleteModel(model_id); foreach (int vec_id in train_set) { SvmLightLib.DeleteFeatureVector(vec_id); } foreach (int vec_id in test_set) { SvmLightLib.DeleteFeatureVector(vec_id); } ...

Normally a SVM gives you a class label `y=+1`

or `y=-1`

for a two-class problem and vector `x`

to be classified. But sometimes you rather want class probabilities: `P(y=1|x)`

.

There is a lot of work on how SVMs can provide class probabilities. A famous method is Platt's method. It is described in this talk:

Probabilistic Outputs for SVMs and Comparisons to Regularized Likelihood Methods by Nikos Karampatziakis & John Platt.

This talk also mentions alternative approaches to do so: 1.Wahba’s approach, 2.Vapnik’s approach, 3.Hastie's & Tibshirani's approach.

Platt's method is also implemented in libsvm if you use `svm-train -b 1`

to train a SVM model which can provide class probabilities and `svm-predict -b 1`

to predict test vectors to class probabilities.

The main idea seems to consider the scores `f(x)`

of the SVM decision function (not only the sign, but also the absolut value!), to count how often each score value appears for each class and to estimate the class probability by these relative frequencies. The idea is also described in this talk: Obtaining Calibrated Probability Estimates from SVMs by Joseph Drish:

• We now reach the part where the outputs of the classifier are converted into well calibrated posterior probabilities. • Given a test example x, knowing its distance from the separating hyperplane, and the class to which it belongs, the problem of interest is to know the probability with which x belongs to class j. • Binning is a simple histogram technique. • Training examples are first ranked according to their distance scores. • They are then divided into subsets of equal size –these are called bins. • Each bin therefore has an upper bound and a lower bound distance score value. • The number of subsets bis chosen experimentally so that the variance in the resulting probability estimates is reduced. • Given a test example x, it is placed in the bin according to the distance score predicted for it by the SVM. • The corresponding estimated probability that xbelongs to its predicted class jis the fraction of true positives in the bin, i.e. the fraction of training examples in the bin that actually belong to the class that has been predicted for x.

In Platt's paper

Probabilistic Outputs for Support Vector Machines and Comparisons in Advances In Large Margin Classifiers, 1999, 61–74, MIT Press

Platt says that the SVM is trained and then the parameters of an additional sigmoid function are trained to map the SVM outputs into probabilities.

public/svm.txt · Last modified: 2012/06/27 19:35 (external edit) · []