|A C++ library on regularized methods for feature selection and model assessment||[back]|
LibMsFeatS (Model assessment for Feature Selection) is a platform independent C++ toolbox that includes the implementation of elastic net and group LASSO for binary and multi-class classification problems. For the latter, the (groups of) features that are selected are the ones that best discriminate between all the classes simultaneously. This leads to a single valid description for all the classes, differently from the classical approach of modeling the multi-class with several one-vs-all binary problems, ending with many different representations to be computed at run time.
Our toolbox also provides model selection functionalies, allowing the user to automatically select the best performing regularization parameters. The input data file is compliant with libSVM (http://www.csie.ntu.edu.tw/~cjlin/libsvm/) format, to favor the interchange with classification.
The toolbox commands are designed in a very intuitive way, to facilitate the usability of both experts and non experts on the topic. Our goal is to provide a feature selection tool which can be instantiated with respect to different setting, prior knowledge on the problem and specific requirements when available, or, in absence, to favor the comparison between different solutions to select the most appropriate for a given task.
The toolbox source files can be downloaded from here.
Boost (http://www.boost.org) and Eigen 3 (http://eigen.tuxfamily.org) libraries are required prior the installation. Boost libraries are assumed to be available at the default system locations. If this is not your case, you should specify the locations by setting the corresponding path variable.
Compilation can be performed with cmake using the CMakeLists.txt file provided with the code. Run cmake as follows
and set the Boost and Eigen 3 path variables if needed. Then simply run
After compilation two main executables will be available:|
The toolbox also provides two utility functions, ascii2binary and binary2ascii to convert the input files from one format to the other.
|How to use the toolbox|
We provide here examples of use of the toolbox functionalities. We focus on a multi-class problem and adopt a dataset, glass.centered, downloaded from http://www.ics.uci.edu/~mlearn/MLRepository.html and provided with the toolbox. The dataset includes 7 classes and descriptors with 9 features.
To obtain a sparser representation you just have to increase the value of tau. Examples:
./MCgrouplasso --train-set glass.centered --tau 1e-3     selects all variables
./MCgrouplasso --train-set glass.centered --tau 1e-2    selects 6 variables (indexes 1 2 3 4 5 7)
./MCgrouplasso --train-set glass.centered --tau 2*1e-2    selects 4 variables (indexes 1 2 3 7)
Let us now suppose we want to select and appropriate values for tau in the range [1e-10 : 1e-3] using 10-folds cross validation and in multi-thread. Then simply run
./MCgrouplasso --train-set glass.centered --tau 1e-10:10:1e-3 --mt --folds 10
An example of stdout message:
No group file specified. Using MCL1L2 [specific feature selection algorithm adopted depending on the options (in this case Multi-Class L1L2 since no groups file was specified)]
Parameters selected: tau 2.78256e-05 mu 0
Selected variables: 0 1 2 3 4 5 6 7 8 [single feature indexes, starting from zero]
Copyright 2014 by Luca Zini, Nicoletta Noceti and Francesca Odone,
Department of Informatics, Bioengineering, Robotics, and Systems Engineering
University of Genova.
This program is free software: you can redistribute it and/or modify it under the terms of the CC-BY Public Licence, Version 4.0, 25 November 2013. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY, including but not limited to the warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the CC-BY Public License copy provided with this software for more details.
If you use libMsFeatS in your research, please cite
Structured multi-class feature selection with an application to face recognition, by L.Zini, N.Noceti, G.Fusco, F.Odone
Pattern Recognition Letters, 2014.
For more information: