OpenMS
|
Simple interface to support vector machines for classification and regression (via LIBSVM). More...
#include <OpenMS/ML/SVM/SimpleSVM.h>
Classes | |
struct | Prediction |
SVM/SVR prediction result. More... | |
Public Types | |
typedef std::map< String, std::vector< double > > | PredictorMap |
Mapping from predictor name to vector of predictor values. More... | |
typedef std::map< String, std::pair< double, double > > | ScaleMap |
Mapping from predictor name to predictor min and max. More... | |
Public Member Functions | |
SimpleSVM () | |
Default constructor. More... | |
~SimpleSVM () override | |
Destructor. More... | |
void | setup (PredictorMap &predictors, const std::map< Size, double > &outcomes, bool classification=true) |
Load data and train a model. More... | |
void | predict (std::vector< Prediction > &predictions, std::vector< Size > indexes=std::vector< Size >()) const |
Predict class labels or regression values (and probabilities). More... | |
void | predict (PredictorMap &predictors, std::vector< Prediction > &predictions) const |
Predict class labels or regression values (and probabilities). More... | |
void | getFeatureWeights (std::map< String, double > &feature_weights) const |
Get the weights used for features (predictors) in the SVM model. More... | |
void | writeXvalResults (const String &path) const |
Write cross-validation (parameter optimization) results to a CSV file. More... | |
const ScaleMap & | getScaling () const |
Get data range of predictors before scaling to [0, 1]. More... | |
Public Member Functions inherited from DefaultParamHandler | |
DefaultParamHandler (const String &name) | |
Constructor with name that is displayed in error messages. More... | |
DefaultParamHandler (const DefaultParamHandler &rhs) | |
Copy constructor. More... | |
virtual | ~DefaultParamHandler () |
Destructor. More... | |
DefaultParamHandler & | operator= (const DefaultParamHandler &rhs) |
Assignment operator. More... | |
virtual bool | operator== (const DefaultParamHandler &rhs) const |
Equality operator. More... | |
void | setParameters (const Param ¶m) |
Sets the parameters. More... | |
const Param & | getParameters () const |
Non-mutable access to the parameters. More... | |
const Param & | getDefaults () const |
Non-mutable access to the default parameters. More... | |
const String & | getName () const |
Non-mutable access to the name. More... | |
void | setName (const String &name) |
Mutable access to the name. More... | |
const std::vector< String > & | getSubsections () const |
Non-mutable access to the registered subsections. More... | |
Protected Types | |
typedef std::vector< std::vector< std::vector< double > > > | SVMPerformance |
Classification (or regression) performance for different param. combinations (C/gamma/p): More... | |
Protected Member Functions | |
void | clear_ () |
void | scaleData_ (PredictorMap &predictors) |
Scale predictor values to range 0-1. More... | |
void | convertData_ (const PredictorMap &predictors) |
Convert predictors to LIBSVM format. More... | |
std::tuple< double, double, double > | chooseBestParameters_ (bool higher_better) const |
Choose best SVM parameters based on cross-validation results. More... | |
void | optimizeParameters_ (bool classification) |
Run cross-validation to optimize SVM parameters. More... | |
Protected Member Functions inherited from DefaultParamHandler | |
virtual void | updateMembers_ () |
This method is used to update extra member variables at the end of the setParameters() method. More... | |
void | defaultsToParam_ () |
Updates the parameters after the defaults have been set in the constructor. More... | |
Static Protected Member Functions | |
static void | printNull_ (const char *) |
Dummy function to suppress LIBSVM output. More... | |
Protected Attributes | |
std::vector< std::vector< struct svm_node > > | nodes_ |
Values of predictors (LIBSVM format) More... | |
struct svm_problem | data_ |
SVM training data (LIBSVM format) More... | |
struct svm_parameter | svm_params_ |
SVM parameters (LIBSVM format) More... | |
struct svm_model * | model_ |
Pointer to SVM model (LIBSVM format) More... | |
std::vector< String > | predictor_names_ |
Names of predictors in the model (excluding uninformative ones) More... | |
Size | n_parts_ |
Number of partitions for cross-validation. More... | |
std::vector< double > | log2_C_ |
Parameter values to try during optimization. More... | |
std::vector< double > | log2_gamma_ |
std::vector< double > | log2_p_ |
ScaleMap | scaling_ |
Mapping from predictor name to predictor min and max. More... | |
SVMPerformance | performance_ |
Cross-validation results. More... | |
Protected Attributes inherited from DefaultParamHandler | |
Param | param_ |
Container for current parameters. More... | |
Param | defaults_ |
Container for default parameters. This member should be filled in the constructor of derived classes! More... | |
std::vector< String > | subsections_ |
Container for registered subsections. This member should be filled in the constructor of derived classes! More... | |
String | error_name_ |
Name that is displayed in error messages during the parameter checking. More... | |
bool | check_defaults_ |
If this member is set to false no checking if parameters in done;. More... | |
bool | warn_empty_defaults_ |
If this member is set to false no warning is emitted when defaults are empty;. More... | |
Additional Inherited Members | |
Static Public Member Functions inherited from DefaultParamHandler | |
static void | writeParametersToMetaValues (const Param &write_this, MetaInfoInterface &write_here, const String &key_prefix="") |
Writes all parameters to meta values. More... | |
Simple interface to support vector machines for classification and regression (via LIBSVM).
This class supports:
It uses cross-validation to optimize the respective SVM/SVR parameters C, p (SVR-only) and (RBF kernel only) gamma.
Usage: SVM models are generated by the the setup() method. SimpleSVM provides two common use cases for convinience:
predictors
to setup and training on a subset.predictors
to setup. The parameter outcomes
of setup() defines in both cases the training set; it contains the indexes of observations (corresponding to positions in the vectors in predictors
) together with the class labels (or regression values) for training.Given N observations of M predictors, the data are coded as a map of predictors (size M), each a numeric vector of values for different observations (size N).
To predict class labels (or regression values) based on a model, use one of the predict() methods:
indexes
of predict() takes a vector of indexes corresponding to the observations for which predictions should be made. (With an empty vector, the default, predictions are made for all observations, including those used for training.)Name | Type | Default | Restrictions | Description |
---|---|---|---|---|
kernel | string | RBF | RBF, linear | SVM kernel |
xval | int | 5 | min: 1 | Number of partitions for cross-validation (parameter optimization) |
log2_C | float list | [-5.0, -3.0, -1.0, 1.0, 3.0, 5.0, 7.0, 9.0, 11.0, 13.0, 15.0] | Values to try for the SVM parameter 'C' during parameter optimization. A value 'x' is used as 'C = 2^x'. | |
log2_gamma | float list | [-15.0, -13.0, -11.0, -9.0, -7.0, -5.0, -3.0, -1.0, 1.0, 3.0] | Values to try for the SVM parameter 'gamma' during parameter optimization (RBF kernel only). A value 'x' is used as 'gamma = 2^x'. | |
log2_p | float list | [-15.0, -12.0, -9.0, -6.0, -3.32192809489, 0.0, 3.32192809489, 6.0, 9.0, 12.0, 15.0] | Values to try for the SVM parameter 'epsilon' during parameter optimization (epsilon-SVR only). A value 'x' is used as 'epsilon = 2^x'. | |
epsilon | float | 1.0e-03 | min: 0.0 | Stopping criterion |
cache_size | float | 100.0 | min: 1.0 | Size of the kernel cache (in MB) |
no_shrinking | string | false | true, false | Disable the shrinking heuristics |
struct OpenMS::SimpleSVM::Prediction |
SVM/SVR prediction result.
Class Members | ||
---|---|---|
double | outcome | Predicted class label (or regression value) |
map< double, double > | probabilities | Class label (or regression value) and their predicted probabilities. |
typedef std::map<String, std::vector<double> > PredictorMap |
Mapping from predictor name to vector of predictor values.
Mapping from predictor name to predictor min and max.
|
protected |
Classification (or regression) performance for different param. combinations (C/gamma/p):
SimpleSVM | ( | ) |
Default constructor.
|
override |
Destructor.
|
protected |
Choose best SVM parameters based on cross-validation results.
|
protected |
|
protected |
Convert predictors to LIBSVM format.
void getFeatureWeights | ( | std::map< String, double > & | feature_weights | ) | const |
Get the weights used for features (predictors) in the SVM model.
Currently only supported for two-class classification. If a linear kernel is used, the weights are informative for ranking features.
Exception::Precondition | if no model has been trained, or if the classification involves more than two classes |
const ScaleMap& getScaling | ( | ) | const |
Get data range of predictors before scaling to [0, 1].
|
protected |
Run cross-validation to optimize SVM parameters.
void predict | ( | PredictorMap & | predictors, |
std::vector< Prediction > & | predictions | ||
) | const |
Predict class labels or regression values (and probabilities).
predictors | Mapping from predictor name to vector of predictor values (for different observations). All vectors should have the same length; values will be changed by scaling applied to training data in setup. |
predictions | Output vector of prediction results (same order as indexes ). |
Exception::Precondition | if no model has been trained |
Exception::InvalidValue | if an invalid index is used in indexes |
void predict | ( | std::vector< Prediction > & | predictions, |
std::vector< Size > | indexes = std::vector< Size >() |
||
) | const |
Predict class labels or regression values (and probabilities).
predictions | Output vector of prediction results (same order as indexes ). |
indexes | Vector of observation indexes for which predictions are desired. If empty (default), predictions are made for all observations. |
Exception::Precondition | if no model has been trained |
Exception::InvalidValue | if an invalid index is used in indexes |
|
inlinestaticprotected |
Dummy function to suppress LIBSVM output.
|
protected |
Scale predictor values to range 0-1.
void setup | ( | PredictorMap & | predictors, |
const std::map< Size, double > & | outcomes, | ||
bool | classification = true |
||
) |
Load data and train a model.
predictors | Mapping from predictor name to vector of predictor values (for different observations). All vectors should have the same length; values will be changed by scaling. |
outcomes | Mapping from observation index to class label or regression value in the training set. |
classification | true (default) if SVM classification should be used, SVR otherwise |
Exception::IllegalArgument | if predictors is empty |
Exception::InvalidValue | if an invalid index is used in outcomes |
Exception::MissingInformation | if there are fewer than two class labels in outcomes , or if there are not enough observations for cross-validation |
void writeXvalResults | ( | const String & | path | ) | const |
Write cross-validation (parameter optimization) results to a CSV file.
|
protected |
SVM training data (LIBSVM format)
|
protected |
Parameter values to try during optimization.
|
protected |
|
protected |
|
protected |
Pointer to SVM model (LIBSVM format)
|
protected |
Number of partitions for cross-validation.
|
protected |
Values of predictors (LIBSVM format)
|
protected |
Cross-validation results.
|
protected |
Names of predictors in the model (excluding uninformative ones)
|
protected |
Mapping from predictor name to predictor min and max.
|
protected |
SVM parameters (LIBSVM format)