The class implements the random forest predictor. More...
#include <opencv2/ml.hpp>
Public Types | |
enum | Flags { PREDICT_AUTO =0, PREDICT_SUM =(1<<8), PREDICT_MAX_VOTE =(2<<8), PREDICT_MASK =(3<<8) } |
Predict options. More... | |
Public Member Functions | |
virtual float | calcError (const Ptr< TrainData > &data, bool test, OutputArray resp) const |
Computes error on the training or test dataset. More... | |
virtual void | clear () |
Clears the algorithm state. More... | |
virtual bool | empty () const CV_OVERRIDE |
Returns true if the Algorithm is empty (e.g. More... | |
virtual int | getActiveVarCount () const =0 |
The size of the randomly selected subset of features at each tree node and that are used to find the best split(s). More... | |
virtual bool | getCalculateVarImportance () const =0 |
If true then variable importance will be calculated and then it can be retrieved by RTrees::getVarImportance. More... | |
virtual int | getCVFolds () const =0 |
If CVFolds > 1 then algorithms prunes the built decision tree using K-fold cross-validation procedure where K is equal to CVFolds. More... | |
virtual String | getDefaultName () const |
Returns the algorithm string identifier. More... | |
virtual int | getMaxCategories () const =0 |
Cluster possible values of a categorical variable into K<=maxCategories clusters to find a suboptimal split. More... | |
virtual int | getMaxDepth () const =0 |
The maximum possible depth of the tree. More... | |
virtual int | getMinSampleCount () const =0 |
If the number of samples in a node is less than this parameter then the node will not be split. More... | |
virtual const std::vector< Node > & | getNodes () const =0 |
Returns all the nodes. More... | |
virtual cv::Mat | getPriors () const =0 |
The array of a priori class probabilities, sorted by the class label value. More... | |
virtual float | getRegressionAccuracy () const =0 |
Termination criteria for regression trees. More... | |
virtual const std::vector< int > & | getRoots () const =0 |
Returns indices of root nodes. More... | |
virtual const std::vector< Split > & | getSplits () const =0 |
Returns all the splits. More... | |
virtual const std::vector< int > & | getSubsets () const =0 |
Returns all the bitsets for categorical splits. More... | |
virtual TermCriteria | getTermCriteria () const =0 |
The termination criteria that specifies when the training algorithm stops. More... | |
virtual bool | getTruncatePrunedTree () const =0 |
If true then pruned branches are physically removed from the tree. More... | |
virtual bool | getUse1SERule () const =0 |
If true then a pruning will be harsher. More... | |
virtual bool | getUseSurrogates () const =0 |
If true then surrogate splits will be built. More... | |
virtual int | getVarCount () const =0 |
Returns the number of variables in training samples. More... | |
virtual Mat | getVarImportance () const =0 |
Returns the variable importance array. More... | |
virtual void | getVotes (InputArray samples, OutputArray results, int flags) const =0 |
Returns the result of each individual tree in the forest. More... | |
virtual bool | isClassifier () const =0 |
Returns true if the model is classifier. More... | |
virtual bool | isTrained () const =0 |
Returns true if the model is trained. More... | |
virtual float | predict (InputArray samples, OutputArray results=noArray(), int flags=0) const =0 |
Predicts response(s) for the provided sample(s) More... | |
virtual void | read (const FileNode &fn) |
Reads algorithm parameters from a file storage. More... | |
virtual void | save (const String &filename) const |
Saves the algorithm to a file. More... | |
virtual void | setActiveVarCount (int val)=0 |
The size of the randomly selected subset of features at each tree node and that are used to find the best split(s). More... | |
virtual void | setCalculateVarImportance (bool val)=0 |
If true then variable importance will be calculated and then it can be retrieved by RTrees::getVarImportance. More... | |
virtual void | setCVFolds (int val)=0 |
If CVFolds > 1 then algorithms prunes the built decision tree using K-fold cross-validation procedure where K is equal to CVFolds. More... | |
virtual void | setMaxCategories (int val)=0 |
Cluster possible values of a categorical variable into K<=maxCategories clusters to find a suboptimal split. More... | |
virtual void | setMaxDepth (int val)=0 |
The maximum possible depth of the tree. More... | |
virtual void | setMinSampleCount (int val)=0 |
If the number of samples in a node is less than this parameter then the node will not be split. More... | |
virtual void | setPriors (const cv::Mat &val)=0 |
The array of a priori class probabilities, sorted by the class label value. More... | |
virtual void | setRegressionAccuracy (float val)=0 |
Termination criteria for regression trees. More... | |
virtual void | setTermCriteria (const TermCriteria &val)=0 |
The termination criteria that specifies when the training algorithm stops. More... | |
virtual void | setTruncatePrunedTree (bool val)=0 |
If true then pruned branches are physically removed from the tree. More... | |
virtual void | setUse1SERule (bool val)=0 |
If true then a pruning will be harsher. More... | |
virtual void | setUseSurrogates (bool val)=0 |
If true then surrogate splits will be built. More... | |
virtual bool | train (const Ptr< TrainData > &trainData, int flags=0) |
Trains the statistical model. More... | |
virtual bool | train (InputArray samples, int layout, InputArray responses) |
Trains the statistical model. More... | |
virtual void | write (FileStorage &fs) const |
Stores algorithm parameters in a file storage. More... | |
void | write (const Ptr< FileStorage > &fs, const String &name=String()) const |
simplified API for language bindings This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts. More... | |
Static Public Member Functions | |
static Ptr< RTrees > | create () |
Creates the empty model. More... | |
static Ptr< RTrees > | load (const String &filepath, const String &nodeName=String()) |
Loads and creates a serialized RTree from a file. More... | |
template<typename _Tp > | |
static Ptr< _Tp > | loadFromString (const String &strModel, const String &objname=String()) |
Loads algorithm from a String. More... | |
template<typename _Tp > | |
static Ptr< _Tp > | read (const FileNode &fn) |
Reads algorithm from the file node. More... | |
template<typename _Tp > | |
static Ptr< _Tp > | train (const Ptr< TrainData > &data, int flags=0) |
Create and train model with default parameters. More... | |
Protected Member Functions | |
void | writeFormat (FileStorage &fs) const |
The class implements the random forest predictor.
|
inherited |
|
virtualinherited |
Computes error on the training or test dataset.
data | the training data |
test | if true, the error is computed over the test subset of the data, otherwise it's computed over the training subset of the data. Please note that if you loaded a completely different dataset to evaluate already trained classifier, you will probably want not to set the test subset at all with TrainData::setTrainTestSplitRatio and specify test=false, so that the error is computed for the whole new set. Yes, this sounds a bit confusing. |
resp | the optional output responses. |
The method uses StatModel::predict to compute the error. For regression models the error is computed as RMS, for classifiers - as a percent of missclassified samples (0%-100%).
|
inlinevirtualinherited |
Clears the algorithm state.
Reimplemented in cv::FlannBasedMatcher, and cv::DescriptorMatcher.
Creates the empty model.
Use StatModel::train to train the model, StatModel::train to create and train the model, Algorithm::load to load the pre-trained model.
|
virtualinherited |
Returns true if the Algorithm is empty (e.g.
in the very beginning or after unsuccessful read
Reimplemented from cv::Algorithm.
|
pure virtual |
The size of the randomly selected subset of features at each tree node and that are used to find the best split(s).
If you set it to 0 then the size will be set to the square root of the total number of features. Default value is 0.
|
pure virtual |
If true then variable importance will be calculated and then it can be retrieved by RTrees::getVarImportance.
Default value is false.
|
pure virtualinherited |
If CVFolds > 1 then algorithms prunes the built decision tree using K-fold cross-validation procedure where K is equal to CVFolds.
Default value is 10.
|
virtualinherited |
Returns the algorithm string identifier.
This string is used as top level xml/yml node tag when the object is saved to a file or string.
Reimplemented in cv::AKAZE, cv::KAZE, cv::SimpleBlobDetector, cv::GFTTDetector, cv::AgastFeatureDetector, cv::FastFeatureDetector, cv::MSER, cv::ORB, cv::BRISK, and cv::Feature2D.
|
pure virtualinherited |
Cluster possible values of a categorical variable into K<=maxCategories clusters to find a suboptimal split.
If a discrete variable, on which the training procedure tries to make a split, takes more than maxCategories values, the precise best subset estimation may take a very long time because the algorithm is exponential. Instead, many decision trees engines (including our implementation) try to find sub-optimal split in this case by clustering all the samples into maxCategories clusters that is some categories are merged together. The clustering is applied only in n > 2-class classification problems for categorical variables with N > max_categories possible values. In case of regression and 2-class classification the optimal split can be found efficiently without employing clustering, thus the parameter is not used in these cases. Default value is 10.
|
pure virtualinherited |
The maximum possible depth of the tree.
That is the training algorithms attempts to split a node while its depth is less than maxDepth. The root node has zero depth. The actual depth may be smaller if the other termination criteria are met (see the outline of the training procedure here), and/or if the tree is pruned. Default value is INT_MAX.
|
pure virtualinherited |
If the number of samples in a node is less than this parameter then the node will not be split.
Default value is 10.
|
pure virtualinherited |
Returns all the nodes.
all the node indices are indices in the returned vector
|
pure virtualinherited |
The array of a priori class probabilities, sorted by the class label value.
The parameter can be used to tune the decision tree preferences toward a certain class. For example, if you want to detect some rare anomaly occurrence, the training base will likely contain much more normal cases than anomalies, so a very good classification performance will be achieved just by considering every case as normal. To avoid this, the priors can be specified, where the anomaly probability is artificially increased (up to 0.5 or even greater), so the weight of the misclassified anomalies becomes much bigger, and the tree is adjusted properly.
You can also think about this parameter as weights of prediction categories which determine relative weights that you give to misclassification. That is, if the weight of the first category is 1 and the weight of the second category is 10, then each mistake in predicting the second category is equivalent to making 10 mistakes in predicting the first category. Default value is empty Mat.
|
pure virtualinherited |
Termination criteria for regression trees.
If all absolute differences between an estimated value in a node and values of train samples in this node are less than this parameter then the node will not be split further. Default value is 0.01f
|
pure virtualinherited |
Returns indices of root nodes.
|
pure virtualinherited |
Returns all the splits.
all the split indices are indices in the returned vector
|
pure virtualinherited |
Returns all the bitsets for categorical splits.
Split::subsetOfs is an offset in the returned vector
|
pure virtual |
The termination criteria that specifies when the training algorithm stops.
Either when the specified number of trees is trained and added to the ensemble or when sufficient accuracy (measured as OOB error) is achieved. Typically the more trees you have the better the accuracy. However, the improvement in accuracy generally diminishes and asymptotes pass a certain number of trees. Also to keep in mind, the number of tree increases the prediction time linearly. Default value is TermCriteria(TermCriteria::MAX_ITERS + TermCriteria::EPS, 50, 0.1)
|
pure virtualinherited |
If true then pruned branches are physically removed from the tree.
Otherwise they are retained and it is possible to get results from the original unpruned (or pruned less aggressively) tree. Default value is true.
|
pure virtualinherited |
If true then a pruning will be harsher.
This will make a tree more compact and more resistant to the training data noise but a bit less accurate. Default value is true.
|
pure virtualinherited |
If true then surrogate splits will be built.
These splits allow to work with missing data and compute variable importance correctly. Default value is false.
|
pure virtualinherited |
Returns the number of variables in training samples.
|
pure virtual |
Returns the variable importance array.
The method returns the variable importance vector, computed at the training stage when CalculateVarImportance is set to true. If this flag was set to false, the empty matrix is returned.
|
pure virtual |
Returns the result of each individual tree in the forest.
In case the model is a regression problem, the method will return each of the trees' results for each of the sample cases. If the model is a classifier, it will return a Mat with samples + 1 rows, where the first row gives the class number and the following rows return the votes each class had for each sample.
samples | Array containing the samples for which votes will be calculated. |
results | Array where the result of the calculation will be written. |
flags | Flags for defining the type of RTrees. |
|
pure virtualinherited |
Returns true if the model is classifier.
|
pure virtualinherited |
Returns true if the model is trained.
|
static |
Loads and creates a serialized RTree from a file.
Use RTree::save to serialize and store an RTree to disk. Load the RTree from this file again, by calling this function with the path to the file. Optionally specify the node for the file containing the classifier
filepath | path to serialized RTree |
nodeName | name of node containing the classifier |
|
inlinestaticinherited |
Loads algorithm from a String.
strModel | The string variable containing the model you want to load. |
objname | The optional name of the node to read (if empty, the first top-level node will be used) |
This is static template method of Algorithm. It's usage is following (in the case of SVM):
References CV_WRAP, cv::FileNode::empty(), cv::FileStorage::getFirstTopLevelNode(), cv::FileStorage::MEMORY, and cv::FileStorage::READ.
|
pure virtualinherited |
Predicts response(s) for the provided sample(s)
samples | The input samples, floating-point matrix |
results | The optional output matrix of results. |
flags | The optional flags, model-dependent. See cv::ml::StatModel::Flags. |
Implemented in cv::ml::LogisticRegression, and cv::ml::EM.
|
inlinevirtualinherited |
Reads algorithm parameters from a file storage.
Reimplemented in cv::FlannBasedMatcher, cv::DescriptorMatcher, and cv::Feature2D.
|
inlinestaticinherited |
Reads algorithm from the file node.
This is static template method of Algorithm. It's usage is following (in the case of SVM):
In order to make this method work, the derived class must overwrite Algorithm::read(const FileNode& fn) and also have static create() method without parameters (or with all the optional parameters)
|
virtualinherited |
Saves the algorithm to a file.
In order to make this method work, the derived class must implement Algorithm::write(FileStorage& fs).
|
pure virtual |
The size of the randomly selected subset of features at each tree node and that are used to find the best split(s).
|
pure virtual |
If true then variable importance will be calculated and then it can be retrieved by RTrees::getVarImportance.
|
pure virtualinherited |
If CVFolds > 1 then algorithms prunes the built decision tree using K-fold cross-validation procedure where K is equal to CVFolds.
|
pure virtualinherited |
Cluster possible values of a categorical variable into K<=maxCategories clusters to find a suboptimal split.
|
pure virtualinherited |
The maximum possible depth of the tree.
|
pure virtualinherited |
If the number of samples in a node is less than this parameter then the node will not be split.
|
pure virtualinherited |
The array of a priori class probabilities, sorted by the class label value.
|
pure virtualinherited |
Termination criteria for regression trees.
|
pure virtual |
The termination criteria that specifies when the training algorithm stops.
|
pure virtualinherited |
If true then pruned branches are physically removed from the tree.
|
pure virtualinherited |
If true then a pruning will be harsher.
|
pure virtualinherited |
If true then surrogate splits will be built.
|
virtualinherited |
Trains the statistical model.
trainData | training data that can be loaded from file using TrainData::loadFromCSV or created with TrainData::create. |
flags | optional flags, depending on the model. Some of the models can be updated with the new training samples, not completely overwritten (such as NormalBayesClassifier or ANN_MLP). |
|
virtualinherited |
Trains the statistical model.
samples | training samples |
layout | See ml::SampleTypes. |
responses | vector of responses associated with the training samples. |
|
inlinestaticinherited |
Create and train model with default parameters.
The class must implement static create()
method with no parameters or with all default parameter values
|
inlinevirtualinherited |
Stores algorithm parameters in a file storage.
Reimplemented in cv::FlannBasedMatcher, cv::DescriptorMatcher, and cv::Feature2D.
References CV_WRAP.
Referenced by cv::Feature2D::write(), and cv::DescriptorMatcher::write().
|
inherited |
simplified API for language bindings This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts.
|
protectedinherited |