|
ppforest2 v0.1.0
Projection Pursuit Decision Trees and Random Forests
|
Abstract base class for projection pursuit random forests. More...
#include <Forest.hpp>
Public Types | |
| using | FeatureMatrix = types::FeatureMatrix |
| using | FeatureVector = types::FeatureVector |
| using | Outcome = types::Outcome |
| using | OutcomeVector = types::OutcomeVector |
| using | Ptr = std::unique_ptr<Forest> |
| using | RNG = stats::RNG |
Public Types inherited from ppforest2::Model | |
| using | Ptr = std::shared_ptr<Model> |
Public Member Functions | |
| void | accept (Model::Visitor &visitor) const override=0 |
| Accept a model visitor (mode-specific dispatch). | |
| void | add_tree (BaggedTree::Ptr tree) |
| Add a trained bagged tree to the forest. | |
| bool | operator!= (Forest const &other) const |
| bool | operator== (Forest const &other) const |
| Outcome | predict (FeatureVector const &x) const override=0 |
| Per-row prediction (mode-specific: majority vote or mean). | |
| virtual types::OutcomeVector | predict (types::FeatureMatrix const &x) const |
| Predict a matrix of observations. | |
Public Member Functions inherited from ppforest2::Model | |
| virtual | ~Model ()=default |
Static Public Member Functions | |
| static Ptr | train (TrainingSpec const &spec, FeatureMatrix const &x, OutcomeVector const &y) |
| Train a random forest. | |
Static Public Member Functions inherited from ppforest2::Model | |
| static void | check_train_inputs (types::FeatureMatrix const &x, types::OutcomeVector const &y) |
| Validate common training inputs (y non-empty, matching x rows). | |
| static Ptr | train (TrainingSpec const &spec, types::FeatureMatrix &x, types::OutcomeVector &y) |
| Train a model from a training specification. | |
Public Attributes | |
| std::vector< BaggedTree::Ptr > | trees |
Bootstrap-aggregated trees. Each BaggedTree pairs the polymorphic inner Tree with its sample indices. | |
Public Attributes inherited from ppforest2::Model | |
| bool | degenerate = false |
| Whether the model contains degenerate nodes/splits. | |
| TrainingSpec::Ptr | training_spec |
| Training specification used to build this model. | |
Protected Member Functions | |
| Forest ()=default | |
| Forest (TrainingSpec::Ptr spec) | |
| void | build_trees (types::FeatureMatrix const &x, types::OutcomeVector const &y) |
Build training_spec->size bagged trees in parallel and attach them to this forest. | |
| virtual BaggedTree::Ptr | train_tree (FeatureMatrix const &x, OutcomeVector const &y, RNG &rng) const =0 |
Train one bagged tree on a bootstrap resample of x / y. | |
Abstract base class for projection pursuit random forests.
Holds a vector of BaggedTree wrappers (each pairs a Tree with the bootstrap sample indices it was trained on) and the shared training spec. Aggregation logic (majority vote vs mean), proportion predictions, and OOB handling are defined in the concrete subclasses ClassificationForest and RegressionForest.
Construct via Forest::train, which dispatches to the correct concrete type based on training_spec.mode. Diagnostics and variable importance are free functions in models/Evaluation.hpp — they take a Forest const& (or Ptr) and dispatch on mode internally.
| using ppforest2::Forest::Ptr = std::unique_ptr<Forest> |
| using ppforest2::Forest::RNG = stats::RNG |
|
protecteddefault |
|
inlineexplicitprotected |
|
overridepure virtual |
Accept a model visitor (mode-specific dispatch).
Implements ppforest2::Model.
Implemented in ppforest2::ClassificationForest, and ppforest2::RegressionForest.
| void ppforest2::Forest::add_tree | ( | BaggedTree::Ptr | tree | ) |
Add a trained bagged tree to the forest.
Asserts at runtime that the incoming tree's mode matches this forest's mode (e.g., a ClassificationForest can only accept trees trained under classification). This is the type-safety compromise of keeping Forest::trees mode-agnostic at the container level: the check runs at assembly time rather than at every prediction site, but is cheaper than threading templates through every call site that handles a Forest::Ptr.
|
protected |
Build training_spec->size bagged trees in parallel and attach them to this forest.
Shared training-loop scaffolding for ClassificationForest::train and RegressionForest::train. The two only differ in how a single bagged tree is trained — the outer scaffolding (size guard, OpenMP setup, per-tree retry loop with stream-id RNG, error capture, the post-loop assembly with degenerate-flag propagation) is identical. Mirrors the Tree::build_root pattern: shared algorithm lives on the base; the concrete subclass provides the mode-specific work via the train_tree virtual hook.
Each tree is trained using a RNG stream formula i + attempt * size, which is load-bearing for reproducibility.
| Whatever | train_tree threw. If multiple iterations threw, the first (lowest-index) exception is rethrown. |
| bool ppforest2::Forest::operator!= | ( | Forest const & | other | ) | const |
| bool ppforest2::Forest::operator== | ( | Forest const & | other | ) | const |
|
overridepure virtual |
Per-row prediction (mode-specific: majority vote or mean).
Implements ppforest2::Model.
Implemented in ppforest2::ClassificationForest, and ppforest2::RegressionForest.
|
inlinevirtual |
Predict a matrix of observations.
Default implementation iterates rows and dispatches to the single-row predict. Subclasses may override to vectorize.
| x | Feature matrix (n × p). |
Reimplemented from ppforest2::Model.
Reimplemented in ppforest2::RegressionForest.
|
static |
Train a random forest.
Dispatches to ClassificationForest::train or RegressionForest::train based on training_spec.mode. Note that the top-level x / y are not mutated here — each bootstrap tree resamples into its own local storage. (Contrast with single-tree Tree::train, where regression mode does permute x / y in place.)
|
protectedpure virtual |
Train one bagged tree on a bootstrap resample of x / y.
Mode-specific hook invoked by build_trees once per slot in the forest (with retries on degenerate trees). Subclasses that need per-training shared state (e.g. ClassificationForest caches the parent's label partition for stratified sampling) hold it as a private transient pointer set up before build_trees runs.
Called from inside build_trees's OpenMP parallel for, so implementations must be thread-safe and read-only over *this.
Implemented in ppforest2::ClassificationForest, and ppforest2::RegressionForest.
| std::vector<BaggedTree::Ptr> ppforest2::Forest::trees |
Bootstrap-aggregated trees. Each BaggedTree pairs the polymorphic inner Tree with its sample indices.