|
ppforest2 v0.1.0
Projection Pursuit Decision Trees and Random Forests
|
Abstract base class for projection pursuit decision trees. More...
#include <Tree.hpp>
Public Types | |
| using | FeatureMatrix = types::FeatureMatrix |
| using | FeatureVector = types::FeatureVector |
| using | GroupPartition = stats::GroupPartition |
| using | Outcome = types::Outcome |
| using | OutcomeVector = types::OutcomeVector |
| using | Ptr = std::unique_ptr<Tree> |
| using | RNG = stats::RNG |
| using | Root = TreeNode::Ptr |
Public Types inherited from ppforest2::Model | |
| using | Ptr = std::shared_ptr<Model> |
Public Member Functions | |
| void | accept (Model::Visitor &visitor) const override=0 |
| Accept a model visitor (mode-specific dispatch). | |
| bool | operator!= (Tree const &other) const |
| bool | operator== (Tree const &other) const |
| virtual types::OutcomeVector | predict (types::FeatureMatrix const &x) const |
| Predict a matrix of observations. | |
| types::Outcome | predict (types::FeatureVector const &x) const override |
| Predict a single observation. | |
Public Member Functions inherited from ppforest2::Model | |
| virtual | ~Model ()=default |
Static Public Member Functions | |
| static Ptr | train (TrainingSpec const &spec, types::FeatureMatrix &x, types::OutcomeVector &y) |
| Train a tree from a response vector. | |
| static Ptr | train (TrainingSpec const &spec, types::FeatureMatrix &x, types::OutcomeVector &y, stats::RNG &rng) |
| Train a tree from a response vector. | |
Static Public Member Functions inherited from ppforest2::Model | |
| static void | check_train_inputs (types::FeatureMatrix const &x, types::OutcomeVector const &y) |
| Validate common training inputs (y non-empty, matching x rows). | |
| static Ptr | train (TrainingSpec const &spec, types::FeatureMatrix &x, types::OutcomeVector &y) |
| Train a model from a training specification. | |
Public Attributes | |
| Root | root |
| Root node of the tree. | |
Public Attributes inherited from ppforest2::Model | |
| bool | degenerate = false |
| Whether the model contains degenerate nodes/splits. | |
| TrainingSpec::Ptr | training_spec |
| Training specification used to build this model. | |
Protected Member Functions | |
| Tree (TreeNode::Ptr root, TrainingSpec::Ptr spec) | |
Static Protected Member Functions | |
| static Root | build_root (TrainingSpec const &spec, FeatureMatrix &x, OutcomeVector &y, GroupPartition const &y_part, RNG &rng) |
| Build the root node of a tree. | |
Abstract base class for projection pursuit decision trees.
Each internal node projects data onto a linear combination of features and splits on the projected value. Leaf values depend on the mode — group labels for classification, mean response for regression — implemented in the concrete subclasses ClassificationTree and RegressionTree.
Construct via the static Tree::train factory, which dispatches to the correct concrete type based on training_spec.mode.
| using ppforest2::Tree::Ptr = std::unique_ptr<Tree> |
| using ppforest2::Tree::RNG = stats::RNG |
| using ppforest2::Tree::Root = TreeNode::Ptr |
|
inlineprotected |
|
overridepure virtual |
Accept a model visitor (mode-specific dispatch).
Implements ppforest2::Model.
Implemented in ppforest2::ClassificationTree, and ppforest2::RegressionTree.
|
staticprotected |
Build the root node of a tree.
Iteratively grows the tree from the given group partition. Shared implementation used by ClassificationTree::train and RegressionTree::train.
x and y are mutable: regression's ByCutpoint grouping strategy reorders rows in place. Classification doesn't mutate either — the reference is still non-const to keep the signature uniform across modes and avoid the "aliased mutable vs immutable view" smell of the prior pointer-based design.
| spec | Training specification (strategies + mode). |
| x | Feature matrix. Mutated by regression's in-place row reorder; untouched by classification. |
| y | Response vector. Same mutation contract as x. |
| y_part | Initial group partition for the root node. |
| rng | Random number generator (tree-local). |
| bool ppforest2::Tree::operator!= | ( | Tree const & | other | ) | const |
| bool ppforest2::Tree::operator== | ( | Tree const & | other | ) | const |
|
inlinevirtual |
Predict a matrix of observations.
Default implementation iterates rows and dispatches to the single-row predict. Subclasses may override to vectorize.
| x | Feature matrix (n × p). |
Reimplemented from ppforest2::Model.
|
overridevirtual |
Predict a single observation.
Walks the tree and returns the leaf value. Same implementation for both modes — the leaf value is produced by the mode-specific leaf strategy during training.
Implements ppforest2::Model.
|
static |
Train a tree from a response vector.
Dispatches to ClassificationTree::train or RegressionTree::train based on training_spec.mode. Creates an RNG from the spec's seed.
x and y are taken by mutable reference because some strategies — notably ByCutpoint for regression — permute rows in place during training. Classification training does not mutate them, so the classification path pays no cost for this signature. The alternative (const-correct public API + defensive copy at the regression dispatch) would force a full-matrix copy per single-tree regression call, which is a real cost the library shouldn't absorb when the natural callers (R bindings, CLI) discard the data right after training anyway. Callers who need to preserve the original row order must copy before calling.
| spec | Training specification (strategies + mode). |
| x | Feature matrix (n × p). May be permuted during regression training. |
| y | Response vector (n) — integer labels for classification, continuous response for regression. May be permuted during regression training. |
|
static |
Train a tree from a response vector.
Dispatches to ClassificationTree::train or RegressionTree::train based on training_spec.mode. Creates an RNG from the spec's seed.
x and y are taken by mutable reference because some strategies — notably ByCutpoint for regression — permute rows in place during training. Classification training does not mutate them, so the classification path pays no cost for this signature. The alternative (const-correct public API + defensive copy at the regression dispatch) would force a full-matrix copy per single-tree regression call, which is a real cost the library shouldn't absorb when the natural callers (R bindings, CLI) discard the data right after training anyway. Callers who need to preserve the original row order must copy before calling.
| spec | Training specification (strategies + mode). |
| x | Feature matrix (n × p). May be permuted during regression training. |
| y | Response vector (n) — integer labels for classification, continuous response for regression. May be permuted during regression training. |