|
ppforest2 v0.1.0
Projection Pursuit Decision Trees and Random Forests
|
Bootstrap-aggregated model wrapper. More...
#include <Bagged.hpp>
Public Types | |
| using | Ptr = std::unique_ptr<Bagged<M>> |
Public Member Functions | |
| Bagged (std::unique_ptr< M > model, std::vector< int > sample_indices) | |
| bool | degenerate () const |
| Whether the wrapped model reported a degenerate training run. | |
| std::vector< int > | oob_indices (int n_total) const |
| Row indices of training observations not in the bootstrap sample. | |
| bool | operator!= (Bagged const &other) const |
| bool | operator== (Bagged const &other) const |
| Structural equality on the wrapped model only. | |
| types::OutcomeVector | predict (types::FeatureMatrix const &x) const |
| Delegate batch prediction to the wrapped model. | |
| types::Outcome | predict (types::FeatureVector const &x) const |
| Delegate single-row prediction to the wrapped model. | |
| types::OutcomeVector | predict_oob (types::FeatureMatrix const &x, std::vector< int > const &row_idx) const |
| Predict a subset of rows (typically OOB indices). | |
Static Public Member Functions | |
| template<typename Derived = M> | |
| static Ptr | make (std::unique_ptr< Derived > model, std::vector< int > sample_indices) |
Construct a Bagged<M> and return it as a Ptr. | |
Public Attributes | |
| std::unique_ptr< M > | model |
| The bootstrap-trained model. | |
| std::vector< int > | sample_indices |
Row indices (into the original training set) used to train model. | |
Bootstrap-aggregated model wrapper.
Pairs a concrete model with the row indices of the bootstrap sample it was trained on, so out-of-bag queries can recover the complementary observations. Template over M so the bagging abstraction is orthogonal to the model being aggregated — today M = Tree, but nothing in this class is tree-specific, and a future Bagged<GBTree> or Bagged<LinearModel> would reuse the exact same wrapper.
M must provide:
predict(types::FeatureVector const&) returning a scalar outcomepredict(types::FeatureMatrix const&) returning an OutcomeVectorbool degenerate (field or method) for the forest retry logicoperator==(M const&) for structural equality round-tripsDefined entirely in the header so consumers don't need an explicit instantiation list — including this file with a complete M is enough to use any Bagged<M>.
| using ppforest2::Bagged< M >::Ptr = std::unique_ptr<Bagged<M>> |
|
inline |
|
inline |
Whether the wrapped model reported a degenerate training run.
|
inlinestatic |
Construct a Bagged<M> and return it as a Ptr.
Sugar for the std::make_unique<Bagged<M>>(std::move(model), std::move(sample_indices)) boilerplate at every callsite that builds a bag (forest training, test fixtures, deserializers). Derived defaults to M, but lets callers wrap a subclass pointer (e.g. ClassificationTree::Ptr) without an upcast at the callsite — the cast happens here in one place.
|
inline |
Row indices of training observations not in the bootstrap sample.
O(n_sample + n_total) bitmap scan: a bootstrap bag has ~63% unique entries, and vector<uint8_t> access is faster than the vector<bool> proxy or a tree-set lookup. Out-of-range indices in sample_indices are silently skipped — bootstrap sampling never produces them.
| n_total | Total number of training observations. |
|
inline |
|
inline |
Structural equality on the wrapped model only.
sample_indices is deliberately excluded from the comparison. It records which rows were used to train this bag — bookkeeping for OOB computation, not an identity property of the model. Two bags that would produce the same predictions on every input are equal here, even if they were trained on different bootstrap samples.
Callers that need to assert sample-indices round-trip must compare sample_indices directly (see Json.test.cpp's round-trip test).
|
inline |
Delegate batch prediction to the wrapped model.
|
inline |
Delegate single-row prediction to the wrapped model.
|
inline |
Predict a subset of rows (typically OOB indices).
The returned vector has the same size as row_idx; element i is the wrapped model's prediction for row row_idx[i] of x.
| std::unique_ptr<M> ppforest2::Bagged< M >::model |
The bootstrap-trained model.
| std::vector<int> ppforest2::Bagged< M >::sample_indices |
Row indices (into the original training set) used to train model.