ppforest2 v0.1.0
Projection Pursuit Decision Trees and Random Forests
Loading...
Searching...
No Matches
Simulation.hpp
Go to the documentation of this file.
1#pragma once
2
3#include "stats/Stats.hpp"
5#include "utils/Types.hpp"
6
7#include <vector>
8
9namespace ppforest2::stats {
18 float mean = 100.0f;
19 float mean_separation = 50.0f;
20 float sd = 10.0f;
21 };
22
36 DataPacket simulate(int n, int p, int G, RNG& rng, SimulationParams const& params = SimulationParams{});
37
41 struct Split {
42 std::vector<int> tr;
43 std::vector<int> te;
44 };
45
57 Split split(DataPacket const& data, float train_ratio, RNG& rng);
58}
Statistical infrastructure for training and evaluation.
Definition ConfusionMatrix.hpp:11
Split split(DataPacket const &data, float train_ratio, RNG &rng)
Perform a stratified random train/test split on a DataPacket.
DataPacket simulate(int n, int p, int G, RNG &rng, SimulationParams const &params=SimulationParams{})
Generate a simulated dataset with G groups, n rows, and p features.
pcg32 RNG
Definition Stats.hpp:19
Bundled dataset: features, responses, and group labels.
Definition DataPacket.hpp:18
Parameters for generating simulated classification data.
Definition Simulation.hpp:17
float sd
Standard deviation within each group.
Definition Simulation.hpp:20
float mean_separation
Mean shift between successive groups.
Definition Simulation.hpp:19
float mean
Base mean for the first group.
Definition Simulation.hpp:18
Indices for a train/test split.
Definition Simulation.hpp:41
std::vector< int > te
Test set indices.
Definition Simulation.hpp:43
std::vector< int > tr
Training set indices.
Definition Simulation.hpp:42