Statistical infrastructure for training and evaluation.
Definition ConfusionMatrix.hpp:11
Split split(DataPacket const &data, float train_ratio, RNG &rng)
Perform a stratified random train/test split on a DataPacket.
DataPacket simulate(int n, int p, int G, RNG &rng, SimulationParams const ¶ms=SimulationParams{})
Generate a simulated dataset with G groups, n rows, and p features.
pcg32 RNG
Definition Stats.hpp:19
Bundled dataset: features, responses, and group labels.
Definition DataPacket.hpp:18
Parameters for generating simulated classification data.
Definition Simulation.hpp:17
float sd
Standard deviation within each group.
Definition Simulation.hpp:20
float mean_separation
Mean shift between successive groups.
Definition Simulation.hpp:19
float mean
Base mean for the first group.
Definition Simulation.hpp:18
Indices for a train/test split.
Definition Simulation.hpp:41
std::vector< int > te
Test set indices.
Definition Simulation.hpp:43
std::vector< int > tr
Training set indices.
Definition Simulation.hpp:42