Definition Simulation.hpp:10
Statistical infrastructure for training and evaluation.
Definition ConfusionMatrix.hpp:11
Split split(DataPacket const &data, float train_ratio, RNG &rng)
Perform a stratified random train/test split on a DataPacket.
DataPacket simulate(int n, int p, int G, RNG &rng, simulation::params::Classification const ¶ms={})
Generate a simulated classification dataset.
pcg32 RNG
Definition Stats.hpp:24
float Feature
Scalar type for feature values.
Definition Types.hpp:24
Bundled dataset: features, response, and group labels.
Definition DataPacket.hpp:19
Indices for a train/test split.
Definition Simulation.hpp:57
std::vector< int > te
Test set indices.
Definition Simulation.hpp:59
std::vector< int > tr
Training set indices.
Definition Simulation.hpp:58
Classification simulation: group-shifted normals.
Definition Simulation.hpp:17
types::Feature sd
Standard deviation within each group.
Definition Simulation.hpp:20
types::Feature mean
Base mean for the first group.
Definition Simulation.hpp:18
types::Feature mean_separation
Mean shift between successive groups.
Definition Simulation.hpp:19
Regression simulation: linear model over i.i.d. features.
Definition Simulation.hpp:31
types::Feature sd
Standard deviation of feature values.
Definition Simulation.hpp:35
types::Feature y_intercept
Base intercept added to every response.
Definition Simulation.hpp:33
int n_informative
Informative feature count (0 → min(p, 5)).
Definition Simulation.hpp:32
types::Feature y_sd
Standard deviation of response noise.
Definition Simulation.hpp:34