| Signature | Description | Parameters |
|---|---|---|
#include <DataFrame/DataFrameStatsVisitors.h> template<typename T, typename I = unsigned long> struct CramerVonMisesTestVisitor; // ------------------------------------- template<typename T, typename I = unsigned long> using cvonm_test_v = CramerVonMisesTestVisitor<T, I>; |
This functor class calculates the Cramer-von Mises Test The Cramer-von Mises test is a statistical test used to assess how well a sample dataset fits a specified distribution. It's a non-parametric goodness-of-fit test, meaning it doesn't rely on assumptions about the specific distribution of the data. It's often used as an alternative to the Kolmogorov-Smirnov test, especially when deviations at the tails of the distribution are important. The calculated test statistic is then compared to critical values from a distribution or a p-value is calculated. This determines whether the observed difference is statistically significant, indicating a poor fit of the data to the hypothesized distribution. get_result() returns the test statistics (W2). It quantifies the difference between the sample's empirical cumulative distribution function (ECDF) and the theoretical normal distribution function (CDF). A higher test statistic indicates a greater discrepancy between the sample and the normal distribution. get_p_value() member function returns the p-value. P-value is between 0 and 1. Closer values to one means there's not enough evidence to conclude that the sample data differs significantly from the normal distribution.
CramerVonMisesTestVisitor();
|
T: Column data type. I: Index type. |
static void test_CramerVonMisesTestVisitor() { std::cout << "\nTesting CramerVonMisesTestVisitor{ } ..." << std::endl; StrDataFrame ibm; try { ibm.read("IBM.csv", io_format::csv2); } catch (const DataFrameError &ex) { std::cout << ex.what() << std::endl; ::exit(-1); } const auto col_s = ibm.get_index().size(); RandGenParams<double> p1 { .min_value = 99, .max_value = 200, .seed = 123 }; ibm.load_column("uniform", gen_uniform_real_dist<double>(col_s, p1)); ibm.load_column("exponential", gen_exponential_dist<double>(col_s, p1)); ibm.load_column("lognormal", gen_lognormal_dist<double>(col_s, p1)); ibm.load_column("normal", gen_normal_dist<double>(col_s, p1)); RandGenParams<double> p2 { .seed = 123, .mean = 0, .std = 1.0 }; ibm.load_column("std_normal", gen_normal_dist<double>(col_s, p2)); CramerVonMisesTestVisitor<double, std::string> cvmt; ibm.single_act_visit<double>("IBM_Close", cvmt); assert((std::fabs(cvmt.get_result() - 8.56304) < 0.00001)); assert((std::fabs(cvmt.get_p_value() - 2.22045e-16) < 0.0000000000001)); ibm.single_act_visit<double>("uniform", cvmt); assert((std::fabs(cvmt.get_result() - 7.78703) < 0.00001)); assert((std::fabs(cvmt.get_p_value() - 2.22045e-16) < 0.0000000000001)); ibm.single_act_visit<double>("exponential", cvmt); assert((std::fabs(cvmt.get_result() - 39.0971) < 0.0001)); assert((std::fabs(cvmt.get_p_value() - 2.22045e-16) < 0.0000000000001)); ibm.single_act_visit<double>("lognormal", cvmt); assert((std::fabs(cvmt.get_result() - 96.4868) < 0.0001)); assert((std::fabs(cvmt.get_p_value() - 2.22045e-16) < 0.0000000000001)); ibm.single_act_visit<double>("normal", cvmt); assert((std::fabs(cvmt.get_result() - 0.040035) < 0.000001)); assert((std::fabs(cvmt.get_p_value() - 0.99) < 0.01)); ibm.single_act_visit<double>("std_normal", cvmt); assert((std::fabs(cvmt.get_result() - 0.040035) < 0.000001)); assert((std::fabs(cvmt.get_p_value() - 0.99) < 0.01)); }