| Signature | Description |
|---|---|
template<typename T> struct CanonCorrResult { // These values represent the strength of the linear relationship between // each pair of canonical variates, ranging from -1 to 1, with higher // absolute values signifying a stronger association. // std::vector<T> coeffs { }; // Canonical correlation coefficients // The Redundancy Index is a measure that indicates how much variance in // one set of variables is explained by the linear combination of the other // set of variables. This was proposed by Stewart and Love (1968). // T x_red_idx { }; // Redundancy index for X T y_red_idx { }; // Redundancy index for Y }; |
Result of Canonical Correlation Analysis as returned by canon_corr() interface |
| Signature | Description | Parameters |
|---|---|---|
template<typename T> CanonCorrResult<T> canon_corr(std::vector<const char *> &&X_col_names, std::vector<const char *> &&Y_col_names) const; |
This performs Canonical Correlation Analysis (CCA) between two sets of columns X and Y. It returns the result in a struct defined above. CCA is a statistical method for examining and measuring correlations between two sets of variables. Fundamentally, CCA looks for linear combinations of variables, also referred to as canonical variables, within each set so that the correlation between them is maximized. Finding relationships and patterns of linkage between the two groups is the main objective. NOTE: Number of columns in each set must be the same |
T: Type of the named columns X_col_names: Names of the first set of columns Y_col_names: Names of the second set of columns |
static void test_canon_corr() { std::cout << "\nTesting canon_corr( ) ..." << std::endl; StrDataFrame df; try { df.read("IBM.csv", io_format::csv2); } catch (const DataFrameError &ex) { std::cout << ex.what() << std::endl; } const auto result = df.canon_corr<double>({ "IBM_Close", "IBM_Open" }, { "IBM_High", "IBM_Low" }); assert(result.coeffs.size() == 2); assert(std::fabs(result.coeffs[0] - 0.999944) < 0.000001); assert(std::fabs(result.coeffs[1] - 0.262927) < 0.000001); assert(std::fabs(result.x_red_idx - 0.534073) < 0.000001); assert(std::fabs(result.y_red_idx - 0.535897) < 0.000001); }