OpenMS
LinearRegression Class Reference

This class offers functions to perform least-squares fits to a straight line model, \( Y(c,x) = c_0 + c_1 x \). More...

#include <OpenMS/ML/REGRESSION/LinearRegression.h>

Collaboration diagram for LinearRegression:
[legend]

Public Member Functions

 LinearRegression ()
 Constructor. More...
 
virtual ~LinearRegression ()=default
 Destructor. More...
 
void computeRegression (double confidence_interval_P, std::vector< double >::const_iterator x_begin, std::vector< double >::const_iterator x_end, std::vector< double >::const_iterator y_begin, bool compute_goodness=true)
 This function computes the best-fit linear regression coefficients \( (c_0,c_1) \) of the model \( Y = c_0 + c_1 X \) for the dataset \( (x, y) \). More...
 
void computeRegressionWeighted (double confidence_interval_P, std::vector< double >::const_iterator x_begin, std::vector< double >::const_iterator x_end, std::vector< double >::const_iterator y_begin, std::vector< double >::const_iterator w_begin, bool compute_goodness=true)
 This function computes the best-fit linear regression coefficients \( (c_0,c_1) \) of the model \( Y = c_0 + c_1 X \) for the weighted dataset \( (x, y) \). More...
 
double getIntercept () const
 Non-mutable access to the y-intercept of the straight line. More...
 
double getSlope () const
 Non-mutable access to the slope of the straight line. More...
 
double getXIntercept () const
 Non-mutable access to the x-intercept of the straight line. More...
 
double getLower () const
 Non-mutable access to the lower border of confidence interval. More...
 
double getUpper () const
 Non-mutable access to the upper border of confidence interval. More...
 
double getTValue () const
 Non-mutable access to the value of the t-distribution. More...
 
double getRSquared () const
 Non-mutable access to the squared Pearson coefficient. More...
 
double getStandDevRes () const
 Non-mutable access to the standard deviation of the residuals. More...
 
double getMeanRes () const
 Non-mutable access to the residual mean. More...
 
double getStandErrSlope () const
 Non-mutable access to the standard error of the slope. More...
 
double getChiSquared () const
 Non-mutable access to the chi squared value. More...
 
double getRSD () const
 Non-mutable access to relative standard deviation. More...
 

Static Public Member Functions

static double computePointY (double x, double slope, double intercept)
 given x compute y = slope * x + intercept More...
 

Protected Member Functions

void computeGoodness_ (const std::vector< double > &X, const std::vector< double > &Y, double confidence_interval_P)
 Computes the goodness of the fitted regression line. More...
 
template<typename Iterator >
double computeChiSquare (Iterator x_begin, Iterator x_end, Iterator y_begin, double slope, double intercept)
 Compute the chi squared of a linear fit. More...
 
template<typename Iterator >
double computeWeightedChiSquare (Iterator x_begin, Iterator x_end, Iterator y_begin, Iterator w_begin, double slope, double intercept)
 Compute the chi squared of a weighted linear fit. More...
 

Protected Attributes

double intercept_
 The intercept of the fitted line with the y-axis. More...
 
double slope_
 The slope of the fitted line. More...
 
double x_intercept_
 The intercept of the fitted line with the x-axis. More...
 
double lower_
 The lower bound of the confidence interval. More...
 
double upper_
 The upper bound of the confidence interval. More...
 
double t_star_
 The value of the t-statistic. More...
 
double r_squared_
 The squared correlation coefficient (Pearson) More...
 
double stand_dev_residuals_
 The standard deviation of the residuals. More...
 
double mean_residuals_
 Mean of residuals. More...
 
double stand_error_slope_
 The standard error of the slope. More...
 
double chi_squared_
 The value of the Chi Squared statistic. More...
 
double rsd_
 the relative standard deviation More...
 

Private Member Functions

 LinearRegression (const LinearRegression &arg)
 Not implemented. More...
 
LinearRegressionoperator= (const LinearRegression &arg)
 Not implemented. More...
 

Detailed Description

This class offers functions to perform least-squares fits to a straight line model, \( Y(c,x) = c_0 + c_1 x \).

Next to the intercept with the y-axis and the slope of the fitted line, this class computes the:

  • squared Pearson coefficient
  • value of the t-distribution
  • standard deviation of the residuals
  • standard error of the slope
  • intercept with the x-axis (useful for additive series experiments)
  • lower border of confidence interval
  • higher border of confidence interval
  • chi squared value
  • x mean

Constructor & Destructor Documentation

◆ LinearRegression() [1/2]

LinearRegression ( )
inline

Constructor.

◆ ~LinearRegression()

virtual ~LinearRegression ( )
virtualdefault

Destructor.

◆ LinearRegression() [2/2]

LinearRegression ( const LinearRegression arg)
private

Not implemented.

Member Function Documentation

◆ computeChiSquare()

double computeChiSquare ( Iterator  x_begin,
Iterator  x_end,
Iterator  y_begin,
double  slope,
double  intercept 
)
protected

Compute the chi squared of a linear fit.

References LinearRegression::computePointY().

◆ computeGoodness_()

void computeGoodness_ ( const std::vector< double > &  X,
const std::vector< double > &  Y,
double  confidence_interval_P 
)
protected

Computes the goodness of the fitted regression line.

◆ computePointY()

static double computePointY ( double  x,
double  slope,
double  intercept 
)
inlinestatic

given x compute y = slope * x + intercept

Referenced by LinearRegression::computeChiSquare(), and LinearRegression::computeWeightedChiSquare().

◆ computeRegression()

void computeRegression ( double  confidence_interval_P,
std::vector< double >::const_iterator  x_begin,
std::vector< double >::const_iterator  x_end,
std::vector< double >::const_iterator  y_begin,
bool  compute_goodness = true 
)

This function computes the best-fit linear regression coefficients \( (c_0,c_1) \) of the model \( Y = c_0 + c_1 X \) for the dataset \( (x, y) \).

The values in x-dimension of the dataset \( (x,y) \) are given by the iterator range [x_begin,x_end) and the corresponding y-values start at position y_begin.

For a "x %" Confidence Interval use confidence_interval_P = x/100. For example the 95% Confidence Interval is supposed to be an interval that has a 95% chance of containing the true value of the parameter.

Parameters
confidence_interval_PValue between 0-1 to determine lower and upper CI borders.
x_beginBegin iterator of x values
x_endEnd iterator of x values
y_beginBegin iterator of y values (same length as x)
compute_goodnessCompute meta stats about the fit. If this is not done, none of the members (except slope and intercept) are meaningful.
Exceptions
Exception::UnableToFitis thrown if fitting cannot be performed

◆ computeRegressionWeighted()

void computeRegressionWeighted ( double  confidence_interval_P,
std::vector< double >::const_iterator  x_begin,
std::vector< double >::const_iterator  x_end,
std::vector< double >::const_iterator  y_begin,
std::vector< double >::const_iterator  w_begin,
bool  compute_goodness = true 
)

This function computes the best-fit linear regression coefficients \( (c_0,c_1) \) of the model \( Y = c_0 + c_1 X \) for the weighted dataset \( (x, y) \).

The values in x-dimension of the dataset \( (x, y) \) are given by the iterator range [x_begin,x_end) and the corresponding y-values start at position y_begin. They will be weighted by the values starting at w_begin.

For a "x %" Confidence Interval use confidence_interval_P = x/100. For example the 95% Confidence Interval is supposed to be an interval that has a 95% chance of containing the true value of the parameter.

Parameters
confidence_interval_PValue between 0-1 to determine lower and upper CI borders.
x_beginBegin iterator of x values
x_endEnd iterator of x values
y_beginBegin iterator of y values (same length as x)
w_beginBegin iterator of weight values (same length as x)
compute_goodnessCompute meta stats about the fit. If this is not done, none of the members (except slope and intercept) are meaningful.
Exceptions
Exception::UnableToFitis thrown if fitting cannot be performed

◆ computeWeightedChiSquare()

double computeWeightedChiSquare ( Iterator  x_begin,
Iterator  x_end,
Iterator  y_begin,
Iterator  w_begin,
double  slope,
double  intercept 
)
protected

Compute the chi squared of a weighted linear fit.

References LinearRegression::computePointY().

◆ getChiSquared()

double getChiSquared ( ) const

Non-mutable access to the chi squared value.

◆ getIntercept()

double getIntercept ( ) const

Non-mutable access to the y-intercept of the straight line.

◆ getLower()

double getLower ( ) const

Non-mutable access to the lower border of confidence interval.

◆ getMeanRes()

double getMeanRes ( ) const

Non-mutable access to the residual mean.

◆ getRSD()

double getRSD ( ) const

Non-mutable access to relative standard deviation.

◆ getRSquared()

double getRSquared ( ) const

Non-mutable access to the squared Pearson coefficient.

◆ getSlope()

double getSlope ( ) const

Non-mutable access to the slope of the straight line.

◆ getStandDevRes()

double getStandDevRes ( ) const

Non-mutable access to the standard deviation of the residuals.

◆ getStandErrSlope()

double getStandErrSlope ( ) const

Non-mutable access to the standard error of the slope.

◆ getTValue()

double getTValue ( ) const

Non-mutable access to the value of the t-distribution.

◆ getUpper()

double getUpper ( ) const

Non-mutable access to the upper border of confidence interval.

◆ getXIntercept()

double getXIntercept ( ) const

Non-mutable access to the x-intercept of the straight line.

◆ operator=()

LinearRegression& operator= ( const LinearRegression arg)
private

Not implemented.

Member Data Documentation

◆ chi_squared_

double chi_squared_
protected

The value of the Chi Squared statistic.

◆ intercept_

double intercept_
protected

The intercept of the fitted line with the y-axis.

◆ lower_

double lower_
protected

The lower bound of the confidence interval.

◆ mean_residuals_

double mean_residuals_
protected

Mean of residuals.

◆ r_squared_

double r_squared_
protected

The squared correlation coefficient (Pearson)

◆ rsd_

double rsd_
protected

the relative standard deviation

◆ slope_

double slope_
protected

The slope of the fitted line.

◆ stand_dev_residuals_

double stand_dev_residuals_
protected

The standard deviation of the residuals.

◆ stand_error_slope_

double stand_error_slope_
protected

The standard error of the slope.

◆ t_star_

double t_star_
protected

The value of the t-statistic.

◆ upper_

double upper_
protected

The upper bound of the confidence interval.

◆ x_intercept_

double x_intercept_
protected

The intercept of the fitted line with the x-axis.