Time series data fitting by Python linear regression

Linear regression is a method for modeling the relationship between two scalar values: the input variable X and the output variable Y.

For example, if x is your age and y is your annual income, you may able to estimate how much does your income increase with per year, or how much is your annual income at a certain age using regression analysis.

You may think it is obvious from data that my income will increase as I get older, but the beauty of regression analysis is that it can statistically and quantitatively show that your income will increase (or decrease) by number.


Goal here is to find linear regression coefficient by python using two approaches

  1. Solve overdetermined matrix y = ax+b to find the coefficient of a and b to minimize sum of squared error
  2. Use python built-in linear regression function – linregress(X,Y)
import numpy as np
import matplotlib.pyplot as plt
from scipy import stats


## Obs values ------------------------------------------------------------------
X = np.array([0.05, 0.10, 0.20, 0.30, 0.40, 0.50, 0.60, 0.70, 0.80, 0.90, 0.95])
Y = np.array([0.78, 0.80, 0.76, 0.69, 0.74, 0.71, 0.79, 0.89, 0.82, 0.93, 0.98])
pred_X = np.arange(0, 1.1, 0.01)   
  
## --------------------------------------------------------------
#  Approach. 1
 - Solve matrix and find maximum likelyhood
A = np.array([ X**0, X**1])
w = np.linalg.lstsq(A.T,Y)[0]


pred_Y = w[0] + w[1]*pred_X
plt.plot(X, Y, 'ko')     
plt.plot(pred_X, pred_Y, 'r', label="By solving LSQR")       

## --------------------------------------------------------------
#  Approach. 2 - use scipy linregress
a, b, r_val, p_val, std_err = stats.linregress(X,Y)

pred_Y = a*pred_X+b  
plt.plot(X, Y, 'ko')    
plt.plot(pred_X, pred_Y, 'b--', label="By scipy linregress")   
plt.xlim(0.0, 1.0); plt.ylim(0.6, 1.05)
plt.legend(loc="upper left", prop={'size': 14})
Time series data fitting by Python linear regression

Leave a Reply

Your email address will not be published. Required fields are marked *