ols

Syntax

ols(Y, X, [intercept=true], [mode=0])

Arguments

Y is the dependent variable; X is the independent variable(s).

Y is a vector; X is a matrix/table/tuple. When X is a matrix, if the number of rows equals the length of Y, each column of X is a factor; if the number of rows is not the same as the length of Y, and if the number of columns equals the length of Y, each row of X is a factor.

intercept is a Boolean variable indicating whether the regression includes the intercept. If it is true, the system automatically adds a column of 1’s to X to generate the intercept. The default value is true.

mode is an integer and could be 0, 1 or 2. It indicates the contents in the output. The default value is 0.

  • 0: a vector of the coefficient estimates.

  • 1: a table with coefficient estimates, standard error, t-statistics, and p-values.

  • 2: a dictionary with the following keys: ANOVA, RegressionStat, Coefficient and Residual

ANOVA (one-way analysis of variance)

Source of Variance

DF (degree of freedom)

SS (sum of square)

MS (mean of square)

F (F-score)

Significance

Regression

p

sum of squares regression, SSR

regression mean square, MSR=SSR/R

MSR/MSE

p-value

Residual

n-p-1

sum of squares error, SSE

mean square error, MSE=MSE/E

Total

n-1

sum of squares total, SST

RegressionStat (Regression statistics)

Item

Description

R2

R-squared

AdjustedR2

The adjusted R-squared corrected based on the degrees of freedom by comparing the sample size to the number of terms in the regression model.

StdError

The residual standard error/deviation corrected based on the degrees of freedom.

Observations

The sample size.

Coefficient

Item

Description

factor

Independent variables

beta

Estimated regression coefficients

StdError

Standard error of the regression coefficients

tstat

t statistic, indicating the significance of the regression coefficients

Residual: the difference between each predicted value and the actual value.

Details

Return the result of an ordinary-least-squares regression of Y on X.

Note that NULL values in X and Y are treated as 0 in calculations.

Examples

$ x1=1 3 5 7 11 16 23
$ x2=2 8 11 34 56 54 100
$ y=0.1 4.2 5.6 8.8 22.1 35.6 77.2;

$ ols(y, x1);
[-9.912821,3.378632]

$ ols(y, (x1,x2));
[-9.494813,2.806426,0.13147]
$ ols(y, (x1,x2), 1, 1);

factor

beta

stdError

tstat

pvalue

intercept

-9.494813

5.233168

-1.814353

0.143818

x1

2.806426

1.830782

1.532911

0.20007

x2

0.13147

0.409081

0.321379

0.764015

$ ols(y, (x1,x2), 1, 2);
ANOVA->
Breakdown  DF SS          MS          F         Significance
---------- -- ----------- ----------- --------- ------------
Regression 2  4204.416396 2102.208198 31.467739 0.003571
Residual   4  267.220747  66.805187
Total      6  4471.637143

RegressionStat->
item         statistics
------------ ----------
R2           0.940241
AdjustedR2   0.910361
StdError     8.173444
Observations 7

Coefficient->
factor    beta      stdError tstat     pvalue
--------- --------- -------- --------- --------
intercept -9.494813 5.233168 -1.814353 0.143818
x1        2.806426  1.830782 1.532911  0.20007
x2        0.13147   0.409081 0.321379  0.764015
$ x=matrix(1 4 8 2 3, 1 4 2 3 8, 1 5 1 1 5);
$ x;

#0

#1

#2

1

1

1

4

4

5

8

2

1

2

3

1

3

8

5

$ ols(1..5, x);
[1.156537,0.105505,0.91055,-0.697821]

$ ols(1..5, x.transpose());
[1.156537,0.105505,0.91055,-0.697821]
// the system adjusts the dimensions of the dependent variable and the independent variables for the regression to go through.