Correlation and Regression are the two analysis based on multivariate distribution. A multivariate distribution is described as a distribution of multiple variables. Correlation is described as the analysis which lets us know the association or the absence of the relationship between two variables ‘x’ and ‘y’. On the other end, Regression analysis, predicts the value of the dependent variable based on the known value of the independent variable, assuming that average mathematical relationship between two or more variables.
The formula for Correlation and Regression:
Example: Calculate the linear regression coefficients and coefficient of correlation.
x
|
1
|
2
|
3
|
4
|
5
|
6
|
7
|
8
|
y
|
3
|
7
|
10
|
12
|
14
|
17
|
20
|
24
|
X
|
Y
|
X2
|
Y2
|
XY
|
1
|
3
|
1
|
9
|
3
|
2
|
7
|
2
|
49
|
14
|
3
|
10
|
9
|
100
|
30
|
4
|
12
|
16
|
144
|
48
|
5
|
14
|
25
|
196
|
70
|
6
|
17
|
36
|
289
|
102
|
7
|
20
|
49
|
400
|
140
|
8
|
24
|
64
|
576
|
192
|
∑X= 36
|
∑Y= 107
|
∑X2= 204
|
∑Y2 =1763
|
∑XY= 599
|
Now using above formulas :
byx=(8x599-36x107)/(8x204- 36x36)
=940/336
=2.8
bxy=(8x599-36x107)/(8x1763-107x107)
=940/2655
=.35
The coefficient of Correlation= sqrt(2.8 X .35)
= .98 approx 1.0
#include<stdio.h>
#include<math.h>
void main()
{
float r2,x[10],y[10],xy[10],x2[10],y2[10],sumx=0,sumy=0,sumxy=0,sumx2=0, sumy2=0,bxy,byx;
int n,i;
printf("\nEnter the no. of data");
scanf("%d",&n);
for(i=0;i<n;i++)
{ printf("\nValue of x%d=",i+1);
scanf("%f",&x[i]);
printf("\nValue of y%d=",i+1);
scanf("%f",&y[i]);
}
for(i=0;i<n;i++)
{ x2[i]=x[i]*x[i];
y2[i]=y[i]*y[i];
xy[i]=x[i]*y[i];
}
for(i=0;i<n;i++)
{ sumx=sumx+x[i];
sumy=sumy+y[i];
sumxy=sumxy+xy[i];
sumx2=sumx2+x2[i];
sumy2=sumy2+y2[i];
}
printf("\nx|\ty|\tx2|\ty2|\txy");
printf("\n***************************************");
for(i=0;i<n;i++)
{ printf("\n%.2f|\t%.2f|\t%.2f|\t%.2f|\t%.2f",x[i],y[i],x2[i],y2[i],xy[i]);
}
printf("\n***************************************");
printf("\n%.2f|\t%.2f|\t%.2f|\t%.2f|\t%.2f",sumx,sumy,sumx2,sumy2,sumxy);
printf("\n\nThe linear regression coefficients are:");
byx=(n*sumxy-sumx*sumy)/(n*sumx2-sumx*sumx);
bxy=(n*sumxy-sumx*sumy)/(n*sumy2-sumy*sumy);
printf("\n\nbyx=%.2f\nbxy=%.2f",byx,bxy);
r2=sqrt(byx*bxy);
printf("\ncoefficient of correlation=%.2f",r2);
}
Output:
Enter the no. of data 8
Value of x1=1
Value of y1=3
Value of x2=2
Value of y2=7
Value of x3=3
Value of y3=10
Value of x4=4
Value of y4=12
Value of x5=5
Value of y5=14
Value of x6=6
Value of y6=17
Value of x7=7
Value of y7=20
Value of x8=8
Value of y8=24
x| y| x2| y2| xy
***************************************
1.00| 3.00| 1.00| 9.00| 3.00
2.00| 7.00| 4.00| 49.00| 14.00
3.00| 10.00| 9.00| 100.00| 30.00
4.00| 12.00| 16.00| 144.00| 48.00
5.00| 14.00| 25.00| 196.00| 70.00
6.00| 17.00| 36.00| 289.00| 102.00
7.00| 20.00| 49.00| 400.00| 140.00
8.00| 24.00| 64.00| 576.00| 192.00
***************************************
36.00| 107.00| 204.00| 1763.00| 599.00
The linear regression coefficients are:
byx=2.80
bxy=0.35
coefficient of correlation=1.00
No comments:
Post a Comment