Sunday, December 22, 2013

2.8 Multiple linear regression in R

Multiple linear regression in R

In the previous section it is explained how a simple linear regression could be executed with R, where a dependent variable is predicted by only one independent variable. In this section it is explained how to execute a multiple linear regression analysis in R. In this case the dependent variable is explained with more than one independent variable.

The command to perform a multiple linear regression with R

In R, there is a slight difference between the command of a simple regression analysis and a multiple regression analysis. Executing a multiple linear regression could be executed in R with the following command: (*name of the regression*)<-lm(*dependent variable*~*independent variable 1* + *independent variable 2* + *independent variable.... etc.*, *variable of the dataset*). You could add as much independent variables you want to predict the dependent variable, as long as you use a + symbol when adding these variables.

In the same way as the simple linear regression, you can present the predicted model with the information of the regression analysis respectively with the commands *naam of the regression* and summary(*name of the regression*)


Figure 25: Executing a multiple linear regression with R
Figure 25: Executing a multiple linear regression with R

Intepretation of the regression

In contrast to a simple linear regression, the results of a multiple linear regression show more rows in the information section of the overview of the regression. The principle stays the same, based on the indications of stars after the specific row the levels of significance of the variables are shown.

Figure 25 shows the result of the following regression analysis. Predicting the profit based on information about the costs of labour, material and satisfaction of the customer use the following command:
Regressing<-lm(Profit~PersonnelCosts+MaterialCosts+SatisfactionCustomer,Projects). By putting in the name of the Regression, the following prediction model appears:
Profit = 8845.8805 + (-0.2964)X1 + (-1.9292)X2 + (-1338.5787)X3.  Where X1 stands for personnel costs, X2 for material costs and X3 for the level of customer satisfaction.

With the command summary(Regression) the regression analysis appears. Notice that all rows of variables are significant. The variables vary in level of significance, however all the variables are significant enough to predict valuable outcomes. In the example of Figure 25 the Adjusted R-squared is only 0.2529. The three independent variables are quite acceptable variables to make predictions of the profit, one out of four predictions (25%) from the model are right. 

In this section you could see that a multiple linear regression works in the same way as a simple linear regression.

To the next step: 2.9 Multiple or simple linear regression analysis with categorical variables in R

No comments:

Post a Comment