niedziela, 5 czerwca 2016

Regression Modeling in Practice: WEEK 4

Logistic Regression


AssignmentTest a Logic Regression Model 

Source: Data from OECD (“The Organisation for Economic Co-operation and Development”)

Variables Used

*Employment Rate -  It is the number of employed persons aged 15 to 64 over the population of the same age. (source: OECD)

*Life Satisfaction - This indicator considers people's evaluation of their life as a whole. (source: OECD)

*Household Disposable Income- It´s the maximum amount that household can afford to consume without having to reduce its assets or to increase its liabilities. (source: OECD)


Explanatory variables were standardized for the Logistic procedures.
Response variable (Life Satisfaction) was binned into 2 categories.

Introduction:


In this logistic model I coded my response variable of Life Satisfaction as 0 if the country has Life Satisfaction Index below or equal to 6.8 (on the scale of 1 to 10), and 1 if it has Life Satisfaction above 6.8.

I used the centered Employment Rate and centered Household Disposable Income as the explanatory variables. 

My hypothesis: There is a strong correlation between Life Satisfaction and Employment Rate.

CODE:

OUTPUT:
[STAGE 1]
[STAGE 2]
Summary:

In my analysis, I introduced two stages. First, I ran logistic regression for the primary explanatory variable and response variable. After receiving positive results, I added the second explanatory in order to check whether it is significant or, on the contrary, confounding the relationship. The results of the two stages are as follows:

[STAGE 1]
The primary explanatory variable “Employment Rate” has a significant relationship with the response variable “Life Satisfaction” (p<0.0019). The null hypothesis may be rejected. The likelihood ratio in testing Null Hypothesis gives p<.0001.

The explanatory variable (parameter estimate= 0.2845 p-value p<0.0019, odds ratio= 1.329) shows that countries with high Employment rates are 1.329 times more likely to have Life Satisfaction Index of more than 6.8 (on the scale from 1 to 10). 
There is 95% confidence that the likelihood falls between 1.111 and 1.590.


[STAGE 2]
After adding the second explanatory variable “Household Disposable Income,” the correlation with “Life Satisfaction” remains significant with p= 0.0477 and 0.0421 for Employment Rate and Household Income respectively. Therefore, Household Income does not confound the results.

This time the odds ratio for Employment Rate is 1.203. Countries with high Employment rates are 1.203 times more likely to have high Life Satisfaction. There is 95% confidence between 1.002 and 1.446.
The odds ratio for Household disposable income is 1.000 and there is 95% confidence between 1.000 and 1.001.


The results support my original hypothesis of the significant and positive relationship between the Life Satisfaction rate and the Employment Rate. It appears that people are more satisfied with their lives in countries where employment rate is high.


There was no evidence of confounding for the association between my primary explanatory variable (Employment Rate) and the response variable (Life Satisfaction). After adding the second explanatory variable (Household Disposable Income) the relationship remained statistically significant.