środa, 30 marca 2016

Data Management and Visualization: WEEK 4

Project:
DATA VISUALIZATION

Source: New Data Sheet from OECD (“The Organisation for Economic Co-operation and Development)

The objective of this program is to visualize data both by creating charts of individual variables and pairs of variables.

The source which I used is a new, imported data sheet with 34 developed countries (including 24 European countries) with GDP per capita variable and various variables responsible for Life Quality. All the data comes from OECD.org

The variables observed in this assignment are as follows:

*Countries - Australia, Austria, Belgium, Canada, Chile, Czech Republic, Denmark, Estonia, Finland, France, Germany, Greece, Hungary, Iceland, Ireland, Israel, Italy, Japan, Korea, Luxembourg, Mexico, Netherlands, New Zealand, Norway, Poland, Portugal, Slovak Republic, Slovenia, Spain, Sweden, Switzerland, Turkey, United Kingdom, United States.                                                                                                             

*GDP – Gross Domestic Product per capita

*LifeSat = Satisfaction Index – “The indicator considers people's evaluation of their life as a whole. It is a weighted-sum of different response categories based on people's rates of their current life relative to the best and worst possible lives for them on a scale from 0 to 10, using the Cantril Ladder (known also as the "Self-Anchoring Striving Scale")” (source: OECD)

*PearsEarn = Personal Earnings – “This indicator refers to the average annual wages per full-time equivalent dependent employee, which are obtained by dividing the national-accounts-based total wage bill by the average number of employees in the total economy, which is then multiplied by the ratio of average usual weekly hours per full-time employee to average usually weekly hours for all employees. It considers the employees’ gross remuneration, that is, the total before any deductions are made by the employer in respect of taxes, contributions of employees to social security and pension schemes, life insurance premiums, union dues and other obligations of employee (source: OECD)

*House Income = Household Disposable Income - “It's the maximum amount that a household can afford to consume without having to reduce its assets or to increase its liabilities. It's obtained adding to people’s gross income (earnings, self-employment and capital income, as well as current monetary transfers received from other sectors) the social transfers in-kind that households receive from governments (such as education and health care services), and then subtracting the taxes on income and wealth, the social security contributions paid by households as well as the depreciation of capital goods consumed by households. Available data refer to the sum of households and non-profit institution serving households” (source: OECD)

*Edu= Education – “Educational attainment considers the number of adults aged 25 to 64 holding at least an upper secondary degree over the population of the same age, as defined by the OECD-ISCED classification” (source: OECD)

* Work – Percentage of the working-age population (aged 15-64); "It is the number of employed persons aged 15 to 64 over the population of the same age. Employed people are those aged 15 or more who report that they have worked in gainful employment for at least one hour in the previous week, as defined by the International Labour Organization – ILO." (source: OECD)

In order to create charts, most of the above variables are categorized and new variables are produced:

“GDP2” with 4 categories for “GDP”
“SAT” with 4 categories for “LifeSat”
“Earn” with 5 categories for “PersEarn”
“House” with 3 categories for “House Income”
“ED” with 3 categories for “EDU”
And “Work2” with 3 categories for “Work”

Lower Categories in new variables correspond to lower numbers, therefore category “1” will always mean the lowest value.

CODE:

GRAPHS:

GDP

This graph is unimodal, with its highest peak (center) at the category of 30,000 to 50,000 $ GDP per capita.

*Out of 34 developed countries from the data, 20 countries (58.82%) fall into the above category.

The average GDP is $36023.2353.
And the standard deviation (spread) is 13220.128. This means that the differences between results are quite high.
The graph seems to be skewed to the right as there are higher frequencies in lower categories than the higher categories.


Personal Earnings

It´s a bimodal graph, with its highest (centers) peaks at the category $20,000 to $30,000 per capita and $40,000 to $50,000 personal earnings per capita.
*In 23.53% countries from the data, personal earnings are between $20,000 and $30,000, and
in 32.35% countries between $40,000 and $50,000.

The standard deviation (spread) for this variable is 12724.



Household Disposable income
This graph is unimodal, with its highest peak (center) at the category of 20,000 to 30,000 $.
It´s slightly skewed to the right as there are higher frequencies in lower values.

The average household disposable income is 22949.47
and the standard deviation (spread) is 6693. It is much lower than the spread of GDP or Personal Income which means that the results for household disposable income are much closer to each other.


Education

This graph is unimodal, with its highest peak (center) at the category of 70-92%.
It´s skewed to the left, which means that there´s higher frequency in higher categories.

* 76.47% of countries have more than 70% of people with at least high-school graduation. 23.53% of countries are below this category.

The average percentage is 74.5%
and the standard deviation is 16.26.



































Life Satisfaction

This graph has the highest peak (center) at categories "3" & "4", i.e. the highest Life Satisfaction categories (more than 6/10 index points).

The graph is skewed to the left, which means that there´s higher frequency in higher categories.

The average life satisfaction is 6.59 out of ten index points.
And the standard deviation is 0.8.


Work
The graph is almost flat which means that it does not have any particular center. There is almost the same number of low, middle and high values.

*There is very similar number of countries with 48-60%, 60-70% and 70-80% of employed people between the age 15 and 65.

The average work percentage is 66%.
The standard deviation (spread) is 7.35.

BIVARIATE GRAPHS:





Life Satisfaction vs. GDP

The graphs show the relationship between Life Satisfaction Index of a country and the country’s corresponding GDP.

We can see a trend that there´s more life satisfaction of people with the higher GDP of the country.

What´s interesting is that the highest income country does not seem to follow the trend. Its life satisfaction score is still reasonably high (6.9/10; Category 3/4) but lower than in countries with lower GDP category.

The said country is Luxembourg with GDP per capita of $83,394.4 – the only country in the category (“4”) of GDP per capita higher than $70,000.

Another interesting fact is that all countries with GDP category “3” have the highest Life Satisfaction index category (“4”). The countries in this category are Norway and Switzerland.

The lowest Life Satisfaction category (“1”) is seen only in countries with the lowest GDP category (“1”). The countries with both the lowest category of GDP (“1”) and Life Satisfaction (“1”) are Greece and Hungary.




GDP vs. Personal Earnings

The second plot proves that GDP and Personal Earnings have a very high correlation. Without any doubt, the higher GDP the higher personal Earnings.

However, the most interesting thing is how much of this money actually stays at home. To check it, I compared Personal Earnings and Household Income variables:






Personal Earnings vs. Household Income
In the third plot, I decided to check how Household Income depends on Personal Earnings. And once again, the dependency is very high. The higher Personal Earnings, the higher average Household Disposable Income.
Interestingly, in the highest Personal Earnings group, one country seems to have much lower Household Income than other countries in the group. This country is Iceland with $55,716 Personal Earnings (Category “5/5”) and $21,201 Household Income (Category “2/3”).
This means that the Personal Income in Iceland is highly decreased by such costs as taxes on income and wealth, the social security contributions paid by households as well as the capital goods consumed by households.
In other countries, as we can see in the plot chart, the results are much closer to each other.


GDP vs. Education

The dependence of Education on GDP is not as obvious as with other variables.

High percentage of people with at least high-school diploma is both observed in countries with lower and higher GDP.

However, it must be noted that the lowest Education category “1” appears only in countries with the lowest GDP category “1”. The countries with the lowest categories for both variables are Turkey, Mexico and Portugal.

Another pattern is that countries with the highest GDP categories (“3”&“4”) have only the highest percentage of high-school graduates’ category (“1”).

Therefore, the relationship between the Education and GDP, even if it´s not very strong and not apparent in all countries, exists. In countries with higher GDP, the average percentage of high-school graduates is higher than in countries with lower GDP.

GDP vs. WORK

The plot of WORK on GDP is very similar to the plot of Life Satisfaction on GDP.

The slope is rising. The higher GDP, the higher percentage of working people.

The exception of the pattern is also the same as in Life Satisfaction on GDP plot. It is Luxembourg with 65% of employed people at the age between 15 and 65. This score is lower than 47% of countries in the data.

After looking at these results, I decided to check the correlation between WORK and Life Satisfaction:



WORK vs. LIFE SATISFACTION

The above plot shows the relationship between percentage of working people and Life Satisfaction.

Higher Work percentage, higher Life Satisfaction. It is especially visible in countries with the work percentage over 70%:

There are 11 countries with work percentage higher than 70%. 9 of those countries have the highest Life Satisfaction category (“4”) and 2 remaining ones have category “3”.


CONCLUSION:

After analyzing and visualizing given variables, it appears that GDP has a strong relationship with Earnings, Income, Work and Life Satisfaction. The higher GDP, the higher the said variables.

The richest countries (Category “3&4”) have also high categories in other variables.

Countries with the lowest GDP (Category “1”) have more low scores in other variables.

The plot of Education on GDP was a little different. The spread of the results was much wider. Even in some of the countries with the lowest GDP, the percentage of people with at least high-school education was very high. However, the average percentage was, again, higher in countries with higher GDP.

In general, according to this analysis, the hypothesis that in countries with higher GDP there is better quality of life is correct. In higher GDP countries, people seem to have better material situation, good education, more work opportunities and higher life satisfaction.

The Countries with the highest sum of high categories for all variables are:
Switzerland, Norway, Luxembourg, Australia and United States.

Additionally, it appeared that there´s a strong correlation between Work and Life Satisfaction. The countries with both the highest percentage of working people (at the age of 15-65) and highest Life Satisfaction are:
Norway, Sweden, Iceland, New Zealand, Netherlands, Switzerland, Australia, Denmark and Canada.                                          

Brak komentarzy:

Prześlij komentarz