# Factors influencing the choice of the people to lead a healthy lifestyle

## Concept and health indicators, the impact on his bad habits. A study of factors that influence the choice of people to stick to a certain type of lifestyle on the example of people aged 14 to 40 years old. The stages of the analysis and the results.

 Рубрика Социология и обществознание Вид курсовая работа Язык английский Дата добавления 28.08.2016 Размер файла 882,9 K

### Отправить свою хорошую работу в базу знаний просто. Используйте форму, расположенную ниже

Студенты, аспиранты, молодые ученые, использующие базу знаний в своей учебе и работе, будут вам очень благодарны.

Размещено на http://www.allbest.ru/

Размещено на http://www.allbest.ru/

# Since we use respondents' answers to a set of questions, concerning lifestyle, there might be a significant correlation between initial variables responsible for the description of respondent's lifestyle. So, it is reasonable to first use factor analysis for data reduction. Factor analysis allows to reduce the data by means of combining initial variables into their linear combinations based on correlation between them. Factors represent linear combinations of initial variables that contain the most needed information to provide the research. Factor analysis finds a few common factors (say, n of them) that linearly reconstruct the m original variables.

yij = вi1x1j + вi2x2j + · · · + вiqxqj + eij, where

· yij is the value of ith observation on the jth variable

· вik is the ith observation on the kth common factor

· вik is the ith observation on the kth common factor

· xkj is the set of linear coefficients called the factor loadings

· eij is similar to a residual but is known as the jth variable's unique factor

By reconstructing we mean that, while applying principal component factor analysis, means minimum residual variances are summed across all equations (eigenvectors are returned into normalized form with unit length, L'L=I). [25],

Once the factors and their loadings have been estimated we may interpret them. Interpretation typically means examining the xkj 's and assigning names to each of the factor. In order to provide independence of factors we perform orthogonalization procedure. [25], [14].

So, factor analysis has three stages:

1. Preparation of the covariance matrix (Sometimes the correlation matrix is used instead);

2. Discharge of the original orthogonal vectors (main stage);

3. Rotation in order to obtain a final decision.

First stage - we observe covariance matrix to understand correlation and possible similarity between variables. Then, on the second stage, we use statistical package (in our case Stata) and construct factor analysis. We choose appropriate number of factors that will be used to evaluate factor loadings (pattern matrix) and unique variables based on factors cumulative table (cumulative value should not be more than 0.8). The third stage - we conduct factor analysis (using number of factors from stage two) and obtain orthogonal factors (pattern matrix), using rotation option. After that, we identify each variables to exact factor. We do this based on a principal that factor loadings matrix contains correlation between variables and factors. Moreover, we can identify similarities between factors and variables, finding the maximum correlation between each variable and factors.

Final decision is to understand what exactly each factor means based on a group of variables, which it contains. Our main goal is to reduce the complexity in a set of data to detect the latent structure in the data. Such variable restriction process allows us do not exclude valuable variables from analyzing process. What is more, factor components are suitable to conduct model evaluation without possible problem such as multicollinearity. The results of factor analysis could be suitable to obtain cluster analysis, which is the process of identifying groups of objects that are homogeneous within themselves and heterogeneous between each other. One can use several different methods to identify those clusters.

In our analysis we do not need to create subclusters, so we will use nonhierarchical approach with K-means set, which is a widely used technique for vast datasets. K-means sets the cluster centroids randomly and assigns each object to the cluster with the closest centroid.

The K-means algorithm is a procedure that tends to find out the the data in such a way that within cluster variation is minimized, where cluster variation is the distance from the observation to the center of the associated cluster. Since we work with a really small dataset and the results of the hierarchical approach are reproducible we decided to stick to this one. [16]

We should also discuss approach how to choose the appropriate number of clusters. Using hierarchical analysis - suitable options - graphical analysis using dendogram. In our case we will use K-means method so we will use Calinski approach (Calinski T, [17])

The Calinski-Harabasz criterion is sometimes called the variance ratio criterion (VRC). The Calinski-Harabasz index is defined as

, where

· SSB is the overall between-cluster variance

· SSW is the overall within-cluster variance

· k is the number of clusters

· N is the number of observations

If SSB is large and SSW is small, then clusters can be described as wee-defined. To find out the optimal number of clusters, we need to maximize VRCk ratio with respect to k. The optimal number of clusters is the solution with the highest Calinski-Harabasz index value. [26], [17]

After performing cluster analysis, we will use the result of this analysis to construct multinominal logit model. In the multinomial logit model we assume that the log-odds of each response follow a linear model

зij=log(рij/рiJ)=бj+x?iвj, where

· бjбj is a constant

· вj is a vector of regression coefficients, for j=1,2,…, J?1.

From this model we may see, that it is analogous to a logistic regression model, however the difference is that we have J-1 equations here instead of one and probability distribution response is multinomial instead of binomial.

What is more, there exist no difference in multinomial regression models concerning the choice of the reference cell, as we can always convert one formulation to another.

To sum everything up, suppose that there are k categorical outcomes and-without loss of generality-let the base outcome be 1. Accorging to Greene W.H. [18], the probability that the response for the jth observation is equal to the ith outcome is

, where

· xj is the row vector of observed values of the independent variables for the jth observation

· вm is the coefficient vector for outcome m.

# Table 1. Descriptive statistics of the main variables

Variable

Observations

Mean

Std. Dev.

Min

Max

Gender

111320

1.565371

4957301

1

2

Number of visits to the doctor

11032

2.095178

1.001679

1

5

Did you have any health problems during the last 30 days?

11300

1.78

4142646

1

2

Were you in a hospital during the last 3 months?

10185

1.957486

2017674

1

2

How would you rate your health?

10130

3.553011

6181218

1

5

Do you smoke now?

11301

1.636758

4809551

1

2

Remember please, when did you start smoking? How old were you?

3112

16.31652

2.955865

4

34

Did you smoke during the last 7 days?

4089

1.005869

0763962

1

2

How many cigarettes do you smoke daily?

4025

14.12174

7.328462

1

60

Have you ever smoked?

7123

1.812158

3906135

1

2

How long ago did you quit smoking?

1229

4.235151

4.274483

0

21

Did you consume alcohol during the last 30 days?

8555

1.293279

4552917

1

2

How often you consumed alcohol during the last 30 days?

6347

2.501024

1.082534

1

6

Did you run in the last 30 days?

9038

1.040385

1968715

1

2

Did you swim in the last 30 days?

7904

1.042637

2020493

1

2

Did you go to fitness in the last 30 days?

9031

1.057579

2329594

1

2

How did your weight change over the last year?

10984

2.327749

7265783

1

3

Did you dance in the last 30 days?

9035

1.019812

139361

1

2

Did you play basketball or football in the last 30 days?

9039

1.958181

2001859

1

2

How often do you do physical exercises?

8390

1.834923

1.454371

1

5

Did you miss your work during the last 30 days due to illness?

6751

1.938824

2396708

1

2

Year of birth

11320

1981.421

5.216236

1974

1991

Age

11320

28.07862

5.954817

14

40

Weight

10789

69.80668

15.47784

21

160

# Table 2. Contingency table for gender and marital status ,count (row %)

marital status

gender

never married

married

live together but not married

divorced

(widower)

total

male

(54.5)

(36.9)

32 (6.6)

10 (2.0)

0 (0.0)

488 (100)

female

71 (11.2)

32 (5.0)

2 (0.3)

636 (100)

total

(46.8)

(40.1)

103 (9.2)

42 (3.7)

2 (0.2)

1 124 (100)

# Table 3. Contingency table for gender and level of education status, count (row %)

level of education

gender

finished 0-6 classes

did not finish school (7-8 classes)

did not finish school (7-8 classes) + additional education

finished high school

finished prof education

finished higher education

total

male

(1.4)

45 (9.2)

100 (20.5)

180 (37.0)

99 (20.3)

56 (11.5)

487 (100)

female

(1.0)

105 (16.5)

219 (34.4)

169 (26.5)

103 (16.2)

637 (100)

total

(1.2)

80 (7.1)

205 (18.2)

399 (35.5)

268 (23.8)

159 (14.2)

1 124 (100)

# Table 4. Contingency table for gender and self-assessment health status, count (row %)

How do you rate your level of health

gender

good

good

normal

total

male

(0.2)

(2.7)

177 (36.1)

273 (55.7)

26 (5.3)

(100)

female

(0.3)

287 (44.9)

(49.1)

23 (3.6)

639 (100)

total

(0.3)

26 (2.3)

464 (41.1)

587 (52.0)

(4.3)

1 129 (100)

Do you smoke?

gender

yes

no

total

male

(54.9)

(45.1)

(100)

female

(20.4)

(79.6)

(100)

total

(35.4)

(64.6)

# 1 124

(100)

Pearson chi 2 (1) = 143.5166 Pr=0.000

# Table 6. Contingency table for gender and alcohol consumption, count (row %)

Did you consume alcohol during the last 30 days?

gender

yes

no

total

male

(51.3)

(48.7)

(100)

female

(43.5)

(56.5)

(100)

total

(46.9)

(53.1)

# 1 119

(100)

Pearson chi 2 (1) = 143.5166 Pr=0.000

# To test these hypotheses, initially, we conduct a cluster analysis of respondents' health status and then build a multinomial model to discover factors that affect probability of an individual to fill into a particular cluster.

...

### Подобные документы

• Study the opinion of elderly people and young people about youth culture. Subculture as a group of people with the same interests and views on life. Passion for today's youth to heavy music, computers, dance parties and special styles of clothing.

презентация [654,6 K], добавлен 28.10.2014

• The nature and content of the concept of "migration". The main causes and consequences of migration processes in the modern world. Countries to which most people are emigrating from around the world. TThe conditions for obtaining the status of "migrant".

презентация [4,8 M], добавлен 22.03.2015

• Description situation of the drugs in the world. Factors and tendencies of development of drugs business. Analysis kinds of drugs, their stages of manufacture and territory of sale. Interrelation of drugs business with other global problems of mankind.

курсовая работа [38,9 K], добавлен 13.09.2010

• American marriage pattern, its types, statistics and trends among different social groups and ages. The reasons of marriage and divorce and analyzing the statistics of divorce and it’s impact on people. The position of children in American family.

курсовая работа [48,3 K], добавлен 23.08.2013

• Problems in school and with parents. Friendship and love. Education as a great figure in our society. The structure of employed young people in Russia. Taking drugs and smoking as the first serious and actual problem. Informal movements or subcultures.

контрольная работа [178,7 K], добавлен 31.08.2014

• The concept and sex, and especially his studies in psychology and sociology at the present stage. The history of the study of the concepts of masculinity and femininity. Gender issues in Russian society. Gender identity and the role of women in America.

дипломная работа [73,0 K], добавлен 11.11.2013

• Инструментарий для изучения аудитории печатного периодического издания "Men's Health". Анкетный опрос среди студентов факультета журналистики. Опровержение гипотезы о положительном отношении женской аудитории факультета к чтению мужских журналов.

курсовая работа [68,4 K], добавлен 07.05.2015

• The concept of public: from ancient times to era of Web 2.0. Global public communication. "Charlie Hebdo" case. Transition of public from on-line to off-line. Case study: from blog to political party. "M5S Public": features and mechanisms of transition.

дипломная работа [2,7 M], добавлен 23.10.2016

• Overpopulation, pollution, Global Warming, Stupidity, Obesity, Habitat Destruction, Species Extinction, Religion. The influence of unemployment in America on the economy. The interaction of society with other societies, the emergence of global problems.

реферат [21,1 K], добавлен 19.04.2013

• The study of human populations. Demographic prognoses. The contemplation about future social developments. The population increase. Life expectancy. The international migration. The return migration of highly skilled workers to their home countries.

реферат [20,6 K], добавлен 24.07.2014

Работы в архивах красиво оформлены согласно требованиям ВУЗов и содержат рисунки, диаграммы, формулы и т.д.
PPT, PPTX и PDF-файлы представлены только в архивах.
Рекомендуем скачать работу.