Оценка результатов построения деревьев решений при помощи регрессионного анализа
Возникновение и применения метода построения деревьев решений. Основные существующие алгоритмы и решаемые ими задачи. Существующие статистические методы, применяемые для решения тех же задач. Категориальная бинарная и небинарная целевая переменная.
Рубрика | Экономико-математическое моделирование |
Вид | дипломная работа |
Язык | русский |
Дата добавления | 01.12.2019 |
Размер файла | 591,6 K |
Отправить свою хорошую работу в базу знаний просто. Используйте форму, расположенную ниже
Студенты, аспиранты, молодые ученые, использующие базу знаний в своей учебе и работе, будут вам очень благодарны.
Sensitivity = mean(as.numeric(metrics_regression$byClass[,1])),
Specificity = mean(as.numeric(metrics_regression$byClass[,2]))))
Теперь дерево Rpart
library(rpart)
library(rpart.plot)
Оцениваем модель дерева. Вводим ограничения на количество наблюдений в узлах и листах, устанавливаем максимальную глубину дерева.
NEET_rpart <- rpart(Ocupation ~., data = train, method = 'class',
control = rpart.control(minsplit = 60, minbucket = 30, cp = 0.00001, maxdepth = 30))
Отображаем дерево
rpart.plot(NEET_rpart, type = 0, cex = T, compress = T, ycompress = T, under = T, varlen = 0, faclen = 5, fallen.leaves = F)
Используем модель для предсказания значений целевой переменной на тестовой подвыборке в терминах вероятностей.
pred <- as.data.table(predict(object=NEET_rpart, newdata = test, type = 'prob'))
Рассчитываем показатель AUC для небинарной категориальной переменной
AUC_rpart <- multiclass.roc(actual$V1,pred)$auc
Используем модель для предсказания значений целевой переменной на тестовой подвыборке в терминах класса наблюдений.
pred <- as.data.table(predict(NEET_rpart, newdata = test, type = 'class'))
Рассчитываем метрики качества
metrics_rpart <- confusionMatrix(pred$V1,actual$V1, mode = 'everything')
metrics_rpart <- as.data.table(list(Accuracy = as.numeric(metrics_rpart$overall[1]),
Sensitivity = mean(as.numeric(metrics_rpart$byClass[,1])),
Specificity = mean(as.numeric(metrics_rpart$byClass[,2]))))
Теперь строим дерево при помощи алгоритма CHAID
library(CHAID)
Строим модель CHAID
NEET_chtree <- chaid(Ocupation ~., data = train_ch,
control = chaid_control(minprob = 0.001,
minsplit = 60,minbucket = 30))
Отображаем дерево визуально
plot(NEET_chtree, type = 'simple', gp = gpar(fontsize = 6))
Используем модель для предсказания значений целевой переменной на тестовой подвыборке в терминах вероятности.
pred <- as.data.table(predict(object=NEET_chtree,
newdata = test,
type = 'prob'))
Рассчитываем AUC
AUC_chaid <- multiclass.roc(actual$V1,pred)$auc
Используем модель для предсказания значений целевой переменной на тестовой подвыборке в терминах класса.
pred <- as.data.table(predict(object=NEET_chtree,
newdata = test,
type = 'response'))
Рассчитываем метрики для дерева CHAID
metrics_chaid <- confusionMatrix(pred$V1,actual$V1, mode = 'everything')
metrics_chaid <- as.data.table(list(Accuracy = as.numeric(metrics_chaid$overall[1]),
Sensitivity = mean(as.numeric(metrics_chaid$byClass[,1])),
Specificity = mean(as.numeric(metrics_chaid$byClass[,2]))))
C5.0. Устанавливаем пакеты
install.packages('C50')
library(C50)
Строим модель C50
c50tree <- C5.0(Ocupation ~., data = train)
Используем модель для предсказания значений целевой переменной на тестовой подвыборке в терминах вероятности.
pred <- as.data.table(predict.C5.0(c50tree,
newdata = test,
type = 'prob'))
Рассчитываем AUC
AUC_c50 <- multiclass.roc(actual$V1,pred)$auc
Используем модель для предсказания значений целевой переменной на тестовой подвыборке в терминах класса.
pred <- as.data.table(predict.C5.0(c50tree,
newdata = test,
type = 'class'))
Рассчитываем метрики для дерева C50
metrics_c50 <- confusionMatrix(pred$V1,actual$V1, mode = 'everything')
metrics_c50 <- as.data.table(list(Accuracy = as.numeric(metrics_c50$overall[1]),
Sensitivity = mean(as.numeric(metrics_c50$byClass[,1])),
Specificity = mean(as.numeric(metrics_c50$byClass[,2]))))
metrics_c50
Деревья C4.5/lmt из пакета rweka
install.packages('RWeka')
library(RWeka)
Строим модель дерева С4.5 и LMT
tree_j48 <- J48(Ocupation ~., data = train, control = Weka_control(U = T, M = 60))
tree_lmt <- LMT(Ocupation ~., data = train, control = Weka_control(I = 1))
Отображаем деревья
plot(tree_j48)
plot(tree_lmt)
Предсказываем значения при помощи дерева C4.5/j48
pred <- as.data.table(predict(object=tree_j48,
newdata = test,
type = 'prob'))
Рассчитываем AUC
AUC_j48 <- multiclass.roc(actual$V1,pred)$auc
AUC_j48
Предсказываем значения при помощи дерева C4.5/j48 в терминах класса
pred <- as.data.table(predict(object = tree_j48,
newdata = test,
type = 'class'))
Рассчитываем метрики для дерева J48/C4.5
metrics_j48 <- confusionMatrix(pred$V1,actual$V1, mode = 'everything')
metrics_j48 <- as.data.table(list(Accuracy = as.numeric(metrics_j48$overall[1]),
Sensitivity = mean(as.numeric(metrics_j48$byClass[,1])),
Specificity = mean(as.numeric(metrics_j48$byClass[,2]))))
metrics_j48
Предсказываем значения при помощи дерева LMT в терминах вероятностей
pred <- as.data.table(predict(object=tree_lmt,
newdata = test,
type = 'prob'))
Рассчитываем AUC
AUC_lmt <- multiclass.roc(actual$V1,pred)$auc
AUC_lmt
Предсказываем значения при помощи дерева LMT в терминах класса
pred <- as.data.table(predict(object = tree_lmt,
newdata = test,
type = 'class'))
Рассчитываем метрики для дерева LMT
metrics_lmt <- confusionMatrix(pred$V1,actual$V1, mode = 'everything')
metrics_lmt <- as.data.table(list(Accuracy = as.numeric(metrics_lmt$overall[1]),
Sensitivity = mean(as.numeric(metrics_lmt$byClass[,1])),
Specificity = mean(as.numeric(metrics_lmt$byClass[,2]))))
metrics_lmt
Выводим все таблицы с результатами
Регрессионная модель
library(stargazer)
stargazer(regression, type = 'html', out = 'NEET.html', style = 'asr')
Считаем R-квадрат для мультиномилаьной модели
nnet.mod.loglik <- nnet:::logLik.multinom(regression)
nnet.mod0 <- multinom(Ocupation ~ 1, train)
nnet.mod0.loglik <- nnet:::logLik.multinom(nnet.mod0)
(nnet.mod.mfr2 <- as.numeric(1 - nnet.mod.loglik/nnet.mod0.loglik))
Выводим все метрики для сравнения
AUC_regression
metrics_regression
AUC_chaid
metrics_chaid
AUC_rpart
metrics_rpart
AUC_c50
metrics_c50
AUC_j48
metrics_j48
AUC_lmt
metrics_lmt
Приложение . Результаты построения отдельных моделей
Интервальная целевая переменная
«Height and Life Satisfaction: Evidence from Russia»
Таблица 1
Результаты регрессионного анализа
Target |
||
Height_sq |
0.0001 |
|
Height |
-0.029 |
|
Age |
0.037*** |
|
age_sq |
-0.0003*** |
|
Female |
0.032 |
|
Health |
0.352*** |
|
Married |
-0.362*** |
|
Higher_educ |
-0.068** |
|
Ln_wage |
-0.091*** |
|
Believer |
-0.158*** |
|
LN_gpr |
-0.044 |
|
Constant |
4.955 |
|
N |
5,454 |
|
R2 |
0.151 |
|
Adjusted R2 |
0.149 |
|
Residual Std. Error |
0.794 (df = 5442) |
|
F Statistic |
88.061*** (df = 11; 5442) |
|
*p <.05; **p <.01; ***p <.001 |
Спецификация дерева CART
n=11291 (2924 observations deleted due to missingness)
node), split, n, deviance, yval
* denotes terminal node
1) root 11291 9561.3150 2.551058
2) Health< 2.5 4565 3018.1060 2.169332
4) LN_gpr< 12.13283 397 129.2645 1.712846 *
5) LN_gpr>=12.13283 4168 2798.2360 2.212812
10) Married>=0.5 2024 1196.6250 2.102767
20) LN_gpr>=13.10486 625 377.1584 1.921600 *
21) LN_gpr< 13.10486 1399 789.7884 2.183703 *
11) Married< 0.5 2144 1553.9620 2.316698
22) Age< 24.5 994 623.1187 2.104628 *
23) Age>=24.5 1150 847.5000 2.500000 *
3) Health>=2.5 6726 5426.5480 2.810140
6) Health< 3.5 5435 3910.9790 2.705612
12) Married>=0.5 2906 1885.9090 2.554370
24) LN_gpr>=13.10486 934 727.0931 2.375803
48) Age< 41.5 291 232.2955 2.127148 *
49) Age>=41.5 643 468.6625 2.488336 *
25) LN_gpr< 13.10486 1972 1114.9290 2.638945 *
13) Married< 0.5 2529 1882.2170 2.879399
26) Age< 26.5 436 306.7706 2.522936 *
27) Age>=26.5 2093 1508.5050 2.953655 *
7) Health>=3.5 1291 1206.1870 3.250194
14) Married>=0.5 511 450.9824 2.994129 *
15) Married< 0.5 780 699.7487 3.417949 *
Спецификация дерева Ctree
Model formula:
Target ~ Height_sq + Height + Age + age_sq + Female + Health +
Married + Higher_educ + Ln_wage + Believer + LN_gpr
Fitted party:
[1] root
| [2] Health <= 2
| | [3] Married <= 0
| | | [4] Age <= 30: 2.263 (n = 422, err = 261.8)
| | | [5] Age > 30
| | | | [6] Ln_wage <= 10.43415: 2.579 (n = 442, err = 297.7)
| | | | [7] Ln_wage > 10.43415: 2.340 (n = 153, err = 108.3)
| | [8] Married > 0
| | | [9] Age <= 28
| | | | [10] Believer <= 0: 2.125 (n = 56, err = 26.1)
| | | | [11] Believer > 0: 1.824 (n = 193, err = 90.0)
| | | [12] Age > 28
| | | | [13] Ln_wage <= 10.18494: 2.226 (n = 680, err = 407.1)
| | | | [14] Ln_wage > 10.18494: 2.004 (n = 569, err = 310.0)
| [15] Health > 2
| | [16] Married <= 0
| | | [17] Age <= 29: 2.647 (n = 204, err = 140.6)
| | | [18] Age > 29
| | | | [19] Ln_wage <= 10.37352: 3.004 (n = 804, err = 585.0)
| | | | [20] Ln_wage > 10.37352
| | | | | [21] Believer <= 0: 3.135 (n = 37, err = 28.3)
| | | | | [22] Believer > 0: 2.617 (n = 133, err = 89.4)
| | [23] Married > 0
| | | [24] Age <= 32
| | | | [25] LN_gpr <= 13.02431: 2.410 (n = 161, err = 88.9)
| | | | [26] LN_gpr > 13.02431: 1.990 (n = 101, err = 69.0)
| | | [27] Age > 32
| | | | [28] Ln_wage <= 10.5187
| | | | | [29] Health <= 3: 2.633 (n = 1121, err = 656.3)
| | | | | [30] Health > 3: 2.938 (n = 81, err = 62.7)
| | | | [31] Ln_wage > 10.5187: 2.327 (n = 297, err = 211.3)
Number of inner nodes: 15
Number of terminal nodes: 16
Спецификация дерева M5P
M5 unpruned model tree:
(using smoothed linear models)
Health <= 2.5:
| Married <= 0.5:
| | Age <= 30.5:
| | | Ln_wage <= 10.106: LM1 (221/101.503%)
| | | Ln_wage > 10.106: LM2 (201/77.252%)
| | Age > 30.5:
| | | LN_gpr <= 12.932: LM3 (295/84.798%)
| | | LN_gpr > 12.932:
| | | | Ln_wage <= 9.847: LM4 (97/111.27%)
| | | | Ln_wage > 9.847: LM5 (203/101.416%)
| Married > 0.5:
| | LN_gpr <= 13.105:
| | | LN_gpr <= 12.588:
| | | | LN_gpr <= 12.514: LM6 (237/87.162%)
| | | | LN_gpr > 12.514: LM7 (91/85.359%)
| | | LN_gpr > 12.588:
| | | | Age <= 38.5:
| | | | | LN_gpr <= 12.92:
| | | | | | LN_gpr <= 12.867: LM8 (167/80.329%)
| | | | | | LN_gpr > 12.867: LM9 (140/73.576%)
| | | | | LN_gpr > 12.92: LM10 (116/93.523%)
| | | | Age > 38.5: LM11 (280/82.167%)
| | LN_gpr > 13.105:
| | | LN_gpr <= 13.199: LM12 (125/70.464%)
| | | LN_gpr > 13.199:
| | | | Ln_wage <= 10.194: LM13 (134/105.102%)
| | | | Ln_wage > 10.194: LM14 (208/87.101%)
Health > 2.5:
| Married <= 0.5:
| | LN_gpr <= 13:
| | | Ln_wage <= 9.916:
| | | | LN_gpr <= 12.601: LM15 (165/98.416%)
| | | | LN_gpr > 12.601: LM16 (272/95.566%)
| | | Ln_wage > 9.916: LM17 (180/90.712%)
| | LN_gpr > 13:
| | | Age <= 35.5: LM18 (168/108.916%)
| | | Age > 35.5:
| | | | Ln_wage <= 10.389: LM19 (293/96.108%)
| | | | Ln_wage > 10.389: LM20 (100/99.509%)
| Married > 0.5:
| | LN_gpr <= 13.105:
| | | LN_gpr <= 12.616:
| | | | Height_sq <= 29412.5: LM21 (244/86.577%)
| | | | Height_sq > 29412.5: LM22 (179/85.524%)
| | | LN_gpr > 12.616:
| | | | Age <= 38.5: LM23 (225/78.548%)
| | | | Age > 38.5:
| | | | | LN_gpr <= 12.956:
| | | | | | Height_sq <= 28392.5: LM24 (201/92.773%)
| | | | | | Height_sq > 28392.5: LM25 (208/82.793%)
| | | | | LN_gpr > 12.956: LM26 (110/70.236%)
| | LN_gpr > 13.105:
| | | LN_gpr <= 13.463:
| | | | LN_gpr <= 13.41:
| | | | | Believer <= 0.5: LM27 (68/105.521%)
| | | | | Believer > 0.5: LM28 (237/89.905%)
| | | | LN_gpr > 13.41: LM29 (111/115.802%)
| | | LN_gpr > 13.463: LM30 (178/89.16%)
«Человеческий капитал российских рабочих: общее состояние и специфические особенности»
Таблица 2
Результаты регрессионного анализа
Ln_wage |
||
Work_exp |
0.039*** |
|
statusОбластной центр |
0.147 |
|
statusПГТ |
0.078 |
|
statusСело |
-0.229* |
|
educ |
0.039* |
|
Work_exp_sq |
-0.001* |
|
Constant |
9.184*** |
|
N |
1,767 |
|
R2 |
0.021 |
|
Adjusted R2 |
0.018 |
|
Residual Std. Error |
1.674 (df = 1760) |
|
F Statistic |
6.244*** (df = 6; 1760) |
|
*p <.05; **p <.01; ***p <.001 |
Спецификация дерева CART
n= 1971
1) root 1971 6578.3590 10.133500
2) status=Село 483 1766.6380 9.782897
4) Work_exp< 20.5 319 1224.6390 9.633161 *
5) Work_exp>=20.5 164 520.9348 10.074150 *
3) status=Город,Областной центр,ПГТ 1488 4733.0750 10.247310
6) Work_exp< 3.5 109 1034.3120 9.334591
12) status=Город 41 451.2174 8.751628 *
13) status=Областной центр,ПГТ 68 560.7595 9.686083 *
7) Work_exp>=3.5 1379 3600.7830 10.319460
14) educ< 14.75 1169 2840.0200 10.264020
28) educ< 11.75 574 1145.4280 10.165760
56) status=Город,ПГТ 292 551.8704 9.998493 *
57) status=Областной центр 282 576.9280 10.338970 *
29) educ>=11.75 595 1683.7030 10.358820 *
15) educ>=14.75 210 737.1756 10.628030 *
Спецификация дерева Ctree
Fitted party:
[1] root
| [2] status in Город
| | [3] Work_exp <= 3: 8.752 (n = 41, err = 451.2)
| | [4] Work_exp > 3: 10.188 (n = 545, err = 1361.7)
| [5] status in Областной центр: 10.240 (n = 651, err = 1519.3)
| [6] status in ПГТ: 10.166 (n = 90, err = 261.6)
| [7] status in Село: 9.817 (n = 440, err = 1317.8)
Number of inner nodes: 2
Number of terminal nodes: 5
Спецификация дерева M5P
M5 unpruned model tree:
(using smoothed linear models)
Work_exp <= 7.5:
| Work_exp <= 3.5:
| | status=Город,ПГТ,Областной центр <= 0.5: LM1 (30/34.127%)
| | status=Город,ПГТ,Областной центр > 0.5:
| | | educ <= 11.5: LM2 (29/234.051%)
| | | educ > 11.5:
| | | | educ <= 12.75: LM3 (34/32.569%)
| | | | educ > 12.75: LM4 (46/203.782%)
| Work_exp > 3.5:
| | status=ПГТ,Областной центр <= 0.5:
| | | educ <= 11.75: LM5 (50/144.26%)
| | | educ > 11.75:
| | | | Work_exp <= 6.5:
| | | | | educ <= 12.75: LM6 (24/40.156%)
| | | | | educ > 12.75: LM7 (36/142.731%)
| | | | Work_exp > 6.5: LM8 (24/21.549%)
| | status=ПГТ,Областной центр > 0.5:
| | | educ <= 11.5: LM9 (24/104.127%)
| | | educ > 11.5:
| | | | educ <= 12.5: LM10 (28/21.877%)
| | | | educ > 12.5: LM11 (43/29.721%)
Work_exp > 7.5:
| Work_exp <= 31.5:
| | Work_exp <= 16.5:
| | | status=ПГТ,Областной центр <= 0.5:
| | | | educ <= 12.25:
| | | | | educ <= 10.5: LM12 (59/37.408%)
| | | | | educ > 10.5:
| | | | | | Work_exp <= 12.5:
| | | | | | | educ <= 11.75: LM13 (36/39.294%)
| | | | | | | educ > 11.75: LM14 (50/150.883%)
| | | | | | Work_exp > 12.5:
| | | | | | | educ <= 11.25: LM15 (25/118.517%)
| | | | | | | educ > 11.25: LM16 (36/26.926%)
| | | | educ > 12.25:
| | | | | Work_exp <= 10.5: LM17 (35/27.788%)
| | | | | Work_exp > 10.5:
| | | | | | Work_exp <= 12.5: LM18 (23/52.219%)
| | | | | | Work_exp > 12.5: LM19 (38/34.58%)
| | | status=ПГТ,Областной центр > 0.5:
| | | | educ <= 14.5:
| | | | | educ <= 13.5:
| | | | | | educ <= 12.75:
| | | | | | | educ <= 9.5: LM20 (17/23.737%)
| | | | | | | educ > 9.5:
| | | | | | | | Work_exp <= 10.5: LM21 (24/35.786%)
| | | | | | | | Work_exp > 10.5: LM22 (41/43.426%)
| | | | | | educ > 12.75: LM23 (24/24.493%)
| | | | | educ > 13.5: LM24 (27/33.396%)
| | | | educ > 14.5: LM25 (46/95.517%)
| | Work_exp > 16.5:
| | | status=Город,ПГТ,Областной центр <= 0.5:
| | | | Work_exp <= 26.5:
| | | | | Work_exp <= 23.5:
| | | | | | Work_exp <= 20.5: LM26 (49/144.175%)
| | | | | | Work_exp > 20.5: LM27 (25/102.422%)
| | | | | Work_exp > 23.5: LM28 (27/202.958%)
| | | | Work_exp > 26.5: LM29 (39/88.819%)
| | | status=Город,ПГТ,Областной центр > 0.5:
| | | | educ <= 11.75:
| | | | | status=Областной центр <= 0.5:
| | | | | | Work_exp <= 24.5: LM30 (47/79.038%)
| | | | | | Work_exp > 24.5: LM31 (55/135.093%)
| | | | | status=Областной центр > 0.5:
| | | | | | educ <= 10.25: LM32 (39/85.392%)
| | | | | | educ > 10.25:
| | | | | | | Work_exp <= 21.5: LM33 (16/38.987%)
| | | | | | | Work_exp > 21.5: LM34 (44/24.152%)
| | | | educ > 11.75:
| | | | | Work_exp <= 23.5:
| | | | | | Work_exp <= 19.5: LM35 (57/94.783%)
| | | | | | Work_exp > 19.5:
| | | | | | | educ <= 14.5: LM36 (53/115.058%)
| | | | | | | educ > 14.5: LM37 (18/185.284%)
| | | | | Work_exp > 23.5:
| | | | | | Work_exp <= 26.5: LM38 (50/26.88%)
| | | | | | Work_exp > 26.5:
| | | | | | | Work_exp <= 27.5: LM39 (14/153.849%)
| | | | | | | Work_exp > 27.5: LM40 (48/122.849%)
| Work_exp > 31.5:
| | Work_exp <= 35.5:
| | | Work_exp <= 34.5:
| | | | status=ПГТ,Областной центр <= 0.5: LM41 (56/34.574%)
| | | | status=ПГТ,Областной центр > 0.5: LM42 (44/25.007%)
| | | Work_exp > 34.5: LM43 (33/27.167%)
| | Work_exp > 35.5:
| | | Work_exp <= 41.5:
| | | | educ <= 10.5: LM44 (45/47.67%)
| | | | educ > 10.5:
| | | | | status=Областной центр <= 0.5: LM45 (48/78.126%)
| | | | | status=Областной центр > 0.5: LM46 (31/141.343%)
| | | Work_exp > 41.5:
| | | | Work_exp <= 46.5: LM47 (59/22.912%)
| | | | Work_exp > 46.5: LM48 (21/26.845%)
«Оценки "штрафа за материнство" в России»
Таблица 3
Ln_wage |
||
age20-24 |
-0.070 |
|
age25-29 |
-0.012 |
|
age30-34 |
0.032 |
|
age35-39 |
0.098* |
|
children_nЕсть дети до 18 |
-0.106** |
|
children_nЕсть дети старше 18 |
-0.110 |
|
statusГород |
0.097* |
|
statusОбластной центр |
0.277*** |
|
statusПГТ |
0.251*** |
|
partnerНет партнера / была замужем |
0.123** |
|
Partner Нет партнера / не была замужем |
0.024 |
|
health_count_nНизкий |
0.011 |
|
vj11.1Не оформлены официально |
-0.100 |
|
sectorКоммерческий сектор |
0.212*** |
|
full_dayПолная неделя |
0.671*** |
|
university |
0.311*** |
|
vj61 |
0.356*** |
|
Constant |
8.723*** |
|
N |
1,485 |
|
R2 |
0.268 |
|
Adjusted R2 |
0.260 |
|
Residual Std. Error |
0.513 (df = 1467) |
|
F Statistic |
31.654*** (df = 17; 1467) |
|
*p <.05; **p <.01; ***p <.001 |
Спецификация дерева CART
n=1680 (90 observations deleted due to missingness)
node), split, n, deviance, yval
* denotes terminal node
1) root 1680 606.70860 9.863185
2) university< 0.5 927 297.49560 9.694236
4) vj6=0 803 235.65770 9.640228
8) status=Село 228 67.73815 9.447899
16) sector=Соцсфера и госслужба 65 14.92798 9.289711 *
17) sector=Коммерческий сектор 163 50.53504 9.510980 *
9) status=Город,Областной центр,ПГТ 575 156.14150 9.716491
18) sector=Соцсфера и госслужба 97 29.49475 9.495410 *
19) sector=Коммерческий сектор 478 120.94360 9.761355
38) status=Город 202 43.84122 9.681197 *
39) status=Областной центр,ПГТ 276 74.85460 9.820021 *
5) vj6=1 124 44.32803 10.043980 *
3) university>=0.5 753 250.17870 10.071170
6) vj6=0 571 173.66570 9.977542
12) sector=Соцсфера и госслужба 224 62.33324 9.836797
24) status=Село,Город,ПГТ 121 32.39154 9.721097 *
25) status=Областной центр 103 26.41914 9.972715 *
13) sector=Коммерческий сектор 347 104.03080 10.068400
26) status=Село,Город 123 34.95513 9.951862 *
27) status=Областной центр,ПГТ 224 66.48806 10.132390 *
7) vj6=1 182 55.80187 10.364930
14) status=Село,Город 68 19.35471 10.187810 *
15) status=Областной центр,ПГТ 114 33.04146 10.470580 *
Спецификация дерева Ctree
Model formula:
Ln_wage ~ age + children_n + status + partner + health_count_n +
vj11.1 + sector + full_day + university + vj6
Fitted party:
[1] root
| [2] university <= 0
| | [3] vj6 in 0
| | | [4] status in Село
| | | | [5] full_day in Полная неделя
| | | | | [6] sector in Соцсфера и госслужба
| | | | | | [7] children_n in Есть дети до 18: 9.196 (n = 42, err = 5.8)
| | | | | | [8] children_n in Нет детей, Есть дети старше 18: 9.542 (n = 20, err = 3.6)
| | | | | [9] sector in Коммерческий сектор: 9.566 (n = 137, err = 37.8)
| | | | [10] full_day in Неполная неделя: 8.476 (n = 5, err = 3.7)
| | | [11] status in Город
| | | | [12] sector in Соцсфера и госслужба
| | | | | [13] full_day in Полная неделя: 9.418 (n = 38, err = 6.9)
| | | | | [14] full_day in Неполная неделя: 8.132 (n = 1, err = 0.0)
| | | | [15] sector in Коммерческий сектор: 9.664 (n = 164, err = 35.7)
| | | [16] status in Областной центр
| | | | [17] full_day in Полная неделя: 9.810 (n = 224, err = 60.7)
| | | | [18] full_day in Неполная неделя: 8.966 (n = 6, err = 0.5)
| | | [19] status in ПГТ: 9.808 (n = 44, err = 12.5)
| | [20] vj6 in 1: 10.042 (n = 115, err = 38.5)
| [21] university > 0
| | [22] vj6 in 0
| | | [23] sector in Соцсфера и госслужба: 9.832 (n = 206, err = 54.4)
| | | [24] sector in Коммерческий сектор: 10.064 (n = 312, err = 88.7)
| | [25] vj6 in 1
| | | [26] health_count_n in Высокий: 10.258 (n = 108, err = 25.1)
| | | [27] health_count_n in Низкий: 10.554 (n = 63, err = 25.5)
Number of inner nodes: 12
Number of terminal nodes: 15
Спецификация дерева M5P
M5 unpruned model tree:
(using smoothed linear models)
university <= 0.5:
| status=Small towm,City <= 0.5:
| | sector=Commercial sector <= 0.5: LM1 (121/82.841%)
| | sector=Commercial sector > 0.5:
| | | status=Town,Small towm,City <= 0.5: LM2 (175/92.16%)
| | | status=Town,Small towm,City > 0.5:
| | | | partner=Wasnt married_No partner,Was married_No partner, <= 0.5: LM3 (156/82.782%)
| | | | partner=Wasnt married_No partner,Was married_No partner, > 0.5: LM4 (74/72.609%)
| status=Small towm,City > 0.5:
| | vj11.1=Official job <= 0.5: LM5 (94/118.612%)
| | vj11.1=Official job > 0.5:
| | | health_count_n=Low <= 0.5:
| | | | partner=Wasnt married_No partner,Was married_No partner, <= 0.5: LM6 (133/82.574%)
| | | | partner=Wasnt married_No partner,Was married_No partner, > 0.5: LM7 (88/100.306%)
| | | health_count_n=Low > 0.5: LM8 (85/82.152%)
university > 0.5:
| vj6 <= 0.5:
| | sector=Commercial sector <= 0.5:
| | | status=City <= 0.5: LM9 (120/86.128%)
| | | status=City > 0.5: LM10 (103/84.218%)
| | sector=Commercial sector > 0.5:
| | | status=Small towm,City <= 0.5: LM11 (123/88.647%)
| | | status=Small towm,City > 0.5:
| | | | partner=Wasnt married_No partner,Was married_No partner, <= 0.5: LM12 (155/89.236%)
| | | | partner=Wasnt married_No partner,Was married_No partner, > 0.5: LM13 (68/90.998%)
| vj6 > 0.5: LM14 (182/92.077%)
Порядковая целевая переменная
«When information dominates comparison»
Таблица 4
Спецификации регрессии
TARGET |
||
v_age |
-0.109 |
|
age_sq |
0.001 |
|
Male |
-0.004 |
|
v_nfm |
0.046*** |
|
Russian |
-0.367*** |
|
Believer |
0.398*** |
|
MaritalDivorced |
-0.108*** |
|
MaritalMarried |
0.942*** |
|
MaritalWidowed |
0.014 |
|
years_educ |
0.051*** |
|
Experience |
0.017*** |
|
OccupationClerks |
0.010*** |
|
OccupationCraft_and_related |
0.046*** |
|
OccupationElementary_unskilled_occupation |
-0.213*** |
|
OccupationLegislators_senior managers_officials |
0.270*** |
|
OccupationOperators_and_assemblers |
0.081*** |
|
OccupationProfessionals |
0.096* |
|
OccupationServices_workers |
-0.100* |
|
OccupationSkilled_agriculture_and_fish |
1.564*** |
|
OccupationTechnicians |
0.053 |
|
RegCentral and Central Black-Earth |
-0.067 |
|
RegEastern Siberia and Far Eastern |
-0.097*** |
|
RegNorth Caucasian |
0.496*** |
|
RegNorthern and North Western |
1.179*** |
|
RegSouthern |
-0.256*** |
|
RegUral |
-0.324*** |
|
RegVolga-Vyatski and Volga Basin |
-0.158*** |
|
RegWestern Siberia |
0.008 |
|
HealthAverage |
1.638*** |
|
HealthBad |
0.766*** |
|
HealthGood |
2.563*** |
|
HealthVery good |
3.505*** |
|
N |
9,023 |
|
*p <.05; **p <.01; ***p <.001 |
Спецификация дерева CHAID
Model formula:
TARGET ~ v_age + age_sq + Male + v_nfm + Russian + Believer +
Marital + years_educ + Experience + Occupation + Reg + Health
Fitted party:
[1] root
| [2] Health in Very bad: 2 (n = 159, err = 67.9%)
| [3] Health in Average
| | [4] Reg in Moskow and Spb, Western Siberia
| | | [5] Marital in Single, Divorced, Widowed: 3 (n = 432, err = 54.4%)
| | | [6] Marital in Married
| | | | [7] Believer in 0: 3 (n = 211, err = 55.0%)
| | | | [8] Believer in 1: 3 (n = 645, err = 57.2%)
| | [9] Reg in Central and Central Black-Earth, Volga -Vaytski and Volga Basin
| | | [10] Marital in Single, Divorced, Widowed: 3 (n = 650, err = 45.8%)
| | | [11] Marital in Married
| | | | [12] Reg in Moskow and Spb, Central and Central Black-Earth, Eastern Siberia and Far Eastern, North Caucasian, Northern and North Western, Southern, Ural, Western Siberia: 3 (n = 620, err = 46.3%)
| | | | [13] Reg in Volga -Vaytski and Volga Basin: 3 (n = 697, err = 51.1%)
| | [14] Reg in Eastern Siberia and Far Eastern, Ural: 3 (n = 547, err = 54.1%)
| | [15] Reg in North Caucasian: 3 (n = 138, err = 59.4%)
| | [16] Reg in Northern and North Western: 5 (n = 318, err = 67.3%)
| | [17] Reg in Southern: 3 (n = 439, err = 54.9%)
| [18] Health in Bad
| | [19] Reg in Moskow and Spb, Central and Central Black-Earth, Eastern Siberia and Far Eastern, North Caucasian, Southern, Ural, Volga -Vaytski and Volga Basin, Western Siberia
| | | [20] Marital in Single, Widowed: 2 (n = 392, err = 61.7%)
| | | [21] Marital in Divorced: 2 (n = 115, err = 66.1%)
| | | [22] Marital in Married: 3 (n = 463, err = 54.6%)
| | [23] Reg in Northern and North Western: 4 (n = 75, err = 70.7%)
| [24] Health in Good
| | [25] Marital in Single, Widowed: 4 (n = 548, err = 55.5%)
| | [26] Marital in Divorced: 3 (n = 229, err = 57.6%)
| | [27] Marital in Married
| | | [28] Reg in Moskow and Spb, Central and Central Black-Earth, Eastern Siberia and Far Eastern, North Caucasian, Southern, Ural, Volga -Vaytski and Volga Basin, Western Siberia
| | | | [29] Believer in 0: 4 (n = 532, err = 53.8%)
| | | | [30] Believer in 1
| | | | | [31] Reg in Moskow and Spb, Central and Central Black-Earth, Eastern Siberia and Far Eastern, Northern and North Western, Southern, Ural, Volga -Vaytski and Volga Basin, Western Siberia: 4 (n = 1423, err = 49.4%)
| | | | | [32] Reg in North Caucasian: 4 (n = 137, err = 42.3%)
| | | [33] Reg in Northern and North Western: 5 (n = 144, err = 50.7%)
| [34] Health in Very good: 5 (n = 109, err = 56.0%)
Спецификация дерева Rpart
n= 11313
node), split, n, loss, yval, (yprob)
* denotes terminal node
1) root 11313 7035 3 (0.019 0.12 0.38 0.36 0.13)
2) Health=Very bad,Average,Bad 6772 3749 3 (0.029 0.17 0.45 0.28 0.073)
4) Health=Very bad,Bad 1327 834 3 (0.092 0.3 0.37 0.19 0.044)
8) Marital=Single,Divorced,Widowed 712 454 2 (0.12 0.36 0.34 0.16 0.022)
16) Male>=0.5 122 65 2 (0.24 0.47 0.2 0.09 0.0082) *
17) Male< 0.5 590 374 3 (0.097 0.34 0.37 0.17 0.025)
34) Reg=Southern,Ural,Western Siberia 164 89 2 (0.085 0.46 0.31 0.15 0) *
35) Reg=Moskow and Spb,Central and Central Black-Earth,Eastern Siberia and Far Eastern,North Caucasian,Northern and North Western,Volga -Vaytski and Volga Basin 426 261 3 (0.1 0.3 0.39 0.18 0.035) *
9) Marital=Married 615 362 3 (0.059 0.23 0.41 0.23 0.068)
18) Reg=Moskow and Spb,Central and Central Black-Earth,Eastern Siberia and Far Eastern,North Caucasian,Southern,Ural,Volga -Vaytski and Volga Basin,Western Siberia 552 309 3 (0.056 0.24 0.44 0.22 0.049) *
19) Reg=Northern and North Western 63 41 4 (0.079 0.17 0.16 0.35 0.24) *
5) Health=Average 5445 2915 3 (0.013 0.14 0.46 0.31 0.08)
10) Reg=Moskow and Spb,Central and Central Black-Earth,Eastern Siberia and Far Eastern,North Caucasian,Southern,Ural,Volga -Vaytski and Volga Basin,Western Siberia 5092 2657 3 (0.014 0.14 0.48 0.31 0.062)
20) Marital=Divorced,Widowed 1158 594 3 (0.029 0.25 0.49 0.21 0.028)
40) Reg=Moskow and Spb,Eastern Siberia and Far Eastern,North Caucasian,Southern,Ural,Western Siberia 628 355 3 (0.04 0.28 0.43 0.22 0.032)
80) Male>=0.5 100 58 2 (0.05 0.42 0.37 0.15 0.01) *
81) Male< 0.5 528 292 3 (0.038 0.25 0.45 0.23 0.036) *
41) Reg=Central and Central Black-Earth,Volga -Vaytski and Volga Basin 530 239 3 (0.017 0.22 0.55 0.2 0.023) *
21) Marital=Single,Married 3934 2063 3 (0.0089 0.11 0.48 0.34 0.072)
42) v_age>=37.5 2631 1328 3 (0.0095 0.11 0.5 0.32 0.062)
84) Reg=Moskow and Spb,Central and Central Black-Earth,Eastern Siberia and Far Eastern,Ural,Volga -Vaytski and Volga Basin 1846 889 3 (0.011 0.11 0.52 0.3 0.065)
168) Marital=Single 100 52 3 (0.02 0.27 0.48 0.22 0.01) *
169) Marital=Married 1746 837 3 (0.01 0.096 0.52 0.3 0.068)
338) Reg=Moskow and Spb,Eastern Siberia and Far Eastern,Ural 634 329 3 (0.013 0.13 0.48 0.29 0.085)
676) Occupation=Unemployed,Elementary_unskilled_occupation,Operators_and_assemblers,Professionals,Services_workers,Technicians 539 265 3 (0.015 0.13 0.51 0.27 0.074) *
677) Occupation=Clerks,Craft_and_related,Legislators_senior managers_officials 95 59 4 (0 0.15 0.33 0.38 0.15) *
339) Reg=Central and Central Black-Earth,Volga -Vaytski and Volga Basin 1112 508 3 (0.009 0.076 0.54 0.31 0.058) *
85) Reg=North Caucasian,Southern,Western Siberia 785 439 3 (0.0064 0.13 0.44 0.37 0.054)
170) Believer< 0.5 176 83 3 (0.0057 0.19 0.53 0.23 0.045) *
171) Believer>=0.5 609 356 3 (0.0066 0.12 0.42 0.4 0.056)
342) Reg=Southern,Western Siberia 531 303 3 (0.0075 0.12 0.43 0.39 0.051) *
343) Reg=North Caucasian 78 38 4 (0 0.077 0.32 0.51 0.09) *
43) v_age< 37.5 1303 735 3 (0.0077 0.089 0.44 0.38 0.092)
86) Reg=Central and Central Black-Earth,Eastern Siberia and Far Eastern,North Caucasian,Southern,Ural,Volga -Vaytski and Volga Basin 928 491 3 (0.0065 0.073 0.47 0.37 0.08)
172) Marital=Single 328 163 3 (0.012 0.11 0.5 0.3 0.067) *
173) Marital=Married 600 328 3 (0.0033 0.052 0.45 0.4 0.087)
346) Reg=Eastern Siberia and Far Eastern,North Caucasian,Southern 149 66 3 (0 0.047 0.56 0.3 0.094) *
347) Reg=Central and Central Black-Earth,Ural,Volga -Vaytski and Volga Basin 451 253 4 (0.0044 0.053 0.42 0.44 0.084) *
87) Reg=Moskow and Spb,Western Siberia 375 229 4 (0.011 0.13 0.35 0.39 0.12) *
11) Reg=Northern and North Western 353 234 5 (0.011 0.099 0.27 0.28 0.34) *
3) Health=Good,Very good 4541 2394 4 (0.0037 0.041 0.28 0.47 0.21)
6) Reg=Moskow and Spb,Central and Central Black-Earth,Eastern Siberia and Far Eastern,Southern,Ural,Volga -Vaytski and Volga Basin,Western Siberia 3720 1980 4 (0.004 0.046 0.31 0.47 0.17)
12) Marital=Divorced,Widowed 316 176 3 (0.025 0.15 0.44 0.32 0.066) *
13) Marital=Single,Married 3404 1765 4 (0.0021 0.037 0.3 0.48 0.18)
26) Health=Good 3262 1684 4 (0.0021 0.038 0.31 0.48 0.17)
52) v_age>=38.5 1004 538 4 (0.001 0.038 0.37 0.46 0.13)
104) Reg=Central and Central Black-Earth,Eastern Siberia and Far Eastern,Southern,Ural,Volga -Vaytski and Volga Basin 743 378 4 (0.0013 0.038 0.36 0.49 0.1)
208) years_educ< 12.5 185 111 3 (0 0.076 0.4 0.39 0.14) *
209) years_educ>=12.5 558 265 4 (0.0018 0.025 0.35 0.53 0.095) *
105) Reg=Moskow and Spb,Western Siberia 261 160 4 (0 0.038 0.37 0.39 0.2) *
53) v_age< 38.5 2258 1146 4 (0.0027 0.038 0.28 0.49 0.19)
106) Believer< 0.5 688 382 4 (0.0015 0.065 0.33 0.44 0.16)
212) Reg=Southern,Ural,Volga -Vaytski and Volga Basin 297 180 3 (0 0.098 0.39 0.37 0.13) *
213) Reg=Moskow and Spb,Central and Central Black-Earth,Eastern Siberia and Far Eastern,Western Siberia 391 196 4 (0.0026 0.041 0.28 0.5 0.18) *
107) Believer>=0.5 1570 764 4 (0.0032 0.026 0.26 0.51 0.2) *
27) Health=Very good 142 81 4 (0 0.007 0.13 0.43 0.43) *
7) Reg=North Caucasian,Northern and North Western 821 414 4 (0.0024 0.021 0.12 0.5 0.36)
14) Reg=Northern and North Western 299 163 5 (0.0033 0.047 0.17 0.32 0.45) *
15) Reg=North Caucasian 522 212 4 (0.0019 0.0057 0.096 0.59 0.3) *
Спецификация дерева C50
Call:
C5.0.formula(formula = TARGET ~., data = train, control = C5.0Control(CF = 0.25, minCases
= 30, fuzzyThreshold = F, noGlobalPruning = F))
Class specified by attribute `outcome'
Read 11313 cases (13 attributes) from undefined.data
Decision tree:
Health in {Good,Very good}:
:...Reg = North Caucasian: 4 (521.8/212)
: Reg in {Moskow and Spb,Central and Central Black-Earth,
: : Eastern Siberia and Far Eastern,Southern,Ural,
: : Volga -Vaytski and Volga Basin,Western Siberia}:
: :...Marital in {Single,Married}: 4 (3406.5/1766.6)
: : Marital in {Divorced,Widowed}: 3 (325.1/181.2)
: Reg = Northern and North Western:
: :...v_age <= 49: 5 (267/138)
: v_age > 49: 4 (32/16)
Health in {Very bad,Average,Bad}:
:...Reg = Northern and North Western:
:...Health in {Very bad,Bad}: 4 (117/83)
: Health = Average:
: :...Marital in {Divorced,Widowed}: 3 (72.2/47.2)
: Marital in {Single,Married}:
: :...v_nfm <= 2: 3 (102.8/58.8)
: v_nfm > 2: 5 (178/96)
Reg in {Moskow and Spb,Central and Central Black-Earth,
: Eastern Siberia and Far Eastern,North Caucasian,Southern,Ural,
: Volga -Vaytski and Volga Basin,Western Siberia}:
:...Health in {Very bad,Bad}:
:...Marital = Married: 3 (556.3/311.7)
: Marital in {Single,Divorced,Widowed}:
: :...Male > 0: 2 (114.9/60.9)
: Male <= 0:
: :...Believer <= 0: 2 (45.5/19.4)
: Believer > 0:
: :...Health = Very bad: 2 (66.3/39.2)
: Health = Bad: 3 (430.3/260.9)
Health = Average:
:...Marital in {Single,Divorced,Widowed}: 3 (1767.4/916)
Marital = Married:
:...v_age > 36:
:...Reg in {Moskow and Spb,Central and Central Black-Earth,
: : Eastern Siberia and Far Eastern,Southern,Ural,
: : Volga -Vaytski and Volga Basin,
: : Western Siberia}: 3 (2468.2/1235.7)
: Reg = North Caucasian: 4 (81.5/40)
v_age <= 36:
:...Reg in {Central and Central Black-Earth,
: Eastern Siberia and Far Eastern,North Caucasian,
: Southern}: 3 (303.1/158.1)
Reg = Western Siberia: 4 (136.7/77.7)
Reg = Moskow and Spb:
:...v_nfm <= 3: 3 (30/15)
: v_nfm > 3: 4 (39.5/19.5)
Reg = Ural:
:...v_nfm <= 3: 4 (32/14)
: v_nfm > 3: 3 (32/15)
Reg = Volga -Vaytski and Volga Basin:
:...Russian <= 0: 4 (44.9/16)
Russian > 0:
:...years_educ <= 15: 3 (77.4/38)
years_educ > 15: 4 (64.5/31.5)
Спецификация дерева J48
J48 pruned tree
------------------
Health = Very bad: 2 (159.0/108.0)
Health = Average
| Marital = Single: 3 (349.0/175.0)
| Marital = Divorced: 3 (493.0/255.0)
| Marital = Married
| | Reg = Moskow and Spb
| | | v_nfm <= 2: 3 (126.0/61.0)
| | | v_nfm > 2
| | | | years_educ <= 14: 3 (121.0/62.0)
| | | | years_educ > 14: 4 (112.0/60.0)
| | Reg = Central and Central Black-Earth: 3 (620.0/287.0)
| | Reg = Eastern Siberia and Far Eastern: 3 (140.0/72.0)
| | Reg = North Caucasian: 4 (93.0/51.0)
| | Reg = Northern and North Western
| | | Experience <= 21: 5 (106.0/50.0)
| | | Experience > 21: 3 (114.0/75.0)
| | Reg = Southern
| | | v_age <= 57: 3 (215.0/106.0)
| | | v_age > 57: 4 (110.0/54.0)
| | Reg = Ural
| | | v_nfm <= 2: 4 (100.0/59.0)
| | | v_nfm > 2: 3 (143.0/68.0)
| | Reg = Volga –Vaytski and Volga Basin
| | | v_age <= 40
| | | | v_age <= 34: 4 (111.0/53.0)
| | | | v_age > 34: 3 (106.0/52.0)
| | | v_age > 40: 3 (480.0/236.0)
| | Reg = Western Siberia
| | | v_age <= 36: 4 (117.0/68.0)
| | | v_age > 36: 3 (380.0/204.0)
| Marital = Widowed: 3 (661.0/343.0)
Health = Bad
| Marital = Single: 2 (44.0/22.0)
| Marital = Divorced: 2 (128.0/84.0)
| Marital = Married: 3 (504.0/286.0)
| Marital = Widowed
| | v_age <= 71: 3 (127.0/71.0)
| | v_age > 71
| | | Experience <= 39: 2 (111.0/63.0)
| | | Experience > 39: 3 (131.0/78.0)
Health = Good
| Marital = Single: 4 (426.0/232.0)
| Marital = Divorced: 3 (229.0/132.0)
| Marital = Married: 4 (2236.0/1147.0)
| Marital = Widowed: 4 (122.0/72.0)
Health = Very good: 5 (109.0/61.0)
Number of Leaves : 32
Size of the tree: 47
Спецификация дерева LMT
Logistic model tree
------------------
: LM_1:35/35 (9023)
Number of Leaves : 1
Size of the Tree: 1
LM_1:
Class 1:
-0.35 +
[Male] * 0.09 +
[v_nfm] * -0.21 +
[Russian] * 0.69 +
[Believer] * -0.48 +
[Marital=Single] * -0.2 +
[Marital=Divorced] * 0.91 +
[Marital=Married] * -0.79 +
[years_educ] * -0.09 +
[Experience] * -0.01 +
[Occupation=Unemployed] * 0.08 +
[Occupation=Technicians] * -0.35 +
[Reg=Central and Central Black-Earth] * 0.18 +
[Reg=North Caucasian] * -0.35 +
[Reg=Northern and North Western] * -0.22 +
[Reg=Ural] * 0.15 +
[Reg=Western Siberia] * -0.1 +
[Health=Very bad] * 2.23 +
[Health=Average] * -0.13 +
[Health=Bad] * 1.1 +
[Health=Good] * -1.6
Class 2:
0.9 +
[v_nfm] * -0.02 +
[Russian] * 0.14 +
[Believer] * -0.31 +
[Marital=Divorced] * 0.06 +
[Marital=Married] * -1.05 +
[years_educ] * -0.03 +
[Occupation=Unemployed] * 0.29 +
[Occupation=Clerks] * 0.11 +
[Occupation=Elementary_unskilled_occupation] * 0.47 +
[Occupation=Legislators_senior managers_officials] * -0.13 +
[Occupation=Professionals] * -0.11 +
[Occupation=Services_workers] * 0.1 +
[Occupation=Technicians] * -0.06 +
[Reg=Central and Central Black-Earth] * -0.11 +
[Reg=Eastern Siberia and Far Eastern] * 0.09 +
[Reg=North Caucasian] * -0.19 +
[Reg=Northern and North Western] * -0.3 +
[Reg=Southern] * 0.34 +
[Reg=Ural] * 0.29 +
[Reg=Volga –Vaytski and Volga Basin] * 0.04 +
[Reg=Western Siberia] * 0.17 +
[Health=Very bad] * 0.37 +
[Health=Bad] * 0.18 +
[Health=Good] * -1.04 +
[Health=Very good] * -1.63
Class 3:
0.55 +
[v_age] * 0 +
[age_sq] * -0 +
[Male] * -0.11 +
[v_nfm] * -0.01 +
[Russian] * 0.15 +
[Believer] * -0.03 +
[Marital=Single] * 0.04 +
[Marital=Married] * -0.22 +
[years_educ] * -0 +
[Experience] * 0 +
[Occupation=Operators_and_assemblers] * 0.13 +
[Occupation=Services_workers] * 0.05 +
[Occupation=Skilled_agriculture_and_fish] * -1.58 +
[Occupation=Technicians] * 0.04 +
[Reg=Central and Central Black-Earth] * 0.15 +
[Reg=Eastern Siberia and Far Eastern] * -0.09 +
[Reg=North Caucasian] * -0.15 +
[Reg=Northern and North Western] * -0.35 +
[Reg=Volga –Vaytski and Volga Basin] * 0.1 +
[Reg=Western Siberia] * -0.07 +
[Health=Very bad] * -0.25 +
[Health=Average] * 0.54 +
[Health=Bad] * -0.04 +
[Health=Good] * 0.13 +
[Health=Very good] * -0.99
Class 4:
-0.01 +
[v_age] * -0 +
[age_sq] * 0 +
[Male] * 0.02 +
[v_nfm] * 0.01 +
[Russian] * -0.13 +
[Believer] * 0.21 +
[Marital=Divorced] * -0.26 +
[Marital=Married] * 0.07 +
[years_educ] * 0.03 +
[Occupation=Clerks] * 0.05 +
[Occupation=Elementary_unskilled_occupation] * -0.08 +
[Occupation=Legislators_senior managers_officials] * 0.17 +
[Occupation=Services_workers] * -0.06 +
[Occupation=Skilled_agriculture_and_fish] * -0.73 +
[Occupation=Technicians] * -0.06 +
[Reg=North Caucasian] * 0.41 +
[Reg=Ural] * -0.17 +
[Health=Very bad] * -0.49 +
[Health=Average] * 0.21 +
[Health=Bad] * -0.49 +
[Health=Good] * 0.61 +
[Health=Very good] * 0.19
Class 5:
-1.07 +
[v_age] * -0.01 +
[v_nfm] * 0.03 +
[Russian] * -0.29 +
[Believer] * 0.42 +
[Marital=Divorced] * -0.3 +
[Marital=Married] * 0.59 +
[Occupation=Elementary_unskilled_occupation] * -0.37 +
[Occupation=Legislators_senior managers_officials] * 0.16 +
[Occupation=Professionals] * 0.08 +
[Occupation=Services_workers] * -0.26 +
[Occupation=Skilled_agriculture_and_fish] * 0.99 +
[Occupation=Technicians] * 0.06 +
[Reg=Moskow and Spb] * 0.12 +
[Reg=North Caucasian] * 0.53 +
[Reg=Northern and North Western] * 1.4 +
[Reg=Southern] * -0.27 +
[Reg=Ural] * -0.09 +
[Reg=Volga –Vaytski and Volga Basin] * -0.22 +
[Reg=Western Siberia] * 0.1 +
[Health=Very bad] * -0.29 +
[Health=Bad] * -0.81 +
[Health=Good] * 0.77 +
[Health=Very good] * 1.35
«Socio-demographic characteristics, alcohol drinking and self-rated health among Russian women»
Таблица 5
Спецификации регрессии
TARGET |
||
Agegroup |
-0.418*** |
|
education |
0.141*** |
|
MaritalMarried |
0.014 |
|
MaritalSingle |
0.091 |
|
MaritalWidowed |
-0.205*** |
|
Urban |
-0.156*** |
|
DrinkingDrinks regularly |
0.090** |
|
DrinkingDrinks seldom |
0.202*** |
|
N |
6,290 |
|
*p <.05; **p <.01; ***p <.001 |
Спецификация дерева CHAID
Model formula:
TARGET ~ Agegroup + education + Marital + Urban + Drinking
Fitted party:
[1] root
| [2] Agegroup in 1
| | [3] Drinking in Never drinks: Good (n = 451, err = 31.0%)
| | [4] Drinking in Drinks regularly, Drinks seldom: Good (n = 565, err = 42.1%)
| [5] Agegroup in 2
| | [6] Drinking in Never drinks, Drinks seldom: Good (n = 705, err = 43.8%)
| | [7] Drinking in Drinks regularly: Average (n = 556, err = 50.0%)
| [8] Agegroup in 3: Average (n = 1069, err = 42.8%)
| [9] Agegroup in 4: Average (n = 975, err = 30.4%)
| [10] Agegroup in 5
| | [11] Drinking in Never drinks
| | | [12] education in 1: Bad (n = 341, err = 53.4%)
| | | [13] education in 2
| | | | [14] Marital in Divorced, Widowed: Average (n = 374, err = 49.5%)
| | | | [15] Marital in Married, Single: Average (n = 228, err = 48.2%)
| | | [16] education in 3: Average (n = 202, err = 37.6%)
| | [17] Drinking in Drinks regularly, Drinks seldom: Average (n = 824, err = 29.7%)
Number of inner nodes: 6
Number of terminal nodes: 11
Спецификация дерева Rpart
n= 7893
node), split, n, loss, yval, (yprob)
* denotes terminal node
1) root 7893 4246 Average (0.013 0.11 0.46 0.4 0.017)
2) Agegroup>=3.5 2959 1134 Average (0.032 0.24 0.62 0.11 0.002)
4) Drinking=Never drinks 1534 717 Average (0.052 0.32 0.53 0.097 0.002)
8) Agegroup>=4.5 1150 587 Average (0.064 0.37 0.49 0.07 0.0017)
16) education< 1.5 341 182 Bad (0.091 0.47 0.37 0.067 0.0059)
32) Marital=Divorced,Widowed 259 129 Bad (0.089 0.5 0.35 0.058 0.0039) *
33) Marital=Married,Single 82 46 Average (0.098 0.35 0.44 0.098 0.012) *
17) education>=1.5 809 372 Average (0.053 0.34 0.54 0.07 0) *
9) Agegroup< 4.5 384 130 Average (0.016 0.14 0.66 0.18 0.0026) *
5) Drinking=Drinks regularly,Drinks seldom 1425 417 Average (0.0098 0.15 0.71 0.13 0.0021) *
3) Agegroup< 3.5 4934 2095 Good (0.0016 0.028 0.37 0.58 0.026)
6) Agegroup>=2.5 1076 459 Average (0.0056 0.057 0.57 0.36 0.0074) *
7) Agegroup< 2.5 3858 1403 Good (0.00052 0.02 0.31 0.64 0.031)
14) Drinking=Drinks regularly 925 458 Good (0.0011 0.024 0.46 0.5 0.014)
28) Agegroup>=1.5 557 279 Average (0.0018 0.022 0.5 0.47 0.011)
56) Marital=Divorced,Married 499 246 Average (0.002 0.018 0.51 0.46 0.012)
112) education< 1.5 35 15 Average (0 0.029 0.57 0.4 0) *
113) education>=1.5 464 231 Average (0.0022 0.017 0.5 0.47 0.013)
226) Marital=Married 410 203 Average (0.0024 0.02 0.5 0.46 0.015) *
227) Marital=Divorced 54 26 Good (0 0 0.48 0.52 0) *
57) Marital=Single,Widowed 58 28 Good (0 0.052 0.43 0.52 0) *
29) Agegroup< 1.5 368 161 Good (0 0.027 0.39 0.56 0.019) *
15) Drinking=Never drinks,Drinks seldom 2933 945 Good (0.00034 0.019 0.27 0.68 0.036)
30) Agegroup>=1.5 1957 625 Good (0.00051 0.02 0.26 0.68 0.038)
60) Marital=Married,Widowed 1819 574 Good (0 0.02 0.26 0.68 0.038)
120) education>=2.5 1202 377 Good (0 0.02 0.25 0.69 0.043)
240) Urban< 0.5 35 16 Average (0 0.029 0.54 0.4 0.029) *
241) Urban>=0.5 1167 356 Good (0 0.02 0.24 0.69 0.044) *
121) education< 2.5 617 197 Good (0 0.019 0.27 0.68 0.029) *
61) Marital=Divorced,Single 138 51 Good (0.0072 0.029 0.3 0.63 0.029) *
31) Agegroup< 1.5 976 320 Good (0 0.015 0.28 0.67 0.033) *
Спецификация дерева C50
Call:
C5.0.formula(formula = TARGET ~., data = train, control = C5.0Control(CF = 0.25, minCases
= 50, fuzzyThreshold = T, noGlobalPruning = F))
Class specified by attribute `outcome'
Read 7893 cases (6 attributes) from undefined.data
Decision tree:
Agegroup <= 2 (3.5):
:...Agegroup <= 2 (2.5): Good (2857.7/1140.8)
: Agegroup >= 3 (2.5):
: :...Marital in {Divorced,Married,Widowed}: Average (1189.5/577.8)
: Marital = Single: Good (153.3/67.5)
Agegroup >= 4 (3.5):
:...Marital in {Divorced,Married,Widowed}: Average (3372.3/1515.8)
Marital = Single:
:...education <= 1 (1.5): Good (196.4/73.5)
education >= 2 (1.5): Average (123.9/65.1)
Спецификация дерева J48
J48 pruned tree
------------------
Agegroup <= 3
| Agegroup <= 2
| | Agegroup <= 1: Good (1016.0/378.0)
| | Agegroup > 1
| | | Drinking = Never drinks: Good (403.0/176.0)
| | | Drinking = Drinks regularly: Average (556.0/278.0)
| | | Drinking = Drinks seldom: Good (302.0/133.0)
| Agegroup > 2: Average (1069.0/458.0)
Agegroup > 3
| Agegroup <= 4: Average (975.0/296.0)
| Agegroup > 4
| | Drinking = Never drinks
| | | education <= 1: Bad (341.0/182.0)
| | | education > 1: Average (804.0/371.0)
| | Drinking = Drinks regularly: Average (414.0/135.0)
| | Drinking = Drinks seldom: Average (410.0/110.0)
Number of Leaves : 10
Size of the tree: 17
Спецификация дерева LMT
Logistic model tree
------------------
Agegroup <= 3
| Agegroup <= 2
| | Agegroup <= 1: LM_1:10/40 (1016)
| | Agegroup > 1: LM_2:10/40 (1261)
| Agegroup > 2
| | education <= 2: LM_3:10/40 (704)
| | education > 2: LM_4:10/40 (365)
Agegroup > 3: LM_5:10/20 (2944)
Number of Leaves : 5
Size of the Tree: 9
«Пробит-моделирование национальной солидарности на примере России»
Таблица 6
Спецификации регрессии
TARGET |
||
Satisfaction |
0.094*** |
|
Expectation |
0.027 |
|
Self_provision |
0.158*** |
|
Money_satisfaction |
0.127*** |
|
Wealth_Self_assesment |
-0.092*** |
|
Health |
0.038 |
|
Education |
-0.069*** |
|
Male |
0.036 |
|
Age |
-0.007 |
Подобные документы
Разработка и принятие правильного решения как задачи работы управленческого персонала организации. Деревья решений - один из методов автоматического анализа данных, преимущества их использования и область применения. Построение деревьев классификации.
контрольная работа [91,6 K], добавлен 08.09.2011Этапы построения деревьев решений: правило разбиения, остановки и отсечения. Постановка задачи многошагового стохастического выбора в предметной области. Оценка вероятности реализации успешной и неуспешной деятельности в задаче, ее оптимальный путь.
реферат [188,8 K], добавлен 23.05.2015Использование системного анализа для подготовки и обоснования управленческих решений по многофакторным проблемам. Возникновение синергетики как науки о законах построения организации, возникновения упорядоченности, развитии и самоусложнении системы.
реферат [40,4 K], добавлен 21.01.2015Понятие сетевого планирования, его особенности, назначение и сферы применения. Правила и этапы построения сетевых графиков, необходимые расчеты и решение типовых задач. Общая характеристика корреляционного и регрессивного анализа, их применение.
контрольная работа [142,3 K], добавлен 29.04.2009Геометрическая интерпретация, графический и симплексный методы решения задачи линейного программирования. Компьютерная реализация задач стандартными офисными средствами, в среде пакета Excel. Задачи распределительного типа, решаемые в землеустройстве.
методичка [574,3 K], добавлен 03.10.2012Теоретические основы прикладного регрессионного анализа. Проверка предпосылок и предположений регрессионного анализа. Обнаружение выбросов в выборке. Рекомендации по устранению мультиколлинеарности. Пример практического применения регрессионного анализа.
курсовая работа [1,2 M], добавлен 04.02.2011Понятие математического программирования как отрасли математики, являющейся теоретической основой решения задач о нахождении оптимальных решений. Основные этапы нахождения оптимальных решений экономических задач. Примеры задач линейного программирования.
учебное пособие [2,0 M], добавлен 15.06.2015Использование симплексного метода решения задач линейного программирования для расчета суточного объема производства продукции. Проверка плана на оптимальность. Пересчет симплексной таблицы методом Жордана-Гаусса. Составление модели транспортной задачи.
контрольная работа [613,3 K], добавлен 18.02.2014Математическая формализация оптимизационной проблемы. Геометрическая интерпретация стандартной задачи линейного программирования, планирование товарооборота. Сущность и алгоритм симплекс-метода. Постановка транспортной задачи, последовательность решения.
учебное пособие [126,0 K], добавлен 07.10.2014Применение линейного программирования для решения транспортной задачи. Свойство системы ограничений, опорное решение задачи. Методы построения начального опорного решения. Распределительный метод, алгоритм решения транспортной задачи методом потенциалов.
реферат [4,1 M], добавлен 09.03.2011Изучение интуитивных и рациональных методов подхода к решению творческих задач. Темпы технического прогресса напрямую зависят от изобретателей, а экономические успехи зависят от темпов технического прогресса. Методы решения изобретательских задач.
реферат [22,4 K], добавлен 17.07.2008Сущность метода наименьших квадратов. Экономический смысл параметров кривой роста (линейная модель). Оценка погрешности и проверка адекватности модели. Построение точечного и интервального прогноза. Суть графического построения области допустимых решений.
контрольная работа [32,3 K], добавлен 23.04.2013Количественное обоснование управленческих решений по улучшению состояния экономических процессов методом математических моделей. Анализ оптимального решения задачи линейного программирования на чувствительность. Понятие многопараметрической оптимизации.
курсовая работа [4,2 M], добавлен 20.04.2015Навыки применения теоретических знаний по теме "Одномерный регрессионный анализ" при решении экономических задач с помощью системы GRETL. Анализ затрат в зависимости от числа ящиков, готовых к разгрузке. Обоснование результатов регрессионного анализа
лабораторная работа [27,2 K], добавлен 15.12.2008Решение задач при помощи пакета прикладных программ MatLab. Загрузка в MatLab матриц A и P. Нахождение оптимальной стратегии для заданных матриц с использованием критериев принятия решений в условиях неопределённости Вальда, Гурвица, Лапласа, Сэвиджа.
лабораторная работа [80,2 K], добавлен 18.03.2015Классическая теория оптимизации. Функция скаляризации Чебышева. Критерий Парето-оптимальность. Марковские процессы принятия решений. Метод изменения ограничений. Алгоритм нахождения кратчайшего пути. Процесс построения минимального остовного дерева сети.
контрольная работа [182,8 K], добавлен 18.01.2015Связь между случайными переменными и оценка её тесноты как основная задача корреляционного анализа. Регрессионный анализ, расчет параметров уравнения линейной парной регрессии. Оценка статистической надежности результатов регрессионного моделирования.
контрольная работа [50,4 K], добавлен 07.06.2011Построение экономических и математических моделей принятия решений в условиях неопределенности. Общая методология оптимизационных задач, оценка преимуществ выбранного варианта. Двойственность и симплексный метод решения задач линейного программирования.
курс лекций [496,2 K], добавлен 17.11.2011Основные понятия линейной алгебры и выпуклого анализа, применяемые в теории математического программирования. Характеристика графических методов решения задачи линейного программирования, сущность их геометрической интерпретации и основные этапы.
курсовая работа [609,5 K], добавлен 17.02.2010Классические подходы к анализу финансовых рынков, алгоритмы машинного обучения. Модель ансамблей классификационных деревьев для прогнозирования динамики финансовых временных рядов. Выбор алгоритма для анализа данных. Практическая реализация модели.
дипломная работа [1,5 M], добавлен 21.09.2016