kaggle-泰坦尼克号Titanic-3
Posted Freeman耀
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了kaggle-泰坦尼克号Titanic-3相关的知识,希望对你有一定的参考价值。
根据以上两篇的分析,下面我们还要对数据进行处理,观察Age和Fare两个属性,乘客的数值变化幅度较大!根据逻辑回归和梯度下降的了解,如果属性值之间scale差距较大,将对收敛速度造成较大影响,甚至不收敛!因此,我们需要运用scikit-learn里面的preprocessing模块对Age和Fare两个属性做一个scaling,即将其数值转化为[-1,1]范围内。
1 # 接下来我们将一些变化幅度较大的特征化到[-1,1]之内,这样可以加速logistic regression的收敛 2 import sklearn.preprocessing as preprocessing 3 scaler = preprocessing.StandardScaler() 4 age_scale_param = scaler.fit(df[\'Age\']) 5 df[\'Age_scaled\'] = scaler.fit_transform(df[\'Age\'],age_scale_param) 6 fare_scale_param = scaler.fit(df[\'Fare\']) 7 df[\'Fare_scaled\'] = scaler.fit_transform(df[\'Fare\'],fare_scale_param) 8 print(df)
|
PassengerId |
Survived |
Age |
SibSp |
Parch |
Fare |
Cabin_No |
Cabin_Yes |
Embarked_C |
Embarked_Q |
Embarked_S |
Sex_female |
Sex_male |
Pclass_1 |
Pclass_2 |
Pclass_3 |
Age_scaled |
Fare_scaled |
0 |
1 |
0 |
22.000000 |
1 |
0 |
7.2500 |
1 |
0 |
0 |
0 |
1 |
0 |
1 |
0 |
0 |
1 |
-0.561417 |
-0.502445 |
1 |
2 |
1 |
38.000000 |
1 |
0 |
71.2833 |
0 |
1 |
1 |
0 |
0 |
1 |
0 |
1 |
0 |
0 |
0.613177 |
0.786845 |
2 |
3 |
1 |
26.000000 |
0 |
0 |
7.9250 |
1 |
0 |
0 |
0 |
1 |
1 |
0 |
0 |
0 |
1 |
-0.267768 |
-0.488854 |
3 |
4 |
1 |
35.000000 |
1 |
0 |
53.1000 |
0 |
1 |
0 |
0 |
1 |
1 |
0 |
1 |
0 |
0 |
0.392941 |
0.420730 |
4 |
5 |
0 |
35.000000 |
0 |
0 |
8.0500 |
1 |
0 |
0 |
0 |
1 |
0 |
1 |
0 |
0 |
1 |
0.392941 |
-0.486337 |
5 |
6 |
0 |
23.828953 |
0 |
0 |
8.4583 |
1 |
0 |
0 |
1 |
0 |
0 |
1 |
0 |
0 |
1 |
-0.427149 |
-0.478116 |
6 |
7 |
0 |
54.000000 |
0 |
0 |
51.8625 |
0 |
1 |
0 |
0 |
1 |
0 |
1 |
1 |
0 |
0 |
1.787771 |
0.395814 |
7 |
8 |
0 |
2.000000 |
3 |
1 |
21.0750 |
1 |
0 |
0 |
0 |
1 |
0 |
1 |
0 |
0 |
1 |
-2.029659 |
-0.224083 |
8 |
9 |
1 |
27.000000 |
0 |
2 |
11.1333 |
1 |
0 |
0 |
0 |
1 |
1 |
0 |
0 |
0 |
1 |
-0.194356 |
-0.424256 |
9 |
10 |
1 |
14.000000 |
1 |
0 |
30.0708 |
1 |
0 |
1 |
0 |
0 |
1 |
0 |
0 |
1 |
0 |
-1.148714 |
-0.042956 |
10 |
11 |
1 |
4.000000 |
1 |
1 |
16.7000 |
0 |
1 |
0 |
0 |
1 |
1 |
0 |
0 |
0 |
1 |
-1.882835 |
-0.312172 |
11 |
12 |
1 |
58.000000 |
0 |
0 |
26.5500 |
0 |
1 |
0 |
0 |
1 |
1 |
0 |
1 |
0 |
0 |
2.081420 |
-0.113846 |
12 |
13 |
0 |
20.000000 |
0 |
0 |
8.0500 |
1 |
0 |
0 |
0 |
1 |
0 |
1 |
0 |
0 |
1 |
-0.708241 |
-0.486337 |
13 |
14 |
0 |
39.000000 |
1 |
5 |
31.2750 |
1 |
0 |
0 |
0 |
1 |
0 |
1 |
0 |
0 |
1 |
0.686589 |
-0.018709 |
14 |
15 |
0 |
14.000000 |
0 |
0 |
7.8542 |
1 |
0 |
0 |
0 |
1 |
1 |
0 |
0 |
0 |
1 |
-1.148714 |
-0.490280 |
15 |
16 |
1 |
55.000000 |
0 |
0 |
16.0000 |
1 |
0 |
0 |
0 |
1 |
1 |
0 |
0 |
1 |
0 |
1.861183 |
-0.326267 |
16 |
17 |
0 |
2.000000 |
4 |
1 |
29.1250 |
1 |
0 |
0 |
1 |
0 |
0 |
1 |
0 |
0 |
1 |
-2.029659 |
-0.061999 |
17 |
18 |
1 |
32.066493 |
0 |
0 |
13.0000 |
1 |
0 |
0 |
0 |
1 |
0 |
1 |
0 |
1 |
0 |
0.177586 |
-0.386671 |
18 |
19 |
0 |
31.000000 |
1 |
0 |
18.0000 |
1 |
0 |
0 |
0 |
1 |
1 |
0 |
0 |
0 |
1 |
0.099292 |
-0.285997 |
19 |
20 |
1 |
29.518205 |
0 |
0 |
7.2250 |
1 |
0 |
1 |
0 |
0 |
1 |
0 |
0 |
0 |
1 |
-0.009489 |
-0.502949 |
20 |
21 |
0 |
35.000000 |
0 |
0 |
26.0000 |
1 |
0 |
0 |
0 |
1 |
0 |
1 |
0 |
1 |
0 |
0.392941 |
-0.124920 |
21 |
22 |
1 |
34.000000 |
0 |
0 |
13.0000 |
0 |
1 |
0 |
0 |
1 |
0 |
1 |
0 |
1 |
0 |
0.319529 |
-0.386671 |
22 |
23 |
1 |
15.000000 |
0 |
0 |
8.0292 |
1 |
0 |
0 |
1 |
0 |
1 |
0 |
0 |
0 |
1 |
-1.075302 |
-0.486756 |
23 |
24 |
1 |
28.000000 |
0 |
0 |
35.5000 |
0 |
1 |
0 |
0 |
1 |
0 |
1 |
1 |
0 |
0 |
-0.120944 |
0.066360 |
24 |
25 |
0 |
8.000000 |
3 |
1 |
21.0750 |
1 |
0 |
0 |
0 |
1 |
1 |
0 |
0 |
0 |
1 |
-1.589186 |
-0.224083 |
25 |
26 |
1 |
38.000000 |
1 |
5 |
31.3875 |
1 |
0 |
0 |
0 |
1 |
1 |
0 |
0 |
0 |
1 |
0.613177 |
-0.016444 |
26 |
27 |
0 |
29.518205 |
0 |
0 |
7.2250 |
1 |
0 |
1 |
0 |
0 |
0 |
1 |
0 |
0 |
1 |
-0.009489 |
-0.502949 |
27 |
28 |
0 |
19.000000 |
3 |
2 |
263.0000 |
0 |
1 |
0 |
0 |
1 |
0 |
1 |
1 |
0 |
0 |
-0.781653 |
4.647001 |
28 |
29 |
1 |
22.380113 |
0 |
0 |
7.8792 |
1 |
0 |
0 |
1 |
0 |
1 |
0 |
0 |
0 |
1 |
-0.533512 |
-0.489776 |
29 |
30 |
0 |
27.947206 |
0 |
0 |
7.8958 |
1 |
0 |
0 |
0 |
1 |
0 |
1 |
0 |
0 |
1 |
-0.124820 |
-0.489442 |
... |
... |
... |
... |
... |
... |
... |
... |
... |
... |
... |
... |
... |
... |
... |
... |
... |
... |
... |
861 |
862 |
0 |
21.000000 |
1 |
0 |
11.5000 |
1 |
0 |
0 |
0 |
1 |
0 |
1 |
0 |
1 |
0 |
-0.634829 |
-0.416873 |
862 |
863 |
1 |
48.000000 |
0 |
0 |
25.9292 |
0 |
1 |
0 |
0 |
1 |
1 |
0 |
1 |
0 |
0 |
1.347299 |
-0.126345 |
863 |
864 |
0 |
10.888325 |
8 |
2 |
69.5500 |
1 |
0 |
0 |
0 |
1 |
1 |
0 |
0 |
0 |
1 |
-1.377148 |
0.751946 |
864 |
865 |
0 |
24.000000 |
0 |
0 |
13.0000 |
1 |
0 |
0 |
0 |
1 |
0 |
1 |
0 |
1 |
0 |
-0.414592 |
-0.386671 |
865 |
866 |
1 |
42.000000 |
Kaggle经典测试,泰坦尼克号的生存预测,机器学习实验----02 |