import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import sklearn.linear_model다중공선성의 해소
Ridge와Lasso를 통해 다중공선성을 극복해보자.
해당 자료는 전북대학교 통계학과 최규빈 교수님의 강의 내용을 토대로 재구성되었음을 밝힙니다.
1. 라이브러리 imports
2. Ridge : L2-penalty
df = pd.read_csv("https://raw.githubusercontent.com/guebin/MP2023/main/posts/employment_multicollinearity.csv")
np.random.seed(43052)
df['employment_score'] = df.gpa * 1.0 + df.toeic* 1/100 + np.random.randn(500)
df| employment_score | gpa | toeic | toeic0 | toeic1 | toeic2 | toeic3 | toeic4 | toeic5 | toeic6 | ... | toeic490 | toeic491 | toeic492 | toeic493 | toeic494 | toeic495 | toeic496 | toeic497 | toeic498 | toeic499 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1.784955 | 0.051535 | 135 | 129.566309 | 133.078481 | 121.678398 | 113.457366 | 133.564200 | 136.026566 | 141.793547 | ... | 132.014696 | 140.013265 | 135.575816 | 143.863346 | 152.162740 | 132.850033 | 115.956496 | 131.842126 | 125.090801 | 143.568527 |
| 1 | 10.789671 | 0.355496 | 935 | 940.563187 | 935.723570 | 939.190519 | 938.995672 | 945.376482 | 927.469901 | 952.424087 | ... | 942.251184 | 923.241548 | 939.924802 | 921.912261 | 953.250300 | 931.743615 | 940.205853 | 930.575825 | 941.530348 | 934.221055 |
| 2 | 8.221213 | 2.228435 | 485 | 493.671390 | 493.909118 | 475.500970 | 480.363752 | 478.868942 | 493.321602 | 490.059102 | ... | 484.438233 | 488.101275 | 485.626742 | 475.330715 | 485.147363 | 468.553780 | 486.870976 | 481.640957 | 499.340808 | 488.197332 |
| 3 | 2.137594 | 1.179701 | 65 | 62.272565 | 55.957257 | 68.521468 | 76.866765 | 51.436321 | 57.166824 | 67.834920 | ... | 67.653225 | 65.710588 | 64.146780 | 76.662194 | 66.837839 | 82.379018 | 69.174745 | 64.475993 | 52.647087 | 59.493275 |
| 4 | 8.650144 | 3.962356 | 445 | 449.280637 | 438.895582 | 433.598274 | 444.081141 | 437.005100 | 434.761142 | 443.135269 | ... | 455.940348 | 435.952854 | 441.521145 | 443.038886 | 433.118847 | 466.103355 | 430.056944 | 423.632873 | 446.973484 | 442.793633 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 495 | 9.057243 | 4.288465 | 280 | 276.680902 | 274.502675 | 277.868536 | 292.283300 | 277.476630 | 281.671647 | 296.307373 | ... | 269.541846 | 278.220546 | 278.484758 | 284.901284 | 272.451612 | 265.784490 | 275.795948 | 280.465992 | 268.528889 | 283.638470 |
| 496 | 4.108020 | 2.601212 | 310 | 296.940263 | 301.545000 | 306.725610 | 314.811407 | 311.935810 | 309.695838 | 301.979914 | ... | 304.680578 | 295.476836 | 316.582100 | 319.412132 | 312.984039 | 312.372112 | 312.106944 | 314.101927 | 309.409533 | 297.429968 |
| 497 | 2.430590 | 0.042323 | 225 | 206.793217 | 228.335345 | 222.115146 | 216.479498 | 227.469560 | 238.710310 | 233.797065 | ... | 233.469238 | 235.160919 | 228.517306 | 228.349646 | 224.153606 | 230.860484 | 218.683195 | 232.949484 | 236.951938 | 227.997629 |
| 498 | 5.343171 | 1.041416 | 320 | 327.461442 | 323.019899 | 329.589337 | 313.312233 | 315.645050 | 324.448247 | 314.271045 | ... | 326.297700 | 309.893822 | 312.873223 | 322.356584 | 319.332809 | 319.405283 | 324.021917 | 312.363694 | 318.493866 | 310.973930 |
| 499 | 6.505106 | 3.626883 | 375 | 370.966595 | 364.668477 | 371.853566 | 373.574930 | 376.701708 | 356.905085 | 354.584022 | ... | 382.278782 | 379.460816 | 371.031640 | 370.272639 | 375.618182 | 369.252740 | 376.925543 | 391.863103 | 368.735260 | 368.520844 |
500 rows × 503 columns
위와 같은 데이터에서
toeic0~toeic499는 설명변수 간 상관관계가 높은 녀석들이다.
A. True World
## step1
df_train, df_test = sklearn.model_selection.train_test_split(df,test_size=0.3,random_state=42)
X = df_train.loc[:,'gpa':'toeic']
y = df_train[['employment_score']]
XX = df_test.loc[:,'gpa':'toeic']
yy = df_test[['employment_score']]
## step2
predictr = sklearn.linear_model.LinearRegression()
## step3
predictr.fit(X,y)
## step4 : pass LinearRegression()In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
LinearRegression()
print(f'train_score:\t{predictr.score(X,y):.4f}')
print(f'test_score:\t{predictr.score(XX,yy):.4f}')train_score: 0.9133
test_score: 0.9127
- 언더라잉만 잘 적합한 결과, 오차항 때문에 1.0은 나오기 힘듦
이 점수는 현실적으로 달성하기 어려워…
### B. 무지성…
## step1
df_train, df_test = sklearn.model_selection.train_test_split(df,test_size=0.3,random_state=42)
X = df_train.drop(['employment_score'], axis = 1)
y = df_train[['employment_score']]
XX = df_test.drop(['employment_score'], axis = 1)
yy = df_test[['employment_score']]
## step2
predictr = sklearn.linear_model.LinearRegression()
## step3
predictr.fit(X,y)
## step4 : pass LinearRegression()In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
LinearRegression()
print(f'train_score: {predictr.score(X,y):.4f}')
print(f'test_score: {predictr.score(XX,yy):.4f}')train_score: 1.0000
test_score: 0.1171
명백한 오버피팅…
C. Ridge
- 통계학자 : 이럴경우 Ridge를 사용하면 됩니다…
## step1
df_train, df_test = sklearn.model_selection.train_test_split(df,test_size=0.3,random_state=42)
X = df_train.loc[:,'gpa':'toeic499']
y = df_train.loc[:,'employment_score']
XX = df_test.loc[:,'gpa':'toeic499']
yy = df_test.loc[:,'employment_score']
## step2
predictr = sklearn.linear_model.Ridge() ## 로지스틱의 경우 LogisticRegressionCV(penalty = 'l2')를 사용 가능
## step3
predictr.fit(X,y)
## step4 -- pass Ridge()In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
Ridge()
print(f'train_score: {predictr.score(X,y):.4f}')
print(f'test_score: {predictr.score(XX,yy):.4f}')train_score: 1.0000
test_score: 0.1173
??? 안되는데요?
- 하이퍼 파라미터를 튜닝하면 됩니다…
## step1 --- 다넣음
df_train, df_test = sklearn.model_selection.train_test_split(df,test_size=0.3,random_state=42)
X = df_train.loc[:,'gpa':'toeic499']
y = df_train.loc[:,'employment_score']
XX = df_test.loc[:,'gpa':'toeic499']
yy = df_test.loc[:,'employment_score']
## step2
predictr = sklearn.linear_model.Ridge(alpha=5e8) ## alpha = 500000000.
## step3
predictr.fit(X,y)
## step4 -- pass
#---#
print(f'train_score: {predictr.score(X,y):.4f}')
print(f'test_score: {predictr.score(XX,yy):.4f}')train_score: 0.7507
test_score: 0.7438
오라클에 비할 바는 아니긴 한데, 공선성이 있는 경우라도 적절한 alpha를 고른다면 망하지는 않음.
### D. Ridge의 작동원리
- 정확한 설명…
SVD를 이용하여 이론적으로 계산하면, sklearn.linear_model.LinearRegression()로 적합한 결과보다 sklearn.linear_model.Ridge()로 적합한 결과를 더 좋게 만드는 가 항상 존재함을 증명할 수 있음…
그렇다네요.
- 직관적 설명(엄밀하지 않은 설명)
- LinearRegression은 왜 망했지???
취업 자료의 예제를 보면 토익 성적의 계수는 실제로 0.01이다. 적당히… *
toeic_coef+toeic0_coef+…+toeic499_coef\(\approx\) 0.01이라면 대충 맞는 답이다.
- 근데 사실 이 0.01이라는 값은 몇 개의 계수만 있어도 만들 순 있을거임… -> 나머지 설명변수가 모두 불필요한 특징이 됨.
그래가지고 불필요한 특징은 다중공선성의 문제 때문에 오버피팅을 유발한다.
그래서 Ridge는 몇 개의 계수만 빼고 나머지들이 쓸모없는 게 되지 않도록, 다 유의미하도록 계수에 패널티를 부여한다.
E. \(\alpha\)에 따른 계수값 변화
- 여러 개의 predictor를 alpha의 값을 달리하며 학습
## step1 --- toeic, gpa 만 남기고 나머지 변수를 삭제
df_train, df_test = sklearn.model_selection.train_test_split(df,test_size=0.3,random_state=42)
X = df_train.loc[:,'gpa':'toeic499']
y = df_train.loc[:,'employment_score']
XX = df_test.loc[:,'gpa':'toeic499']
yy = df_test.loc[:,'employment_score']
## step2
alphas = [5e2, 5e3, 5e4, 5e5, 5e6, 5e7, 5e8]
predictrs = [sklearn.linear_model.Ridge(alpha=alpha) for alpha in alphas]
## 아래에서 배울 RidgeCV에서 이 값들 중 어느 값이 가장 좋을 지 결정하게 할 수 있음
## step3
for predictr in predictrs: ## 이건 리스트로 만드는 게 아니니까...
predictr.fit(X,y)
## step4 -- pass plt.plot(predictrs[0].coef_[1:], label = r'$\alpha$ = {}'.format(predictrs[0].alpha))
plt.plot(predictrs[2].coef_[1:], label = r'$\alpha$ = {}'.format(predictrs[2].alpha))
plt.legend()
plt.show()
plt.plot(predictrs[3].coef_[1:],label=r'$\alpha$={}'.format(predictrs[3].alpha))
plt.plot(predictrs[5].coef_[1:],label=r'$\alpha$={}'.format(predictrs[5].alpha))
plt.legend()
plt.show()
plt.plot(predictrs[5].coef_[1:],label=r'$\alpha$={}'.format(predictrs[5].alpha))
plt.plot(predictrs[-1].coef_[1:],label=r'$\alpha$={}'.format(predictrs[-1].alpha))
plt.legend()
plt.show()
alpha의 값이 작을수록, 그 변동 폭이 줄어듦을 알 수 있다.
- 마지막 predictor의 계수값을 살펴보면…
s = pd.Series(predictrs[-1].coef_)
s.set_axis(X.columns, axis = 0)gpa 0.000001
toeic 0.000019
toeic0 0.000018
toeic1 0.000018
toeic2 0.000019
...
toeic495 0.000018
toeic496 0.000019
toeic497 0.000019
toeic498 0.000019
toeic499 0.000019
Length: 502, dtype: float64
- 불필요한 변수가 나올 수 없는 구조가 되어버렸음(한두개로 계수 0.01을 만들 수 없음)
- 모든 변수는 대량 2e-5(\(\approx\frac{1}{100}\frac{1}{501}\))정도 똑같이 중요하다고 생각된다.
- 살짝 (\(\frac{1}{100}\frac{1}{501}\))보다 전체적으로 값이 작아보이는데, 이는 기분탓이 아니다.
[predictr.coef_[1:].sum() for predictr in predictrs][0.010274546089787007,
0.010157633994689774,
0.009948779293105905,
0.009866050921714562,
0.009854882844936588,
0.009820059959693872,
0.00949099901512329]
갈수록 합의 크기가 작아짐…
1/100*1/5011.9960079840319362e-05
게대가 본래 기대될 회귀계수의 값보다 전체적으로 조금씩 낮은 편
### F. \(\alpha\) 정리
- L2-penalty는 대충 분산같은 것…
x = np.random.randn(5)
L2_penalty = (x**2).sum() ## 제곱합, 평균에서 멀어진...
(L2_penalty, 5*(x.var() + (x.mean()**2))) ## 2차 적률인듯. E(X**2)(10.591975556137934, 10.591975556137934)
for predictr in predictrs :
print(
f'alpha={predictr.alpha:.0e}\t'
f'l2_penalty={((predictr.coef_)**2).sum():.6f}\t'
f'sum(toeic_coefs)={((predictr.coef_[1:])).sum():.4f}\t'
f'test_score={predictr.score(XX,yy):.4f}')alpha=5e+02 l2_penalty=0.046715 sum(toeic_coefs)=0.0103 test_score=0.2026
alpha=5e+03 l2_penalty=0.021683 sum(toeic_coefs)=0.0102 test_score=0.4638
alpha=5e+04 l2_penalty=0.003263 sum(toeic_coefs)=0.0099 test_score=0.6889
alpha=5e+05 l2_penalty=0.000109 sum(toeic_coefs)=0.0099 test_score=0.7407
alpha=5e+06 l2_penalty=0.000002 sum(toeic_coefs)=0.0099 test_score=0.7447
alpha=5e+07 l2_penalty=0.000000 sum(toeic_coefs)=0.0098 test_score=0.7450
alpha=5e+08 l2_penalty=0.000000 sum(toeic_coefs)=0.0095 test_score=0.7438
alpha의 값이 늘어날수록, penalty의 값이 규모가 작아진다. 그에따라 계수들의총합도 점점 낮아진다…
게다가 test_score도 어느순간부터 낮아지기 시작한다…
3. RidgeCV
- 입력한 alpha값들 중에서 가장 적절한 alpha값을 제시해준다.
## step1
df_train, df_test = sklearn.model_selection.train_test_split(df,test_size=0.3,random_state=42)
X = df_train.loc[:,'gpa':'toeic499']
y = df_train.loc[:,'employment_score']
XX = df_test.loc[:,'gpa':'toeic499']
yy = df_test.loc[:,'employment_score']
## step2
predictr = sklearn.linear_model.RidgeCV() ## 일단 alpha를 지정해주지 않는 모습...
## step3
predictr.fit(X,y)
## step4 -- pass RidgeCV()In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
RidgeCV()
print(f'df_train score : {predictr.score(X, y):.5f}')
print(f'df_test score : {predictr.score(XX, yy):.5f}')df_train score : 1.00000
df_test score : 0.11915
아직 overfitting된 모습…
왜냐! alphas의 후보는 0.1, 1.0, 10.0이 디폴트니까…
- 따라서 이 후보를 직접 넣어주자.
## step1
df_train, df_test = sklearn.model_selection.train_test_split(df,test_size=0.3,random_state=42)
X = df_train.loc[:,'gpa':'toeic499']
y = df_train.loc[:,'employment_score']
XX = df_test.loc[:,'gpa':'toeic499']
yy = df_test.loc[:,'employment_score']
## step2 -- 여기서 alpha의 후보들을 alphas에 리스트로 지정해준다.
predictr = sklearn.linear_model.RidgeCV(alphas=[5e2, 5e3, 5e4, 5e5, 5e6, 5e7, 5e8])
## step3
predictr.fit(X,y)
## step4 -- pass RidgeCV(alphas=[500.0, 5000.0, 50000.0, 500000.0, 5000000.0, 50000000.0,
500000000.0])In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook. On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
RidgeCV(alphas=[500.0, 5000.0, 50000.0, 500000.0, 5000000.0, 50000000.0,
500000000.0])(predictr.score(X, y), predictr.score(XX, yy))(0.7521268560159359, 0.7450309251010893)
predictr.alpha_50000000.0
alpha를 5,000,000로 설정했더니 가장 좋은 결과가 나왔다는 것을 알 수 있다.
4. Lasso
df = pd.read_csv("https://raw.githubusercontent.com/guebin/MP2023/main/posts/employment_multicollinearity.csv")
np.random.seed(43052)
df['employment_score'] = df.gpa * 1.0 + df.toeic* 1/100 + np.random.randn(500)
df| employment_score | gpa | toeic | toeic0 | toeic1 | toeic2 | toeic3 | toeic4 | toeic5 | toeic6 | ... | toeic490 | toeic491 | toeic492 | toeic493 | toeic494 | toeic495 | toeic496 | toeic497 | toeic498 | toeic499 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1.784955 | 0.051535 | 135 | 129.566309 | 133.078481 | 121.678398 | 113.457366 | 133.564200 | 136.026566 | 141.793547 | ... | 132.014696 | 140.013265 | 135.575816 | 143.863346 | 152.162740 | 132.850033 | 115.956496 | 131.842126 | 125.090801 | 143.568527 |
| 1 | 10.789671 | 0.355496 | 935 | 940.563187 | 935.723570 | 939.190519 | 938.995672 | 945.376482 | 927.469901 | 952.424087 | ... | 942.251184 | 923.241548 | 939.924802 | 921.912261 | 953.250300 | 931.743615 | 940.205853 | 930.575825 | 941.530348 | 934.221055 |
| 2 | 8.221213 | 2.228435 | 485 | 493.671390 | 493.909118 | 475.500970 | 480.363752 | 478.868942 | 493.321602 | 490.059102 | ... | 484.438233 | 488.101275 | 485.626742 | 475.330715 | 485.147363 | 468.553780 | 486.870976 | 481.640957 | 499.340808 | 488.197332 |
| 3 | 2.137594 | 1.179701 | 65 | 62.272565 | 55.957257 | 68.521468 | 76.866765 | 51.436321 | 57.166824 | 67.834920 | ... | 67.653225 | 65.710588 | 64.146780 | 76.662194 | 66.837839 | 82.379018 | 69.174745 | 64.475993 | 52.647087 | 59.493275 |
| 4 | 8.650144 | 3.962356 | 445 | 449.280637 | 438.895582 | 433.598274 | 444.081141 | 437.005100 | 434.761142 | 443.135269 | ... | 455.940348 | 435.952854 | 441.521145 | 443.038886 | 433.118847 | 466.103355 | 430.056944 | 423.632873 | 446.973484 | 442.793633 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 495 | 9.057243 | 4.288465 | 280 | 276.680902 | 274.502675 | 277.868536 | 292.283300 | 277.476630 | 281.671647 | 296.307373 | ... | 269.541846 | 278.220546 | 278.484758 | 284.901284 | 272.451612 | 265.784490 | 275.795948 | 280.465992 | 268.528889 | 283.638470 |
| 496 | 4.108020 | 2.601212 | 310 | 296.940263 | 301.545000 | 306.725610 | 314.811407 | 311.935810 | 309.695838 | 301.979914 | ... | 304.680578 | 295.476836 | 316.582100 | 319.412132 | 312.984039 | 312.372112 | 312.106944 | 314.101927 | 309.409533 | 297.429968 |
| 497 | 2.430590 | 0.042323 | 225 | 206.793217 | 228.335345 | 222.115146 | 216.479498 | 227.469560 | 238.710310 | 233.797065 | ... | 233.469238 | 235.160919 | 228.517306 | 228.349646 | 224.153606 | 230.860484 | 218.683195 | 232.949484 | 236.951938 | 227.997629 |
| 498 | 5.343171 | 1.041416 | 320 | 327.461442 | 323.019899 | 329.589337 | 313.312233 | 315.645050 | 324.448247 | 314.271045 | ... | 326.297700 | 309.893822 | 312.873223 | 322.356584 | 319.332809 | 319.405283 | 324.021917 | 312.363694 | 318.493866 | 310.973930 |
| 499 | 6.505106 | 3.626883 | 375 | 370.966595 | 364.668477 | 371.853566 | 373.574930 | 376.701708 | 356.905085 | 354.584022 | ... | 382.278782 | 379.460816 | 371.031640 | 370.272639 | 375.618182 | 369.252740 | 376.925543 | 391.863103 | 368.735260 | 368.520844 |
500 rows × 503 columns
A. Lasso를 이용한 분석
## 1
df_train, df_test = sklearn.model_selection.train_test_split(df, test_size = 0.3, random_state = 42)
X = df_train.drop('employment_score', axis = 1)
y = df_train.employment_score
XX = df_test.drop('employment_score', axis = 1)
yy = df_test.employment_score
## 2
predictr = sklearn.linear_model.Lasso()
## 3
predictr.fit(X, y)
## 4
predictr.score(X, y), predictr.score(XX, yy)C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 5.877e+01, tolerance: 3.337e-01
model = cd_fast.enet_coordinate_descent(
(0.8600312387900632, 0.8306176063318933)
print(f'train_score:\t {predictr.score(X,y):.4f}')
print(f'test_score:\t {predictr.score(XX,yy):.4f}')train_score: 0.8600
test_score: 0.8306
alpha를 default로 두었음에도 굉장히 우수한 결과가 나왔다.
### B. Lasso의 원리
- 정확한 설명
지금 이해하기엔 어려움…
- 상관성이 짙은 설명변수 몇개로만 그 합의 계수를 만들게 해서는 안된다.
아주 적은 숫자의 coef만 살려두고, 나머지는 0으로 강제한다.
계수가 0이라는 것은 해당 변수를 제거한 것과 같은 효과를 가진다.
plt.plot(predictr.coef_[1:])
실제로 계수값이 0인 녀석이 많음을 알 수 있다.
C. \(\alpha\)의 값에 따른 변화
- 여러 개의 predictor를 학습시켜 계수값들의 변화를 관찰해보자.
## 1
df_train, df_test = sklearn.model_selection.train_test_split(df, test_size = 0.3, random_state = 42)
X = df_train.drop('employment_score', axis = 1)
y = df_train.employment_score
XX = df_test.drop('employment_score', axis = 1)
yy = df_test.employment_score
## 2
alphas = np.linspace(0.1, 2, 20)
predictrs = [sklearn.linear_model.Lasso(alpha = alpha) for alpha in alphas]
## 3
for predictr in predictrs:
predictr.fit(X, y)C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 8.115e+01, tolerance: 3.337e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 1.023e+02, tolerance: 3.337e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 1.047e+02, tolerance: 3.337e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 8.991e+01, tolerance: 3.337e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 9.375e+01, tolerance: 3.337e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 9.588e+01, tolerance: 3.337e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 8.730e+01, tolerance: 3.337e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 7.671e+01, tolerance: 3.337e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 7.117e+01, tolerance: 3.337e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 5.877e+01, tolerance: 3.337e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 4.875e+01, tolerance: 3.337e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 3.698e+01, tolerance: 3.337e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 2.606e+01, tolerance: 3.337e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 1.719e+01, tolerance: 3.337e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 1.015e+01, tolerance: 3.337e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 3.205e+00, tolerance: 3.337e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 3.086e+00, tolerance: 3.337e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 4.192e+00, tolerance: 3.337e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 1.498e+00, tolerance: 3.337e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 1.073e+00, tolerance: 3.337e-01
model = cd_fast.enet_coordinate_descent(
plt.plot(predictrs[0].coef_[1:], label=r'$\alpha={}$'.format(predictrs[0].alpha))
plt.plot(predictrs[9].coef_[1:], label=r'$\alpha={}$'.format(predictrs[9].alpha.round(5)))
plt.plot(predictrs[-1].coef_[1:], label=r'$\alpha={}$'.format(predictrs[-1].alpha))
plt.legend()
plt.show()
계수값들의 분산이 갈수록 작아지는 것을 느낄 수 있다.
print(f'alpha={predictrs[0].alpha:.4f}\tsum(toeic_coef)={predictrs[0].coef_[1:].sum()}')
print(f'alpha={predictrs[9].alpha:.4f}\tsum(toeic_coef)={predictrs[9].coef_[1:].sum()}')
print(f'alpha={predictrs[-1].alpha:.4f}\tsum(toeic_coef)={predictrs[-1].coef_[1:].sum()}')alpha=0.1000 sum(toeic_coef)=0.010169320378140704
alpha=1.0000 sum(toeic_coef)=0.009987870459109604
alpha=2.0000 sum(toeic_coef)=0.009864586871194559
predictor들의 toeic 계수 합은 여전히 0.01 근처….
plt.plot([(predictr.coef_ != 0).sum() for predictr in predictrs])
alpha값이 커질수록 0이 아닌 계수의 갯수가 줄어드는 것을 볼 수 있다.
### D. LassoCV(Cross Validation)
- 가장 적합한 \(\alpha\)값을 자동으로 찾아준다.
## 1
df_train, df_test = sklearn.model_selection.train_test_split(df, test_size = 0.3, random_state = 42)
X = df_train.drop('employment_score', axis = 1)
y = df_train.employment_score
XX = df_test.drop('employment_score', axis = 1)
yy = df_test.employment_score
## 2
predictr = sklearn.linear_model.LassoCV(alphas = np.linspace(0.1, 2, 20))
## 3
predictr.fit(X, y)
## 4
predictr.score(X, y)C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 4.256e+00, tolerance: 2.707e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 2.561e+00, tolerance: 2.707e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 3.640e+00, tolerance: 2.707e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 6.989e+00, tolerance: 2.707e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 5.860e+00, tolerance: 2.707e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 6.878e+00, tolerance: 2.707e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 9.633e+00, tolerance: 2.707e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 1.252e+01, tolerance: 2.707e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 1.352e+01, tolerance: 2.707e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 1.306e+01, tolerance: 2.707e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 1.440e+01, tolerance: 2.707e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 1.798e+01, tolerance: 2.707e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 2.242e+01, tolerance: 2.707e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 2.872e+01, tolerance: 2.707e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 2.998e+01, tolerance: 2.707e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 2.992e+01, tolerance: 2.707e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 3.436e+01, tolerance: 2.707e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 4.353e+01, tolerance: 2.707e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 4.359e+01, tolerance: 2.707e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 3.790e+01, tolerance: 2.707e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 2.771e+00, tolerance: 2.707e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 3.627e+00, tolerance: 2.707e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 3.635e+00, tolerance: 2.707e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 4.897e+00, tolerance: 2.707e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 8.514e+00, tolerance: 2.707e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 1.024e+01, tolerance: 2.707e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 1.141e+01, tolerance: 2.707e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 1.461e+01, tolerance: 2.707e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 2.014e+01, tolerance: 2.707e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 2.375e+01, tolerance: 2.707e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 2.590e+01, tolerance: 2.707e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 2.812e+01, tolerance: 2.707e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 2.907e+01, tolerance: 2.707e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 3.234e+01, tolerance: 2.707e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 3.637e+01, tolerance: 2.707e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 3.876e+01, tolerance: 2.707e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 4.340e+01, tolerance: 2.707e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 4.293e+01, tolerance: 2.707e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 4.578e+01, tolerance: 2.707e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 3.930e+01, tolerance: 2.707e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 2.021e+00, tolerance: 2.670e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 6.646e-01, tolerance: 2.670e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 1.295e+00, tolerance: 2.670e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 1.779e+00, tolerance: 2.670e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 3.310e+00, tolerance: 2.670e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 5.064e+00, tolerance: 2.670e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 7.075e+00, tolerance: 2.670e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 9.837e+00, tolerance: 2.670e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 1.093e+01, tolerance: 2.670e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 1.277e+01, tolerance: 2.670e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 1.556e+01, tolerance: 2.670e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 1.943e+01, tolerance: 2.670e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 2.170e+01, tolerance: 2.670e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 2.293e+01, tolerance: 2.670e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 2.617e+01, tolerance: 2.670e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 3.108e+01, tolerance: 2.670e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 3.724e+01, tolerance: 2.670e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 3.145e+01, tolerance: 2.670e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 3.746e+01, tolerance: 2.670e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 3.480e+01, tolerance: 2.670e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 6.197e+00, tolerance: 2.721e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 9.328e+00, tolerance: 2.721e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 7.987e+00, tolerance: 2.721e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 8.132e+00, tolerance: 2.721e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 9.659e+00, tolerance: 2.721e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 1.356e+01, tolerance: 2.721e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 1.658e+01, tolerance: 2.721e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 2.074e+01, tolerance: 2.721e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 2.443e+01, tolerance: 2.721e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 3.203e+01, tolerance: 2.721e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 3.698e+01, tolerance: 2.721e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 4.031e+01, tolerance: 2.721e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 3.921e+01, tolerance: 2.721e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 4.384e+01, tolerance: 2.721e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 4.669e+01, tolerance: 2.721e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 5.082e+01, tolerance: 2.721e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 5.384e+01, tolerance: 2.721e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 4.782e+01, tolerance: 2.721e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 4.134e+01, tolerance: 2.721e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 4.001e+01, tolerance: 2.721e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 9.107e+00, tolerance: 2.540e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 8.057e+00, tolerance: 2.540e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 5.464e+00, tolerance: 2.540e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 7.606e+00, tolerance: 2.540e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 8.704e+00, tolerance: 2.540e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 8.481e+00, tolerance: 2.540e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 8.384e+00, tolerance: 2.540e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 7.910e+00, tolerance: 2.540e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 8.173e+00, tolerance: 2.540e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 1.076e+01, tolerance: 2.540e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 1.301e+01, tolerance: 2.540e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 1.835e+01, tolerance: 2.540e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 1.949e+01, tolerance: 2.540e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 2.076e+01, tolerance: 2.540e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 2.993e+01, tolerance: 2.540e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 3.923e+01, tolerance: 2.540e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 5.063e+01, tolerance: 2.540e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 5.883e+01, tolerance: 2.540e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 5.713e+01, tolerance: 2.540e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 4.017e+01, tolerance: 2.540e-01
model = cd_fast.enet_coordinate_descent(
C:\Users\hollyriver\anaconda3\envs\py\lib\site-packages\sklearn\linear_model\_coordinate_descent.py:628: ConvergenceWarning: Objective did not converge. You might want to increase the number of iterations, check the scale of the features or consider increasing regularisation. Duality gap: 1.047e+02, tolerance: 3.337e-01
model = cd_fast.enet_coordinate_descent(
0.9555099850022306
predictr.score(X, y), predictr.score(XX, yy)(0.9555099850022306, 0.8756348559919926)
살짝 과적합된 면이 있으나, 그래도 상당히 높은 수치이다.
5. coef를 0으로 만드는 수학적 장치
- Ridge : L2-penalty
-
coef의 값들을 가중치에 따라 분할하는 수학적 장치.
패널티 : 상관성이 짙은 설명변수들의 계수값을 제곱한 뒤 합치고(L2-norm을 구하고), 그 값이 0에서 떨어져 있을수록 패널티 부여.
- Lasso : L1-penalty
-
다수의 coef 값들을 0으로 만드는 수학적 장치
패널티 : 상관성이 짙은 설명변수들의 계수값의 절대값을 구한 뒤에 합치고(L1-norm을 구하고), 그 값이 0에서 떨어져 있을수록 패널티 부여.