5 Sat

ํ˜„์—… ์‹ค๋ฌด์ž์—๊ฒŒ ๋ฐฐ์šฐ๋Š” Kaggle ๋จธ์‹ ๋Ÿฌ๋‹ ์ž…๋ฌธ

์„ ํ˜•ํšŒ๊ท€(Linear Regression) ์†Œ๊ฐœ

  • ์„ ํ˜• ํ•จ์ˆ˜๋ฅผ ์ด์šฉํ•ด์„œ ํšŒ๊ท€๋ฅผ ์ˆ˜ํ–‰ํ•˜๋Š” ๋ชจ๋ธ

Regression ์•Œ๊ณ ๋ฆฌ์ฆ˜์˜ ์„ฑ๋Šฅํ‰๊ฐ€ ์ง€ํ‘œ - MSE, RMSE, MAE

ํ•™์Šต๋œ ์˜ˆ์ธก ๋ชจ๋ธ์˜ ์„ฑ๋Šฅ์„ ํ‰๊ฐ€ํ•  ์ˆ˜ ์žˆ์–ด์•ผ ํ•œ๋‹ค.

์„ฑ๋Šฅ์„ ํ‰๊ฐ€ํ•˜๋Š” ์ฒ™๋„๋Š” ๋งŽ์œผ๋ฉฐ ๊ทธ ์ค‘ ํ‰๊ท  ์ œ๊ณฑ ์˜ค์ฐจ, MSE๋ฅผ ๋งŽ์ด ์‚ฌ์šฉํ•œ๋‹ค.

  • MSE๊ฐ€ ์ž‘์€ ๋ชจ๋ธ์€ ์ข‹์€ ๋ชจ๋ธ๋กœ ๋ณผ ์ˆ˜ ์žˆ๋‹ค.

MSE๋Š” ์ฐจ์ด๋ฅผ ์ œ๊ณฑํ•ด์„œ ๋”ํ•˜๋ฏ€๋กœ ์ฐจ์ด๊ฐ€ ์ฆํญ๋˜๋Š” ๋ฌธ์ œ๊ฐ€ ์žˆ๋‹ค. ์ด๋ฅผ ์ œ๊ฑฐํ•˜๊ธฐ ์œ„ํ•ด MSE์— ๋ฃจํŠธ๋ฅผ ์”Œ์šด ํ˜•ํƒœ์˜ RMSE๋„ ๋งŽ์ด ์‚ฌ์šฉํ•œ๋‹ค

๋˜ํ•œ, ์˜ˆ์ธก๊ฐ’๊ณผ ์ •๋‹ต๊ฐ„์˜ ์ฐจ์ด์— ์ ˆ๋Œ“๊ฐ’์„ ์ทจํ•œ MAE, Mean Absolute Error๋„ ํ™œ์šฉ๋  ์ˆ˜ ์žˆ๋‹ค

scikit-learn ์†Œ๊ฐœ

  • ์„ ํ˜• ํšŒ๊ท€๋ฅผ ํฌํ•จํ•œ ๋‹ค์–‘ํ•œ ๋จธ์‹ ๋Ÿฌ๋‹ ๋ชจ๋ธ์„ ์‰ฝ๊ณ  ๊ฐ„ํŽธํ•˜๊ฒŒ ๊ตฌํ˜„ํ•  ์ˆ˜ ์žˆ๋„๋ก ๋„์™€์ฃผ๋Š” ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ

๊ธฐ๋ณธ ์‚ฌ์šฉ๋ฒ•

  • Estimator ์„ ์–ธ

    • ex) LinearRegression

  • .fit() ํ•จ์ˆ˜ ํ˜ธ์ถœ์„ ํ†ตํ•œ ํŠธ๋ ˆ์ด๋‹

  • .predict() ํ•จ์ˆ˜ ํ˜ธ์ถœ์„ ํ†ตํ•œ ์˜ˆ์ธก

๋ฐ์ดํ„ฐ ๋‚˜๋ˆ„๊ธฐ

from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test =\
    train_test_split(X, y, test_size=0.2)

Estimator ์„ ์–ธํ•˜๊ธฐ

from sklearn.linear_model import LinearRegression
lr = LinearRegression()

MSE, RMSE ์ •์˜ํ•˜๊ธฐ

from sklearn.metrics import mean_squared_error
MSE = mean_squared_error(y_test, y_preds)
RMSE = np.sqrt(MSE)

Linear Regression์œผ๋กœ ํ‚ค์— ๋Œ€ํ•œ ๋ชธ๋ฌด๊ฒŒ ์˜ˆ์ธกํ•ด๋ณด๊ธฐ

์˜ˆ์ธก ๋ชจ๋ธ

  • Input : ํ‚ค

  • Output : ๋ชธ๋ฌด๊ฒŒ

  • Estimator : Linear Regression

df.apply(lambda x: x * 2.54)

  • df๋‚ด์˜ ๋ชจ๋“  ๋ฐ์ดํ„ฐ๋ฅผ ๋žŒ๋‹ค ํ•จ์ˆ˜๋ฅผ ๊ฑฐ์ณ ๋ณ€ํ™˜ํ•œ๋‹ค.

Kaggle ๋ฐ Kaggle Competition ์†Œ๊ฐœ

์บ๊ธ€

  • ์„ธ๊ณ„ ์ตœ๋Œ€์˜ ๋ฐ์ดํ„ฐ ๊ณผํ•™์ž ์ปค๋ฎค๋‹ˆํ‹ฐ

  • Data Scientist๋ฅผ ์œ„ํ•œ ๋ฐ์ดํ„ฐ ๋ถ„์„ ๋ฐ ์˜ˆ์ธก ๊ฒฝ์ง„๋Œ€ํšŒ ํ”Œ๋žซํผ

  • ๊ธฐ์—… ๋ฐ ๋‹จ์ฒด์—์„œ ๋ฐ์ดํ„ฐ์™€ ํ•ด๊ฒฐ ๊ณผ์ œ๋ฅผ ๋“ฑ๋กํ•˜๋ฉด, Kaggle์˜ Data Scientist๋“ค์ด ์ด ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๋Š” ๋ฐ์ดํ„ฐ ๋ถ„์„ ๋ฐ ๋ชจ๋ธ์„ ๊ฐœ๋ฐœํ•˜๊ณ  ๊ฒฝ์Ÿํ•œ๋‹ค.

Last updated

Was this helpful?