๋ณธ๋ฌธ ๋ฐ”๋กœ๊ฐ€๊ธฐ

728x90

regression

4/8 ๊ธˆ ๊ธˆ์š”์ผ! ๐Ÿฑ‍๐Ÿ ์˜ค๋Š˜์€ Regression์„ ๋๋‚ธ๋‹ค~~ 4/11 ์›”์š”์ผ์€ ๋จธ์‹ ๋Ÿฌ๋‹ ํ•„๋‹ต ํ‰๊ฐ€, 4/17 ์ผ์š”์ผ์€ ์ˆ˜ํ–‰ํ‰๊ฐ€ 4๊ฐ€์ง€ ์ œ์ถœ์ด ์žˆ๋‹ค. ๊ฒฐ์ธก์น˜ ์ฒ˜๋ฆฌ๋Š” ์‚ญ์ œํ•˜๊ฑฐ๋‚˜, imputation(๋ณด๊ฐ„, ๋Œ€์ฒด) - ํ‰๊ท ํ™” ๊ธฐ๋ฒ•(๋…๋ฆฝ๋ณ€์ˆ˜๋ฅผ ๋Œ€ํ‘œ๊ฐ’์œผ๋กœ ๋Œ€์ฒด), ๋จธ์‹ ๋Ÿฌ๋‹ ๊ธฐ๋ฒ•(์ข…์†๋ณ€์ˆ˜๊ฐ€ ๋Œ€์ƒ. KNN) KNN(K-Nearest Neighbors, K-์ตœ๊ทผ์ ‘ ์ด์›ƒ) : hyperparameter๋Š” k(=1์ผ ๋•Œ ์–ด๋Š ์ •๋„์˜ ์„ฑ๋Šฅ ๋ณด์žฅ)์™€ ๊ฑฐ๋ฆฌ์ธก์ • ๋ฐฉ์‹(์ฃผ๋กœ ์œ ํด๋ผ๋””์•ˆ ์‚ฌ์šฉ) ๋ฐ˜๋“œ์‹œ ์ •๊ทœํ™”๋ฅผ ์ง„ํ–‰ํ•ด์•ผ ํ•จ. ๋ชจ๋“  ๋ฐ์ดํ„ฐ์— ๋Œ€ํ•ด ๊ฑฐ๋ฆฌ๋ฅผ ๊ณ„์‚ฐํ•ด์•ผ ํ•˜๋ฏ€๋กœ ์‹œ๊ฐ„์ด ์˜ค๋ž˜ ๊ฑธ๋ฆด ์ˆ˜ ์žˆ์Œ 1. Logistic Regression + KNN - BMI data import numpy as np import pandas as pd fro.. ๋”๋ณด๊ธฐ
4/5 ํ™” ํ™”์š”์ผ! Logistic Regression์„ ํ™œ์šฉํ•ด ๋จธ์‹ ๋Ÿฌ๋‹ ์ง„ํ–‰ ์‹œ ์ฃผ์˜์‚ฌํ•ญ์„ ์•Œ์•„๋ณธ๋‹ค. ์•ž์œผ๋กœ ์šฐ๋ฆฌ๋Š” Classification(์ดํ•ญ๋ถ„๋ฅ˜)์˜ Metrics๋กœ Accuracy๋ฅผ ์‚ฌ์šฉํ•  ์˜ˆ์ •์ด๋‹ค. ๋ชจ๋ธ ํ‰๊ฐ€ ์ „ ๊ณ ๋ คํ•ด์•ผ ํ•˜๋Š” ๊ฒƒ๋“ค 1. learning rate(ํ•™์Šต๋ฅ ) : loss ๊ฐ’์„ ๋ณด๋ฉด์„œ ํ•™์Šต๋ฅ ์„ ์กฐ์ •ํ•ด์•ผ ํ•จ. ๋ณดํ†ต 1์˜ ๋งˆ์ด๋„ˆ์Šค 4์Šน์œผ๋กœ ์žก์Œ ํ•™์Šต๋ฅ ์ด ๋„ˆ๋ฌด ํฌ๋‹ค๋ฉด global minima(W')๋ฅผ ์ฐพ์„ ์ˆ˜ ์—†๊ฒŒ ๋จ → OverShooting ๋ฐœ์ƒ ํ•™์Šต๋ฅ ์ด ์•„์ฃผ ์ž‘๋‹ค๋ฉด local minima ์ฐพ๊ฒŒ ๋จ 2. Normalization(์ •๊ทœํ™”) : MinMax Scaling - 0 ~ 1. ์ด์ƒ์น˜์— ๋ฏผ๊ฐํ•จ Standardization - ํ‘œ์ค€ํ™”, Z-Score. ์ƒ๋Œ€์ ์œผ๋กœ ์ด์ƒ์น˜์— ๋‘”๊ฐํ•จ, ๋ชจ๋“  ์นผ๋Ÿผ์—.. ๋”๋ณด๊ธฐ
4/4 ์›” ์›”์š”์ผ! ์˜ค๋Š˜์€ ๊ธˆ์š”์ผ์— ์‹ค์Šต ์˜ˆ์ œ๋กœ ์ฃผ์–ด์กŒ๋˜ admission(๋Œ€ํ•™์› ํ•ฉ๊ฒฉ ์—ฌ๋ถ€) ๋ฐ์ดํ„ฐ์…‹์„ Sklearn, Tensorflow๋กœ ๊ตฌํ˜„ํ•˜๊ณ , ์ง€๋‚œ์ฃผ์— ๋ฐฐ์šด Logistic Regression์„ ํ™œ์šฉํ•ด ํ‰๊ฐ€์ง€ํ‘œ(Metrics)๋ฅผ ์•Œ์•„๋ณธ๋‹ค. 1. Logistic Regression by Sklearn import numpy as np import pandas as pd import tensorflow as tf from sklearn import linear_model from sklearn.preprocessing import MinMaxScaler from scipy import stats import matplotlib.pyplot as plt import warnings warnings.filter.. ๋”๋ณด๊ธฐ
3/29 ํ™” ํ™”์š”์ผ! ์˜ค๋Š˜์€ ์–ด์ œ ๋ฐฐ์šด Simple Linear Regression(๋‹จ์ˆœ ์„ ํ˜• ํšŒ๊ท€)์„ ์ฝ”๋“œ๋กœ ๊ตฌํ˜„ํ•œ๋‹ค. 1. Training Data Set ์ค€๋น„ : Data pre-processing(๋ฐ์ดํ„ฐ ์ „์ฒ˜๋ฆฌ). ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋Š” ํ˜•ํƒœ๋กœ ์ค€๋น„ 2. Linear Regression Model์„ ์ •์˜ : y = Wx+b(์˜ˆ์ธก ๋ชจ๋ธ). hypothesis(๊ฐ€์„ค) 3. ์ตœ์ ์˜ W(weight, ๊ฐ€์ค‘์น˜), b(bias, ํŽธ์ฐจ)๋ฅผ ๊ตฌํ•˜๋ ค๋ฉด loss function(์†์‹คํ•จ์ˆ˜)/cost function(๋น„์šฉํ•จ์ˆ˜) → MSE 4. Gradient Descent Algorithm(๊ฒฝ์‚ฌํ•˜๊ฐ•๋ฒ•) : loss function์„ ํŽธ๋ฏธ๋ถ„(W, b) × learning rate 5. ๋ฐ˜๋ณตํ•™์Šต ์ง„ํ–‰ 1. Training Dat.. ๋”๋ณด๊ธฐ
3/28 ์›” ์›”์š”์ผ! ๊ธˆ์š”์ผ์— ์ด์–ด ๋จธ์‹ ๋Ÿฌ๋‹ ๋“ค์–ด๊ฐ„๋‹ค~ Weak AI์˜ ๋จธ์‹ ๋Ÿฌ๋‹ ๊ธฐ๋ฒ•๋“ค : ์ง€๋„ ํ•™์Šต, ๋น„์ง€๋„ ํ•™์Šต, ๊ฐ•ํ™” ํ•™์Šต 1. Regression(ํšŒ๊ท€) : ๋ฐ์ดํ„ฐ์— ์˜ํ–ฅ์„ ์ฃผ๋Š” ์กฐ๊ฑด๋“ค์˜ ์˜ํ–ฅ๋ ฅ์„ ๊ณ ๋ คํ•ด์„œ, ๋ฐ์ดํ„ฐ์— ๋Œ€ํ•œ ์กฐ๊ฑด๋ถ€ ํ‰๊ท ์„ ๊ตฌํ•˜๋Š” ๊ธฐ๋ฒ• * ํ‰๊ท ์„ ๊ตฌํ•  ๋•Œ ์ฃผ์˜ํ•ด์•ผ ํ•  ์  : ํ‰๊ท ์„ ๊ตฌํ•˜๋Š” ๋ฐ์ดํ„ฐ์— ์ด์ƒ์น˜๊ฐ€ ์žˆ์„ ๊ฒฝ์šฐ ๋Œ€ํ‘œ๊ฐ’์œผ๋กœ ์‚ฌ์šฉํ•˜๊ธฐ ์–ด๋ ค์›€. ์ •๊ทœ๋ถ„ํฌ์—ฌ์•ผ ํ•จ! ๊ณ ์ „์  ์„ ํ˜• ํšŒ๊ท€ ๋ชจ๋ธ(Classical Linear Regression Model) ๋‹จ์ˆœ ์„ ํ˜• ํšŒ๊ท€(Simple Linear Regression) import numpy as np import pandas as pd import matplotlib.pyplot as plt df = pd.DataFrame({'๊ณต๋ถ€์‹œ๊ฐ„(x)': [1,2,3.. ๋”๋ณด๊ธฐ

728x90