Skip to content

kyopark2014/ML-Algorithms

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

Machine Learning Algorithms

License

์šฉ์–ด Index

Index์—์„œ๋Š” ์ฃผ์š” ์šฉ์–ด์— ๋Œ€ํ•œ ๋งํฌ๋ฅผ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค.

AI / ML

  • ์ธ๊ณต์ง€๋Šฅ(AI, Artificial Intelligence)์€ ์‚ฌ๋žŒ์ฒ˜๋Ÿผ ํ•™์Šตํ•˜๊ณ  ์ถ”๋ก ํ•  ์ˆ˜ ์žˆ๋Š” ์ง€๋Šฅ์„ ๊ฐ€์ง„ ์‹œ์Šคํ…œ์„ ๋งŒ๋“œ๋Š” ๊ธฐ์ˆ ์„ ์˜๋ฏธํ•ฉ๋‹ˆ๋‹ค.

  • ๋จธ์‹ ๋Ÿฌ๋‹(ML, Machine Learning)์€ ๊ทœ์น™์„ ํ”„๋กœ๊ทธ๋ž˜๋ฐํ•˜์ง€ ์•Š์•„๋„ ์ž๋™์œผ๋กœ ๋ฐ์ดํ„ฐ์—์„œ ๊ทœ์น™์„ ํ•™์Šตํ•ฉ๋‹ˆ๋‹ค. ๋ฒ”์šฉ์ ์ธ ๋ชฉ์ ์— ์ ํ•ฉํ•˜์ง€๋งŒ, ์ฃผ์–ด์ง„ ๋ฐ์ดํ„ฐ์— ๋Œ€ํ•œ ํŠน์„ฑ ์ถ”์ถœ(Feature Extraction)์„ ์ธ๊ฐ„์ด ์ฒ˜๋ฆฌ(์ „์ฒ˜๋ฆฌ)ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.

  • ๋”ฅ๋Ÿฌ๋‹(DL, Deep Learning)์€ ์ธ๊ณต ์‹ ๊ฒฝ๋ง์— ๊ธฐ๋ฐ˜ํ•œ ๋จธ์‹ ๋Ÿฌ๋‹์œผ๋กœ์„œ TensorFlow, PyTorch๊ฐ€ ํ•ด๋‹น๋ฉ๋‹ˆ๋‹ค. ๋Œ€๋Ÿ‰์˜ ๋ฐ์ดํ„ฐ๋ฅผ ์‹ ๊ฒฝ๋ง์— ์ ์šฉํ•˜๋ฉด ์ปดํ“จํ„ฐ๊ฐ€ ์Šค์Šค๋กœ ๋ถ„์„ํ•˜์—ฌ ํŠน์„ฑ ์ถ”์ถœ(Feature Extraction)์„ ์ˆ˜ํ–‰ํ•ฉ๋‹ˆ๋‹ค.

๋ฐ์ดํ„ฐ์ „์ฒ˜๋ฆฌ

Preprocessing์—์„œ๋Š” ํ‘œ์ค€์ ์ˆ˜(z)๋ฅผ ์ด์šฉํ•œ ๋ฐ์ดํ„ฐ ์ •๊ทœํ™” ๋ฐ Train/Test Dataset์„ ์ค€๋น„ํ•˜๋Š” ๊ณผ์ •์„ ์„ค๋ช…ํ•ฉ๋‹ˆ๋‹ค.

๋จธ์‹ ๋Ÿฌ๋‹์—์„œ ํŠน์„ฑ(Feature)๋Š” ์›ํ•˜๋Š” ๊ฐ’์„ ์˜ˆ์ธกํ•˜๊ธฐ ์œ„ํ•ด ํ™œ์šฉํ•˜๋Š” ๋ฐ์ดํ„ฐ๋ฅผ ์˜๋ฏธํ•˜๊ณ , ํƒ€๊นƒ(Target)์€ ์˜ˆ์ธกํ•ด์•ผ ํ•  ๊ฐ’์ž…๋‹ˆ๋‹ค.

ํŠน์„ฑ๊ณตํ•™

ํŠน์„ฑ๊ณตํ•™(Feature Engineering)์€ ์ฃผ์–ด์ง„ ํŠน์„ฑ์„ ์กฐํ•ฉํ•˜์—ฌ ์ƒˆ๋กœ์šด ํŠน์„ฑ์„ ๋งŒ๋“œ๋Š” ๊ณผ์ •์ž…๋‹ˆ๋‹ค.

  • Feature engineering is the process of using domain knowledge of the data to create features that make machine learning algorithms work (e.g., separating time from a date/time field, combining fields โ€” height/weight). Feature engineering can improve model accuracy and speed up training.

  • Feature transfomation: Putting data in a format optimized for machine learning and generalization

  • Feature selection (variable selection, attribute selection) is the process of selecting a subset of relevant features (independent variables, predictors) for use in model construction. Feature selection can improve model accuracy, simplify models, shorten model training times, and reduce overfitting.

  • Correlation analysis is a method of statistical evaluation used to study the strength of a relationship between two, numerically measured, continuous variables (e.g., height and weight). This particular type of analysis is useful when a researcher wants to establish if there are possible connections between variables.

  • Label encoding: Converting categorical text data into model-understandable numerical data

Machine Learning

Supervised Learning

ํšŒ๊ท€(Regression)์€ ์˜ˆ์ธกํ•˜๊ณ  ์‹ถ์€ ์ข…์†๋ณ€์ˆ˜๊ฐ€ ์ˆซ์ž์ผ๋•Œ ์‚ฌ์šฉํ•˜๋Š” ๋จธ์‹ ๋Ÿฌ๋‹ ๋ฐฉ๋ฒ•์ž…๋‹ˆ๋‹ค. Regression์—์„œ๋Š” Regression์— ๋Œ€ํ•œ ๊ธฐ๋ณธ ์„ค๋ช… ๋ฐ ๊ตฌํ˜„ํ•˜๋Š” ์ฝ”๋“œ๋ฅผ ์˜ˆ์ œ๋กœ ์„ค๋ช…ํ•ฉ๋‹ˆ๋‹ค.

๋ถ„๋ฅ˜(Classification)์€ Sample์„ ๋ช‡๊ฐœ์˜ Class์ค‘์— ํ•˜๋‚˜๋กœ ๋ถ„๋ฅ˜ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. Classification์„ ํ†ตํ•ด ์˜ˆ์ œ ์ค‘์‹ฌ์œผ๋กœ ์„ค๋ช…ํ•ฉ๋‹ˆ๋‹ค.

Unsupervised Learning

Clustering์˜ ๋Œ€ํ‘œ์ ์ธ ์˜ˆ๋กœ k-Means๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค. k-Means๋Š” ๋น„์ง€๋„ํ•™์Šต(Unsupervised Learning)์œผ๋กœ ์ •๋‹ต label์ด ์—†๋Š” ๋ฐ์ดํ„ฐ์—์„œ ์œ ์‚ฌ๋„๋ฅผ ๊ธฐ์ค€์œผ๋กœ k๊ฐœ์˜ ๊ตฐ์ง‘์œผ๋กœ ๋ถ„๋ฅ˜ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

  • K-means is an unsupervised learning algorithm. It attempts to find discrete groupings within data, where members of a group are as similar as possible to one another and as different as possible from members of other groups. You define the attributes you want the algorithm to use to determine similarity.

Dimensionally Reduction์˜ ์˜ˆ๋กœ์„œ๋Š” PCA (Principal Component Analysis)๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค. PCA๋ฅผ ์ด์šฉํ•ด ๋ฐ์ดํ„ฐ์˜ ๋ถ„์‚ฐ(variance)์„ ์ตœ๋Œ€ํ•œ ๋ณด์กดํ•˜๋ฉด์„œ ์ถ•์†Œ๋œ ๋ฐ์ดํ„ฐ๋ฅผ ํ•™์Šต๋ฐ์ดํ„ฐ๋กœ ์‚ฌ์šฉํ• ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

Deep Learning

Deep Learning Algorithms์—์„œ๋Š” Deep Learning์— ๋Œ€ํ•œ ์„ค๋ช… ๋ฐ ์˜ˆ์ œ์— ๋Œ€ํ•ด ๋‹ค๋ฃจ๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.

๋ชจ๋ธ ๊ณผ์ ํ•ฉ ๋ฐฉ์ง€

regularization์—์„œ๋Š” ๋ชจ๋ธ ๊ณผ์ ํ•ฉ์„ ๋ฐฉ์ง€ํ•˜๋Š” ๋ฐฉ๋ฒ•์— ๋Œ€ํ•ด ์„ค๋ช…ํ•ฉ๋‹ˆ๋‹ค.

๋ชจ๋ธ ํ‰๊ฐ€

ํ‰๊ฐ€ (Evaluation)์€ ์•Œ๊ณ ๋ฆฌ์ฆ˜์— ๋ชจ๋ธ ํ‰๊ฐ€ ์ง€ํ‘œ์— ๋Œ€ํ•ด ์„ค๋ช…ํ•ฉ๋‹ˆ๋‹ค.

Hyperparameter Optimization (HPO)

Hyperparameter Optimization์—์„œ๋Š” ๋จธ์‹ ๋Ÿฌ๋‹ ํ•™์Šต ์•Œ๊ณ ๋ฆฌ์ฆ˜๋ณ„ ์ตœ์ ์˜ Hyperparameter ์กฐํ•ฉ์„ ์ฐพ์•„๊ฐ€๋Š” ๊ณผ์ •์„ ์˜๋ฏธ ํ•ฉ๋‹ˆ๋‹ค.

ML application

ML๋กœ ์ˆ˜ํ–‰ํ• ์ˆ˜ ์žˆ๋Š” ์‘์šฉ ์˜์—ญ์— ๋Œ€ํ•ด ์„ค๋ช…ํ•ฉ๋‹ˆ๋‹ค.

Machine Learning Examples

XGBoost Algorithms์—์„œ๋Š” XGBoost๋ฅผ ์‚ฌ์šฉํ•œ ๋‹ค์–‘ํ•œ ์‚ฌ๋ก€์— ๋Œ€ํ•ด ์„ค๋ช…ํ•ฉ๋‹ˆ๋‹ค.

๊ฐ์ข… ์œ ์šฉํ•œ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ

  • Numpy๋กœ ๋ฐ์ดํ„ฐ๋ฅผ ์ค€๋น„ํ•ฉ๋‹ˆ๋‹ค.

AWS ์ œ๊ณต ์•Œ๊ณ ๋ฆฌ์ฆ˜

Built-in Algorithms์—์„œ๋Š” AWS์—์„œ ์ œ๊ณตํ•˜๋Š” ML ์•Œ๊ณ ๋ฆฌ์ฆ˜์— ๋Œ€ํ•ด ์„ค๋ช…ํ•ฉ๋‹ˆ๋‹ค.

ML ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ Python ์ฝ”๋“œ๋กœ ๋ณ€ํ™˜

๋…ธํŠธ๋ถ์œผ๋กœ ์ž‘์„ฑํ•œ ML ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ Python ์ฝ”๋“œ๋กœ ๋ณ€ํ™˜ํ•ฉ๋‹ˆ๋‹ค.

LLM

Transformer ๋ชจ๋ธ์˜ ์ด๋ก ์  ์ดํ•ด

Reference

ํ˜ผ์ž ๊ณต๋ถ€ํ•˜๋Š” ๋จธ์‹ ๋Ÿฌ๋‹+๋”ฅ๋Ÿฌ๋‹

๋จธ์‹ ๋Ÿฌ๋‹ยท๋”ฅ๋Ÿฌ๋‹ ๋ฌธ์ œํ•ด๊ฒฐ ์ „๋žต - ์‹ ๋ฐฑ๊ท , ๊ณจ๋“ ๋ž˜๋น—

[Machine Learning at Work - ํ•œ๋น›๋ฏธ๋””์–ด]

XGBoost์™€ ์‚ฌ์ดํ‚ท๋Ÿฐ์„ ํ™œ์šฉํ•œ ๊ทธ๋ ˆ์ด๋””์–ธํŠธ ๋ถ€์ŠคํŒ… - ํ•œ๋น› ๋ฏธ๋””์–ด

About

It summerizes the algorithms of Machine Learning.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published