Skip to content

PSLeon24/Artificial_Intelligence

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

Artificial Intelligence

Yeongmin Ko's learning notes

Difference

  • Classification: KNN, Decision Tree
  • Regression: Linear and Logistic Regression
  • Clustering: K-Means, Agglomerative Clustering, DBSCAN
  • image

1. Clustering(K-Means, Agglomerative Clustering, DBSCAN)

  • ๊ตฐ์ง‘ํ™”(Clustering): ์„œ๋กœ ์œ ์‚ฌํ•œ ๋ฐ์ดํ„ฐ ๊ฐœ์ฒด ์ง‘ํ•ฉ์„ ํ•˜์œ„ ์ง‘ํ•ฉ์œผ๋กœ ๋ถ„ํ• ํ•˜๋Š” ํ”„๋กœ์„ธ์Šค(The process of partitioning a set of data objects that are similar to each other into subsets)

    • ์ด๋•Œ, ๊ฐ ํ•˜์œ„ ์ง‘ํ•ฉ์„ ํด๋Ÿฌ์Šคํ„ฐ๋ผ๊ณ  ๋ถ€๋ฆ„(A subset is called cluster)
    • image
  • K-Means(K-ํ‰๊ท )

    • K๋Š” ํด๋Ÿฌ์Šคํ„ฐ(ํ•˜์œ„ ์ง‘ํ•ฉ)์˜ ๊ฐฏ์ˆ˜(K is number of clusters)

    • ์ค‘์‹ฌ ๊ธฐ๋ฐ˜ ํ…Œํฌ๋‹‰(A centroid-based technique)

      • Centroid๋Š” ๊ฐ ํด๋Ÿฌ์Šคํ„ฐ์— ์†ํ•œ ๊ฐ์ฒด์˜ ํ‰๊ท (Centroid is the average of objects belonging to each cluster)
    • image
      • ์œ„ ๊ทธ๋ฆผ์˜ ๊ฒฝ์šฐ, K = 4
  • ์ž‘๋™ ์•Œ๊ณ ๋ฆฌ์ฆ˜

    • Step 1. ๊ตฐ์ง‘์˜ ๊ฐฏ์ˆ˜ k ๊ฒฐ์ •(Determine parameter k (k > 0))
    • Step 2. ์ดˆ๊ธฐ ์ค‘์‹ฌ์  ์„ค์ •์„ ์œ„ํ•ด k๊ฐœ์˜ ์ ์„ ๋ฌด์ž‘์œ„๋กœ ์„ ํƒ
    • Step 3. ๋ชจ๋“  ์ ์„ ๊ฐ€์žฅ ๊ฐ€๊นŒ์šด ์ค‘์‹ฌ์— ํ• ๋‹นํ•˜์—ฌ k๊ฐœ์˜ ํด๋Ÿฌ์Šคํ„ฐ๋ฅผ ํ˜•์„ฑ
    • Step 4. ๊ฐ ํด๋Ÿฌ์Šคํ„ฐ์˜ ์ค‘์‹ฌ์ ์„ ๋‹ค์‹œ ๊ณ„์‚ฐ(๊ฐ ํด๋Ÿฌ์Šคํ„ฐ์˜ ํ‰๊ท  ๊ณ„์‚ฐ)
    • Step 5. ์ค‘์‹ฌ์ด ๋ณ€ํ•˜์ง€ ์•Š์„ ๋•Œ๊นŒ์ง€ 3~4๋‹จ๊ณ„๋ฅผ ๋ฐ˜๋ณต
  • Agglomerative Clustering(๋ณ‘ํ•ฉ ๊ตฐ์ง‘)

    • ์ƒํ–ฅ์‹ ์ „๋žต(Bottom-up strategy)
    • ๊ฐ ๊ฐœ์ฒด๊ฐ€ ์ž์ฒด ํด๋Ÿฌ์Šคํ„ฐ๋ฅผ ํ˜•์„ฑํ•˜๋„๋ก ํ•˜๋Š” ๊ฒƒ๋ถ€ํ„ฐ ์‹œ์ž‘ํ•˜์—ฌ ๋ชจ๋“  ๊ฐ์ฒด๊ฐ€ ๋‹จ์ผ ํด๋Ÿฌ์Šคํ„ฐ์— ํฌํ•จ๋  ๋•Œ๊นŒ์ง€ ํด๋Ÿฌ์Šคํ„ฐ๋ฅผ ์ ์  ๋” ํฐ ํด๋Ÿฌ์Šคํ„ฐ๋กœ ๋ฐ˜๋ณต์ ์œผ๋กœ ๋ณ‘ํ•ฉ
  • ์ž‘๋™ ์•Œ๊ณ ๋ฆฌ์ฆ˜

    • Step 1. ๊ฐ ๊ฐ์ฒด๋Š” ํ•˜๋‚˜์˜ ํด๋Ÿฌ์Šคํ„ฐ๋ฅผ ํ˜•์„ฑ
    • Step 2. ๊ฐ€์žฅ ๋‚ฎ์€ ์ˆ˜์ค€์—์„œ ๊ฐ€์žฅ ๊ฐ€๊นŒ์šด ๋‘ ๊ฐœ์˜ ํด๋Ÿฌ์Šคํ„ฐ๋ฅผ ํ•˜๋‚˜์˜ ํด๋Ÿฌ์Šคํ„ฐ๋กœ ๋ณ‘ํ•ฉ
    • Step 3. ๋‹จ์ผ ํด๋Ÿฌ์Šคํ„ฐ๊ฐ€ ๋  ๋•Œ๊นŒ์ง€ 2๋‹จ๊ณ„๋ฅผ ๋ฐ˜๋ณต
  • DBSCAN(๋ฐ€๋„ ๊ธฐ๋ฐ˜ ํด๋Ÿฌ์Šคํ„ฐ๋ง)

    • ์–ด๋Š ์ ์„ ๊ธฐ์ค€์œผ๋กœ ๋ฐ˜๊ฒฝ radius ๋‚ด์— ์ ์ด n๊ฐœ ์ด์ƒ ์žˆ์œผ๋ฉด ํ•˜๋‚˜์˜ ๊ตฐ์ง‘์œผ๋กœ ์ธ์‹ํ•˜๋Š” ๋ฐฉ์‹
      • ์  p๊ฐ€ ์žˆ์„ ๋•Œ, ์  p์—์„œ ๋ถ€ํ„ฐ ๊ฑฐ๋ฆฌ e(epsilon)๋‚ด์— ์ ์ด m(minPts)๊ฐœ ์žˆ์œผ๋ฉด ํ•˜๋‚˜์˜ ๊ตฐ์ง‘์œผ๋กœ ์ธ์‹
    • image

2. K-Nearest Neighbors

  • K ์ตœ๊ทผ์ ‘ ์ด์›ƒ(K-Nearest Neighbors)
    • ์ƒˆ๋กœ์šด ๋ฐ์ดํ„ฐ๊ฐ€ ๋“ค์–ด์™”์„ ๋•Œ ๊ธฐ์กด ๋ฐ์ดํ„ฐ ์ค‘ ์ƒˆ๋กœ์šด ๋ฐ์ดํ„ฐ์™€ ๋น„์Šทํ•œ ์†์„ฑ์˜ ๊ทธ๋ฃน์œผ๋กœ ๋ถ„๋ฅ˜ํ•˜๋Š” ์•Œ๊ณ ๋ฆฌ์ฆ˜(Classifies unlabeled data points by assigning them the class of similar labeled data points)
  • ์ž‘๋™ ์•Œ๊ณ ๋ฆฌ์ฆ˜
    • Step 1. ์ฃผ๋ณ€์˜ ๋ช‡ ๊ฐœ์˜ ๋ฐ์ดํ„ฐ์™€ ๋น„๊ตํ• ์ง€ ํŒŒ๋ผ๋ฏธํ„ฐ k ๊ฒฐ์ •(Determine parameter k (k > 0))
    • Step 2. ์ƒˆ ๋ฐ์ดํ„ฐ์™€ ๊ธฐ์กด ๋ฐ์ดํ„ฐ ๊ฐ„์˜ ๊ฑฐ๋ฆฌ๋ฅผ ๊ณ„์‚ฐํ•ด์„œ ๋‘ ๋ฐ์ดํ„ฐ ๊ฐ„์˜ ์œ ์‚ฌ๋„ ๊ตฌํ•˜๊ธฐ(Determine similarity by calculating the distance between a test point and all other points in the dataset)
    • Step 3. 2๋‹จ๊ณ„์—์„œ ๊ณ„์‚ฐํ•œ ๊ฑฐ๋ฆฌ ๊ฐ’์— ๋”ฐ๋ผ ๋ฐ์ดํ„ฐ ์„ธํŠธ๋ฅผ ์ •๋ ฌ(Sort the dataset according to the distance values)
    • Step 4. k๋ฒˆ์งธ ์ตœ๊ทผ์ ‘ ์ด์›ƒ์˜ ๋ฒ”์ฃผ๋ฅผ ๊ฒฐ์ •(Determine the category of the k-th nearest neighbors)
    • Step 5. ์ƒˆ๋กœ์šด ๋ฐ์ดํ„ฐ์— ๋Œ€ํ•ด k๊ฐœ์˜ ์ตœ๊ทผ์ ‘ ์ด์›ƒ์˜ ๋‹จ์ˆœ ๋‹ค์ˆ˜๊ฒฐ์„ ํ†ตํ•ด ๋ฒ”์ฃผ๋ฅผ ๊ฒฐ์ •(Use simple majority of the category of the k nearest neighbors as the category of a test point)
  • ์žฅ์ (advantages)
    • ๊ฐ„๋‹จํ•˜๊ณ  ์ƒ๋Œ€์ ์œผ๋กœ ํšจ๊ณผ์ (Simple and relatively effective)
  • ๋‹จ์ (disadvantages)
    • Requires selection of an appropriate k
      • k๊ฐ€ ๋„ˆ๋ฌด ์ž‘์œผ๋ฉด ๋ชจ๋ธ์ด ๋ณต์žกํ•ด์„œ ๊ณผ์ ํ•ฉ(overfitting)์ด ๋ฐœ์ƒ
      • k๊ฐ€ ๋„ˆ๋ฌด ํฌ๋ฉด ๋ชจ๋ธ์ด ๋„ˆ๋ฌด ๋‹จ์ˆœํ•ด์ ธ์„œ ๊ณผ์†Œ์ ํ•ฉ(underfitting)์ด ๋ฐœ์ƒ
    • Does not produce a model
      • ๋ณ„๋„์˜ ํ•™์Šต ๋ชจ๋ธ์„ ์ƒ์„ฑํ•˜์ง€ ์•Š๊ธฐ ๋•Œ๋ฌธ์— ์ƒˆ๋กœ์šด ๋ฐ์ดํ„ฐ์— ๋Œ€ํ•ด ๋งค๋ฒˆ ๊ณ„์‚ฐ์ด ํ•„์š”ํ•˜๋ฏ€๋กœ ๊ณ„์‚ฐ ๋น„์šฉ์ด ๋†’์Œ
    • Nominal feature and missing data require additional processing
      • KNN์€ ์ฃผ๋กœ ์ˆ˜์น˜ํ˜• ๋ฐ์ดํ„ฐ์— ์‚ฌ์šฉ๋˜๊ธฐ ๋•Œ๋ฌธ์— ๋ช…๋ชฉํ˜• ๋ณ€์ˆ˜์— ๋Œ€ํ•ด์„œ๋Š” ๋ผ๋ฒจ ์ธ์ฝ”๋”ฉ์ด๋‚˜ ์›ํ•ซ ์ธ์ฝ”๋”ฉ๊ณผ ๊ฐ™์€ ๋ฐฉ์‹์œผ๋กœ ์ˆ˜์น˜ํ˜•์œผ๋กœ ๋ณ€ํ™˜ํ•ด์•ผ ํ•˜๋ฉฐ ๊ฒฐ์ธก๊ฐ’์˜ ๊ฒฝ์šฐ ๋ณ„๋„์˜ ๋ฐฉ์‹์œผ๋กœ ์ „์ฒ˜๋ฆฌํ•ด์•ผ ํ•˜๋Š” ์ถ”๊ฐ€ ๋น„์šฉ์ด ๋ฐœ์ƒ

3. Naive Bayes

  • ๋ฒ ์ด์ฆˆ ์ •๋ฆฌ(Bayes' theorem): ์‚ฌ์ „ํ™•๋ฅ ๊ณผ ์‚ฌํ›„ํ™•๋ฅ ์˜ ๊ด€๊ณ„์— ๋Œ€ํ•ด์„œ ์„ค๋ช…ํ•˜๋Š” ์ •๋ฆฌ
  • image
  • แ„‰แ…ณแ„แ…ณแ„…แ…ตแ†ซแ„‰แ…ฃแ†บ 2024-05-30 แ„‹แ…ฉแ„’แ…ฎ 1 32 03
  • ์šฉ์–ด ์ •๋ฆฌ
    • ๊ฐ€์„ค(H, Hypothesis): ๊ฐ€์„ค ํ˜น์€ ์–ด๋–ค ์‚ฌ๊ฑด์ด ๋ฐœ์ƒํ–ˆ๋‹ค๋Š” ์ฃผ์žฅ
    • ์ฆ๊ฑฐ(E, Evidence): ์ƒˆ๋กœ์šด ์ •๋ณด
    • ์šฐ๋„(๊ฐ€๋Šฅ๋„, likelihood) = P(E|H): ๊ฐ€์„ค(H)์ด ์ฃผ์–ด์กŒ์„ ๋•Œ ์ฆ๊ฑฐ(E)๊ฐ€ ๊ด€์ฐฐ๋  ๊ฐ€๋Šฅ์„ฑ
      • ํ™•๋ฅ  vs ์šฐ๋„
        • ํ™•๋ฅ : ํŠน์ • ๊ฒฝ์šฐ์— ๋Œ€ํ•œ ์ƒ๋Œ€์  ๋น„์œจ
          • ๋ชจ๋“  ๊ฒฝ์šฐ์— ๋Œ€ํ•˜์—ฌ ๋”ํ•˜๋ฉด 1์ด ๋จ(Mutually exclusive & exhaustive)
        • ์šฐ๋„: '๊ฐ€์„ค'์— ๋Œ€ํ•œ ์ƒ๋Œ€์  ๋น„์œจ
          • ๊ฐ€์„ค์€ ์–ผ๋งˆ๋“ ์ง€ ์„ธ์šธ ์ˆ˜ ์žˆ๊ณ , ์‹ฌ์ง€์–ด ์„œ๋กœ๊ฐ„์— ํฌํ•จ๊ด€๊ณ„๊ฐ€ ๋  ์ˆ˜๋„ ์žˆ์Œ(Not mutually exclusive & Not exhaustive)
    • ์‚ฌ์ „ํ™•๋ฅ (Prior probaility) = P(H): ์–ด๋–ค ์‚ฌ๊ฑด์ด ๋ฐœ์ƒํ–ˆ๋‹ค๋Š” ์ฃผ์žฅ์˜ ์‹ ๋ขฐ๋„
    • ์‚ฌํ›„ํ™•๋ฅ (Posterior probability) = P(H|E): ์ƒˆ๋กœ์šด ์ •๋ณด๋ฅผ ๋ฐ›์€ ํ›„ ๊ฐฑ์‹ ๋œ ์‹ ๋ขฐ๋„
  • ์ž‘๋™ ์•Œ๊ณ ๋ฆฌ์ฆ˜
    • Step 1. ์ฃผ์–ด์ง„ ํด๋ž˜์Šค ๋ผ๋ฒจ์— ๋Œ€ํ•œ ์‚ฌ์ „ ํ™•๋ฅ (Prior probability)์„ ๊ณ„์‚ฐ
    • Step 2. ๊ฐ ํด๋ž˜์Šค์˜ ๊ฐ ์†์„ฑ์œผ๋กœ ์šฐ๋„ ํ™•๋ฅ (Likelihood probability) ๊ณ„์‚ฐ
    • Step 3. ์ด ๊ฐ’์„ Bayes Formula์— ๋Œ€์ž…ํ•˜๊ณ  ์‚ฌํ›„ ํ™•๋ฅ (Posterior probability)์„ ๊ณ„์‚ฐ
    • Step 4. 1~3์˜ ๊ฒฐ๊ณผ๋กœ ์–ด๋–ค ํด๋ž˜์Šค๊ฐ€ ๋†’์€ ์‚ฌํ›„ ํ™•๋ฅ ์„ ๊ฐ–๊ฒŒ ๋  ์ง€ ์•Œ ์ˆ˜ ์žˆ์Œ(์ž…๋ ฅ ๊ฐ’์ด ์–ด๋–ค ํด๋ž˜์Šค์— ๋” ๋†’์€ ํ™•๋ฅ ๋กœ ์†ํ•  ์ˆ˜ ์žˆ์„์ง€)

4. Association Mining(Apriori Algorithm)

  • ์—ฐ๊ด€ ๊ทœ์น™(Association Rule)
    • ๋ฐ์ดํ„ฐ์—์„œ ๋ณ€์ˆ˜ ๊ฐ„์˜ ์œ ์˜๋ฏธํ•œ ๊ทœ์น™์„ ๋ฐœ๊ฒฌํ•˜๋Š” ๋ฐ ์“ฐ์ด๋Š” ์•Œ๊ณ ๋ฆฌ์ฆ˜
    • e.g., ๋ผ๋ฉด์„ ๊ตฌ๋งคํ•˜๋Š” ๊ณ ๊ฐ์ด ํ–‡๋ฐ˜์„ ํ•จ๊ป˜ ๊ตฌ๋งคํ•  ๊ฐ€๋Šฅ์„ฑ์ด ๋†’๋‹ค.
  • ์—ฐ๊ด€์„ฑ ๊ทœ์น™ ์ƒ์„ฑ ๊ณผ์ •
    • Step 1. ์ง€์ง€๋„(Support, ๊ต์‚ฌ๊ฑด)
      • ๋ฐ์ดํ„ฐ์—์„œ ํ•ญ๋ชฉ ์ง‘ํ•ฉ์ด ์–ผ๋งˆ๋‚˜ ๋นˆ๋ฒˆํ•˜๊ฒŒ ๋“ฑ์žฅํ•˜๋Š”์ง€๋ฅผ ๋‚˜ํƒ€๋‚ด๋Š” ์ฒ™๋„
      • Support(X) = Count(X) / N
    • Step 2. ์‹ ๋ขฐ๋„(Confidence, ์กฐ๊ฑด๋ถ€ ํ™•๋ฅ )
      • ์กฐ๊ฑด๋ถ€ ์•„์ดํ…œ(A)์„ ๊ตฌ๋งคํ•œ ๊ฒฝ์šฐ, ์ด์ค‘์—์„œ ์–ผ๋งˆ๋‚˜ ๊ฒฐ๋ก ๋ถ€ ์•„์ดํ…œ(B)์„ ๊ตฌ๋งคํ•  ๊ฒƒ์ธ์ง€๋ฅผ ๋‚˜ํƒ€๋‚ด๋Š” ์ฒ™๋„
      • Confidence(X โ†’ Y) = Support(X, Y) / Support(X)
  • Apriori Algorithm
    • ์—ฐ๊ด€ ๊ทœ์น™(association rule)์˜ ๋Œ€ํ‘œ์ ์ธ ์•Œ๊ณ ๋ฆฌ์ฆ˜์œผ๋กœ, ํŠน์ • ์‚ฌ๊ฑด์ด ๋ฐœ์ƒํ–ˆ์„ ๋•Œ ํ•จ๊ป˜ ๋ฐœ์ƒํ•˜๋Š” ๋˜ ๋‹ค๋ฅธ ์‚ฌ๊ฑด์˜ ๊ทœ์น™์„ ์ฐพ๋Š” ์•Œ๊ณ ๋ฆฌ์ฆ˜
    • ์ž‘๋™ ์•Œ๊ณ ๋ฆฌ์ฆ˜
      • Step 1. ๋ชจ๋“  ํ•ญ๋ชฉ์˜ ๋นˆ๋„๋ฅผ ๊ณ„์‚ฐํ•˜์—ฌ ์ตœ์†Œ ์ง€์ง€๋„(minimum support)๋ฅผ ๋„˜๋Š” ํ•ญ๋ชฉ๋“ค๋งŒ ๋‚จ๊น€
      • Step 2. ๋‚จ์€ ํ•ญ๋ชฉ๋“ค์„ ์กฐํ•ฉํ•˜์—ฌ 2๊ฐœ์˜ ํ•ญ๋ชฉ ์ง‘ํ•ฉ์œผ๋กœ ์ด๋ฃจ์–ด์ง„ ํ›„๋ณด ํ•ญ๋ชฉ ์ง‘ํ•ฉ์„ ๋งŒ๋“ฆ
      • Step 3. 2๋‹จ๊ณ„์—์„œ ๋งŒ๋“  ํ›„๋ณด ํ•ญ๋ชฉ ์ง‘ํ•ฉ์œผ๋กœ๋ถ€ํ„ฐ ๋นˆ๋„๋ฅผ ๊ณ„์‚ฐํ•˜์—ฌ ์ตœ์†Œ ์ง€์ง€๋„๋ฅผ ๋„˜๋Š” ํ•ญ๋ชฉ๋“ค๋งŒ ๋‚จ๊น€
      • Step 4. ํ›„๋ณด ํ•ญ๋ชฉ ์ง‘ํ•ฉ์ด ๋”์ด์ƒ ๋‚˜์˜ค์ง€ ์•Š์„ ๋•Œ๊นŒ์ง€ ๋‚จ์€ ํ•ญ๋ชฉ๋“ค๋กœ๋ถ€ํ„ฐ 2~3๋‹จ๊ณ„๋ฅผ ๋ฐ˜๋ณต ์ˆ˜ํ–‰
      • Step 5. ๊ฐ ๋นˆ๋ฐœ ํ•ญ๋ชฉ ์ง‘ํ•ฉ์— ๋Œ€ํ•ด ๋ชจ๋“  ๊ฐ€๋Šฅํ•œ ์—ฐ๊ด€ ๊ทœ์น™์„ ์ƒ์„ฑํ•˜๊ณ  ๊ฐ๊ฐ์˜ ์‹ ๋ขฐ๋„(confidence)๋ฅผ ๊ณ„์‚ฐํ•จ
      • Step 6. ์‹ ๋ขฐ๋„๊ฐ€ ์ตœ์†Œ ์‹ ๋ขฐ๋„(minimum confidence)๋ฅผ ๋„˜๋Š” ๊ทœ์น™๋“ค๋งŒ ๋‚จ๊น€

5. Collaborative Filtering

  • ํ˜‘์—… ํ•„ํ„ฐ๋ง(Collaborative Filtering)
    • ์ œํ’ˆ ๋ฐ ์‚ฌ์šฉ์ž ๊ฐ„์˜ ์œ ์‚ฌ์„ฑ์„ ๊ฒ€ํ† ํ•˜๊ณ  ์ด๋ฅผ ๋ฐ”ํƒ•์œผ๋กœ ์‚ฌ์šฉ์ž ์ทจํ–ฅ์— ๋งž๋Š” ์ œํ’ˆ์„ ์ถ”์ฒœํ•ด์ฃผ๋Š” ๋ฐฉ์‹์œผ๋กœ ์‚ฌ์šฉ์ž ๊ธฐ๋ฐ˜ ํ˜‘์—… ํ•„ํ„ฐ๋ง๊ณผ ์•„์ดํ…œ ๊ธฐ๋ฐ˜ ํ˜‘์—… ํ•„ํ„ฐ๋ง์œผ๋กœ ๋ถ„๋ฅ˜ํ•  ์ˆ˜ ์žˆ์Œ
      • e.g., ํŠน์ • ์‚ฌ์šฉ์ž์™€ ๋น„์Šทํ•œ ์ทจํ–ฅ์„ ๊ฐ€์ง„ ์‚ฌ๋žŒ์ด ์ข‹์•„ํ•˜๋Š” ์Œ์•…์€ ํŠน์ • ์‚ฌ์šฉ์ž๋„ ์ข‹์•„ํ•  ๊ฐ€๋Šฅ์„ฑ์ด ๋†’์Œ
  • Recommendation Systems Applications
Amazon Netflix Watcha
image image image
  • ์‚ฌ์šฉ์ž ๊ธฐ๋ฐ˜ ํ˜‘์—… ํ•„ํ„ฐ๋ง(User-based filtering)

    • Basic idea: ํƒ€๊ฒŸ ์‚ฌ์šฉ์ž์™€ ๊ด€์‹ฌ์‚ฌ๊ฐ€ ๊ฐ™์€ ์œ ์‚ฌ ์‚ฌ์šฉ์ž ์ฐพ๊ธฐ

      • e.g., ์˜ํ™” ์ถ”์ฒœ ์‹œ์Šคํ…œ์—์„œ ํ•œ ์‚ฌ์šฉ์ž๊ฐ€ ํŠน์ • ์˜ํ™”์— ๋†’์€ ํ‰์ ์„ ์คฌ๋‹ค๋ฉด, ์ด์™€ ๋น„์Šทํ•œ ์ทจํ–ฅ์„ ๊ฐ€์ง„ ์‚ฌ์šฉ์ž๋“ค์—๊ฒŒ๋„ ํ•ด๋‹น ์˜ํ™”๋ฅผ ์ถ”์ฒœ
    • image
    • ์‚ฌ์šฉ์ž ๊ฐ„์˜ ์œ ์‚ฌ์„ฑ ๊ณ„์‚ฐ(ํ”ผ์–ด์Šจ ์ƒ๊ด€๊ณ„์ˆ˜ ํ™œ์šฉ)

    • image
    • แ„‰แ…ณแ„แ…ณแ„…แ…ตแ†ซแ„‰แ…ฃแ†บ 2024-06-03 แ„‹แ…ฉแ„’แ…ฎ 2 36 23
    • ์‚ฌ์šฉ์ž A์™€ B์˜ ๊ณต๋ถ„์‚ฐ์„ ๊ฐ๊ฐ์˜ ํ‘œ์ค€ํŽธ์ฐจ์˜ ๊ณฑ์œผ๋กœ ๋‚˜๋ˆ„๋ฉด ํ”ผ์–ด์Šจ ์ƒ๊ด€๊ณ„์ˆ˜๋ฅผ ๊ตฌํ•  ์ˆ˜ ์žˆ์Œ

  • ์•„์ดํ…œ ๊ธฐ๋ฐ˜ ํ˜‘์—… ํ•„ํ„ฐ๋ง(Item-based filtering)

    • Basic idea: ์•„์ดํ…œ ๊ฐ„์˜ ์œ ์‚ฌ์„ฑ์„ ์‚ฌ์šฉํ•˜์—ฌ ์ถ”์ฒœ ์ œ๊ณต
      • e.g., ๋Œ€๋‹ค์ˆ˜์˜ ์‚ฌ์šฉ์ž๊ฐ€ A ์˜ํ™”์— ์ด์–ด B ์˜ํ™”๋ฅผ ๋†’๊ฒŒ ํ‰๊ฐ€ํ–ˆ๋‹ค๋ฉด, A ์˜ํ™”๋ฅผ ์„ ํ˜ธํ•˜๋Š” ์‚ฌ์šฉ์ž์—๊ฒŒ B ์˜ํ™”๋ฅผ ์ถ”์ฒœํ•  ์ˆ˜ ์žˆ์Œ
    • image
  • image
  • ์‚ฌ์šฉ์ž ๊ธฐ๋ฐ˜ ํ˜‘์—… ํ•„ํ„ฐ๋ง์˜ ์žฅยท๋‹จ์ 

    • ์žฅ์ 
      1. ์ง๊ด€์„ฑ: ๋™์ผํ•œ ์ทจํ–ฅ์„ ๊ฐ€์ง„ ์‚ฌ์šฉ์ž์˜ ํ–‰๋™์„ ๊ธฐ๋ฐ˜์œผ๋กœ ์ถ”์ฒœ์„ ์ œ๊ณตํ•˜๊ธฐ ๋•Œ๋ฌธ์— ์ดํ•ดํ•˜๊ธฐ ์‰ฌ์›€
      2. ๊ฐœ์ธํ™”๋œ ์ถ”์ฒœ: ๋น„์Šทํ•œ ์‚ฌ์šฉ์ž๋ฅผ ์ฐพ์Œ์œผ๋กœ์จ ๊ฐœ์ธํ™”๋œ ์ถ”์ฒœ์ด ๊ฐ€๋Šฅํ•จ
      3. ์‹ ๊ทœ ์•„์ดํ…œ ์ถ”์ฒœ ๊ฐ€๋Šฅ: ์ƒˆ๋กœ์šด ์•„์ดํ…œ์ด ์‹œ์Šคํ…œ์— ์ถ”๊ฐ€๋˜๋ฉด ๊ธฐ์กด์˜ ์‚ฌ์šฉ์ž ์ทจํ–ฅ์— ๋งž์ถ”์–ด ์‰ฝ๊ฒŒ ์ถ”์ฒœ ๊ฐ€๋Šฅ
    • ๋‹จ์ 
      1. ํ™•์žฅ์„ฑ ๋ฌธ์ œ: ์‚ฌ์šฉ์ž ์ˆ˜๊ฐ€ ๋งŽ์•„์งˆ์ˆ˜๋ก ์œ ์‚ฌ๋„ ๊ณ„์‚ฐ์ด ๋น„ํšจ์œจ์ ์ด๊ฒŒ ๋ฉ๋‹ˆ๋‹ค. ํŠนํžˆ ๋Œ€๊ทœ๋ชจ ๋ฐ์ดํ„ฐ์…‹์—์„œ๋Š” ๊ณ„์‚ฐ ๋น„์šฉ์ด ๋†’์Œ
      2. ํฌ์†Œ์„ฑ ๋ฌธ์ œ: ์‚ฌ์šฉ์ž-์•„์ดํ…œ ๋งคํŠธ๋ฆญ์Šค๊ฐ€ ํฌ์†Œํ•  ๊ฒฝ์šฐ(๋งŽ์€ ๋นˆ์นธ์ด ์žˆ๋Š” ๊ฒฝ์šฐ) ์œ ์‚ฌํ•œ ์‚ฌ์šฉ์ž๋ฅผ ์ฐพ๊ธฐ๊ฐ€ ์–ด๋ ค์›€
      3. ์ฝœ๋“œ ์Šคํƒ€ํŠธ ๋ฌธ์ œ: ์ƒˆ๋กœ์šด ์‚ฌ์šฉ์ž์— ๋Œ€ํ•ด ์ถฉ๋ถ„ํ•œ ์ •๋ณด๊ฐ€ ์—†์„ ๊ฒฝ์šฐ ์ถ”์ฒœ์ด ์–ด๋ ค์›€
  • ์•„์ดํ…œ ๊ธฐ๋ฐ˜ ํ˜‘์—… ํ•„ํ„ฐ๋ง์˜ ์žฅยท๋‹จ์ 

    • ์žฅ์ 
      1. ํ™•์žฅ์„ฑ: ์•„์ดํ…œ ์ˆ˜๋Š” ์ผ๋ฐ˜์ ์œผ๋กœ ์‚ฌ์šฉ์ž ์ˆ˜๋ณด๋‹ค ์ ๊ธฐ ๋•Œ๋ฌธ์— ์œ ์‚ฌ๋„ ๊ณ„์‚ฐ์ด ๋” ํšจ์œจ์ 
      2. ์•ˆ์ •์„ฑ: ์•„์ดํ…œ์˜ ์œ ์‚ฌ๋„๋Š” ์‹œ๊ฐ„์— ๋”ฐ๋ผ ํฌ๊ฒŒ ๋ณ€ํ•˜์ง€ ์•Š์œผ๋ฏ€๋กœ ๋” ์•ˆ์ •์ ์ธ ์ถ”์ฒœ์ด ๊ฐ€๋Šฅํ•จ
      3. ํฌ์†Œ์„ฑ ๋ฌธ์ œ ํ•ด๊ฒฐ: ์‚ฌ์šฉ์ž๊ฐ€ ์ ์–ด๋„ ํ•˜๋‚˜์˜ ์•„์ดํ…œ์„ ํ‰๊ฐ€ํ–ˆ๋‹ค๋ฉด ์ถ”์ฒœ์ด ๊ฐ€๋Šฅํ•จ
    • ๋‹จ์ 
      1. ์‹ ๊ทœ ์•„์ดํ…œ ๋ฌธ์ œ: ์ƒˆ๋กœ์šด ์•„์ดํ…œ์— ๋Œ€ํ•œ ์œ ์‚ฌ๋„ ์ •๋ณด๋ฅผ ์–ป๊ธฐ ์–ด๋ ค์›Œ ์ถ”์ฒœ์ด ํž˜๋“ค ์ˆ˜ ์žˆ์Œ
      2. ๊ฐœ์ธํ™” ๋ถ€์กฑ: ์‚ฌ์šฉ์ž ๊ธฐ๋ฐ˜ ํ˜‘์—… ํ•„ํ„ฐ๋ง์— ๋น„ํ•ด ๊ฐœ์ธํ™”๊ฐ€ ์–ด๋ ค์›€. ํŠน์ • ์‚ฌ์šฉ์ž์˜ ์ทจํ–ฅ๋ณด๋‹ค๋Š” ์•„์ดํ…œ์˜ ์ „๋ฐ˜์ ์ธ ์œ ์‚ฌ์„ฑ์— ์˜์กดํ•˜๊ธฐ ๋•Œ๋ฌธ
      3. ์ดˆ๊ธฐ ํ•™์Šต ๋น„์šฉ: ์ดˆ๊ธฐ ์•„์ดํ…œ ๊ฐ„ ์œ ์‚ฌ๋„ ๊ณ„์‚ฐ์— ๋งŽ์€ ์‹œ๊ฐ„์ด ์†Œ์š”๋  ์ˆ˜ ์žˆ์Œ

6. Linear Regression

  • ์„ ํ˜• ํšŒ๊ท€(Linear Regression): ์ฃผ์–ด์ง„ ๋ฐ์ดํ„ฐ์—์„œ ๋…๋ฆฝ ๋ณ€์ˆ˜(X)์™€ ์ข…์† ๋ณ€์ˆ˜(Y) ๊ฐ„์˜ ์„ ํ˜• ๊ด€๊ณ„๋ฅผ ๋ชจ๋ธ๋งํ•˜์—ฌ ์—ฐ์†์ ์ธ ๊ฐ’์„ ์˜ˆ์ธกํ•˜๋Š” ๋จธ์‹ ๋Ÿฌ๋‹ ์•Œ๊ณ ๋ฆฌ์ฆ˜(๋จธ์‹ ๋Ÿฌ๋‹ ์•Œ๊ณ ๋ฆฌ์ฆ˜์ด์ง€๋งŒ ์ธ๊ณต์‹ ๊ฒฝ๋ง ์•Œ๊ณ ๋ฆฌ์ฆ˜์˜ ๊ธฐ์ดˆ๊ฐ€ ๋จ)
    • แ„‰แ…ณแ„แ…ณแ„…แ…ตแ†ซแ„‰แ…ฃแ†บ 2024-06-03 แ„‹แ…ฉแ„’แ…ฎ 1 29 27
    • ์œ„ ์ˆ˜์‹์—์„œ ๐›ฝ๋Š” ํšŒ๊ท€ ๊ณ„์ˆ˜, ฯต์€ ์˜ค์ฐจํ•ญ์ด๋ฉฐ, ํšŒ๊ท€ ๊ณ„์ˆ˜๋ฅผ ์ฐพ๊ธฐ ์œ„ํ•ด ์ตœ์†Œ์ œ๊ณฑ๋ฒ•์„ ์‚ฌ์šฉํ•˜์—ฌ ์˜ค์ฐจ ์ œ๊ณฑํ•ฉ์„ ์ตœ์†Œํ™”
  • แ„‰แ…ณแ„แ…ณแ„…แ…ตแ†ซแ„‰แ…ฃแ†บ 2024-06-03 แ„‹แ…ฉแ„’แ…ฎ 1 32 37

7. Perceptron & Adaline

  • ํผ์…‰ํŠธ๋ก (Perceptron): ํผ์…‰ํŠธ๋ก ์€ ๋‹จ์ธต ์‹ ๊ฒฝ๋ง์˜ ๊ฐ€์žฅ ๊ธฐ๋ณธ์ ์ธ ํ˜•ํƒœ๋กœ, ์„ ํ˜• ํšŒ๊ท€์™€ ์œ ์‚ฌํ•œ ๊ตฌ์กฐ๋ฅผ ๊ฐ€์ง€๋ฉฐ ์ด์ง„ ๋ถ„๋ฅ˜ ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด ์‚ฌ์šฉ

    • ํผ์…‰ํŠธ๋ก ์€ ์ž…๋ ฅ๊ฐ’์— ๊ฐ€์ค‘์น˜๋ฅผ ๊ณฑํ•œ ํ›„, ๊ทธ ํ•ฉ์„ ํ™œ์„ฑํ™” ํ•จ์ˆ˜(์ฃผ๋กœ ๊ณ„๋‹จ ํ•จ์ˆ˜)๋ฅผ ํ†ตํ•ด ์ด์ง„ ์ถœ๋ ฅ์„ ์ƒ์„ฑ
    • แ„‰แ…ณแ„แ…ณแ„…แ…ตแ†ซแ„‰แ…ฃแ†บ 2024-06-03 แ„‹แ…ฉแ„’แ…ฎ 1 33 42
    • ์œ„ ์ˆ˜์‹์—์„œ step์€ ๊ฐ€์ค‘์น˜ ํ•ฉ์„ ์ž…๋ ฅ์œผ๋กœ ๋ฐ›์•„์„œ ์ตœ์ข… ์ถœ๋ ฅ์„ ๊ฒฐ์ •ํ•˜๋Š” ๊ณ„๋‹จ ํ•จ์ˆ˜์ด๊ณ , w๋Š” ๊ฐ€์ค‘์น˜ ๋ฒกํ„ฐ, ๐‘ฅ๋Š” ์ž…๋ ฅ ๋ฒกํ„ฐ, ๐‘๋Š” ๋ฐ”์ด์–ด์Šค
    • ๊ณ„๋‹จ ํ•จ์ˆ˜(step)์˜ ์ˆ˜์‹: แ„‰แ…ณแ„แ…ณแ„…แ…ตแ†ซแ„‰แ…ฃแ†บ 2024-06-03 แ„‹แ…ฉแ„’แ…ฎ 1 33 47
  • แ„‰แ…ณแ„แ…ณแ„…แ…ตแ†ซแ„‰แ…ฃแ†บ 2024-06-03 แ„‹แ…ฉแ„’แ…ฎ 1 35 19
  • ์•„๋‹ฌ๋ฆฐ(Adaline): ํผ์…‰ํŠธ๋ก ๊ณผ ์œ ์‚ฌํ•˜์ง€๋งŒ, ์ถœ๋ ฅ์— ํ™œ์„ฑํ™” ํ•จ์ˆ˜(๊ณ„๋‹จ ํ•จ์ˆ˜)๋ฅผ ์ ์šฉํ•˜์ง€ ์•Š๊ณ  ์„ ํ˜• ํ•จ์ˆ˜์˜ ๊ฒฐ๊ณผ๋ฅผ ํ•™์Šต์— ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ์ด ํŠน์ง•

    • แ„‰แ…ณแ„แ…ณแ„…แ…ตแ†ซแ„‰แ…ฃแ†บ 2024-06-03 แ„‹แ…ฉแ„’แ…ฎ 1 36 45
    • ์œ„ ์ˆ˜์‹์—์„œ w๋Š” ๊ฐ€์ค‘์น˜ ๋ฒกํ„ฐ, ๐‘ฅ๋Š” ์ž…๋ ฅ ๋ฒกํ„ฐ, ๐‘๋Š” ๋ฐ”์ด์–ด์Šค, ์•„๋ฐ๋ฆฐ์€ ํ‰๊ท  ์ œ๊ณฑ ์˜ค์ฐจ ํ•จ์ˆ˜๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๊ฐ€์ค‘์น˜๋ฅผ ์—…๋ฐ์ดํŠธํ•˜๋ฉฐ, ์—ฐ์†์ ์ธ ์˜ค์ฐจ๋ฅผ ์ตœ์†Œํ™”
  • แ„‰แ…ณแ„แ…ณแ„…แ…ตแ†ซแ„‰แ…ฃแ†บ 2024-06-03 แ„‹แ…ฉแ„’แ…ฎ 1 37 22

8. Logistic Regression

  • ๋กœ์ง€์Šคํ‹ฑ ํšŒ๊ท€(Logistic Regression): ์ด์ง„ ๋ถ„๋ฅ˜ ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด ๊ณ ์•ˆ๋œ ๋ฐฉ๋ฒ•์œผ๋กœ, ์ž…๋ ฅ ๋ณ€์ˆ˜๋“ค์˜ ์„ ํ˜• ๊ฒฐํ•ฉ์„ ๊ตฌํ•œ ํ›„ ์ด๋ฅผ ๋กœ์ง“ ํ•จ์ˆ˜(๋˜๋Š” ๋กœ์ง€์Šคํ‹ฑ ํ•จ์ˆ˜)๋ฅผ ํ†ต๊ณผ์‹œ์ผœ ํ™•๋ฅ ์„ ์˜ˆ์ธก

    • ๋กœ์ง“ ํ•จ์ˆ˜๋Š” S์ž ํ˜•ํƒœ๋ฅผ ๋ ๋ฉฐ 0๊ณผ 1 ์‚ฌ์ด์˜ ๊ฐ’์„ ๊ฐ€์ง. ์ด ํ•จ์ˆ˜๋ฅผ ํ†ต๊ณผํ•œ ๊ฒฐ๊ณผ๋Š” ํŠน์ • ์‚ฌ๊ฑด์ด ๋ฐœ์ƒํ•  ํ™•๋ฅ ๋กœ ํ•ด์„๋˜๋ฉฐ, ์ด์ง„ ๋ถ„๋ฅ˜์—์„œ๋Š” ์–‘์„ฑ ํด๋ž˜์Šค์— ์†ํ•  ํ™•๋ฅ ๋กœ ํ•ด์„
    • แ„‰แ…ณแ„แ…ณแ„…แ…ตแ†ซแ„‰แ…ฃแ†บ 2024-06-03 แ„‹แ…ฉแ„’แ…ฎ 1 38 38
    • ์œ„ ์ˆ˜์‹์—์„œ ๐‘ƒ(Y=1)์€ ์–‘์„ฑ ํด๋ž˜์Šค์— ์†ํ•  ํ™•๋ฅ ์„ ์˜๋ฏธํ•˜๋ฉฐ, X๋Š” ์ž…๋ ฅ ๋ณ€์ˆ˜, ๐›ฝ๋Š” ํšŒ๊ท€ ๊ณ„์ˆ˜๋ฅผ ๋‚˜ํƒ€๋ƒ„. ๋ชจ๋ธ์€ ์ฃผ์–ด์ง„ ๋ฐ์ดํ„ฐ์—์„œ ํ™•๋ฅ ์„ ์ตœ๋Œ€ํ™”ํ•˜๋Š” ํšŒ๊ท€ ๊ณ„์ˆ˜๋ฅผ ์ฐพ๊ธฐ ์œ„ํ•ด ์ตœ๋Œ€ ์šฐ๋„ ์ถ”์ •(MLE)๋ฒ•์„ ์‚ฌ์šฉ
  • Odds(์˜ค์ฆˆ): ์„ฑ๊ณต ํ™•๋ฅ ๊ณผ ์‹คํŒจ ํ™•๋ฅ ์˜ ๋น„์œจ โ†’ ํŠน์ • ์‚ฌ๊ฑด์ด ๋ฐœ์ƒํ•  ํ™•๋ฅ ์„ ๊ทธ ์‚ฌ๊ฑด์ด ๋ฐœ์ƒํ•˜์ง€ ์•Š์„ ํ™•๋ฅ ๊ณผ ๋น„๊ตํ•œ ๊ฐ’

    • 0๋ถ€ํ„ฐ 1๊นŒ์ง€ ์ฆ๊ฐ€ํ•  ๋•Œ ์˜ค์ฆˆ ๋น„์˜ ๊ฐ’์€ ์ฒ˜์Œ์—๋Š” ์ฒœ์ฒœํžˆ ์ฆ๊ฐ€ํ•˜๋‹ค๊ฐ€ p๊ฐ€ 1์— ๊ฐ€๊นŒ์›Œ์ง€๋ฉด ๊ธ‰๊ฒฉํžˆ ์ฆ๊ฐ€ํ•จ
    • Odds Ratio(์˜ค์ฆˆ ๋น„): p / (1 - p) (p = ์„ฑ๊ณต ํ™•๋ฅ )
      • e.g., ์–ด๋–ค ์‚ฌ๊ฑด์ด ๋ฐœ์ƒํ•  ํ™•๋ฅ ์ด 80%์ผ ๋•Œ์˜ odds ratio๋Š”?
        • 0.8 / (1 - 0.8) = 0.8 / 0.2 = 4
  • Logit function(๋กœ์ง“ ํ•จ์ˆ˜): ์˜ค์ฆˆ์˜ ์ž์—ฐ ๋กœ๊ทธ๋ฅผ ์ทจํ•œ ๊ฐ’

    • logit(p) = log(p / (1 - p))
    • image
    • p๊ฐ€ 0.5์ผ ๋•Œ 0์ด ๋˜๊ณ  ๊ฐ€ 0๊ณผ 1์ผ ๋•Œ ๊ฐ๊ฐ ๋ฌดํ•œ๋Œ€๋กœ ์Œ์ˆ˜์™€ ์–‘์ˆ˜๊ฐ€ ๋˜๋Š” ํŠน์ง•์„ ๊ฐ€์ง
  • แ„‰แ…ณแ„แ…ณแ„…แ…ตแ†ซแ„‰แ…ฃแ†บ 2024-06-03 แ„‹แ…ฉแ„’แ…ฎ 1 39 58
  • ํ™œ์„ฑํ™” ํ•จ์ˆ˜: ์„ ํ˜• ํ•จ์ˆ˜๋ฅผ ํ†ต๊ณผ์‹œ์ผœ ์–ป์€ ๊ฐ’์„ ์ž„๊ณ„ ํ•จ์ˆ˜์— ๋ณด๋‚ด๊ธฐ ์ „์— ๋ณ€ํ˜•์‹œํ‚ค๋Š”๋ฐ ํ•„์š”ํ•œ ํ•จ์ˆ˜๋กœ, ์ฃผ๋กœ ๋น„์„ ํ˜• ํ•จ์ˆ˜๋ฅผ ์‚ฌ์šฉ

    • Why does Activation function use nonlinear function?
      • ์„ ํ˜• ํ•จ์ˆ˜(๋‹จ์ˆœํ•œ ๊ทœ์น™)์˜ ๊ฒฝ์šฐ ์ง์„ ์œผ๋กœ data๋ฅผ ๊ตฌ๋ถ„ํ•˜๋Š”๋ฐ, ์ด๋Š” ์•„๋ฌด๋ฆฌ ์ธต์„ ๊นŠ๊ฒŒ ์Œ“์•„๋„ ํ•˜๋‚˜์˜ ์ง์„ ์œผ๋กœ ๊ทœ์น™์ด ํ‘œํ˜„๋œ๋‹ค๋Š” ๊ฒƒ์„ ๋œปํ•จ. ์ฆ‰, ์„ ํ˜• ๋ณ€ํ™˜์„ ๊ณ„์† ๋ฐ˜๋ณตํ•˜๋”๋ผ๋„ ๊ฒฐ๊ตญ ์„ ํ˜• ํ•จ์ˆ˜์ด๋ฏ€๋กœ ๋ณ„ ์˜๋ฏธ๊ฐ€ ์—†์Œ.
      • ๊ทธ๋Ÿฌ๋‚˜, ๋น„์„ ํ˜• ํ•จ์ˆ˜์˜ ๊ฒฝ์šฐ ์—ฌ๋Ÿฌ ๋ฐ์ดํ„ฐ์˜ ๋ณต์žกํ•œ ํŒจํ„ด์„ ํ•™์Šตํ•  ์ˆ˜ ์žˆ๊ณ , ๊ณ„์† ๋น„์„ ํ˜•์„ ์œ ์ง€ํ•˜๊ธฐ ๋•Œ๋ฌธ์— ๋‹ค์ธต ๊ตฌ์กฐ์˜ ์œ ํšจ์„ฑ์„ ์ถฉ์กฑ์‹œํ‚ฌ ์ˆ˜ ์žˆ์Œ. ๋˜ํ•œ, ๋น„์„ ํ˜• ํ•จ์ˆ˜๋Š” ๋Œ€๋ถ€๋ถ„ ๋ฏธ๋ถ„์ด ๊ฐ€๋Šฅํ•˜๊ธฐ ๋•Œ๋ฌธ์— ํ™œ์„ฑํ™” ํ•จ์ˆ˜๋กœ ์ ํ•ฉํ•จ.
  • Sigmoid function(์‹œ๊ทธ๋ชจ์ด๋“œ ํ•จ์ˆ˜)

9. Single Layer Neural Network

  • ๋‹จ์ผ์ธต ์‹ ๊ฒฝ๋ง: ์ž…๋ ฅ์ธต๊ณผ ์ถœ๋ ฅ์ธต์œผ๋กœ ๊ตฌ์„ฑ๋œ ๊ฐ€์žฅ ๊ฐ„๋‹จํ•œ ํ˜•ํƒœ์˜ ์‹ ๊ฒฝ๋ง. ์ฃผ๋กœ ํผ์…‰ํŠธ๋ก , ์•„๋ฐ๋ฆฐ, ๋กœ์ง€์Šคํ‹ฑ ํšŒ๊ท€์™€ ๊ฐ™์€ ๋ชจ๋ธ๋“ค์„ ํฌํ•จํ•  ์ˆ˜ ์žˆ์œผ๋ฉฐ, ๋‹ค์–‘ํ•œ ํ™œ์„ฑํ™” ํ•จ์ˆ˜๋ฅผ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์–ด ๋น„์„ ํ˜• ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•  ์ˆ˜ ์žˆ์Œ
  • แ„‰แ…ณแ„แ…ณแ„…แ…ตแ†ซแ„‰แ…ฃแ†บ 2024-06-03 แ„‹แ…ฉแ„’แ…ฎ 1 41 02
  • ์œ„ ์ˆ˜์‹์—์„œ f๋Š” ํ™œ์„ฑํ™” ํ•จ์ˆ˜(์˜ˆ: ์‹œ๊ทธ๋ชจ์ด๋“œ ํ•จ์ˆ˜, ReLU ๋“ฑ), w๋Š” ๊ฐ€์ค‘์น˜ ๋ฒกํ„ฐ, ๐‘ฅ๋Š” ์ž…๋ ฅ ๋ฒกํ„ฐ, ๐‘๋Š” ๋ฐ”์ด์–ด์Šค
  • แ„‰แ…ณแ„แ…ณแ„…แ…ตแ†ซแ„‰แ…ฃแ†บ 2024-06-03 แ„‹แ…ฉแ„’แ…ฎ 1 41 28

10. Multi Layer Neural Network

11. Convolutional Neural Network

  • Summary
    • Accepts a volume of size W1 x H1 x D1
    • Requires four hyperparameters:
      • Number of filters K(Common settings: powers of 2),
      • there spatial extent F,
      • the stride S,
      • the amount of zero padding
    • Produces a volume of size W2 x H2 x D2 where:
      • W2 = (W1 - F + 2P)/S + 1
      • H2 = (H1 - F + 2P)/S + 1
      • D2 = K
    • With parameter sharing, it introduces FยทFยทD1 weights per filter, for a total of (FยทFยทD1)ยทK weights and K biases.
    • In the output volume, the d-th depth slice (of size W2 x H2) is the result of performing a valid convolution of the d-th filter over the input volume with a stride of S, and then offset by d-th bias.

12. Recurrent Neural Network

13. Long Short-Term Memory


Similarity Measure

1. Pearson Correlation

  • ํ”ผ์–ด์Šจ ์ƒ๊ด€๊ณ„์ˆ˜(Pearson Correlation): -1 ~ 1 ์‚ฌ์ด์˜ ๊ฐ€๋Šฅํ•œ ์œ ์‚ฌ๋„(Possible similarity values between -1 and 1)

  • image
    • 1์— ๊ฐ€๊นŒ์šธ ์ˆ˜๋ก ์–‘์˜ ์ƒ๊ด€๊ด€๊ณ„
    • -1์— ๊ฐ€๊นŒ์šธ ์ˆ˜๋ก ์Œ์˜ ์ƒ๊ด€๊ด€๊ณ„
    • 0์— ๊ฐ€๊นŒ์šธ ์ˆ˜๋ก ์ƒ๊ด€๊ด€๊ณ„ ์—†์Œ

2. Cosine Similarity

  • ์ฝ”์‚ฌ์ธ ์œ ์‚ฌ๋„: ๋ฒกํ„ฐ ๊ฐ„์˜ ๊ฐ๋„๋ฅผ ์ธก์ •ํ•ด์„œ ์œ ์‚ฌ๋„ ๊ณ„์‚ฐ
    • ๋‚ด์  ๊ณต์‹: AยทB = ||A|| * ||B|| * cosฮธ
    • ์œ„ ๊ณต์‹์„ ํ†ตํ•ด cosฮธ = AยทB / (||A|| * ||B||) ๋ฅผ ๊ตฌํ•  ์ˆ˜ ์žˆ์Œ
  • image

Evaluation Metrics

1. Clustering

  • Adjusted Rand Index
    • image

2. Classification

  • ํ˜ผ๋™ํ–‰๋ ฌ(Confusion matrix)
    • ํ˜ผ๋™ํ–‰๋ ฌ: ์˜ˆ์ธก๊ฐ’์ด ์‹ค์ œ๊ฐ’๊ณผ ์ผ์น˜ํ•˜๋Š”์ง€ ์—ฌ๋ถ€์— ๋”ฐ๋ผ ๋ถ„๋ฅ˜ํ•œ ํ‘œ(a table that categorizes predictions according to whether they match the actual value)
    • The most common performance measures consider the model's ability to discern one class versus all others
      • The class of interest is known as the positive
      • All others are known as negative
    • The relationship between the positive class and negative class predictions can be depicted as a 2 x 2 confusion matrix
      • True Positive(TP): Correctly classfied as the class of interest
      • True Negative(TN): Correctly classified as not the class of interest
      • False Positive(FP): Incorrectly classified as the class of interest
      • False Negative(FN): Incorrectly classified as not the class of interest
    • image
      • T์™€ F์˜ ๊ฒฝ์šฐ, True(์ฐธ)์™€ False(๊ฑฐ์ง“)์„ ๋‚˜ํƒ€๋‚ด๋ฉฐ, ์˜ˆ์ธก๊ฐ’๊ณผ ์‹ค์ œ๊ฐ’์ด ์ผ์น˜ํ•˜๋Š” ๊ฒฝ์šฐ T๊ฐ€ ์˜ค๊ณ  ์˜ˆ์ธก๊ฐ’๊ณผ ์‹ค์ œ๊ฐ’์ด ๋‹ค๋ฅธ ๊ฒฝ์šฐ F๊ฐ€ ์˜ด
      • P์™€ N์˜ ๊ฒฝ์šฐ, Positive(๊ธ์ •)์™€ Negative(๋ถ€์ •)์„ ๋‚˜ํƒ€๋‚ด๋ฉฐ, ์˜ˆ์ธก๊ฐ’์ด ์–‘์„ฑ ํด๋ž˜์Šค(1)์„ ๋‚˜ํƒ€๋‚ด๋Š” ๊ฒฝ์šฐ P๊ฐ€ ์˜ค๊ณ  ์˜ˆ์ธก๊ฐ’์ด ์Œ์„ฑ ํด๋ž˜์Šค(0)์„ ๋‚˜ํƒ€๋‚ด๋Š” ๊ฒฝ์šฐ N์ด ์˜ด
      • e.g., ์˜ˆ์ธก๊ฐ’=0, ์‹ค์ œ๊ฐ’=0์ธ ๊ฒฝ์šฐ, TN
      • e.g., ์˜ˆ์ธก๊ฐ’=1, ์‹ค์ œ๊ฐ’=0์ธ ๊ฒฝ์šฐ, FP
  • ์ •ํ™•๋„(Accuracy): 2 x 2 ํ˜ผ๋™ํ–‰๋ ฌ์—์„œ, ์•„๋ž˜์™€ ๊ฐ™์ด ์ •ํ™•๋„๋ฅผ ์ˆ˜์‹ํ™”ํ•  ์ˆ˜ ์žˆ์Œ
    • image
  • ์˜ค๋ถ„๋ฅ˜์œจ(Error rate): ์˜ค๋ถ„๋ฅ˜์œจ์€ 1์—์„œ ์ •ํ™•๋„๋ฅผ ๋นผ๋ฉด ๋จ
    • image
  • ์ •๋ฐ€๋„(Precision): ์ •๋ฐ€๋„๋Š” ๋ชจ๋ธ์˜ ์˜ˆ์ธก๊ฐ’์ด ๊ธ์ •์ธ ๊ฒƒ๋“ค ์ค‘ ์‹ค์ œ๊ฐ’์ด ๊ธ์ •์ธ ๋น„์œจ์„ ๋‚˜ํƒ€๋ƒ„
    • image
    • ์ •๋ฐ€๋„๋Š” ์žฌํ˜„์œจ๊ณผ ํ—ท๊ฐˆ๋ฆฌ๊ธฐ ์‰ฌ์šด๋ฐ, ์˜ˆ์ธก๊ฐ’์ด ๊ธ์ •์ด๋ผ๋Š” ํ‚ค์›Œ๋“œ๋ฅผ ๊ธฐ์–ตํ•˜๋ฉด ๋ถ„๋ชจ์˜ ์ˆ˜์‹์ธ TP + FP๋ฅผ ๊ธฐ์–ตํ•˜๊ธฐ ์‰ฌ์›€
  • ์žฌํ˜„์œจ(Recall): ์žฌํ˜„์œจ์€ ์‹ค์ œ๊ฐ’์ด ๊ธ์ •์ธ ๊ฒƒ๋“ค ์ค‘ ์˜ˆ์ธก๊ฐ’์ด ๊ธ์ •์ธ ๋น„์œจ์„ ๋‚˜ํƒ€๋ƒ„
    • image
    • ์žฌํ˜„์œจ์€ ์ •๋ฐ€๋„์™€ ํ—ท๊ฐˆ๋ฆฌ๊ธฐ ์‰ฌ์šด๋ฐ, ์‹ค์ œ๊ฐ’์ด ๊ธ์ •์ด๋ผ๋Š” ํ‚ค์›Œ๋“œ๋ฅผ ๊ธฐ์–ตํ•˜๋ฉด ๋ถ„๋ชจ์˜ ์ˆ˜์‹์ธ TP + FN์„ ๊ธฐ์–ตํ•˜๊ธฐ ์‰ฌ์›€
  • F ์ ์ˆ˜(F-Score): ์ •๋ฐ€๋„์™€ ์žฌํ˜„์œจ์˜ ์กฐํ™”ํ‰๊ท 

3. Regression

  • ํ‰๊ท  ์ œ๊ณฑ ์˜ค์ฐจ(Mean Squared Error, MSE)
  • ํ‰๊ท  ์ ˆ๋Œ€ ์˜ค์ฐจ(Mean absolute error, MAE)

Etc

  • Overfitting / Underfitting

    • There is a tradeoff between a model's ability to minimize bias and variance.
    • Overfitting: High variance, Low bias
    • Underfitting: High bias, Low variance
  • Regularization

    • L1 Regularization(Lasso)
    • L2 Regularization(Ridge)
  • Optimization

    • Gradient Descent
      • Stochastic Gradient Descent(SGD)
      • Batch Gradient Descent(BGD)
      • Mini-batch gradient descent(MSGD)
    • Backpropagation