๐Ÿšดโ€โ™‚๏ธ
TIL
  • MAIN
  • : TIL?
  • : WIL
  • : Plan
  • : Retrospective
    • 21Y
      • Wait a moment!
      • 9M 2W
      • 9M1W
      • 8M4W
      • 8M3W
      • 8M2W
      • 8M1W
      • 7M4W
      • 7M3W
      • 7M2W
      • 7M1W
      • 6M5W
      • 1H
    • ์ƒˆ์‚ฌ๋žŒ ๋˜๊ธฐ ํ”„๋กœ์ ํŠธ
      • 2ํšŒ์ฐจ
      • 1ํšŒ์ฐจ
  • TIL : ML
    • Paper Analysis
      • BERT
      • Transformer
    • Boostcamp 2st
      • [S]Data Viz
        • (4-3) Seaborn ์‹ฌํ™”
        • (4-2) Seaborn ๊ธฐ์ดˆ
        • (4-1) Seaborn ์†Œ๊ฐœ
        • (3-4) More Tips
        • (3-3) Facet ์‚ฌ์šฉํ•˜๊ธฐ
        • (3-2) Color ์‚ฌ์šฉํ•˜๊ธฐ
        • (3-1) Text ์‚ฌ์šฉํ•˜๊ธฐ
        • (2-3) Scatter Plot ์‚ฌ์šฉํ•˜๊ธฐ
        • (2-2) Line Plot ์‚ฌ์šฉํ•˜๊ธฐ
        • (2-1) Bar Plot ์‚ฌ์šฉํ•˜๊ธฐ
        • (1-3) Python๊ณผ Matplotlib
        • (1-2) ์‹œ๊ฐํ™”์˜ ์š”์†Œ
        • (1-1) Welcome to Visualization (OT)
      • [P]MRC
        • (2๊ฐ•) Extraction-based MRC
        • (1๊ฐ•) MRC Intro & Python Basics
      • [P]KLUE
        • (5๊ฐ•) BERT ๊ธฐ๋ฐ˜ ๋‹จ์ผ ๋ฌธ์žฅ ๋ถ„๋ฅ˜ ๋ชจ๋ธ ํ•™์Šต
        • (4๊ฐ•) ํ•œ๊ตญ์–ด BERT ์–ธ์–ด ๋ชจ๋ธ ํ•™์Šต
        • [NLP] ๋ฌธ์žฅ ๋‚ด ๊ฐœ์ฒด๊ฐ„ ๊ด€๊ณ„ ์ถ”์ถœ
        • (3๊ฐ•) BERT ์–ธ์–ด๋ชจ๋ธ ์†Œ๊ฐœ
        • (2๊ฐ•) ์ž์—ฐ์–ด์˜ ์ „์ฒ˜๋ฆฌ
        • (1๊ฐ•) ์ธ๊ณต์ง€๋Šฅ๊ณผ ์ž์—ฐ์–ด ์ฒ˜๋ฆฌ
      • [U]Stage-CV
      • [U]Stage-NLP
        • 7W Retrospective
        • (10๊ฐ•) Advanced Self-supervised Pre-training Models
        • (09๊ฐ•) Self-supervised Pre-training Models
        • (08๊ฐ•) Transformer (2)
        • (07๊ฐ•) Transformer (1)
        • 6W Retrospective
        • (06๊ฐ•) Beam Search and BLEU score
        • (05๊ฐ•) Sequence to Sequence with Attention
        • (04๊ฐ•) LSTM and GRU
        • (03๊ฐ•) Recurrent Neural Network and Language Modeling
        • (02๊ฐ•) Word Embedding
        • (01๊ฐ•) Intro to NLP, Bag-of-Words
        • [ํ•„์ˆ˜ ๊ณผ์ œ 4] Preprocessing for NMT Model
        • [ํ•„์ˆ˜ ๊ณผ์ œ 3] Subword-level Language Model
        • [ํ•„์ˆ˜ ๊ณผ์ œ2] RNN-based Language Model
        • [์„ ํƒ ๊ณผ์ œ] BERT Fine-tuning with transformers
        • [ํ•„์ˆ˜ ๊ณผ์ œ] Data Preprocessing
      • Mask Wear Image Classification
        • 5W Retrospective
        • Report_Level1_6
        • Performance | Review
        • DAY 11 : HardVoting | MultiLabelClassification
        • DAY 10 : Cutmix
        • DAY 9 : Loss Function
        • DAY 8 : Baseline
        • DAY 7 : Class Imbalance | Stratification
        • DAY 6 : Error Fix
        • DAY 5 : Facenet | Save
        • DAY 4 : VIT | F1_Loss | LrScheduler
        • DAY 3 : DataSet/Lodaer | EfficientNet
        • DAY 2 : Labeling
        • DAY 1 : EDA
        • 2_EDA Analysis
      • [P]Stage-1
        • 4W Retrospective
        • (10๊ฐ•) Experiment Toolkits & Tips
        • (9๊ฐ•) Ensemble
        • (8๊ฐ•) Training & Inference 2
        • (7๊ฐ•) Training & Inference 1
        • (6๊ฐ•) Model 2
        • (5๊ฐ•) Model 1
        • (4๊ฐ•) Data Generation
        • (3๊ฐ•) Dataset
        • (2๊ฐ•) Image Classification & EDA
        • (1๊ฐ•) Competition with AI Stages!
      • [U]Stage-3
        • 3W Retrospective
        • PyTorch
          • (10๊ฐ•) PyTorch Troubleshooting
          • (09๊ฐ•) Hyperparameter Tuning
          • (08๊ฐ•) Multi-GPU ํ•™์Šต
          • (07๊ฐ•) Monitoring tools for PyTorch
          • (06๊ฐ•) ๋ชจ๋ธ ๋ถˆ๋Ÿฌ์˜ค๊ธฐ
          • (05๊ฐ•) Dataset & Dataloader
          • (04๊ฐ•) AutoGrad & Optimizer
          • (03๊ฐ•) PyTorch ํ”„๋กœ์ ํŠธ ๊ตฌ์กฐ ์ดํ•ดํ•˜๊ธฐ
          • (02๊ฐ•) PyTorch Basics
          • (01๊ฐ•) Introduction to PyTorch
      • [U]Stage-2
        • 2W Retrospective
        • DL Basic
          • (10๊ฐ•) Generative Models 2
          • (09๊ฐ•) Generative Models 1
          • (08๊ฐ•) Sequential Models - Transformer
          • (07๊ฐ•) Sequential Models - RNN
          • (06๊ฐ•) Computer Vision Applications
          • (05๊ฐ•) Modern CNN - 1x1 convolution์˜ ์ค‘์š”์„ฑ
          • (04๊ฐ•) Convolution์€ ๋ฌด์—‡์ธ๊ฐ€?
          • (03๊ฐ•) Optimization
          • (02๊ฐ•) ๋‰ด๋Ÿด ๋„คํŠธ์›Œํฌ - MLP (Multi-Layer Perceptron)
          • (01๊ฐ•) ๋”ฅ๋Ÿฌ๋‹ ๊ธฐ๋ณธ ์šฉ์–ด ์„ค๋ช… - Historical Review
        • Assignment
          • [ํ•„์ˆ˜ ๊ณผ์ œ] Multi-headed Attention Assignment
          • [ํ•„์ˆ˜ ๊ณผ์ œ] LSTM Assignment
          • [ํ•„์ˆ˜ ๊ณผ์ œ] CNN Assignment
          • [ํ•„์ˆ˜ ๊ณผ์ œ] Optimization Assignment
          • [ํ•„์ˆ˜ ๊ณผ์ œ] MLP Assignment
      • [U]Stage-1
        • 1W Retrospective
        • AI Math
          • (AI Math 10๊ฐ•) RNN ์ฒซ๊ฑธ์Œ
          • (AI Math 9๊ฐ•) CNN ์ฒซ๊ฑธ์Œ
          • (AI Math 8๊ฐ•) ๋ฒ ์ด์ฆˆ ํ†ต๊ณ„ํ•™ ๋ง›๋ณด๊ธฐ
          • (AI Math 7๊ฐ•) ํ†ต๊ณ„ํ•™ ๋ง›๋ณด๊ธฐ
          • (AI Math 6๊ฐ•) ํ™•๋ฅ ๋ก  ๋ง›๋ณด๊ธฐ
          • (AI Math 5๊ฐ•) ๋”ฅ๋Ÿฌ๋‹ ํ•™์Šต๋ฐฉ๋ฒ• ์ดํ•ดํ•˜๊ธฐ
          • (AI Math 4๊ฐ•) ๊ฒฝ์‚ฌํ•˜๊ฐ•๋ฒ• - ๋งค์šด๋ง›
          • (AI Math 3๊ฐ•) ๊ฒฝ์‚ฌํ•˜๊ฐ•๋ฒ• - ์ˆœํ•œ๋ง›
          • (AI Math 2๊ฐ•) ํ–‰๋ ฌ์ด ๋ญ์˜ˆ์š”?
          • (AI Math 1๊ฐ•) ๋ฒกํ„ฐ๊ฐ€ ๋ญ์˜ˆ์š”?
        • Python
          • (Python 7-2๊ฐ•) pandas II
          • (Python 7-1๊ฐ•) pandas I
          • (Python 6๊ฐ•) numpy
          • (Python 5-2๊ฐ•) Python data handling
          • (Python 5-1๊ฐ•) File / Exception / Log Handling
          • (Python 4-2๊ฐ•) Module and Project
          • (Python 4-1๊ฐ•) Python Object Oriented Programming
          • (Python 3-2๊ฐ•) Pythonic code
          • (Python 3-1๊ฐ•) Python Data Structure
          • (Python 2-4๊ฐ•) String and advanced function concept
          • (Python 2-3๊ฐ•) Conditionals and Loops
          • (Python 2-2๊ฐ•) Function and Console I/O
          • (Python 2-1๊ฐ•) Variables
          • (Python 1-3๊ฐ•) ํŒŒ์ด์ฌ ์ฝ”๋”ฉ ํ™˜๊ฒฝ
          • (Python 1-2๊ฐ•) ํŒŒ์ด์ฌ ๊ฐœ์š”
          • (Python 1-1๊ฐ•) Basic computer class for newbies
        • Assignment
          • [์„ ํƒ ๊ณผ์ œ 3] Maximum Likelihood Estimate
          • [์„ ํƒ ๊ณผ์ œ 2] Backpropagation
          • [์„ ํƒ ๊ณผ์ œ 1] Gradient Descent
          • [ํ•„์ˆ˜ ๊ณผ์ œ 5] Morsecode
          • [ํ•„์ˆ˜ ๊ณผ์ œ 4] Baseball
          • [ํ•„์ˆ˜ ๊ณผ์ œ 3] Text Processing 2
          • [ํ•„์ˆ˜ ๊ณผ์ œ 2] Text Processing 1
          • [ํ•„์ˆ˜ ๊ณผ์ œ 1] Basic Math
    • ๋”ฅ๋Ÿฌ๋‹ CNN ์™„๋ฒฝ ๊ฐ€์ด๋“œ - Fundamental ํŽธ
      • ์ข…ํ•ฉ ์‹ค์Šต 2 - ์บ๊ธ€ Plant Pathology(๋‚˜๋ฌด์žŽ ๋ณ‘ ์ง„๋‹จ) ๊ฒฝ์—ฐ ๋Œ€ํšŒ
      • ์ข…ํ•ฉ ์‹ค์Šต 1 - 120์ข…์˜ Dog Breed Identification ๋ชจ๋ธ ์ตœ์ ํ™”
      • ์‚ฌ์ „ ํ›ˆ๋ จ ๋ชจ๋ธ์˜ ๋ฏธ์„ธ ์กฐ์ • ํ•™์Šต๊ณผ ๋‹ค์–‘ํ•œ Learning Rate Scheduler์˜ ์ ์šฉ
      • Advanced CNN ๋ชจ๋ธ ํŒŒํ—ค์น˜๊ธฐ - ResNet ์ƒ์„ธ์™€ EfficientNet ๊ฐœ์š”
      • Advanced CNN ๋ชจ๋ธ ํŒŒํ—ค์น˜๊ธฐ - AlexNet, VGGNet, GoogLeNet
      • Albumentation์„ ์ด์šฉํ•œ Augmentation๊ธฐ๋ฒ•๊ณผ Keras Sequence ํ™œ์šฉํ•˜๊ธฐ
      • ์‚ฌ์ „ ํ›ˆ๋ จ CNN ๋ชจ๋ธ์˜ ํ™œ์šฉ๊ณผ Keras Generator ๋ฉ”์ปค๋‹ˆ์ฆ˜ ์ดํ•ด
      • ๋ฐ์ดํ„ฐ ์ฆ๊ฐ•์˜ ์ดํ•ด - Keras ImageDataGenerator ํ™œ์šฉ
      • CNN ๋ชจ๋ธ ๊ตฌํ˜„ ๋ฐ ์„ฑ๋Šฅ ํ–ฅ์ƒ ๊ธฐ๋ณธ ๊ธฐ๋ฒ• ์ ์šฉํ•˜๊ธฐ
    • AI School 1st
    • ํ˜„์—… ์‹ค๋ฌด์ž์—๊ฒŒ ๋ฐฐ์šฐ๋Š” Kaggle ๋จธ์‹ ๋Ÿฌ๋‹ ์ž…๋ฌธ
    • ํŒŒ์ด์ฌ ๋”ฅ๋Ÿฌ๋‹ ํŒŒ์ดํ† ์น˜
  • TIL : Python & Math
    • Do It! ์žฅ๊ณ +๋ถ€ํŠธ์ŠคํŠธ๋žฉ: ํŒŒ์ด์ฌ ์›น๊ฐœ๋ฐœ์˜ ์ •์„
      • Relations - ๋‹ค๋Œ€๋‹ค ๊ด€๊ณ„
      • Relations - ๋‹ค๋Œ€์ผ ๊ด€๊ณ„
      • ํ…œํ”Œ๋ฆฟ ํŒŒ์ผ ๋ชจ๋“ˆํ™” ํ•˜๊ธฐ
      • TDD (Test Driven Development)
      • template tags & ์กฐ๊ฑด๋ฌธ
      • ์ •์  ํŒŒ์ผ(static files) & ๋ฏธ๋””์–ด ํŒŒ์ผ(media files)
      • FBV (Function Based View)์™€ CBV (Class Based View)
      • Django ์ž…๋ฌธํ•˜๊ธฐ
      • ๋ถ€ํŠธ์ŠคํŠธ๋žฉ
      • ํ”„๋ก ํŠธ์—”๋“œ ๊ธฐ์ดˆ๋‹ค์ง€๊ธฐ (HTML, CSS, JS)
      • ๋“ค์–ด๊ฐ€๊ธฐ + ํ™˜๊ฒฝ์„ค์ •
    • Algorithm
      • Programmers
        • Level1
          • ์†Œ์ˆ˜ ๋งŒ๋“ค๊ธฐ
          • ์ˆซ์ž ๋ฌธ์ž์—ด๊ณผ ์˜๋‹จ์–ด
          • ์ž์—ฐ์ˆ˜ ๋’ค์ง‘์–ด ๋ฐฐ์—ด๋กœ ๋งŒ๋“ค๊ธฐ
          • ์ •์ˆ˜ ๋‚ด๋ฆผ์ฐจ์ˆœ์œผ๋กœ ๋ฐฐ์น˜ํ•˜๊ธฐ
          • ์ •์ˆ˜ ์ œ๊ณฑ๊ทผ ํŒ๋ณ„
          • ์ œ์ผ ์ž‘์€ ์ˆ˜ ์ œ๊ฑฐํ•˜๊ธฐ
          • ์ง์‚ฌ๊ฐํ˜• ๋ณ„์ฐ๊ธฐ
          • ์ง์ˆ˜์™€ ํ™€์ˆ˜
          • ์ฒด์œก๋ณต
          • ์ตœ๋Œ€๊ณต์•ฝ์ˆ˜์™€ ์ตœ์†Œ๊ณต๋ฐฐ์ˆ˜
          • ์ฝœ๋ผ์ธ  ์ถ”์ธก
          • ํฌ๋ ˆ์ธ ์ธํ˜•๋ฝ‘๊ธฐ ๊ฒŒ์ž„
          • ํ‚คํŒจ๋“œ ๋ˆ„๋ฅด๊ธฐ
          • ํ‰๊ท  ๊ตฌํ•˜๊ธฐ
          • ํฐ์ผ“๋ชฌ
          • ํ•˜์ƒค๋“œ ์ˆ˜
          • ํ•ธ๋“œํฐ ๋ฒˆํ˜ธ ๊ฐ€๋ฆฌ๊ธฐ
          • ํ–‰๋ ฌ์˜ ๋ง์…ˆ
        • Level2
          • ์ˆซ์ž์˜ ํ‘œํ˜„
          • ์ˆœ์œ„ ๊ฒ€์ƒ‰
          • ์ˆ˜์‹ ์ตœ๋Œ€ํ™”
          • ์†Œ์ˆ˜ ์ฐพ๊ธฐ
          • ์†Œ์ˆ˜ ๋งŒ๋“ค๊ธฐ
          • ์‚ผ๊ฐ ๋‹ฌํŒฝ์ด
          • ๋ฌธ์ž์—ด ์••์ถ•
          • ๋ฉ”๋‰ด ๋ฆฌ๋‰ด์–ผ
          • ๋” ๋งต๊ฒŒ
          • ๋•…๋”ฐ๋จน๊ธฐ
          • ๋ฉ€์ฉกํ•œ ์‚ฌ๊ฐํ˜•
          • ๊ด„ํ˜ธ ํšŒ์ „ํ•˜๊ธฐ
          • ๊ด„ํ˜ธ ๋ณ€ํ™˜
          • ๊ตฌ๋ช…๋ณดํŠธ
          • ๊ธฐ๋Šฅ ๊ฐœ๋ฐœ
          • ๋‰ด์Šค ํด๋Ÿฌ์Šคํ„ฐ๋ง
          • ๋‹ค๋ฆฌ๋ฅผ ์ง€๋‚˜๋Š” ํŠธ๋Ÿญ
          • ๋‹ค์Œ ํฐ ์ˆซ์ž
          • ๊ฒŒ์ž„ ๋งต ์ตœ๋‹จ๊ฑฐ๋ฆฌ
          • ๊ฑฐ๋ฆฌ๋‘๊ธฐ ํ™•์ธํ•˜๊ธฐ
          • ๊ฐ€์žฅ ํฐ ์ •์‚ฌ๊ฐํ˜• ์ฐพ๊ธฐ
          • H-Index
          • JadenCase ๋ฌธ์ž์—ด ๋งŒ๋“ค๊ธฐ
          • N๊ฐœ์˜ ์ตœ์†Œ๊ณต๋ฐฐ์ˆ˜
          • N์ง„์ˆ˜ ๊ฒŒ์ž„
          • ๊ฐ€์žฅ ํฐ ์ˆ˜
          • 124 ๋‚˜๋ผ์˜ ์ˆซ์ž
          • 2๊ฐœ ์ดํ•˜๋กœ ๋‹ค๋ฅธ ๋น„ํŠธ
          • [3์ฐจ] ํŒŒ์ผ๋ช… ์ •๋ ฌ
          • [3์ฐจ] ์••์ถ•
          • ์ค„ ์„œ๋Š” ๋ฐฉ๋ฒ•
          • [3์ฐจ] ๋ฐฉ๊ธˆ ๊ทธ๊ณก
          • ๊ฑฐ๋ฆฌ๋‘๊ธฐ ํ™•์ธํ•˜๊ธฐ
        • Level3
          • ๋งค์นญ ์ ์ˆ˜
          • ์™ธ๋ฒฝ ์ ๊ฒ€
          • ๊ธฐ์ง€๊ตญ ์„ค์น˜
          • ์ˆซ์ž ๊ฒŒ์ž„
          • 110 ์˜ฎ๊ธฐ๊ธฐ
          • ๊ด‘๊ณ  ์ œ๊ฑฐ
          • ๊ธธ ์ฐพ๊ธฐ ๊ฒŒ์ž„
          • ์…”ํ‹€๋ฒ„์Šค
          • ๋‹จ์†์นด๋ฉ”๋ผ
          • ํ‘œ ํŽธ์ง‘
          • N-Queen
          • ์ง•๊ฒ€๋‹ค๋ฆฌ ๊ฑด๋„ˆ๊ธฐ
          • ์ตœ๊ณ ์˜ ์ง‘ํ•ฉ
          • ํ•ฉ์Šน ํƒ์‹œ ์š”๊ธˆ
          • ๊ฑฐ์Šค๋ฆ„๋ˆ
          • ํ•˜๋…ธ์ด์˜ ํƒ‘
          • ๋ฉ€๋ฆฌ ๋›ฐ๊ธฐ
          • ๋ชจ๋‘ 0์œผ๋กœ ๋งŒ๋“ค๊ธฐ
        • Level4
    • Head First Python
    • ๋ฐ์ดํ„ฐ ๋ถ„์„์„ ์œ„ํ•œ SQL
    • ๋‹จ ๋‘ ์žฅ์˜ ๋ฌธ์„œ๋กœ ๋ฐ์ดํ„ฐ ๋ถ„์„๊ณผ ์‹œ๊ฐํ™” ๋ฝ€๊ฐœ๊ธฐ
    • Linear Algebra(Khan Academy)
    • ์ธ๊ณต์ง€๋Šฅ์„ ์œ„ํ•œ ์„ ํ˜•๋Œ€์ˆ˜
    • Statistics110
  • TIL : etc
    • [๋”ฐ๋ฐฐ๋Ÿฐ] Kubernetes
    • [๋”ฐ๋ฐฐ๋Ÿฐ] Docker
      • 2. ๋„์ปค ์„ค์น˜ ์‹ค์Šต 1 - ํ•™์ŠตํŽธ(์ค€๋น„๋ฌผ/์‹ค์Šต ์œ ํ˜• ์†Œ๊ฐœ)
      • 1. ์ปจํ…Œ์ด๋„ˆ์™€ ๋„์ปค์˜ ์ดํ•ด - ์ปจํ…Œ์ด๋„ˆ๋ฅผ ์“ฐ๋Š”์ด์œ  / ์ผ๋ฐ˜ํ”„๋กœ๊ทธ๋žจ๊ณผ ์ปจํ…Œ์ด๋„ˆํ”„๋กœ๊ทธ๋žจ์˜ ์ฐจ์ด์ 
      • 0. ๋“œ๋””์–ด ์ฐพ์•„์˜จ Docker ๊ฐ•์˜! ์™•์ดˆ๋ณด์—์„œ ๋„์ปค ๋งˆ์Šคํ„ฐ๋กœ - OT
    • CoinTrading
      • [๊ฐ€์ƒ ํ™”ํ ์ž๋™ ๋งค๋งค ํ”„๋กœ๊ทธ๋žจ] ๋ฐฑํ…Œ์ŠคํŒ… : ๊ฐ„๋‹จํ•œ ํ…Œ์ŠคํŒ…
    • Gatsby
      • 01 ๊นƒ๋ถ ํฌ๊ธฐ ์„ ์–ธ
  • TIL : Project
    • Mask Wear Image Classification
    • Project. GARIGO
  • 2021 TIL
    • CHANGED
    • JUN
      • 30 Wed
      • 29 Tue
      • 28 Mon
      • 27 Sun
      • 26 Sat
      • 25 Fri
      • 24 Thu
      • 23 Wed
      • 22 Tue
      • 21 Mon
      • 20 Sun
      • 19 Sat
      • 18 Fri
      • 17 Thu
      • 16 Wed
      • 15 Tue
      • 14 Mon
      • 13 Sun
      • 12 Sat
      • 11 Fri
      • 10 Thu
      • 9 Wed
      • 8 Tue
      • 7 Mon
      • 6 Sun
      • 5 Sat
      • 4 Fri
      • 3 Thu
      • 2 Wed
      • 1 Tue
    • MAY
      • 31 Mon
      • 30 Sun
      • 29 Sat
      • 28 Fri
      • 27 Thu
      • 26 Wed
      • 25 Tue
      • 24 Mon
      • 23 Sun
      • 22 Sat
      • 21 Fri
      • 20 Thu
      • 19 Wed
      • 18 Tue
      • 17 Mon
      • 16 Sun
      • 15 Sat
      • 14 Fri
      • 13 Thu
      • 12 Wed
      • 11 Tue
      • 10 Mon
      • 9 Sun
      • 8 Sat
      • 7 Fri
      • 6 Thu
      • 5 Wed
      • 4 Tue
      • 3 Mon
      • 2 Sun
      • 1 Sat
    • APR
      • 30 Fri
      • 29 Thu
      • 28 Wed
      • 27 Tue
      • 26 Mon
      • 25 Sun
      • 24 Sat
      • 23 Fri
      • 22 Thu
      • 21 Wed
      • 20 Tue
      • 19 Mon
      • 18 Sun
      • 17 Sat
      • 16 Fri
      • 15 Thu
      • 14 Wed
      • 13 Tue
      • 12 Mon
      • 11 Sun
      • 10 Sat
      • 9 Fri
      • 8 Thu
      • 7 Wed
      • 6 Tue
      • 5 Mon
      • 4 Sun
      • 3 Sat
      • 2 Fri
      • 1 Thu
    • MAR
      • 31 Wed
      • 30 Tue
      • 29 Mon
      • 28 Sun
      • 27 Sat
      • 26 Fri
      • 25 Thu
      • 24 Wed
      • 23 Tue
      • 22 Mon
      • 21 Sun
      • 20 Sat
      • 19 Fri
      • 18 Thu
      • 17 Wed
      • 16 Tue
      • 15 Mon
      • 14 Sun
      • 13 Sat
      • 12 Fri
      • 11 Thu
      • 10 Wed
      • 9 Tue
      • 8 Mon
      • 7 Sun
      • 6 Sat
      • 5 Fri
      • 4 Thu
      • 3 Wed
      • 2 Tue
      • 1 Mon
    • FEB
      • 28 Sun
      • 27 Sat
      • 26 Fri
      • 25 Thu
      • 24 Wed
      • 23 Tue
      • 22 Mon
      • 21 Sun
      • 20 Sat
      • 19 Fri
      • 18 Thu
      • 17 Wed
      • 16 Tue
      • 15 Mon
      • 14 Sun
      • 13 Sat
      • 12 Fri
      • 11 Thu
      • 10 Wed
      • 9 Tue
      • 8 Mon
      • 7 Sun
      • 6 Sat
      • 5 Fri
      • 4 Thu
      • 3 Wed
      • 2 Tue
      • 1 Mon
    • JAN
      • 31 Sun
      • 30 Sat
      • 29 Fri
      • 28 Thu
      • 27 Wed
      • 26 Tue
      • 25 Mon
      • 24 Sun
      • 23 Sat
      • 22 Fri
      • 21 Thu
      • 20 Wed
      • 19 Tue
      • 18 Mon
      • 17 Sun
      • 16 Sat
      • 15 Fri
      • 14 Thu
      • 13 Wed
      • 12 Tue
      • 11 Mon
      • 10 Sun
      • 9 Sat
      • 8 Fri
      • 7 Thu
      • 6 Wed
      • 5 Tue
      • 4 Mon
      • 3 Sun
      • 2 Sat
      • 1 Fri
  • 2020 TIL
    • DEC
      • 31 Thu
      • 30 Wed
      • 29 Tue
      • 28 Mon
      • 27 Sun
      • 26 Sat
      • 25 Fri
      • 24 Thu
      • 23 Wed
      • 22 Tue
      • 21 Mon
      • 20 Sun
      • 19 Sat
      • 18 Fri
      • 17 Thu
      • 16 Wed
      • 15 Tue
      • 14 Mon
      • 13 Sun
      • 12 Sat
      • 11 Fri
      • 10 Thu
      • 9 Wed
      • 8 Tue
      • 7 Mon
      • 6 Sun
      • 5 Sat
      • 4 Fri
      • 3 Tue
      • 2 Wed
      • 1 Tue
    • NOV
      • 30 Mon
Powered by GitBook
On this page
  • Lesson 2 - Image Classification & EDA
  • 0. Libraries & Configurations
  • 1. ์ด๋ฏธ์ง€ RGB์ •๋ณด, ์‚ฌ์ด์ฆˆ
  • 2. target๊ฐ’ y์— ๋Œ€ํ•œ ๋ถ„์„
  • 3. X, y ๊ด€๊ณ„ํ™•์ธ
  • 4. Reference

Was this helpful?

  1. TIL : ML
  2. Boostcamp 2st
  3. Mask Wear Image Classification

2_EDA Analysis

210824

์ €์ž‘๊ถŒ๋•Œ๋ฌธ์— ์ด๋ฏธ์ง€ ์‚ฌ์ง„์€ ๋ชจ๋‘ ์‚ญ์ œํ•ฉ๋‹ˆ๋‹ค.

Lesson 2 - Image Classification & EDA

  • 2๊ฐ•์—์„œ๋Š” ๋ฐ์ดํ„ฐ๋ฅผ ๋ถ„์„ํ•˜๋Š” ๊ณผ์ •์ธ EDA(Exploratory Data Analysis)์— ๋Œ€ํ•ด ์ง„ํ–‰๋˜์—ˆ์Šต๋‹ˆ๋‹ค. ๋ชจ๋ธ์„ ์„ค๊ณ„ํ•˜๋Š”๋ฐ ์žˆ์–ด ๋ฐ์ดํ„ฐ๋ฅผ ๋ถ„์„ํ•˜๋Š” ์ž‘์—…์€ ์ค‘์š”ํ•ฉ๋‹ˆ๋‹ค. ์ด ์‹ค์Šต ์ž๋ฃŒ์—์„œ๋Š” ๋งˆ์Šคํฌ ๋ฐ์ดํ„ฐ์…‹์„ ์ด์šฉํ•˜์—ฌ ๊ฐ„๋‹จํ•œ ๋ถ„์„ ๋ฐ ์‹œ๊ฐํ™”๋ฅผ ํ•ด๋ด…๋‹ˆ๋‹ค.

  • ๋งˆ์Šคํฌ ๋ฐ์ดํ„ฐ์…‹์—๋Š” ๋‹ค์–‘ํ•œ ์ •๋ณด๊ฐ€ ์กด์žฌํ•ฉ๋‹ˆ๋‹ค. ๋„“์€ ์‹œ์•ผ์—์„œ ๋ชจ๋“  ์‚ฌ๋žŒ์˜ ์ •๋ณด๋ฅผ ์ˆ˜์ง‘ํ•˜์—ฌ ์„ฑ๋ณ„๊ณผ ์—ฐ๋ น์— ๋Œ€ํ•œ ๋ถ„ํฌ๋ฅผ ๋ถ„์„ํ•  ์ˆ˜๋„ ์žˆ๊ณ  ์ด๋ฏธ์ง€ ๊ฐ’์˜ ๋ถ„ํฌ๋ฅผ ํŒŒ์•…ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ํ˜น์€ ๊ฐœ๋ณ„ ์ด๋ฏธ์ง€๋ฅผ ์‹œ๊ฐํ™”ํ•˜์—ฌ ์–ด๋– ํ•œ ๋ฐ์ดํ„ฐ๊ฐ€ ์žˆ๋Š”์ง€ ํƒ์ƒ‰ํ•  ์ˆ˜๋„ ์žˆ๊ณ  ๋งˆ์Šคํฌ์˜ ์œ ๋ฌด์— ๋”ฐ๋ผ ์ด๋ฏธ์ง€๊ฐ€ ์–ด๋–ป๊ฒŒ ๋‹ค๋ฅธ์ง€ ๋น„๊ตํ•ด๋ณผ ์ˆ˜๋„ ์žˆ๊ฒ ์ฃ . ์ด ์ฝ”๋“œ๋Š” ๋‹จ์ˆœํ•œ ์˜ˆ์‹œ์ด๋ฉฐ ์ด ๋ณด๋‹ค ๋” ๋งŽ์€ ๋ถ„์„์„ ์ž์œ ๋กญ๊ฒŒ ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค!

0. Libraries & Configurations

import os
import sys
from glob import glob
import numpy as np
import pandas as pd
import cv2
from PIL import Image
from tqdm.notebook import tqdm
from time import time

import matplotlib.pyplot as plt
import seaborn as sns
import multiprocessing as mp
  • os : ์šด์˜ ์ฒด์ œ์™€ ์ƒํ˜ธ ์ž‘์šฉํ•˜๊ธฐ ์œ„ํ•œ ์ˆ˜์‹ญ ๊ฐ€์ง€ ํ•จ์ˆ˜๋“ค์„ ์ œ๊ณตํ•œ๋‹ค. ์—ฌ๊ธฐ์„œ๋Š” ์ฃผ๋กœ ์ด๋ฏธ์ง€์™€ ํด๋”์˜ ๊ฒฝ๋กœ๋ฅผ ์ง€์ •ํ•ด์ฃผ๊ธฐ ์œ„ํ•œ ๋„๊ตฌ๋กœ ์‚ฌ์šฉํ•œ๋‹ค.

  • sys : ํŒŒ์ด์ฌ ์ธํ„ฐํ”„๋ฆฌํ„ฐ๊ฐ€ ์ œ๊ณตํ•˜๋Š” ๋ณ€์ˆ˜์™€ ํ•จ์ˆ˜๋ฅผ ์ง์ ‘ ์ œ์–ดํ•  ์ˆ˜ ์žˆ๊ฒŒ ํ•ด์ฃผ๋Š” ๋ชจ๋“ˆ์ด๋‹ค. ์—ฌ๊ธฐ์„œ๋Š” ์„ ์–ธ๋งŒ ํ•  ๋ฟ ์‚ฌ์šฉํ•˜์ง€๋Š” ์•Š์•˜๋‹ค.

  • glob : ํŠน์ •ํ•œ ํŒจํ„ด์„ ๊ฐ€์ง€๊ณ  ํŒŒ์ผ๋“ค์˜ ๋ฆฌ์ŠคํŠธ๋ฅผ ๋ฝ‘์„ ๋•Œ ์‚ฌ์šฉํ•œ๋‹ค. ์ด๋ฏธ์ง€๋ฅผ ์—ด๊ธฐ ์œ„ํ•ด ํŒŒ์ผ๋ช…์„ ๊ตฌํ•  ๋•Œ ์‚ฌ์šฉํ•  ๊ฒƒ์ด๋‹ค.

  • cv2 : Open Source Computer Vision Library์˜ ์•ฝ์–ด๋กœ ์˜คํ”ˆ์†Œ์Šค ์ปดํ“จํ„ฐ ๋น„์ „ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ์ด๋‹ค. ์ด๋ฏธ์ง€๋ฅผ ์กฐ์ž‘(์—ด๊ธฐ, ๋ณ€ํ™˜, ์ถœ๋ ฅ ๋“ฑ)ํ•  ์ˆ˜ ์žˆ๋‹ˆ๋‹ค. ์ด๋ฒˆ EDA์—์„œ ์ด๋ฏธ์ง€๋“ค์„ ์—ด๊ณ  ์ถœ๋ ฅํ•  ๋•Œ ์‚ฌ์šฉํ•  ๊ฒƒ์ด๋‹ค.

    • aistages ํ™˜๊ฒฝ์—์„œ๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์ด ์„ค์น˜ํ•ด์•ผ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋‹ค

apt-get install libgl1-mesa-glx 
  • PIL : Python Image Library์˜ ์•ฝ์ž์ด๋‹ค. ์ด๋ฏธ์ง€ ๋ถ„์„ ๋ฐ ์ฒ˜๋ฆฌ๋ฅผ ์‰ฝ๊ฒŒ ํ•  ์ˆ˜ ์žˆ๋Š” ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ์ด๋‹ค.

  • tqdm : ์ž‘์—…์ง„ํ–‰๋ฅ ์„ ์‹œ๊ฐ์ ์œผ๋กœ ํ‘œ์‹œํ•˜๊ธฐ ์œ„ํ•œ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ์ด๋‹ค.

    • tqdm.tqdm ์€ ์ง€์ €๋ถ„ํ•ด์„œ tqdm.notebook.tqdm ๋˜๋Š” tqdm.auto.tqdm ์„ ๋งŽ์ด ์‚ฌ์šฉํ•œ๋‹ค.

    • aistages ํ™˜๊ฒฝ์—์„œ๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์ด...

pip install ipywidgets
  • time : ์ปดํ“จํ„ฐ์—์„œ ์‹œ๊ฐ„์„ ์ธก์ •ํ•˜๊ธฐ ์œ„ํ•œ ๋ชจ๋“ˆ๋กœ ์‚ฌ์šฉ๋œ๋‹ค.

  • seaborn : matplotlib์—์„œ ๊ตฌํ˜„ํ•  ์ˆ˜ ์žˆ๋Š” ๊ธฐ๋Šฅ์„ ํŽธ๋ฆฌํ•˜๊ฒŒ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋„๋ก ๊ตฌํ˜„๋œ ์‹œ๊ฐํ™” ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ์ด๋‹ค.

    • aistages ํ™˜๊ฒฝ์—์„œ๋Š” ...

pip install seaborn

class cfg:
    data_dir = './input/data/train'
    img_dir = f'{data_dir}/images'
    df_path = f'{data_dir}/train.csv'

๋ฐ์ดํ„ฐ๋“ค์˜ ๊ฒฝ๋กœ๋ฅผ ๋ชจ์•„๋‘๋Š” ์ฃผ๋จธ๋‹ˆ๋กœ ํด๋ž˜์Šค๋ฅผ ์‚ฌ์šฉํ–ˆ๋‹ค. ๋‚ด๊ฐ€ ์ง€๊ธˆ๊นŒ์ง€ ๋ดค๋˜ ๋ฐฉ๋ฒ•์€ ๋Œ€๋ฌธ์ž ๋ณ€์ˆ˜๋กœ (like DATR_DIR ) ์„ ์–ธํ•˜๋Š” ๊ฒƒ์ด์—ˆ๋Š”๋ฐ, ์ด๋ ‡๊ฒŒ๋„ ํ•  ์ˆ˜ ์žˆ๋Š” ๊ฒƒ์„ ์•Œ์•˜๋‹ค.

์—ฌ๊ธฐ์„œ๋Š” ๋ฐ์ดํ„ฐ๊ฐ€ ๋“ค์–ด์žˆ๋Š” ๊ฒฝ๋กœ๋ฅผ ๊ธฐ๋ณธ ๊ฒฝ๋กœ๋กœ ๋จผ์ € ์„ ์–ธํ•˜๊ณ  ์—ฌ๊ธฐ์„œ ์ด๋ฏธ์ง€ ๋ฐ์ดํ„ฐ์™€ csvํŒŒ์ผ์— ๋Œ€ํ•œ ๊ฒฝ๋กœ๋ฅผ ๊ธฐ๋ณธ ๊ฒฝ๋กœ์— ๋ถ™์—ฌ์„œ ์„ ์–ธํ–ˆ๋‹ค.

๋‚ด๊ฐ€ ์•Œ๊ธฐ๋กœ๋Š” ๋ฌธ์ž์—ด๋กœ ๊ฒฝ๋กœ๋ฅผ ์ง์ ‘์ ์œผ๋กœ ์„ ์–ธํ•˜๋ฉด ๋ฌธ์ œ๊ฐ€ ์žˆ์„ ์ˆ˜๋„ ์žˆ๋‹ค. ์šด์˜์ฒด์ œ๊ฐ„์— ๊ฒฝ๋กœ ํ‘œ๊ธฐ๋ฒ•์ด / ์™€ \ ๋กœ ๋‚˜๋‰˜๊ธฐ ๋•Œ๋ฌธ. ๊ทธ๋ž˜์„œ ๊ธฐ๋ณธ์ ์œผ๋กœ ํ•ด๋‹น ํŒŒ์ผ(.ipynb ๋˜๋Š” .py)์ด ์žˆ๋Š” ๊ฒฝ๋กœ๋Š” os.path.dirname(os.path.abspath(__file__)) ๋กœ ์„ ์–ธํ•˜๊ณ  os.path.join ์œผ๋กœ ํ•˜์œ„ ๋””๋ ‰ํ† ๋ฆฌ๋‚˜ ํŒŒ์ผ์˜ ๊ฒฝ๋กœ๋ฅผ ์„ ์–ธํ•œ๋‹ค.

num2class = ['incorrect_mask', 'mask1', 'mask2', 'mask3',
             'mask4', 'mask5', 'normal']
class2num = {k: v for v, k in enumerate(num2class)}

df = pd.read_csv(cfg.df_path)
df.head()

id

gender

race

age

path

0

000001

female

Asian

45

000001_female_Asian_45

1

000002

female

Asian

52

000002_female_Asian_52

2

000004

male

Asian

54

000004_male_Asian_54

3

000005

female

Asian

58

000005_female_Asian_58

4

000006

female

Asian

59

000006_female_Asian_59

ํ•œ ์‚ฌ๋žŒ๋‹น 7์žฅ์˜ ์‚ฌ์ง„์„ ๊ฐ€์ง€๊ณ  ์žˆ๋‹ค. ๋ฏธ์ฐฉ์šฉ 1์žฅ, ์™„๋ฒฝํžˆ ๋ฏธ์ฐฉ์šฉ(ํ„ฑ์Šคํฌ) 1์žฅ, ์™„๋ฒฝํžˆ ์ฐฉ์šฉ(์ฝ”์Šคํฌ) 5์žฅ. ์ด๋Ÿฌํ•œ ๋ฒ”์ฃผํ˜• ๋ฐ์ดํ„ฐ๋ฅผ ์ด๋ฅผ 0๋ฒˆ๋ถ€ํ„ฐ 6๋ฒˆ๊นŒ์ง€ ์ „์ฒ˜๋ฆฌ ํ•ด์ฃผ๋Š” ๋ชจ์Šต์ด๋‹ค.

python์—์„œ๋Š” csvํŒŒ์ผ์„ pandas์˜ dataframe์œผ๋กœ ๋ถˆ๋Ÿฌ์˜ฌ ์ˆ˜ ์žˆ๋‹ค. ์ด๋ฅผ ์œ„ํ•ด pd.read_csv ๋ฅผ ์‚ฌ์šฉํ•ด์„œ ์ด์ „์— ํด๋ž˜์Šค๋กœ ์„ ์–ธํ•œ train.csv ์˜ ๊ฒฝ๋กœ๋ฅผ ๋ถˆ๋Ÿฌ์™”๋‹ค.

df.head() ๋Š” dateframe์˜ ๋ฐ์ดํ„ฐ๋ฅผ ์ฒ˜์Œ๋ถ€ํ„ฐ ๋ณด์—ฌ์ฃผ๋ฉฐ ์ธ์ž๋ฅผ ์ž…๋ ฅํ•˜์ง€ ์•Š์œผ๋ฉด deafult๋กœ 5๊ฐœ๋ฅผ ๋ณด์—ฌ์ค€๋‹ค. ๋์—์„œ๋ถ€ํ„ฐ ๋ณด์—ฌ์ฃผ๋Š” ๋ฐฉ๋ฒ•์œผ๋กœ๋Š” df.tail() ์ด ์žˆ๋‹ค.

1. ์ด๋ฏธ์ง€ RGB์ •๋ณด, ์‚ฌ์ด์ฆˆ

input์ด ๋  ์ด๋ฏธ์ง€์— ๋Œ€ํ•œ ๋ถ„์„์œผ๋กœ ์ด๋ฏธ์ง€์˜ ๊ฐ ์ฑ„๋„๋ณ„ ์ •๋ณด, ์‚ฌ์ด์ฆˆ, ๊ฐ์ฒด ์œ„์น˜๋“ฑ์„ ์ด์šฉํ•˜์—ฌ ์ด๋ฏธ์ง€์˜ ํŠน์„ฑ๋“ค์„ ์•Œ์•„๋ด…์‹œ๋‹ค.

1.1 Dataset Statistics

  • ์—ฌ๊ธฐ์—์„  ์ „์ฒด ์ด๋ฏธ์ง€์— ๋Œ€ํ•ด์„œ ์ด๋ฏธ์ง€์˜ ๊ฐœ์ˆ˜์™€ ํฌ๊ธฐ, R, G, B ๊ฐ’์˜ ํ‰๊ท ๊ณผ ํ‘œ์ค€ํŽธ์ฐจ๋ฅผ ๊ณ„์‚ฐํ•ฉ๋‹ˆ๋‹ค.

def get_ext(img_dir, img_id):
    """
    ํ•™์Šต ๋ฐ์ดํ„ฐ์…‹ ์ด๋ฏธ์ง€ ํด๋”์—๋Š” ์—ฌ๋Ÿฌ ํ•˜์œ„ํด๋”๋กœ ๊ตฌ์„ฑ๋˜๊ณ , ์ด ํ•˜์œ„ํด๋”๋“ค์—๋Š” ๊ฐ ์‚ฌ๋žŒ์˜ ์‚ฌ์ง„๋“ค์ด ๋“ค์–ด๊ฐ€์žˆ์Šต๋‹ˆ๋‹ค. ํ•˜์œ„ํด๋”์— ์†ํ•œ ์ด๋ฏธ์ง€์˜ ํ™•์žฅ์ž๋ฅผ ๊ตฌํ•˜๋Š” ํ•จ์ˆ˜์ž…๋‹ˆ๋‹ค.
    
    Args:
        img_dir: ํ•™์Šต ๋ฐ์ดํ„ฐ์…‹ ์ด๋ฏธ์ง€ ํด๋” ๊ฒฝ๋กœ 
        img_id: ํ•™์Šต ๋ฐ์ดํ„ฐ์…‹ ํ•˜์œ„ํด๋” ์ด๋ฆ„

    Returns:
        ext: ์ด๋ฏธ์ง€์˜ ํ™•์žฅ์ž
    """
    filename = os.listdir(os.path.join(img_dir, img_id))[0]
    ext = os.path.splitext(filename)[-1].lower()
    return ext

get_ext ๋ผ๋Š” ํ•จ์ˆ˜๋ฅผ ์„ ์–ธํ•œ๋‹ค. ์ด๋ฆ„๋ถ€ํ„ฐ ํ™•์žฅ์ž๋ฅผ ์–ป์„ ์ˆ˜ ์žˆ๋‹ค๋Š” ๋А๋‚Œ์ด ๋ฌผ์”ฌ ์˜จ๋‹ค. ext๋Š” EXTended file system ์˜ ์ค„์ž„๋ง์ด๋‹ค.

  • ์ธ์ž๋กœ ํด๋” ๊ฒฝ๋กœ์™€ ํด๋” ์ด๋ฆ„์„ ๋ฐ›๋Š”๋‹ค.

    • img_dir : cfg.img_dir ๋ฅผ ์ฃผ๋กœ ๋ฐ›๊ฒŒ ๋  ๊ฒƒ์ด๋‹ค

      • ์ด์ „์— img_dir = f'{data_dir}/images' ๋กœ ์„ค์ •ํ–ˆ๋‹ค.

    • img_id : 001131_female_Asian_22 ์™€ ๊ฐ™์€ ํ…์ŠคํŠธ๋ฅผ ๋ฐ›๋Š”๋‹ค.

  • os.listdir : ์ฃผ์–ด์ง„ ์ฃผ์†Œ์— ์žˆ๋Š” ํŒŒ์ผ๋ช…์„ ๋ฆฌ์ŠคํŠธ๋กœ ๋ฐ˜ํ™˜ํ•œ๋‹ค. ์ด์ „์— ์„ค๋ช…ํ•œ glob.glob ๊ณผ ๋น„์Šทํ•œ ๊ธฐ๋Šฅ์ด๋‹ค. ์ „์ฒด ์ฃผ์†Œ๋ฅผ ์–ป๋Š” ๊ฒƒ์ด ์•„๋‹ˆ๋ผ ํŒŒ์ผ๋ช…๋งŒ ์–ป๋Š”๋‹ค. ๊ทธ๋ž˜์„œ ์ถ”ํ›„์— os.path.join ์„ ์‚ฌ์šฉํ•œ๋‹ค.

  • os.path.join : ๊ฒฝ๋กœ๋ฅผ ์ด์–ด์ฃผ๋Š” ์—ญํ• ์„ ํ•œ๋‹ค

    • ex) os.path.join('user', 'documents') == 'user/documnets

  • os.path.splitext : ์ฃผ์–ด์ง„ ํŒŒ์ผ์˜ ์ด๋ฆ„์„ / ๊ณผ . ์„ ๊ธฐ์ค€์œผ๋กœ ๋‚˜๋ˆˆ๋‹ค.

๊ทธ๋ž˜์„œ, filename์€ 7๊ฐœ์˜ ์ด๋ฏธ์ง€ ํŒŒ์ผ๋ช…์„ ์–ป๊ฒŒ ๋  ๊ฒƒ์ด๊ณ  ์ด ์ค‘ ์ฒซ๋ฒˆ์งธ ํŒŒ์ผ์˜ ํ™•์žฅ์ž๋ฅผ ๋ฐ˜ํ™˜ํ•  ๊ฒƒ์ด๋‹ค.

def get_img_stats(img_dir, img_ids):
    """
    ๋ฐ์ดํ„ฐ์…‹์— ์žˆ๋Š” ์ด๋ฏธ์ง€๋“ค์˜ ํฌ๊ธฐ์™€ RGB ํ‰๊ท  ๋ฐ ํ‘œ์ค€ํŽธ์ฐจ๋ฅผ ์ˆ˜์ง‘ํ•˜๋Š” ํ•จ์ˆ˜์ž…๋‹ˆ๋‹ค.
    
    Args:
        img_dir: ํ•™์Šต ๋ฐ์ดํ„ฐ์…‹ ์ด๋ฏธ์ง€ ํด๋” ๊ฒฝ๋กœ 
        img_ids: ํ•™์Šต ๋ฐ์ดํ„ฐ์…‹ ํ•˜์œ„ํด๋” ์ด๋ฆ„๋“ค

    Returns:
        img_info: ์ด๋ฏธ์ง€๋“ค์˜ ์ •๋ณด (ํฌ๊ธฐ, ํ‰๊ท , ํ‘œ์ค€ํŽธ์ฐจ)
    """
    img_info = dict(heights=[], widths=[], means=[], stds=[])
    for img_id in tqdm(img_ids):
        for path in glob(os.path.join(img_dir, img_id, '*')):
            img = np.array(Image.open(path))
            h, w, _ = img.shape
            img_info['heights'].append(h)
            img_info['widths'].append(w)
            img_info['means'].append(img.mean(axis=(0,1)))
            img_info['stds'].append(img.std(axis=(0,1)))
    return img_info

์ด๋ฏธ์ง€์˜ ํ†ต๊ณ„์ ์ธ(=statistics) ๋ฐ์ดํ„ฐ๋“ค์„ ์–ป๋Š” ๋‹ค๋Š” ๋œป์œผ๋กœ ํ•จ์ˆ˜ ์ด๋ฆ„์„ get_img_stats ๋กœ ์ง€์€ ๋“ฏ ํ•˜๋‹ค. get_ext ์ฒ˜๋Ÿผ ํด๋” ๊ฒฝ๋กœ์™€ ํด๋” ์ด๋ฆ„์„ ์ธ์ž๋กœ ๊ฐ€์ง€๋Š”๋ฐ, ์—ฌ๊ธฐ์„œ ํด๋” ์ด๋ฆ„์€ ์—ฌ๋Ÿฌ๊ฐœ์ธ ๊ฒƒ์ด ์ฐจ์ด.

  • img_info ๋Š” ์ด๋ฏธ์ง€์˜ ๋†’์ด, ๋„ˆ๋น„, ํ‰๊ท , ํ‘œ์ค€ํŽธ์ฐจ๋ฅผ ๋”•์…”๋„ˆ๋ฆฌ ์ •๋ณด๋กœ ๊ฐ€์ง„๋‹ค.

  • tqdm(img_ids) : for๋ฌธ์ด ๋„๋Š” ๊ฒƒ์„ progressbar๋กœ ์‹œ๊ฐ์ ์œผ๋กœ ํ‘œํ˜„ํ•˜๋ ค๊ณ  ํ•œ๋‹ค. ์ด ๋•Œ, ๋‹ค์Œ์„ ์ž…๋ ฅํ•ด์ฃผ๊ณ  ์ปค๋„์„ ์žฌ์‹œ์ž‘ํ•ด์•ผ ์ •์ƒ์ ์œผ๋กœ ์ถœ๋ ฅ๋œ๋‹ค.

    • ๊ทธ๋ฆฌ๊ณ  apt-get nodejs ๋ฅผ ์„ค์น˜ํ•ด์ค˜์•ผ ํ•œ๋‹ค.

    • ํ•˜์ง€๋งŒ, ๊ทธ๋ž˜๋„ ๋‚œ ์ž˜ ์•ˆ๋œ๋‹ค. ๋˜‘๊ฐ™์ด ํ•ด๋„ ํŒ€์›์€ ๋˜๋˜๋ฐ ๋ญ๊ฐ€ ๋ฌธ์ œ์ผ๊นŒ..

# jupyter nbextension enable --py widgetsnbextension

Enabling notebook extension jupyter-js-widgets/extension...
      - Validating: OK
  • np.array(Image.open(path)) : ํŒŒ์ด์ฌ ์ด๋ฏธ์ง€ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ PIL์˜ ํ•จ์ˆ˜ Image.open์„ ํ†ตํ•ด ์ด๋ฏธ์ง€๋ฅผ ์—ฐ๋‹ค. ์ด๋ฏธ์ง€๋ฅผ ํ”ฝ์…€ ๋‹จ์œ„๋กœ ๋ถ„์„ํ•˜๊ธฐ ์œ„ํ•ด์„œ ์ด๋ฅผ numpy array๋กœ ๋ณ€๊ฒฝํ•œ๋‹ค.

  • img.shape : ์ด๋ฏธ์ง€์˜ ๋†’์ด, ๋„ˆ๋น„, ์ฑ„๋„ ์ˆ˜๊ฐ€ ์ˆœ์„œ๋Œ€๋กœ ๋ฐ˜ํ™˜๋œ๋‹ค.

numpy array ์ด๊ธฐ ๋•Œ๋ฌธ์— mean๊ณผ std๋ผ๋Š” numpy ํ•จ์ˆ˜๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ํ‰๊ท ๊ณผ ํ‘œ์ค€ํŽธ์ฐจ๋ฅผ ๊ตฌํ•œ๋‹ค. ์ด ๋•Œ axis = (0, 1)๋กœ ๋ช…์‹œํ•ด์ฃผ๋Š”๋ฐ, ์ „์ฒด ํ”ฝ์…€์„ ๊ฐ€๋กœ์ถ•๊ณผ ์„ธ๋กœ์ถ• ๋ชจ๋‘ ํ•œ๋ฒˆ์— ์ข…ํ•ฉํ•˜๋ผ๋Š” ์˜๋ฏธ์ด๋‹ค. axis ๋ฅผ ์‚ฌ์šฉํ•˜์ง€ ์•Š์•„๋„ ๋™์ผํ•œ ๊ฒฐ๊ณผ๊ฐ€ ๋ฐ˜ํ™˜๋œ๋‹ค.

>>> a = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
>>> a.mean(axis=0)
array([4., 5., 6.])
>>> a.mean(axis=1)
array([2., 5., 8.])
>>> a.mean(axis=(0, 1))
5.0
>>> a.mean()
5.0

img_info = get_img_stats(cfg.img_dir, df.path.values[:100])

print(f'Total number of people is {len(df)}')
print(f'Total number of images is {len(df) * 7}')

print(f'Minimum height for dataset is {np.min(img_info["heights"])}')
print(f'Maximum height for dataset is {np.max(img_info["heights"])}')
print(f'Average height for dataset is {int(np.mean(img_info["heights"]))}')
print(f'Minimum width for dataset is {np.min(img_info["widths"])}')
print(f'Maximum width for dataset is {np.max(img_info["widths"])}')
print(f'Average width for dataset is {int(np.mean(img_info["widths"]))}')

print(f'RGB Mean: {np.mean(img_info["means"], axis=0) / 255.}')
print(f'RGB Standard Deviation: {np.mean(img_info["stds"], axis=0) / 255.}')
Total number of people is 2700
Total number of images is 18900
Minimum height for dataset is 512
Maximum height for dataset is 512
Average height for dataset is 512
Minimum width for dataset is 384
Maximum width for dataset is 384
Average width for dataset is 384
RGB Mean: [0.55800916 0.51224077 0.47767341]
RGB Standard Deviation: [0.21817792 0.23804603 0.25183411]
  • df.path.values[:100] : ์žŠ์—ˆ๋‹ค๋ฉด ๋‹ค์‹œ, df ๋Š” train.csv ๋ฅผ pd.read_csv ๋กœ ์ฝ์–ด๋“ค์ธ ๊ฐ์ฒด์ด๋‹ค. ์ด csvํŒŒ์ผ์—๋Š” path ๋ผ๋Š” ์ปฌ๋Ÿผ์„ ๊ฐ€์ง€๊ณ  ์žˆ๋‹ค. ์ด ์ปฌ๋Ÿผ์˜ ๊ฐ’๋“ค์„ 100๊ฐœ๊นŒ์ง€๋งŒ ๊ฐ€์ ธ์˜จ ๊ฒƒ

    • dataframe์˜ column ์ ‘๊ทผ๋ฒ•์—๋Š” df.path ์™€ df['path'] ๊ฐ€ ์žˆ๋‹ค. ๋‘˜์ด ๋™์ผํ•˜์ง€๋งŒ, column name์— ๊ณต๋ฐฑ์ด ์žˆ๋‹ค๋ฉด ํ›„์ž๋กœ๋งŒ ์ ‘๊ทผํ•  ์ˆ˜ ์žˆ๋‹ค.

1.2 ๊ฐ์ฒด์˜ ์œ„์น˜๋“ค ํ™•์ธํ•ด๋ณด๊ธฐ

  • ์กฐ๊ธˆ ํŠน์ˆ˜ํ•œ ๋ถ„์„์„ ํ•ด๋ด…์‹œ๋‹ค. ์ด ๋ถ€๋ถ„์€ ๊ฐ•์˜ ๋‚ด์šฉ์„ ๋ฒ—์–ด๋‚˜๋Š” ์ฝ”๋“œ๊ฐ€ ํฌํ•จ๋˜์–ด ์žˆ์œผ๋ฏ€๋กœ ์ƒ๋žตํ•˜์…”๋„ ๊ดœ์ฐฎ์Šต๋‹ˆ๋‹ค.

  • ์‚ฌ๋žŒ ์–ผ๊ตด์„ ์ฐพ๋Š”๋ฐ ๋”ฅ๋Ÿฌ๋‹์ด ์‚ฌ์šฉ๋˜๊ธฐ ์ด์ „์—, Haar Cascade๋ผ๋Š” ๋ฐฉ๋ฒ•์ด ๋งŽ์ด ์‚ฌ์šฉ๋˜์—ˆ์Šต๋‹ˆ๋‹ค. ์ด ๋ฐฉ๋ฒ•์„ ์ด์šฉํ•˜์—ฌ ๊ฐ„๋‹จํ•˜๊ฒŒ ๊ฒฐ๊ณผ๋ฅผ ์‹œ๊ฐํ™” ํ•ด๋ด…์‹œ๋‹ค.

face_cascade = cv2.CascadeClassifier('haarcascade_frontalface_default.xml')

์ด ๋ถ€๋ถ„์€ ์˜ค๋ฅ˜๊ฐ€ ๋‚˜๋Š”๋ฐ, ๋‹ค์Œ๊ณผ ๊ฐ™์ด ์ž‘์„ฑํ•˜๋ฉด ์˜ค๋ฅ˜๊ฐ€ ๋‚˜์ง€ ์•Š๋Š”๋‹ค. ๋‚˜๋งŒ ๋‚˜๋Š”๊ฑด๊ฐ€?

face_cascade = cv2.CascadeClassifier(cv2.data.haarcascades + 'haarcascade_frontalface_default.xml')

face_cascade๊ฐ€ ๋ฌด์—‡์ผ๊นŒ? ์ด๊ฒƒ์€ cascade๋ผ ํ•˜๋Š” ๊ฐ„๋‹จํ•œ ํŠน์ง•์„ ๊ฐ€์ง€๊ณ  object detection์„ ํ•˜๋Š” ๋ฐฉ๋ฒ•์ด๋‹ค. ํ˜„์žฌ์˜ ๋จธ์‹ ๋Ÿฌ๋‹์ด ์—ฌ๊ธฐ์— ๊ธฐ๋ฐ˜์„ ๋‘์–ด์„œ ์‹œ์ž‘ํ–ˆ๋‹ค๊ณ  ํ•œ๋‹ค.

์šฐ๋ฆฌ๊ฐ€ ์‚ฌ์šฉํ•˜๋ ค๋Š” face_cascade๋Š” ์ตœ์ ์˜ threshold๊ฐ’์„ ํ•™์Šตํ•˜๊ณ  ์ด๋ฏธ์ œ ์–ผ๊ตด์ด ์žˆ๋Š”์ง€ ์—†๋Š”์ง€์— ๋Œ€ํ•ด ํŒ๋‹จํ•œ๋‹ค๊ณ  ํ•œ๋‹ค. ์•„๋ฌด๋ž˜๋„ edge detection์„ ๊ฑฐ์น˜๋ฉด ์ฃผ๋กœ ์‚ฌ๋žŒ์˜ ์–ผ๊ตด์„ ์„ธ๋กœ๋กœ ํƒ€์›ํ˜•์˜ ๋ชจ์–‘์„ ๊ฐ€์ง€๊ณ  ์žˆ๊ณ  ์‚ฌ๋žŒ์˜ ๋ˆˆ์ฝ”์ž… ์œค๊ณฝ์ด ๋“œ๋Ÿฌ๋‚ ํ…Œ๋‹ˆ ์ด๋Ÿฌํ•œ ์›๋ฆฌ๋กœ ํ•™์Šตํ•˜๋Š” ๋“ฏ ์‹ถ๋‹ค. ์ด 6000๊ฐœ๊ฐ€ ๋„˜๋Š” ํŠน์ง•์„ 38๊ฐœ์˜ ๋‹จ๊ณ„์— ๊ฑฐ์ณ ํ•™์Šตํ•œ ์ด face_cascade๋ฅผ ์‚ฌ์šฉํ•˜๋ ค๊ณ  ํ•œ๋‹ค.

  • threshold๋Š” ๋ฌธํ„ฑ๊ฐ’์ด๋ผ๋Š” ๋œป์œผ๋กœ, ๊ฒฐ์ •ํ•  ์ˆ˜ ์žˆ๋Š” ์ง€์ ์ด๋ผ๊ณ  ์ƒ๊ฐํ•˜๋ฉด ๋œ๋‹ค. ๊ตญ์–ด ์‹œํ—˜ ์ ์ˆ˜๊ฐ€ 60์  ์ด์ƒ์ด์–ด์•ผ๋งŒ ์žฌ์‹œํ—˜์—์„œ ๋ฉด์ œ๋œ๋‹ค๋ฉด ์—ฌ๊ธฐ์„œ threshold๋Š” 60์ ์ด๋‹ค. edge detection์„ ํ•˜๋ฉด ์ด๋ฏธ์ง€์—์„œ ์œค๊ณฝ์„ ๋งŒ ๋Œ€์ฒด๋กœ ์žกํžˆ๋Š”๋ฐ, ์ด๋Ÿฌํ•œ ์œค๊ณฝ์ด ๊ฐ ์„ ๋งˆ๋‹ค ๋šœ๋ ทํ•˜๊ธฐ๋„, ํฌ๋ฏธํ•˜๊ธฐ๋„ ํ•˜๊ธฐ ๋•Œ๋ฌธ์— ์–ด๋””๊นŒ์ง€ edge๋กœ ๋ณผ ๊ฒƒ์ธ๊ฐ€์— ๋Œ€ํ•ด์„œ ์ •ํ•˜๋Š” ๊ฒƒ์ด threshold์ด๋‹ค. face_cascade๋Š” ๋‹ค๋Ÿ‰์˜ ์ด๋ฏธ์ง€์™€ ๋งŽ์€ ํŠน์ง•์œผ๋กœ ๊ฐ stage๋งˆ๋‹ค ์ตœ์ ์˜ threshold๋ฅผ ํ•™์Šตํ•œ ๊ฒƒ

imgs = []
img_id = df.iloc[500].path
ext = get_ext(cfg.img_dir, img_id)
for class_id in num2class:
    img = np.array(Image.open(os.path.join(cfg.img_dir, img_id, class_id+ext)))
    imgs.append(img)
imgs = np.array(imgs)

500๋ฒˆ์งธ ๋ฐ์ดํ„ฐ์˜ ๊ฒฝ๋กœ๋ฅผ ๊ฐ€์ ธ์™€์„œ ๊ทธ ๊ฒฝ๋กœ์•ˆ์— ์žˆ๋Š” 7์žฅ์˜ ์ด๋ฏธ์ง€๋ฅผ ๊ฐ€์ ธ์˜ค๋Š” ์ฝ”๋“œ

  • os.path.join์˜ ์ธ์ž๋ฅผ ์—ฌ๋Ÿฌ๊ฐœ๋กœ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋Š”์ง€ ์ฒ˜์Œ ์•Œ์•˜๋‹ค

fig, axes = plt.subplots(1, 3, sharex=True, sharey=True, figsize=(12, 6))
axes[0].imshow(imgs[0])
axes[1].imshow(imgs[1])
axes[2].imshow(imgs[-1])
plt.show()

ํด๋ž˜์Šค๋ฅผ ๋‹ค์Œ๊ณผ ๊ฐ™์ด ์„ ์–ธํ–ˆ์œผ๋ฏ€๋กœ, ์œ„์™€ ๊ฐ™์€ ์ถœ๋ ฅ ๊ฒฐ๊ณผ๋ฅผ ์–ป๋Š”๋‹ค.

num2class = ['incorrect_mask', 'mask1', 'mask2', 'mask3', 'mask4', 'mask5', 'normal']

2. target๊ฐ’ y์— ๋Œ€ํ•œ ๋ถ„์„

์ €ํฌ๊ฐ€ ๋งž์ถฐ์•ผํ•˜๋Š” ์ •๋ณด๋“ค์ด ์–ด๋–ค ๊ฒƒ์ธ์ง€ ํ™•์ธํ•ด๋ณด๊ณ  ์–ด๋–ค ๋ถ„ํฌ๋ฅผ ๊ฐ–๊ณ  ์žˆ๋Š”์ง€ ํ™•์ธํ•ด๋ด…์‹œ๋‹ค.

  • ์—ฌ๊ธฐ์—์„  train.csv์— ์ €์žฅ๋˜์–ด์žˆ๋Š” ๋ฉ”ํƒ€ ๋ฐ์ดํ„ฐ๋ฅผ ๋ถ„์„ํ•ฉ๋‹ˆ๋‹ค. seaborn ์‹œ๊ฐํ™” ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋ฅผ ํ†ตํ•ด ์„ฑ๋ณ„์˜ ๋ถ„ํฌ์™€ ์—ฐ๋ น ๋ถ„ํฌ๋ฅผ ํ™•์ธํ•ด๋ด…์‹œ๋‹ค.

2.1 y๊ฐ’ ๋…๋ฆฝ์  ๋ถ„ํฌ ํ™•์ธ

plt.figure(figsize=(6, 4.5)) 
ax = sns.countplot(x = 'gender', data = df, palette=["#55967e", "#263959"])

plt.xticks( np.arange(2), ['female', 'male'] )
plt.title('Sex Ratio',fontsize= 14)
plt.xlabel('')
plt.ylabel('Number of images')

counts = df['gender'].value_counts()
counts_pct = [f'{elem * 100:.2f}%' for elem in counts / counts.sum()]
for i, v in enumerate(counts_pct):
    ax.text(i, 0, v, horizontalalignment = 'center', size = 14, color = 'w', fontweight = 'bold')
    
plt.show()

์„ฑ๋ณ„๊ฐ„ ์ˆ˜์น˜๋ฅผ ๋ง‰๋Œ€ ๊ทธ๋ž˜ํ”„๋กœ ํ‘œํ˜„ํ–ˆ๋‹ค.

  • Line 2

    • gender ๋ผ๋Š” ์ปฌ๋Ÿผ์˜ ๊ฐ’๋“ค์„ x์ถ•, ๊ทธ์— ํ•ด๋‹นํ•˜๋Š” ๊ฐ’์„ y์ถ•์œผ๋กœ ์„ค์ •ํ•˜๊ณ  ์ƒ‰์„ ์ง€์ •ํ•ด์ฃผ์—ˆ๋‹ค.

  • Line 4-7

    • ์ œ๋ชฉ๊ณผ x์ถ•, y์ถ•์˜ ์ด๋ฆ„์„ ์„ค์ •ํ–ˆ๋‹ค

  • Line 9-10

    • gender ์ปฌ๋Ÿผ์„ ๊ธฐ์ค€์œผ๋กœ ๋ฐ์ดํ„ฐ์˜ ๊ฐœ์ˆ˜๋ฅผ ์„ธ๊ณ  ํผ์„ผํ…Œ์ด์ง€๋กœ ํ‘œํ˜„ํ•˜๊ธฐ ์œ„ํ•ด ์ „์ฒด ๋ฐ์ดํ„ฐ์˜ ๊ฐœ์ˆ˜๋กœ ๋‚˜๋ˆ„์—ˆ๋‹ค.

  • Line 11

    • ๊ฐ€๋กœ์ถ• ์ •๋ ฌ์„ ๊ฐ€์šด๋ฐ์ •๋ ฌ๋กœ, ์ƒ‰์€ ํ•˜์–€์ƒ‰์ธ ํ…์ŠคํŠธ๋ฅผ (i, 0) ์œ„์น˜์— v๋ผ๊ณ  ์ž…๋ ฅํ•œ๋‹ค.

sns.displot(df, x="age", stat="density")
plt.show()

seaborn์—๋Š” distplot ๊ณผ displot ์ด ๋‘˜ ๋‹ค ์žˆ์œผ๋ฏ€๋กœ ํ—ท๊ฐˆ๋ฆฌ์ง€ ๋ง์ž. displot ์€ ๋ถ„ํฌ๋ฅผ ๋ง‰๋Œ€๊ทธ๋ž˜ํ”„๋กœ ๊ทธ๋ ค์ฃผ๋Š” ํ•จ์ˆ˜์ด๋‹ค. x์ถ•์€ age ๊ฐ€ y์ถ•์€ age์˜ count๋กœ ์„ค์ •๋˜ stat ์€ ์ด y์ถ•์„ ์–ด๋–ค ๋‹จ์œ„๋กœ ํ‘œํ˜„ํ• ์ง€ ๊ฒฐ์ •ํ•œ๋‹ค. ์—ฌ๊ธฐ์„œ๋Š” density ๋ผ๋Š” ๋ฐ€๋„๋กœ ํ‘œํ˜„ํ–ˆ์œผ๋ฉฐ ๊ธฐ๋ณธ๊ฐ’์€ count ์ด๋‹ค.

2.2 y๊ฐ’๋“ค ๊ฐ„์˜ ๊ด€๊ณ„ ๋ถ„ํฌ

  • ๋‚˜์ด์™€ ์„ฑ๋ณ„์— ๋”ฐ๋ฅธ ๋ถ„ํฌ๋Š” ์–ด๋–ป๊ฒŒ ๊ตฌ์„ฑ๋˜์—ˆ๋Š”์ง€ ์•Œ์•„๋ด…์‹œ๋‹ค.

sns.displot(df, x="age", hue="gender", stat="density")
plt.show()

์—ฌ๊ธฐ์„œ hue ์ธ์ž๋ฅผ ์ถ”๊ฐ€ํ•˜๊ฒŒ ๋˜๋ฉด ์ƒˆ๋กœ์šด ๋ฐ์ดํ„ฐ ๋ณ€์ˆ˜๋ฅผ ์ถ”๊ฐ€ํ•ด์„œ ์ƒ‰์ƒ์œผ๋กœ ๊ตฌ๋ถ„๋˜๋Š” ๊ทธ๋ž˜ํ”„๋ฅผ ๊ทธ๋ฆด ์ˆ˜ ์žˆ๋‹ค. ์—ฌ๊ธฐ์„œ๋Š” ๋‚จ์„ฑ๊ณผ ์—ฌ์„ฑ์ด๋ผ๋Š” ์ƒˆ๋กœ์šด ๋ณ€์ˆ˜๋ฅผ ์ถ”๊ฐ€ํ–ˆ๋‹ค.

df['age'].describe()
count    2700.000000
mean       37.708148
std        16.985904
min        18.000000
25%        20.000000
50%        36.000000
75%        55.000000
max        60.000000
Name: age, dtype: float64

dataframe์€ describe ๋ฅผ ํ†ตํ•ด์„œ ์ˆ˜์น˜ํ˜• ๋ฐ์ดํ„ฐ์˜ ๊ฐ์ข… ํ†ต๊ณ„๋Ÿ‰์„๋ฅผ ํŒŒ์•…ํ•  ์ˆ˜ ์žˆ๋‹ค.

sns.boxplot(x='gender', y='age', data=df)
plt.show()
  • ๋‚จ์„ฑ๊ณผ ์—ฌ์„ฑ์˜ ๋‚˜์ด์˜ ๋ฒ”์œ„๋Š” ๊ฐ™์ง€๋งŒ ๊ฒฝํ–ฅ์„ฑ์€ ๋‹ค๋ฅธ ๊ฒƒ์„ ํ™•์ธํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

  • ๋ฐ์ดํ„ฐ๋“ค์˜ ๋ถˆ๊ท ํ˜•(data imbalance)๊ฐ€ ์‹ฌํ•ด๋ณด์ด๋„ค์š” ์ด๋ฅผ ์œ„ํ•ด์„œ๋Š” ์–ด๋–ค ๋ถ„์„๋ฐฉ๋ฒ•์„ ์‚ฌ์šฉํ•ด์•ผํ• ๊นŒ์š”??

3. X, y ๊ด€๊ณ„ํ™•์ธ

X์ธ ์ด๋ฏธ์ง€์™€ y์˜ ๊ด€๊ณ„๋Š” ์–ด๋–ค ๊ฒƒ์ด ์žˆ์„๊นŒ์š”??

  • ๋ถ„์„ํ•˜๊ณ ์ž ํ•˜๋Š” ์ด๋ฏธ์ง€์™€ y์˜ ๊ด€๊ณ„๋ฅผ ์•Œ๊ฒŒ๋œ๋‹ค๋ฉด ์ „์ฒ˜๋ฆฌ, data augmentation ํ˜น์€ CNN์˜ ๊ตฌ์กฐ๋ฅผ ํ’€๊ณ ์žํ•˜๋Š” ๋ฌธ์ œ์— ์ ํ•ฉํ•˜๊ฒŒ ์ ์šฉํ•ด๋ณผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

3.1 ์ด๋ฏธ์ง€ ์‚ฌ์ด์ฆˆ์™€ y๊ฐ’์˜ ๊ด€๊ณ„

  • image size๋Š” ๋ชจ๋‘ ๊ฐ™์€ ์‚ฌ์ด์ฆˆ๋ผ y๊ฐ’๊ณผ ๊ด€๊ณ„๊ฐ€ ์—†์Šต๋‹ˆ๋‹ค.

3.2 ์ด๋ฏธ์ง€ RGB ํ†ต๊ณ„๊ฐ’๊ณผ y ํŠน์„ฑ์˜ ๊ด€๊ณ„

img_id = df.iloc[500].path
ext = get_ext(cfg.img_dir, img_id)

์—ฌ๊ธฐ์„œ๋Š” ์ด๋ฏธ์ง€์˜ ํ†ต๊ณ„๋Ÿ‰์„ ํ™•์ธํ•˜๊ธฐ ์œ„ํ•ด ์ž„์˜์˜ ์ด๋ฏธ์ง€, ์—ฌ๊ธฐ์„œ๋Š” 500๋ฒˆ์งธ ์ด๋ฏธ์ง€๋ฅผ ์„ ํƒํ–ˆ๋‹ค.

plt.figure()
plt.subplot(111)

for class_id in num2class:
    img = np.array(Image.open(os.path.join(cfg.img_dir, img_id, class_id+ext)).convert('L'))
    histogram, bin_edges = np.histogram(img, bins=256, range=(0, 255))
    sns.lineplot(data=histogram)

plt.legend(num2class)
plt.title('Class Grayscale Histogram Plot', fontsize=15)
plt.show()
  • ๋งˆ์Šคํฌ์˜ ์ข…๋ฅ˜๊ฐ€ 5๊ฐœ๋ผ plot์ด ์‚ฐ๋งŒํ•œ ๊ฒƒ๊ฐ™์œผ๋‹ˆ ๋งˆ์Šคํฌ๋Š” ํ‰๊ท ์„ ์ทจํ•ด์„œ ํ™•์ธํ•ด๋ด…์‹œ๋‹ค.

Image ๊ฐ์ฒด๋Š” convert ๋ผ๋Š” ๊ธฐ๋Šฅ์ด ์žˆ๋Š”๋ฐ, ์ด๋ฏธ์ง€์˜ ํƒ€์ž…์„ ๋ฐ”๊พผ๋‹ค. ์—ฌ๊ธฐ์„œ๋Š” L ์ด๋ผ๋Š” ๊ฒƒ์œผ๋กœ ๋ฐ”๊พธ์—ˆ๊ณ  ์ด๊ฒƒ์€ GrayScale(ํ‘๋ฐฑ) ํƒ€์ž…์œผ๋กœ ๋ฐ”๊พธ๋Š” ๊ฒƒ์ด๋‹ค. ํ‘๋ฐฑ์œผ๋กœ ๋ฐ”๊พธ๋ฉด ์ด๋ฏธ์ง€๋Š” 0์—์„œ 255๊นŒ์ง€์˜ ๊ฐ’(๋ช…๋„) ๋งŒ ๊ฐ€์ง€๊ฒŒ ๋œ๋‹ค.

  • ๋˜ ๋‹ค๋ฅธ ์ธ์ž๋กœ๋Š” bilevel(๊ฒ€์€์ƒ‰ ๋˜๋Š” ํฐ์ƒ‰)์˜ 1 ๊ณผ ํˆฌ๋ช…๋„์ •๋ณด๊นŒ์ง€ ํฌํ•จ๋˜๋Š” P ๊ฐ€ ์žˆ๋‹ค.

np.histogram ์€ array๋ฅผ ์ธ์ž๋กœ ๋ฐ›๋Š”๋‹ค. bins ๋Š” ํžˆ์Šคํ† ๊ทธ๋žจ์˜ ๊ตฌ๊ฐ„์„ ๋ช‡๊ฐœ๋กœ ์„ค์ •ํ• ๊ฒƒ์ธ์ง€, range๋Š” ๋ฒ”์œ„๋ฅผ ์˜๋ฏธํ•œ๋‹ค. ๋ฐ˜ํ™˜๊ฐ’์œผ๋กœ๋Š” ์ ์šฉ๋œ histogram๊ณผ len(hist)+1์˜ ๊ฐ’์ธ bin_edges๊ฐ€ ๋ฐ˜ํ™˜๋œ๋‹ค.

plt.figure()
plt.subplot(111)

img = np.array(Image.open(os.path.join(cfg.img_dir, img_id, 'incorrect_mask'+ext)).convert('L'))
histogram, bin_edges = np.histogram(img, bins=256, range=(0, 255))
sns.lineplot(data=histogram)

img = np.array(Image.open(os.path.join(cfg.img_dir, img_id, 'normal'+ext)).convert('L'))
histogram, bin_edges = np.histogram(img, bins=256, range=(0, 255))
sns.lineplot(data=histogram, color='hotpink')

histograms = []
for i in range(1, 6):
    img = np.array(Image.open(os.path.join(cfg.img_dir, img_id, num2class[i]+ext)).convert('L'))
    histogram, bin_edges = np.histogram(img, bins=256, range=(0, 255))
    histograms.append(histogram)
sns.lineplot(data=np.mean(histograms, axis=0))

plt.legend(['incorrect_mask', 'normal', 'mask average'])
plt.title('Class Grayscale Histogram Plot', fontsize=15)
plt.show()

์—ฌ๊ธฐ์„œ๋Š” ๋งˆ์Šคํฌ ์ด๋ฏธ์ง€๋ฅผ ๋ชจ๋‘ np.mean ์œผ๋กœ ํ‰๊ท ๋‚ด์–ด์„œ ์„  ๊ทธ๋ž˜ํ”„๋กœ ๊ทธ๋ ค์ง€๊ฒŒ๋œ๋‹ค.

  • ๋งˆ์Šคํฌ๋ฅผ ์“ฐ์ง€์•Š์€ ์‚ฌ์ง„์˜ RGB ๋ถ„ํฌ๋„ ์‚ดํŽด๋ณผ๊นŒ์š”?

plt.figure()
plt.subplot(111)

img = np.array(Image.open(os.path.join(cfg.img_dir, img_id, 'normal'+ext)))
colormap = ['red', 'green', 'blue']
for i in range(3):
    histogram, bin_edges = np.histogram(img[..., i], bins=256, range=(0, 255))
    sns.lineplot(data=histogram, color=colormap[i])

plt.legend()
plt.title('RGB Histogram Plot - Normal', fontsize=15)
plt.show()

3.3 ๊ฐ์ฒด์˜ ์œ„์น˜์™€ y์˜ ๊ด€๊ณ„

๊ฐ์ฒด์˜ ์œ„์น˜์™€ y์˜ ๊ด€๊ณ„๋ฅผ ์ฐพ๋Š” ๋ฐฉ๋ฒ•์€ ์ง์ ‘ ๋‹ค ํ™•์ธํ•˜๋Š” ๋ฐฉ๋ฒ•์ด ์žˆ์„ ์ˆ˜๋„ ์žˆ์ง€๋งŒ ์œ„์—์„œ ์‚ฌ์šฉํ•œ face detection์„ ์ด์šฉํ•˜์—ฌ box์˜ ์œ„์น˜๋“ค์˜ ํ†ต๊ณ„๊ฐ’๋“ค์„ ์ด์šฉํ•˜์—ฌ ์ฐพ์„ ์ˆ˜ ์žˆ์„ ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค.

  • ์ด๋ฏธ์ง€ ๋ณ„๋กœ ํ†ต๊ณ„๊ฐ’์„ ๋ฝ‘์•„๋‚ด๋Š” ๊ฒƒ์€ ์บ ํผ๋‹˜๋“ค์ด ์ง์ ‘ ํ•ด๋ณด์‹œ๋ฉด ์ข‹์„ ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค.

  • ์•„๋ž˜ ์ฝ”๋“œ๋Š” ์–ด๋–ค label์ด ์–ผ๊ตด์„ ์ž˜์ฐพ์ง€ ๋ชปํ•˜๋Š”์ง€ ํ™•์ธํ•ด๋ด…์‹œ๋‹ค.

face_cascade = cv2.CascadeClassifier(cv2.data.haarcascades + 'haarcascade_frontalface_default.xml')
imgs = []
bboxes = []
not_found_idx = []
img_id = df.iloc[504].path
ext = get_ext(cfg.img_dir, img_id)
for i, class_id in enumerate(num2class):
    img = np.array(Image.open(os.path.join(cfg.img_dir, img_id, class_id+ext)))
    bbox = face_cascade.detectMultiScale(img)
    imgs.append(img)
    if len(bbox) != 0:
        bboxes.append(bbox.max(axis=0))
    else:
        not_found_idx.append(i)
        print(f'{class_id} not found face')
imgs = np.array(imgs)
bboxes = np.array(bboxes)
incorrect_mask not found face
mask1 not found face
mask5 not found face

์ด๋ฒˆ์—๋Š” 504๋ฒˆ์งธ ์ด๋ฏธ์ง€๋ฅผ ์„ ํƒํ•ด๋ณด์ž. ์ด๋•Œ bbox ๋ฅผ face_cascade.detectMultiScale ๋กœ ์–ป๊ฒŒ๋˜๋Š”๋ฐ, ์ด๋Š” bounding box๋ฅผ ์˜๋ฏธํ•˜๋ฉฐ, bounding box์˜ ์ขŒํ‘œ๊ฐ’์„ ์–ป๊ฒŒ๋œ๋‹ค.

์—ฌ๊ธฐ์„œ๋Š” ๋งˆ์Šคํฌ๋ฅผ ์“ด ์‚ฌ์ง„ 6์žฅ(์ •์ƒ์€ 5์žฅ) ์ค‘ 3์žฅ์—์„œ ์–ผ๊ตด์ด ๊ฒ€์ถœ๋˜์ง€ ๋ชปํ–ˆ๋‹ค.

fig, axes = plt.subplots(1, len(not_found_idx), sharex=True, sharey=True, figsize=(12, 6))
for i, j in enumerate(range(len(not_found_idx))):
    axes[i].imshow(imgs[j])
    axes[i].set_title(f'{num2class[j]}')
plt.show()
  • ๋Œ€๋ถ€๋ถ„์˜ ์ด๋ฏธ์ง€๋“ค์€ ์ธ๋ฌผ๋“ค์ด ์ •์ค‘์•™์— ์žˆ๋Š” ๊ฒƒ์œผ๋กœ ํ™•์ธ

  • mask5๋Š” ๋Œ€๋ถ€๋ถ„ bbox๋ฅผ ์ฐพ์ง€ ๋ชปํ•จ

  • ๊ฐ€๋” mask1๋„ ์ฐพ์ง€ ๋ชปํ•จ

3.4 ๋ฐ์ดํ„ฐ ๋…ธ์ด์ฆˆ ํ™•์ธ

  • ์‚ฌ๋žŒ๋งˆ๋‹ค ์ด 7์žฅ์˜ ์‚ฌ์ง„์ด ์กด์žฌํ•ฉ๋‹ˆ๋‹ค. (๋งˆ์Šคํฌ ์ •์ƒ ์ฐฉ์šฉ 5์žฅ, ๋ฏธ์ฐฉ์šฉ 1์žฅ, ์ด์ƒํ•˜๊ฒŒ ์ฐฉ์šฉ 1์žฅ).

  • ์ด ํŒŒํŠธ์—์„œ๋Š” ์ด๋ฏธ์ง€๋ฅผ ์ง์ ‘ ์‹œ๊ฐํ™”ํ•˜์—ฌ ๋ˆˆ์œผ๋กœ ๊ด€์ฐฐํ•˜์—ฌ label์— ๋ฌธ์ œ๊ฐ€ ์—†๋Š”์ง€ ํ™•์ธํ•ด๋ด…์‹œ๋‹ค.

ํ•œ ์‚ฌ๋žŒ์˜ ๋ฐ์ดํ„ฐ๋ฅผ ์‹œ๊ฐํ™”ํ•ด๋ด…์‹œ๋‹ค.

def plot_raw_images(img_dir, img_id):
    """
    ๋งˆ์Šคํฌ ๋ฏธ์ฐฉ์šฉ ์ด๋ฏธ์ง€๋ฅผ ์‹œ๊ฐํ™”ํ•˜๋Š” ํ•จ์ˆ˜์ž…๋‹ˆ๋‹ค.
    
    Args:
        img_dir: ํ•™์Šต ๋ฐ์ดํ„ฐ์…‹ ์ด๋ฏธ์ง€ ํด๋” ๊ฒฝ๋กœ 
        img_id: ํ•™์Šต ๋ฐ์ดํ„ฐ์…‹ ํ•˜์œ„ํด๋” ์ด๋ฆ„
    """
    ext = get_ext(img_dir, img_id)
    img = np.array(Image.open(os.path.join(img_dir, img_id, 'normal' + ext)))
    
    plt.figure(figsize=(6,6))
    plt.imshow(img)
def show_from_id(idx):
    img_id = df.iloc[idx].path
    gen = df.iloc[idx].gender
    age = df.iloc[idx].age
    plot_raw_images(cfg.img_dir, img_id)
    plt.title(f'{gen} {age}')
    plt.show()
  • ๋‚จ์„ฑ์œผ๋กœ ๋ณด์ด์ง€๋งŒ ์—ฌ์„ฑ์œผ๋กœ ํ‘œ์‹œ๋˜์–ด ์žˆ๋Š”๊ฒฝ์šฐ

show_from_id(2399)
show_from_id(2400)
  • ์—ฌ์„ฑ์œผ๋กœ ๋ณด์ด์ง€๋งŒ ๋‚จ์„ฑ์œผ๋กœ ํ‘œ์‹œ๋˜์–ด ์žˆ๋Š” ๊ฒฝ์šฐ

show_from_id(1912)
show_from_id(764)

๊พ€ ๋งŽ์€ ๊ฒฝ์šฐ๋กœ ๋ฐ์ดํ„ฐ์˜ ๊ฒฝํ–ฅ์„ฑ์„ ๋ฐฉํ•ดํ•˜๋Š” ๋ฐ์ดํ„ฐ๊ฐ€ ์žˆ๋Š” ๊ฒƒ์œผ๋กœ ํ™•์ธ๋ฉ๋‹ˆ๋‹ค.

์ด๋ฅผ ์œ„ํ•ด์„œ๋Š” ์–ด๋–ค ๋ฐฉ๋ฒ•์„ ์ด์šฉํ•˜๋Š” ๊ฒƒ์ด ์ข‹์„๊นŒ์š”???

  • id ๋ณ„๋กœ ๋งˆ์Šคํฌ ์ฐฉ์šฉ ์ƒํƒœ๋ฅผ ํ™•์ธํ•ด๋ด…์‹œ๋‹ค.

def plot_mask_images(img_dir, img_id):
    """
    ๋งˆ์Šคํฌ ์ •์ƒ์ฐฉ์šฉ 5์žฅ๊ณผ ์ด์ƒํ•˜๊ฒŒ ์ฐฉ์šฉํ•œ 1์žฅ์„ 2x3์˜ ๊ฒฉ์ž์— ์‹œ๊ฐํ™”ํ•˜๋Š” ํ•จ์ˆ˜์ž…๋‹ˆ๋‹ค.
    
    Args:
        img_dir: ํ•™์Šต ๋ฐ์ดํ„ฐ์…‹ ์ด๋ฏธ์ง€ ํด๋” ๊ฒฝ๋กœ 
        img_id: ํ•™์Šต ๋ฐ์ดํ„ฐ์…‹ ํ•˜์œ„ํด๋” ์ด๋ฆ„
    """
    ext = get_ext(img_dir, img_id)
    imgs = [np.array(Image.open(os.path.join(img_dir, img_id, class_name + ext))) for class_name in num2class[:-1]]
    
    n_rows, n_cols = 2, 3
    fig, axes = plt.subplots(n_rows, n_cols, sharex=True, sharey=True, figsize=(15, 12))
    for i in range(n_rows*n_cols):
        axes[i//(n_rows+1)][i%n_cols].imshow(imgs[i])
        axes[i//(n_rows+1)][i%n_cols].set_title(f'{num2class[i]}', color='r')
    plt.tight_layout()
    plt.show()
idx = 500
img_id = df.iloc[idx].path
plot_mask_images(cfg.img_dir, img_id)

PCA ๋ถ„์„์€ ๋‚˜์˜ ์ง€์‹์ด ๋ถ€์กฑํ•ด์„œ, ์‹œ๊ฐ„์ƒ ์„ค๋ช…ํ•˜์ง€ ๋ชปํ•œ๋‹ค. ์ด๊ฑธ ๊ณต๋ถ€ํ•˜๊ณ  ์„ค๋ช…ํ•˜๋ฉด ๋ผ๋ฒจ๋ง์„ ๋ชปํ•  ๊ฒƒ ๊ฐ™๋‹ค. ๋งค์šฐ ์•ˆํƒ€๊น์ง€๋งŒ ์—ฌ๊ธฐ๊นŒ์ง€.

(Optional) PCA

  • ์ฃผ์„ฑ๋ถ„ ๋ถ„์„์€ ์ด๋ฏธ์ง€ ๋ฐ์ดํ„ฐ ๋ถ„ํฌ์˜ ์ฃผ์„ฑ๋ถ„์„ ๊ตฌํ•˜๋Š” ๋ฐฉ๋ฒ•์ž…๋‹ˆ๋‹ค.

  • 300 ์žฅ์˜ ์–ผ๊ตด ์ด๋ฏธ์ง€์— ๋Œ€ํ•œ ์ฃผ์„ฑ๋ถ„ ๋ฒกํ„ฐ(eigenface)๋ฅผ ๊ตฌํ•˜๊ณ  T-SNE๋ฅผ ํ†ตํ•ด ์ฐจ์›์ถ•์†Œ๋ฅผ ํ•˜์—ฌ ๊ฐ ํด๋ž˜์Šค๋งˆ๋‹ค์˜ ๋ถ„ํฌ์ฐจ์ด๋ฅผ ์‹œ๊ฐํ™”ํ•ด๋ด…์‹œ๋‹ค.

from sklearn.manifold import TSNE
from sklearn.decomposition import PCA
n_imgs = 100

imgs = []
for img_id in df.path.values[:n_imgs]:
    for class_id in num2class:
        img = np.array(Image.open(os.path.join(cfg.img_dir, img_id, class_id+ext)).convert('L'))
        imgs.append(img)
imgs = np.array(imgs)
n_samples, h, w = imgs.shape

imgs = np.reshape(imgs, (n_samples, h*w))
n_components = 30

t0 = time()
pca = PCA(n_components=n_components, svd_solver='randomized',
          whiten=True).fit(imgs)
print(f"pca is fitted in {time() - t0:.0f}s")
print(f'Explained variation per principal component: \n{pca.explained_variance_ratio_}')

eigenfaces = pca.components_.reshape((n_components, h, w))
img_pca = pca.transform(imgs)
pca is fitted in 6s
Explained variation per principal component: 
[0.16400588 0.10582972 0.07423492 0.05696156 0.03344151 0.02725828
 0.02416499 0.02329284 0.02024689 0.01692791 0.01573021 0.013579
 0.01292885 0.01185079 0.01141357 0.00954064 0.00822195 0.0078436
 0.00709916 0.00670267 0.00646114 0.00626619 0.0059206  0.00564931
 0.00546337 0.00517184 0.00475507 0.00465659 0.00437207 0.00422053]
pca_df = pd.DataFrame(img_pca, columns=[str(col) for col in range(n_components)])
pca_df['class_id'] = [num2class[n % len(num2class)] for n in range(n_samples)]
pca_df['class_id'] = pca_df['class_id'].map(lambda x: x if x in ['incorrect_mask', 'normal'] else 'mask')
pca_df.head()

0

1

2

3

4

5

6

7

8

9

...

21

22

23

24

25

26

27

28

29

class_id

0

-0.780943

-0.301832

0.909870

0.561611

0.377147

0.319390

0.490564

-0.157337

0.226589

-0.055671

...

1.402771

0.698502

1.009226

1.024885

0.241752

-0.573497

0.032257

1.244118

-1.303907

incorrect_mask

1

-1.531008

0.054247

1.062841

0.448427

0.221644

0.127141

-0.152839

2.102684

2.525257

0.799864

...

-0.562831

-0.488063

-0.535817

-0.568799

1.178023

-0.692086

0.088736

0.438619

-0.489825

mask

2

-0.878984

-0.366578

0.961171

0.348036

0.147981

-0.001350

0.647854

-0.286916

0.415846

0.085153

...

0.990493

0.476577

0.431235

1.320901

-0.039927

-0.440994

0.039908

1.034641

-0.618757

mask

3

-0.354757

-0.443230

1.248900

0.964928

0.401074

0.732453

0.595657

-0.243793

0.063750

0.278082

...

1.053063

-0.556322

0.290911

0.385133

-0.800782

-0.706160

-0.874645

0.807353

-0.202597

mask

4

-0.526067

-0.342682

1.048815

0.778287

0.344716

0.535303

0.591965

-0.187899

0.033147

0.083656

...

1.541471

0.074838

1.002258

0.754855

-0.308133

-0.994480

-0.577034

1.347365

-1.466241

mask

5 rows ร— 31 columns

plt.figure(figsize=(8,6))
sns.scatterplot(
    x='0', y='1',
    hue="class_id",
    data=pca_df,
    legend="full",
    palette=sns.color_palette("Set2", 3),
    alpha=0.8
)
plt.show()
ax = plt.figure(figsize=(16,10)).gca(projection='3d')
simplified_num2class = ['incorrect_mask', 'mask', 'normal']
simplified_class2num = {k: v for v, k in enumerate(simplified_num2class)}
ax.scatter(
    xs=pca_df["0"], 
    ys=pca_df["1"], 
    zs=pca_df["2"], 
    c=pca_df['class_id'].map(lambda x: simplified_class2num[x]), 
)
ax.set_xlabel('pc1')
ax.set_ylabel('pc2')
ax.set_zlabel('pc3')

plt.legend(simplified_num2class)
plt.show()
time_start = time()
tsne = TSNE(n_components=2, verbose=1, perplexity=40, n_iter=300)
tsne_results = tsne.fit_transform(img_pca)
print('t-SNE done! Time elapsed: {} seconds'.format(time()-time_start))
[t-SNE] Computing 121 nearest neighbors...
[t-SNE] Indexed 700 samples in 0.000s...
[t-SNE] Computed neighbors for 700 samples in 0.036s...
[t-SNE] Computed conditional probabilities for sample 700 / 700
[t-SNE] Mean sigma: 2.323550
[t-SNE] KL divergence after 250 iterations with early exaggeration: 71.543358
[t-SNE] KL divergence after 300 iterations: 1.380570
t-SNE done! Time elapsed: 0.7664225101470947 seconds
pca_df['tsne-2d-one'] = tsne_results[:,0]
pca_df['tsne-2d-two'] = tsne_results[:,1]
plt.figure(figsize=(8,6))
sns.scatterplot(
    x="tsne-2d-one", y="tsne-2d-two",
    hue="class_id",
    palette=sns.color_palette("Set2", 3),
    data=pca_df,
    legend="full",
    alpha=0.8
)
plt.show()

4. Reference

PreviousDAY 1 : EDANext[P]Stage-1

Last updated 3 years ago

Was this helpful?

์˜ˆ๋ฅผ ๋“ค์–ด glob('*.exe') ๋ฅผ ์ž…๋ ฅํ•˜๋ฉด ํ˜„์žฌ ๋””๋ ‰ํ† ๋ฆฌ์—์„œ .exe ํŒŒ์ผ์— ํ•ด๋‹นํ•˜๋Š” ํŒŒ์ผ ์ด๋ฆ„๋“ค(['python.exe', 'pythonw.exe'])์„ ๋ฐ˜ํ™˜ํ•œ๋‹ค. ์ด๋Ÿฌํ•œ ํŒจํ„ด์€ ๊ณผ ๊ด€๋ จ์ด ์žˆ๋‹ค.

์ฝ”๋“œ๊ฐ€ ์ œ๋Œ€๋กœ ์‹คํ–‰์ด ๋˜์ง€์•Š๋Š”๋‹ค๋ฉด, opencv์—์„œ ์ œ๊ณตํ•˜๋Š” ํŒŒ์ผ์„ ํ˜„์žฌ ๊ฒฝ๋กœ์— ๋ฐ›์•„์•ผํ•ฉ๋‹ˆ๋‹ค.

face_cascade๋Š” ์œ„์™€ ๊ฐ™์€ ์ปค๋„์„ ๊ฐ€์ง€๊ณ  ์ด๋ฏธ์ง€์˜ ํŠน์ง•์„ ๋ฝ‘์•„๋‚ธ๋‹ค. ํฐ์ƒ‰ ๋ถ€๋ถ„์˜ ํ”ฝ์…€๊ฐ’์„ ๋นผ๊ณ  ๊ฒ€์€์ƒ‰ ๋ถ€๋ถ„์˜ ํ”ฝ์…€๊ฐ’์„ ๋”ํ•ด์„œ feature๋ฅผ ๋งŒ๋“ ๋‹ค. ์ด๋Ÿฌํ•œ feature๋Š” 24x24 ์œˆ๋„์šฐ๋กœ ๋ฌด๋ ค 16๋งŒ๊ฐœ์˜ ํŠน์ง•์ด ์ƒ์„ฑ๋œ๋‹ค๊ณ  ํ•œ๋‹ค. (์œˆ๋„์šฐ๊ฐ€ ์ •ํ™•ํžˆ๋Š” ๋ฌด์—‡์ธ์ง€ ํŒŒ์•…ํ•˜์ง€ ๋ชปํ–ˆ์ง€๋งŒ ์ปค๋„์˜ ํฌ๊ธฐ๋กœ ์ง์ž‘๋œ๋‹ค) ๋”ฅ๋Ÿฌ๋‹์€ ์ด๋ฏธ์ง€์˜ ํ”ผ์ฒ˜๋งต์„ ์•Œ์•„์„œ ๋ฝ‘์•„๋‚ธ๋‹ค. ๊ทธ์— ๋ฐ˜ํ•ด ๋”ฅ๋Ÿฌ๋‹์ด ์žˆ๊ธฐ์ „์˜ ๋จธ์‹ ๋Ÿฌ๋‹์€ ์‚ฌ๋žŒ์ด ์ง์ ‘ ์ด๋ฏธ์ง€์˜ ํ”ผ์ฒ˜๋ฅผ ์ •ํ•œ๋‹ค. ์œ„์˜ ์ปค๋„๋“ค์ฒ˜๋Ÿผ. ์ž์„ธํ•œ ์„ค๋ช…์€ ๋ฅผ ์ฐธ์กฐํ•˜๋ฉด ๋œ๋‹ค.

์ •๊ทœํ‘œํ˜„์‹
haarcascade_frontalface_default.xml
์—ฌ๊ธฐ
Visualising high-dimensional datasets using PCA and t-SNE in Python
Faces recognition example using eigenfaces and SVMs
Seaborn docs
Face Detection in 2 Minutes using OpenCV & Python