๐Ÿšดโ€โ™‚๏ธ
TIL
  • MAIN
  • : TIL?
  • : WIL
  • : Plan
  • : Retrospective
    • 21Y
      • Wait a moment!
      • 9M 2W
      • 9M1W
      • 8M4W
      • 8M3W
      • 8M2W
      • 8M1W
      • 7M4W
      • 7M3W
      • 7M2W
      • 7M1W
      • 6M5W
      • 1H
    • ์ƒˆ์‚ฌ๋žŒ ๋˜๊ธฐ ํ”„๋กœ์ ํŠธ
      • 2ํšŒ์ฐจ
      • 1ํšŒ์ฐจ
  • TIL : ML
    • Paper Analysis
      • BERT
      • Transformer
    • Boostcamp 2st
      • [S]Data Viz
        • (4-3) Seaborn ์‹ฌํ™”
        • (4-2) Seaborn ๊ธฐ์ดˆ
        • (4-1) Seaborn ์†Œ๊ฐœ
        • (3-4) More Tips
        • (3-3) Facet ์‚ฌ์šฉํ•˜๊ธฐ
        • (3-2) Color ์‚ฌ์šฉํ•˜๊ธฐ
        • (3-1) Text ์‚ฌ์šฉํ•˜๊ธฐ
        • (2-3) Scatter Plot ์‚ฌ์šฉํ•˜๊ธฐ
        • (2-2) Line Plot ์‚ฌ์šฉํ•˜๊ธฐ
        • (2-1) Bar Plot ์‚ฌ์šฉํ•˜๊ธฐ
        • (1-3) Python๊ณผ Matplotlib
        • (1-2) ์‹œ๊ฐํ™”์˜ ์š”์†Œ
        • (1-1) Welcome to Visualization (OT)
      • [P]MRC
        • (2๊ฐ•) Extraction-based MRC
        • (1๊ฐ•) MRC Intro & Python Basics
      • [P]KLUE
        • (5๊ฐ•) BERT ๊ธฐ๋ฐ˜ ๋‹จ์ผ ๋ฌธ์žฅ ๋ถ„๋ฅ˜ ๋ชจ๋ธ ํ•™์Šต
        • (4๊ฐ•) ํ•œ๊ตญ์–ด BERT ์–ธ์–ด ๋ชจ๋ธ ํ•™์Šต
        • [NLP] ๋ฌธ์žฅ ๋‚ด ๊ฐœ์ฒด๊ฐ„ ๊ด€๊ณ„ ์ถ”์ถœ
        • (3๊ฐ•) BERT ์–ธ์–ด๋ชจ๋ธ ์†Œ๊ฐœ
        • (2๊ฐ•) ์ž์—ฐ์–ด์˜ ์ „์ฒ˜๋ฆฌ
        • (1๊ฐ•) ์ธ๊ณต์ง€๋Šฅ๊ณผ ์ž์—ฐ์–ด ์ฒ˜๋ฆฌ
      • [U]Stage-CV
      • [U]Stage-NLP
        • 7W Retrospective
        • (10๊ฐ•) Advanced Self-supervised Pre-training Models
        • (09๊ฐ•) Self-supervised Pre-training Models
        • (08๊ฐ•) Transformer (2)
        • (07๊ฐ•) Transformer (1)
        • 6W Retrospective
        • (06๊ฐ•) Beam Search and BLEU score
        • (05๊ฐ•) Sequence to Sequence with Attention
        • (04๊ฐ•) LSTM and GRU
        • (03๊ฐ•) Recurrent Neural Network and Language Modeling
        • (02๊ฐ•) Word Embedding
        • (01๊ฐ•) Intro to NLP, Bag-of-Words
        • [ํ•„์ˆ˜ ๊ณผ์ œ 4] Preprocessing for NMT Model
        • [ํ•„์ˆ˜ ๊ณผ์ œ 3] Subword-level Language Model
        • [ํ•„์ˆ˜ ๊ณผ์ œ2] RNN-based Language Model
        • [์„ ํƒ ๊ณผ์ œ] BERT Fine-tuning with transformers
        • [ํ•„์ˆ˜ ๊ณผ์ œ] Data Preprocessing
      • Mask Wear Image Classification
        • 5W Retrospective
        • Report_Level1_6
        • Performance | Review
        • DAY 11 : HardVoting | MultiLabelClassification
        • DAY 10 : Cutmix
        • DAY 9 : Loss Function
        • DAY 8 : Baseline
        • DAY 7 : Class Imbalance | Stratification
        • DAY 6 : Error Fix
        • DAY 5 : Facenet | Save
        • DAY 4 : VIT | F1_Loss | LrScheduler
        • DAY 3 : DataSet/Lodaer | EfficientNet
        • DAY 2 : Labeling
        • DAY 1 : EDA
        • 2_EDA Analysis
      • [P]Stage-1
        • 4W Retrospective
        • (10๊ฐ•) Experiment Toolkits & Tips
        • (9๊ฐ•) Ensemble
        • (8๊ฐ•) Training & Inference 2
        • (7๊ฐ•) Training & Inference 1
        • (6๊ฐ•) Model 2
        • (5๊ฐ•) Model 1
        • (4๊ฐ•) Data Generation
        • (3๊ฐ•) Dataset
        • (2๊ฐ•) Image Classification & EDA
        • (1๊ฐ•) Competition with AI Stages!
      • [U]Stage-3
        • 3W Retrospective
        • PyTorch
          • (10๊ฐ•) PyTorch Troubleshooting
          • (09๊ฐ•) Hyperparameter Tuning
          • (08๊ฐ•) Multi-GPU ํ•™์Šต
          • (07๊ฐ•) Monitoring tools for PyTorch
          • (06๊ฐ•) ๋ชจ๋ธ ๋ถˆ๋Ÿฌ์˜ค๊ธฐ
          • (05๊ฐ•) Dataset & Dataloader
          • (04๊ฐ•) AutoGrad & Optimizer
          • (03๊ฐ•) PyTorch ํ”„๋กœ์ ํŠธ ๊ตฌ์กฐ ์ดํ•ดํ•˜๊ธฐ
          • (02๊ฐ•) PyTorch Basics
          • (01๊ฐ•) Introduction to PyTorch
      • [U]Stage-2
        • 2W Retrospective
        • DL Basic
          • (10๊ฐ•) Generative Models 2
          • (09๊ฐ•) Generative Models 1
          • (08๊ฐ•) Sequential Models - Transformer
          • (07๊ฐ•) Sequential Models - RNN
          • (06๊ฐ•) Computer Vision Applications
          • (05๊ฐ•) Modern CNN - 1x1 convolution์˜ ์ค‘์š”์„ฑ
          • (04๊ฐ•) Convolution์€ ๋ฌด์—‡์ธ๊ฐ€?
          • (03๊ฐ•) Optimization
          • (02๊ฐ•) ๋‰ด๋Ÿด ๋„คํŠธ์›Œํฌ - MLP (Multi-Layer Perceptron)
          • (01๊ฐ•) ๋”ฅ๋Ÿฌ๋‹ ๊ธฐ๋ณธ ์šฉ์–ด ์„ค๋ช… - Historical Review
        • Assignment
          • [ํ•„์ˆ˜ ๊ณผ์ œ] Multi-headed Attention Assignment
          • [ํ•„์ˆ˜ ๊ณผ์ œ] LSTM Assignment
          • [ํ•„์ˆ˜ ๊ณผ์ œ] CNN Assignment
          • [ํ•„์ˆ˜ ๊ณผ์ œ] Optimization Assignment
          • [ํ•„์ˆ˜ ๊ณผ์ œ] MLP Assignment
      • [U]Stage-1
        • 1W Retrospective
        • AI Math
          • (AI Math 10๊ฐ•) RNN ์ฒซ๊ฑธ์Œ
          • (AI Math 9๊ฐ•) CNN ์ฒซ๊ฑธ์Œ
          • (AI Math 8๊ฐ•) ๋ฒ ์ด์ฆˆ ํ†ต๊ณ„ํ•™ ๋ง›๋ณด๊ธฐ
          • (AI Math 7๊ฐ•) ํ†ต๊ณ„ํ•™ ๋ง›๋ณด๊ธฐ
          • (AI Math 6๊ฐ•) ํ™•๋ฅ ๋ก  ๋ง›๋ณด๊ธฐ
          • (AI Math 5๊ฐ•) ๋”ฅ๋Ÿฌ๋‹ ํ•™์Šต๋ฐฉ๋ฒ• ์ดํ•ดํ•˜๊ธฐ
          • (AI Math 4๊ฐ•) ๊ฒฝ์‚ฌํ•˜๊ฐ•๋ฒ• - ๋งค์šด๋ง›
          • (AI Math 3๊ฐ•) ๊ฒฝ์‚ฌํ•˜๊ฐ•๋ฒ• - ์ˆœํ•œ๋ง›
          • (AI Math 2๊ฐ•) ํ–‰๋ ฌ์ด ๋ญ์˜ˆ์š”?
          • (AI Math 1๊ฐ•) ๋ฒกํ„ฐ๊ฐ€ ๋ญ์˜ˆ์š”?
        • Python
          • (Python 7-2๊ฐ•) pandas II
          • (Python 7-1๊ฐ•) pandas I
          • (Python 6๊ฐ•) numpy
          • (Python 5-2๊ฐ•) Python data handling
          • (Python 5-1๊ฐ•) File / Exception / Log Handling
          • (Python 4-2๊ฐ•) Module and Project
          • (Python 4-1๊ฐ•) Python Object Oriented Programming
          • (Python 3-2๊ฐ•) Pythonic code
          • (Python 3-1๊ฐ•) Python Data Structure
          • (Python 2-4๊ฐ•) String and advanced function concept
          • (Python 2-3๊ฐ•) Conditionals and Loops
          • (Python 2-2๊ฐ•) Function and Console I/O
          • (Python 2-1๊ฐ•) Variables
          • (Python 1-3๊ฐ•) ํŒŒ์ด์ฌ ์ฝ”๋”ฉ ํ™˜๊ฒฝ
          • (Python 1-2๊ฐ•) ํŒŒ์ด์ฌ ๊ฐœ์š”
          • (Python 1-1๊ฐ•) Basic computer class for newbies
        • Assignment
          • [์„ ํƒ ๊ณผ์ œ 3] Maximum Likelihood Estimate
          • [์„ ํƒ ๊ณผ์ œ 2] Backpropagation
          • [์„ ํƒ ๊ณผ์ œ 1] Gradient Descent
          • [ํ•„์ˆ˜ ๊ณผ์ œ 5] Morsecode
          • [ํ•„์ˆ˜ ๊ณผ์ œ 4] Baseball
          • [ํ•„์ˆ˜ ๊ณผ์ œ 3] Text Processing 2
          • [ํ•„์ˆ˜ ๊ณผ์ œ 2] Text Processing 1
          • [ํ•„์ˆ˜ ๊ณผ์ œ 1] Basic Math
    • ๋”ฅ๋Ÿฌ๋‹ CNN ์™„๋ฒฝ ๊ฐ€์ด๋“œ - Fundamental ํŽธ
      • ์ข…ํ•ฉ ์‹ค์Šต 2 - ์บ๊ธ€ Plant Pathology(๋‚˜๋ฌด์žŽ ๋ณ‘ ์ง„๋‹จ) ๊ฒฝ์—ฐ ๋Œ€ํšŒ
      • ์ข…ํ•ฉ ์‹ค์Šต 1 - 120์ข…์˜ Dog Breed Identification ๋ชจ๋ธ ์ตœ์ ํ™”
      • ์‚ฌ์ „ ํ›ˆ๋ จ ๋ชจ๋ธ์˜ ๋ฏธ์„ธ ์กฐ์ • ํ•™์Šต๊ณผ ๋‹ค์–‘ํ•œ Learning Rate Scheduler์˜ ์ ์šฉ
      • Advanced CNN ๋ชจ๋ธ ํŒŒํ—ค์น˜๊ธฐ - ResNet ์ƒ์„ธ์™€ EfficientNet ๊ฐœ์š”
      • Advanced CNN ๋ชจ๋ธ ํŒŒํ—ค์น˜๊ธฐ - AlexNet, VGGNet, GoogLeNet
      • Albumentation์„ ์ด์šฉํ•œ Augmentation๊ธฐ๋ฒ•๊ณผ Keras Sequence ํ™œ์šฉํ•˜๊ธฐ
      • ์‚ฌ์ „ ํ›ˆ๋ จ CNN ๋ชจ๋ธ์˜ ํ™œ์šฉ๊ณผ Keras Generator ๋ฉ”์ปค๋‹ˆ์ฆ˜ ์ดํ•ด
      • ๋ฐ์ดํ„ฐ ์ฆ๊ฐ•์˜ ์ดํ•ด - Keras ImageDataGenerator ํ™œ์šฉ
      • CNN ๋ชจ๋ธ ๊ตฌํ˜„ ๋ฐ ์„ฑ๋Šฅ ํ–ฅ์ƒ ๊ธฐ๋ณธ ๊ธฐ๋ฒ• ์ ์šฉํ•˜๊ธฐ
    • AI School 1st
    • ํ˜„์—… ์‹ค๋ฌด์ž์—๊ฒŒ ๋ฐฐ์šฐ๋Š” Kaggle ๋จธ์‹ ๋Ÿฌ๋‹ ์ž…๋ฌธ
    • ํŒŒ์ด์ฌ ๋”ฅ๋Ÿฌ๋‹ ํŒŒ์ดํ† ์น˜
  • TIL : Python & Math
    • Do It! ์žฅ๊ณ +๋ถ€ํŠธ์ŠคํŠธ๋žฉ: ํŒŒ์ด์ฌ ์›น๊ฐœ๋ฐœ์˜ ์ •์„
      • Relations - ๋‹ค๋Œ€๋‹ค ๊ด€๊ณ„
      • Relations - ๋‹ค๋Œ€์ผ ๊ด€๊ณ„
      • ํ…œํ”Œ๋ฆฟ ํŒŒ์ผ ๋ชจ๋“ˆํ™” ํ•˜๊ธฐ
      • TDD (Test Driven Development)
      • template tags & ์กฐ๊ฑด๋ฌธ
      • ์ •์  ํŒŒ์ผ(static files) & ๋ฏธ๋””์–ด ํŒŒ์ผ(media files)
      • FBV (Function Based View)์™€ CBV (Class Based View)
      • Django ์ž…๋ฌธํ•˜๊ธฐ
      • ๋ถ€ํŠธ์ŠคํŠธ๋žฉ
      • ํ”„๋ก ํŠธ์—”๋“œ ๊ธฐ์ดˆ๋‹ค์ง€๊ธฐ (HTML, CSS, JS)
      • ๋“ค์–ด๊ฐ€๊ธฐ + ํ™˜๊ฒฝ์„ค์ •
    • Algorithm
      • Programmers
        • Level1
          • ์†Œ์ˆ˜ ๋งŒ๋“ค๊ธฐ
          • ์ˆซ์ž ๋ฌธ์ž์—ด๊ณผ ์˜๋‹จ์–ด
          • ์ž์—ฐ์ˆ˜ ๋’ค์ง‘์–ด ๋ฐฐ์—ด๋กœ ๋งŒ๋“ค๊ธฐ
          • ์ •์ˆ˜ ๋‚ด๋ฆผ์ฐจ์ˆœ์œผ๋กœ ๋ฐฐ์น˜ํ•˜๊ธฐ
          • ์ •์ˆ˜ ์ œ๊ณฑ๊ทผ ํŒ๋ณ„
          • ์ œ์ผ ์ž‘์€ ์ˆ˜ ์ œ๊ฑฐํ•˜๊ธฐ
          • ์ง์‚ฌ๊ฐํ˜• ๋ณ„์ฐ๊ธฐ
          • ์ง์ˆ˜์™€ ํ™€์ˆ˜
          • ์ฒด์œก๋ณต
          • ์ตœ๋Œ€๊ณต์•ฝ์ˆ˜์™€ ์ตœ์†Œ๊ณต๋ฐฐ์ˆ˜
          • ์ฝœ๋ผ์ธ  ์ถ”์ธก
          • ํฌ๋ ˆ์ธ ์ธํ˜•๋ฝ‘๊ธฐ ๊ฒŒ์ž„
          • ํ‚คํŒจ๋“œ ๋ˆ„๋ฅด๊ธฐ
          • ํ‰๊ท  ๊ตฌํ•˜๊ธฐ
          • ํฐ์ผ“๋ชฌ
          • ํ•˜์ƒค๋“œ ์ˆ˜
          • ํ•ธ๋“œํฐ ๋ฒˆํ˜ธ ๊ฐ€๋ฆฌ๊ธฐ
          • ํ–‰๋ ฌ์˜ ๋ง์…ˆ
        • Level2
          • ์ˆซ์ž์˜ ํ‘œํ˜„
          • ์ˆœ์œ„ ๊ฒ€์ƒ‰
          • ์ˆ˜์‹ ์ตœ๋Œ€ํ™”
          • ์†Œ์ˆ˜ ์ฐพ๊ธฐ
          • ์†Œ์ˆ˜ ๋งŒ๋“ค๊ธฐ
          • ์‚ผ๊ฐ ๋‹ฌํŒฝ์ด
          • ๋ฌธ์ž์—ด ์••์ถ•
          • ๋ฉ”๋‰ด ๋ฆฌ๋‰ด์–ผ
          • ๋” ๋งต๊ฒŒ
          • ๋•…๋”ฐ๋จน๊ธฐ
          • ๋ฉ€์ฉกํ•œ ์‚ฌ๊ฐํ˜•
          • ๊ด„ํ˜ธ ํšŒ์ „ํ•˜๊ธฐ
          • ๊ด„ํ˜ธ ๋ณ€ํ™˜
          • ๊ตฌ๋ช…๋ณดํŠธ
          • ๊ธฐ๋Šฅ ๊ฐœ๋ฐœ
          • ๋‰ด์Šค ํด๋Ÿฌ์Šคํ„ฐ๋ง
          • ๋‹ค๋ฆฌ๋ฅผ ์ง€๋‚˜๋Š” ํŠธ๋Ÿญ
          • ๋‹ค์Œ ํฐ ์ˆซ์ž
          • ๊ฒŒ์ž„ ๋งต ์ตœ๋‹จ๊ฑฐ๋ฆฌ
          • ๊ฑฐ๋ฆฌ๋‘๊ธฐ ํ™•์ธํ•˜๊ธฐ
          • ๊ฐ€์žฅ ํฐ ์ •์‚ฌ๊ฐํ˜• ์ฐพ๊ธฐ
          • H-Index
          • JadenCase ๋ฌธ์ž์—ด ๋งŒ๋“ค๊ธฐ
          • N๊ฐœ์˜ ์ตœ์†Œ๊ณต๋ฐฐ์ˆ˜
          • N์ง„์ˆ˜ ๊ฒŒ์ž„
          • ๊ฐ€์žฅ ํฐ ์ˆ˜
          • 124 ๋‚˜๋ผ์˜ ์ˆซ์ž
          • 2๊ฐœ ์ดํ•˜๋กœ ๋‹ค๋ฅธ ๋น„ํŠธ
          • [3์ฐจ] ํŒŒ์ผ๋ช… ์ •๋ ฌ
          • [3์ฐจ] ์••์ถ•
          • ์ค„ ์„œ๋Š” ๋ฐฉ๋ฒ•
          • [3์ฐจ] ๋ฐฉ๊ธˆ ๊ทธ๊ณก
          • ๊ฑฐ๋ฆฌ๋‘๊ธฐ ํ™•์ธํ•˜๊ธฐ
        • Level3
          • ๋งค์นญ ์ ์ˆ˜
          • ์™ธ๋ฒฝ ์ ๊ฒ€
          • ๊ธฐ์ง€๊ตญ ์„ค์น˜
          • ์ˆซ์ž ๊ฒŒ์ž„
          • 110 ์˜ฎ๊ธฐ๊ธฐ
          • ๊ด‘๊ณ  ์ œ๊ฑฐ
          • ๊ธธ ์ฐพ๊ธฐ ๊ฒŒ์ž„
          • ์…”ํ‹€๋ฒ„์Šค
          • ๋‹จ์†์นด๋ฉ”๋ผ
          • ํ‘œ ํŽธ์ง‘
          • N-Queen
          • ์ง•๊ฒ€๋‹ค๋ฆฌ ๊ฑด๋„ˆ๊ธฐ
          • ์ตœ๊ณ ์˜ ์ง‘ํ•ฉ
          • ํ•ฉ์Šน ํƒ์‹œ ์š”๊ธˆ
          • ๊ฑฐ์Šค๋ฆ„๋ˆ
          • ํ•˜๋…ธ์ด์˜ ํƒ‘
          • ๋ฉ€๋ฆฌ ๋›ฐ๊ธฐ
          • ๋ชจ๋‘ 0์œผ๋กœ ๋งŒ๋“ค๊ธฐ
        • Level4
    • Head First Python
    • ๋ฐ์ดํ„ฐ ๋ถ„์„์„ ์œ„ํ•œ SQL
    • ๋‹จ ๋‘ ์žฅ์˜ ๋ฌธ์„œ๋กœ ๋ฐ์ดํ„ฐ ๋ถ„์„๊ณผ ์‹œ๊ฐํ™” ๋ฝ€๊ฐœ๊ธฐ
    • Linear Algebra(Khan Academy)
    • ์ธ๊ณต์ง€๋Šฅ์„ ์œ„ํ•œ ์„ ํ˜•๋Œ€์ˆ˜
    • Statistics110
  • TIL : etc
    • [๋”ฐ๋ฐฐ๋Ÿฐ] Kubernetes
    • [๋”ฐ๋ฐฐ๋Ÿฐ] Docker
      • 2. ๋„์ปค ์„ค์น˜ ์‹ค์Šต 1 - ํ•™์ŠตํŽธ(์ค€๋น„๋ฌผ/์‹ค์Šต ์œ ํ˜• ์†Œ๊ฐœ)
      • 1. ์ปจํ…Œ์ด๋„ˆ์™€ ๋„์ปค์˜ ์ดํ•ด - ์ปจํ…Œ์ด๋„ˆ๋ฅผ ์“ฐ๋Š”์ด์œ  / ์ผ๋ฐ˜ํ”„๋กœ๊ทธ๋žจ๊ณผ ์ปจํ…Œ์ด๋„ˆํ”„๋กœ๊ทธ๋žจ์˜ ์ฐจ์ด์ 
      • 0. ๋“œ๋””์–ด ์ฐพ์•„์˜จ Docker ๊ฐ•์˜! ์™•์ดˆ๋ณด์—์„œ ๋„์ปค ๋งˆ์Šคํ„ฐ๋กœ - OT
    • CoinTrading
      • [๊ฐ€์ƒ ํ™”ํ ์ž๋™ ๋งค๋งค ํ”„๋กœ๊ทธ๋žจ] ๋ฐฑํ…Œ์ŠคํŒ… : ๊ฐ„๋‹จํ•œ ํ…Œ์ŠคํŒ…
    • Gatsby
      • 01 ๊นƒ๋ถ ํฌ๊ธฐ ์„ ์–ธ
  • TIL : Project
    • Mask Wear Image Classification
    • Project. GARIGO
  • 2021 TIL
    • CHANGED
    • JUN
      • 30 Wed
      • 29 Tue
      • 28 Mon
      • 27 Sun
      • 26 Sat
      • 25 Fri
      • 24 Thu
      • 23 Wed
      • 22 Tue
      • 21 Mon
      • 20 Sun
      • 19 Sat
      • 18 Fri
      • 17 Thu
      • 16 Wed
      • 15 Tue
      • 14 Mon
      • 13 Sun
      • 12 Sat
      • 11 Fri
      • 10 Thu
      • 9 Wed
      • 8 Tue
      • 7 Mon
      • 6 Sun
      • 5 Sat
      • 4 Fri
      • 3 Thu
      • 2 Wed
      • 1 Tue
    • MAY
      • 31 Mon
      • 30 Sun
      • 29 Sat
      • 28 Fri
      • 27 Thu
      • 26 Wed
      • 25 Tue
      • 24 Mon
      • 23 Sun
      • 22 Sat
      • 21 Fri
      • 20 Thu
      • 19 Wed
      • 18 Tue
      • 17 Mon
      • 16 Sun
      • 15 Sat
      • 14 Fri
      • 13 Thu
      • 12 Wed
      • 11 Tue
      • 10 Mon
      • 9 Sun
      • 8 Sat
      • 7 Fri
      • 6 Thu
      • 5 Wed
      • 4 Tue
      • 3 Mon
      • 2 Sun
      • 1 Sat
    • APR
      • 30 Fri
      • 29 Thu
      • 28 Wed
      • 27 Tue
      • 26 Mon
      • 25 Sun
      • 24 Sat
      • 23 Fri
      • 22 Thu
      • 21 Wed
      • 20 Tue
      • 19 Mon
      • 18 Sun
      • 17 Sat
      • 16 Fri
      • 15 Thu
      • 14 Wed
      • 13 Tue
      • 12 Mon
      • 11 Sun
      • 10 Sat
      • 9 Fri
      • 8 Thu
      • 7 Wed
      • 6 Tue
      • 5 Mon
      • 4 Sun
      • 3 Sat
      • 2 Fri
      • 1 Thu
    • MAR
      • 31 Wed
      • 30 Tue
      • 29 Mon
      • 28 Sun
      • 27 Sat
      • 26 Fri
      • 25 Thu
      • 24 Wed
      • 23 Tue
      • 22 Mon
      • 21 Sun
      • 20 Sat
      • 19 Fri
      • 18 Thu
      • 17 Wed
      • 16 Tue
      • 15 Mon
      • 14 Sun
      • 13 Sat
      • 12 Fri
      • 11 Thu
      • 10 Wed
      • 9 Tue
      • 8 Mon
      • 7 Sun
      • 6 Sat
      • 5 Fri
      • 4 Thu
      • 3 Wed
      • 2 Tue
      • 1 Mon
    • FEB
      • 28 Sun
      • 27 Sat
      • 26 Fri
      • 25 Thu
      • 24 Wed
      • 23 Tue
      • 22 Mon
      • 21 Sun
      • 20 Sat
      • 19 Fri
      • 18 Thu
      • 17 Wed
      • 16 Tue
      • 15 Mon
      • 14 Sun
      • 13 Sat
      • 12 Fri
      • 11 Thu
      • 10 Wed
      • 9 Tue
      • 8 Mon
      • 7 Sun
      • 6 Sat
      • 5 Fri
      • 4 Thu
      • 3 Wed
      • 2 Tue
      • 1 Mon
    • JAN
      • 31 Sun
      • 30 Sat
      • 29 Fri
      • 28 Thu
      • 27 Wed
      • 26 Tue
      • 25 Mon
      • 24 Sun
      • 23 Sat
      • 22 Fri
      • 21 Thu
      • 20 Wed
      • 19 Tue
      • 18 Mon
      • 17 Sun
      • 16 Sat
      • 15 Fri
      • 14 Thu
      • 13 Wed
      • 12 Tue
      • 11 Mon
      • 10 Sun
      • 9 Sat
      • 8 Fri
      • 7 Thu
      • 6 Wed
      • 5 Tue
      • 4 Mon
      • 3 Sun
      • 2 Sat
      • 1 Fri
  • 2020 TIL
    • DEC
      • 31 Thu
      • 30 Wed
      • 29 Tue
      • 28 Mon
      • 27 Sun
      • 26 Sat
      • 25 Fri
      • 24 Thu
      • 23 Wed
      • 22 Tue
      • 21 Mon
      • 20 Sun
      • 19 Sat
      • 18 Fri
      • 17 Thu
      • 16 Wed
      • 15 Tue
      • 14 Mon
      • 13 Sun
      • 12 Sat
      • 11 Fri
      • 10 Thu
      • 9 Wed
      • 8 Tue
      • 7 Mon
      • 6 Sun
      • 5 Sat
      • 4 Fri
      • 3 Tue
      • 2 Wed
      • 1 Tue
    • NOV
      • 30 Mon
Powered by GitBook
On this page
  • 1.Extraction-based MRC
  • ๋ฌธ์ œ ์ •์˜
  • ํ‰๊ฐ€ ๋ฐฉ๋ฒ•
  • Overview
  • 2.Pre-processing
  • 3.Fine-tuning
  • 4.Post-processing
  • ์‹ค์Šต
  • Requirements
  • ๋ฐ์ดํ„ฐ ๋ฐ ํ‰๊ฐ€ ์ง€ํ‘œ ๋ถˆ๋Ÿฌ์˜ค๊ธฐ
  • Pre-trained ๋ชจ๋ธ ๋ถˆ๋Ÿฌ์˜ค๊ธฐ
  • ์„ค์ •ํ•˜๊ธฐ
  • ์ „์ฒ˜๋ฆฌํ•˜๊ธฐ
  • Fine-tuning ํ•˜๊ธฐ
  • ํ‰๊ฐ€ํ•˜

Was this helpful?

  1. TIL : ML
  2. Boostcamp 2st
  3. [P]MRC

(2๊ฐ•) Extraction-based MRC

Previous[P]MRCNext(1๊ฐ•) MRC Intro & Python Basics

Last updated 3 years ago

Was this helpful?

1.Extraction-based MRC

๋ฌธ์ œ ์ •์˜

  • ๋‹ต๋ณ€์„ ์ƒ์„ฑํ•ด๋‚ด๋Š” ๊ฒƒ์ด ์•„๋‹ˆ๋ผ ์ง€๋ฌธ์—์„œ ์ฐพ์„ ์ˆ˜ ์žˆ๊ฒŒ๋œ๋‹ค.

  • ์ด๋Ÿฌํ•œ ๋ฐ์ดํ„ฐ์…‹์€ ์ง์ ‘ ํ•ด๋‹น ์‚ฌ์ดํŠธ์—์„œ ๋‹ค์šด๋ฐ›์„ ์ˆ˜๋„ ์žˆ์ง€๋งŒ huggingface์˜ datasets์—์„œ ๋‹ค์šด์ด ๊ฐ€๋Šฅํ•˜๋‹ค.

ํ‰๊ฐ€ ๋ฐฉ๋ฒ•

  • ์™ผ์ชฝ์€ SQuAD ๋ฐ์ดํ„ฐ์…‹์ด๋ฉฐ ์˜ค๋ฅธ์ชฝ์€ LG์—์„œ SQuAD๋ฅผ ํ† ๋Œ€๋กœ ๋งŒ๋“  KorQuAD์ด๋‹ค.

  • F1 ์ ์ˆ˜๊ฐ€ EM ์ ์ˆ˜๋ณด๋‹ค ๋†’์€ ๊ฒƒ์„ ์•Œ ์ˆ˜ ์žˆ๋‹ค.

  • EM์€ ์˜ˆ์ธก๊ฐ’๊ณผ ์ •๋‹ต์ด ์บ๋ฆญํ„ฐ ๋‹จ์œ„๋กœ ์™„์ „ํžˆ ๋˜‘๊ฐ™์„ ๊ฒฝ์šฐ์—๋งŒ 1์ ์„ ๋ถ€์—ฌํ•˜๋ฉฐ, ํ•˜๋‚˜๋ผ๋„ ๋‹ค๋ฅด๋ฉด 0์ ์„ ๋ถ€์—ฌํ•œ๋‹ค.

  • ๋ฐ˜๋ฉด F1์€ ์˜ˆ์ธก๊ฐ’๊ณผ ์ •๋‹ต์˜ overlap์„ ๋น„์œจ๋กœ ๊ณ„์‚ฐํ•˜๋ฉฐ 0์ ๊ณผ 1์  ์‚ฌ์ด์˜ ๋ถ€๋ถ„์ ์ˆ˜๋ฅผ ๋ฐ›์„ ์ˆ˜ ์žˆ๋‹ค.

์ข€ ๋” ์ž์„ธํžˆ๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™๋‹ค.

  • ์ด ๋•Œ G.T๊ฐ€ ์—ฌ๋Ÿฌ๊ฐœ์ด๋ฏ€๋กœ ๊ฐ๊ฐ์˜ G.T์™€ ์˜ˆ์ธก์„ ๋น„๊ตํ•˜๊ฒŒ ๋˜๊ณ  ๊ฐ€์žฅ ์ตœ๊ณ ์˜ ์ ์ˆ˜๋ฅผ F1 Score๋กœ ์ง€์ •ํ•œ๋‹ค.

Overview

  • ์ง€๋ฌธ๊ณผ ์งˆ๋ฌธ์ด ๊ฐ๊ฐ ์ž„๋ฒ ๋”ฉ๋˜์–ด ๋ชจ๋ธ์— ๋“ค์–ด๊ฐ€๊ฒŒ ๋˜๊ณ  ๋ชจ๋ธ์€ ํŠน์ • ์‹œํ€€์Šค๋ฅผ ๋ฐ˜ํ™˜ํ•˜๋Š” ๊ฒƒ์ด ์•„๋‹ˆ๋ผ ๋‹ต์ด๋ผ๊ณ  ์˜ˆ์ธก๋˜๋Š” ํ† ํฐ์˜ ํฌ์ง€์…˜์„ ๋ฐ˜ํ™˜ํ•œ๋‹ค.

2.Pre-processing

Tokenization

์—ฌ๊ธฐ์„œ๋Š” OOV ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•ด์ฃผ๊ณ  ์ •๋ณดํ•™์ ์œผ๋กœ ์ด์ ์„ ๊ฐ€์ง„ BPE๋ฅผ ์‚ฌ์šฉํ•  ๊ฒƒ์ด๋ฉฐ ์ด ์ค‘ WordPiece Tokenizer๋ฅผ ์‚ฌ์šฉํ•œ๋‹ค.

  • OOV : Out-Of-Vocaburary

  • BPE : Byte Pair Encoding

Special Tokens

[CLS] ์งˆ๋ฌธ [SEP] ์ง€๋ฌธ ์˜ ๊ผด๋กœ ๋ชจ๋ธ์— ์ž…๋ ฅ๋œ๋‹ค.

Attention Mask

์ž…๋ ฅ ์‹œํ€€์Šค ์ค‘์—์„œ attention ์—ฐ์‚ฐ์„ ํ•  ๋•Œ ๋ฌด์‹œํ•  ํ† ํฐ์„ ํ‘œ์‹œํ•œ๋‹ค. 0์€ ๋ฌด์‹œ, 1์€ ์—ฐ์‚ฐ์— ํฌํ•จํ•œ๋‹ค. [PAD]์™€ ๊ฐ™์ด ์˜๋ฏธ๊ฐ€ ์—†๋Š” ํŠน์ˆ˜ํ† ํฐ์„ ๋ฌด์‹œํ•˜๊ธฐ ์œ„ํ•ด ์‚ฌ์šฉํ•œ๋‹ค.

Token Type IDs

์งˆ๋ฌธ์—๋Š” 0, ์ง€๋ฌธ์—๋Š” 1์„ ์ฃผ๋ฉฐ PAD์—๋Š” ํŽธ์˜์ƒ 0์„ ์ค€๋‹ค.

๋ชจ๋ธ ์ถœ๋ ฅ๊ฐ’

๋งŒ์•ฝ, ์ •๋‹ต์ด '๋ฏธ๊ตญ ์œก๊ตฐ ๋ถ€ ์ฐธ๋ชจ ์ด์žฅ' ์ด๋ผ๋ฉด 84์™€ 88์„ ๋ฐ˜ํ™˜ํ•˜๊ฒŒ ๋œ๋‹ค. ๊ทผ๋ฐ ๋งŒ์•ฝ ํ† ํฐํ™”๊ฐ€ '์ €๋ฏธ๊ตญ', '์œก๊ตฐ', '๋ถ€', '์ฐธ๋ชจ์ด', '์žฅ์ด' ๋กœ ๋˜์–ด์žˆ์œผ๋ฉด ์–ด๋–ป๊ฒŒ ํ• ๊นŒ? ์ด ๋•Œ๋Š” ์ ์ˆ˜๊ฐ€ ์ข€ ๋‚ฎ์•„์งˆ ์ˆ˜๋Š” ์žˆ๊ฒ ์ง€๋งŒ ๊ทธ๋ž˜๋„ ์ตœ์†Œ SPAN์„ ์žก์•˜๋‹ค๋Š” ๊ฒƒ์œผ๋กœ ์ธ์ง€ํ•˜๊ณ  ๊ทธ๋Œ€๋กœ ์ด ํ† ํฐ์„ ์‚ฌ์šฉํ•˜๊ฒŒ ๋œ๋‹ค.

3.Fine-tuning

start, end token์ด๋ผ๊ณ  ์˜ˆ์ƒ๋˜๋Š” ๋ฒกํ„ฐ๋Š” ํ•ด๋‹น ํฌ์ง€์…˜์ด ์ง„์งœ ์‹œ์ž‘, ๋ token์ผ ํ™•๋ฅ ์ด๋ฉฐ ์ด๋ฅผ ์‹ค์ œ ๋‹ต์˜ start/end ์œ„์น˜์™€ cross-entropy loss๋ฅผ ํ†ตํ•ด ํ•™์Šตํ•œ๋‹ค.

4.Post-processing

๋ถˆ๊ฐ€๋Šฅํ•œ ๋‹ต ์ œ๊ฑฐํ•˜

  • ์งˆ๋ฌธ์—์„œ ๋‹ต์ด ์žˆ์„ ๊ฒฝ์šฐ ์˜ˆ์ธกํ•œ context๋ฅผ ๋ฒ—์–ด๋‚  ์ˆ˜ ์žˆ๋‹ค.

์ตœ์ ์˜ ๋‹ต์•ˆ ์ฐพ๊ธฐ

์‹ค์Šต

Requirements

!pip install datasets==1.4.1
!pip install transformers==4.4.1

# To use utility functions defined in examples.
!git clone https://github.com/huggingface/transformers.git
import sys
sys.path.append('transformers/examples/question-answering')

from datasets import load_dataset

datasets = load_dataset("squad_kor_v1")
  • 7 : ํŠน์ • ํด๋”์— ์žˆ๋Š” ํŠน์ • ํŒŒ์ด์ฌ ์ฝ”๋“œ๋“ค์„ ํ˜„์žฌ ํ™˜๊ฒฝ์— import ํ•  ์ˆ˜ ์žˆ๋‹ค.

๋ฐ์ดํ„ฐ ๋ฐ ํ‰๊ฐ€ ์ง€ํ‘œ ๋ถˆ๋Ÿฌ์˜ค๊ธฐ

from datasets import load_dataset

datasets = load_dataset("squad_kor_v1")

from datasets import load_metric

metric = load_metric('squad')
  • dataset์€ train๊ณผ valid๋กœ ์ด๋ฃจ์–ด์ ธ ์žˆ์œผ๋ฉฐ dataset์˜ ๊ฐ๊ฐ์˜ ์š”์†Œ id, title, context, question, answers๋กœ ์ด๋ฃจ์–ด์ ธ ์žˆ๋‹ค.

  • dataset['train'] ์œผ๋กœ train dataset์— ์ ‘๊ทผํ•  ์ˆ˜ ์žˆ์œผ๋ฉฐ dataset['train'][0] ์œผ๋กœ ํŠน์ • example์— ์ ‘๊ทผํ•  ์ˆ˜ ์žˆ๋‹ค.

Pre-trained ๋ชจ๋ธ ๋ถˆ๋Ÿฌ์˜ค๊ธฐ

 from transformers import (
    AutoConfig,
    AutoModelForQuestionAnswering,
    AutoTokenizer
)

model_name = "bert-base-multilingual-cased"

config = AutoConfig.from_pretrained(
    model_name
)
tokenizer = AutoTokenizer.from_pretrained(
    model_name,
    use_fast=True
)
model = AutoModelForQuestionAnswering.from_pretrained(
    model_name,
    config=config
)
  • config์™€ tokenizer ๊ทธ๋ฆฌ๊ณ  model์„ Auto series library๋ฅผ ํ†ตํ•ด ๋ถˆ๋Ÿฌ์˜ฌ ์ˆ˜ ์žˆ๋‹ค.

์„ค์ •ํ•˜๊ธฐ

max_seq_length = 384 # ์งˆ๋ฌธ๊ณผ ์ปจํ…์ŠคํŠธ, special token์„ ํ•ฉํ•œ ๋ฌธ์ž์—ด์˜ ์ตœ๋Œ€ ๊ธธ์ด
pad_to_max_length = True
doc_stride = 128 # ์ปจํ…์ŠคํŠธ๊ฐ€ ๋„ˆ๋ฌด ๊ธธ์–ด์„œ ๋‚˜๋ˆด์„ ๋•Œ ์˜ค๋ฒ„๋žฉ๋˜๋Š” ์‹œํ€€์Šค ๊ธธ์ด
max_train_samples = 16
max_val_samples = 16
preprocessing_num_workers = 4
batch_size = 4
num_train_epochs = 2
n_best_size = 20
max_answer_length = 30
  • max_seq_length๋ฅผ ์„ค์ •ํ•ด์•ผ ๋ชจ๋ธ์˜ size๋„, pad๋„ ์ •ํ•  ์ˆ˜ ์žˆ๋‹ค.

  • pad_to_max_length=True๋Š” ๋‚จ์€ ์‹œํ€€์Šค๋ฅผ pad๋กœ ์ฑ„์šฐ๊ฒ ๋‹ค๋Š” ๊ฒƒ

  • doc_stride

  • max_train_samples, max_val_samples : ํ•™์Šต ๋ฐ ๊ฒ€์ฆํ•  ๋ฐ์ดํ„ฐ ์ˆ˜๋ฅผ ์ •ํ•ด๋†“๋Š”๋‹ค. ์—ฌ๊ธฐ์„œ๋Š” ๊ฐ„๋‹จํ•œ ํ…Œ์ŠคํŠธ ์šฉ์ด๊ธฐ ๋•Œ๋ฌธ์— ์ž‘๊ฒŒ ์ •ํ–ˆ์ง€๋งŒ ์‹ค์ œ๋กœ๋Š” ๋งค์šฐ ํฐ ์ˆ˜๊ฐ€ ์ž…๋ ฅ๋œ๋‹ค.

  • preprocessing_num_workes : 4์ด์ƒ์œผ๋กœ๋Š” ๋ณดํ†ต ํ•„์š”๊ฐ€ ์—†๋Š” ๊ฒฝ์šฐ๊ฐ€ ๋งŽ์œผ๋ฉฐ ํ•˜๋“œ์›จ์–ด์— dependent ํ•˜๋‹ค.

  • n_best_size : ๋‹ต๋ณ€ ๊ธธ์ด์˜ ์ตœ์ ์˜ ๊ธธ์ด๋ฅผ ์„ค์ •ํ•œ๋‹ค.

  • max_answer_length : ๋„ˆ๋ฌด ๊ธด ๋‹ต๋ณ€์ด ๋‚˜์˜ค์ง€ ์•Š๋„๋ก ์กฐ์ ˆํ•œ๋‹ค.

์ „์ฒ˜๋ฆฌํ•˜๊ธฐ

def prepare_train_features(examples):
    # ์ฃผ์–ด์ง„ ํ…์ŠคํŠธ๋ฅผ ํ† ํฌ๋‚˜์ด์ง• ํ•œ๋‹ค. ์ด ๋•Œ ํ…์ŠคํŠธ์˜ ๊ธธ์ด๊ฐ€ max_seq_length๋ฅผ ๋„˜์œผ๋ฉด stride๋งŒํผ ์Šฌ๋ผ์ด๋”ฉํ•˜๋ฉฐ ์—ฌ๋Ÿฌ ๊ฐœ๋กœ ์ชผ๊ฐฌ.
    # ์ฆ‰, ํ•˜๋‚˜์˜ example์—์„œ ์ผ๋ถ€๋ถ„์ด ๊ฒน์น˜๋Š” ์—ฌ๋Ÿฌ sequence(feature)๊ฐ€ ์ƒ๊ธธ ์ˆ˜ ์žˆ์Œ.
    tokenized_examples = tokenizer(
        examples["question"],
        examples["context"],
        truncation="only_second",  # max_seq_length๊นŒ์ง€ truncateํ•œ๋‹ค. pair์˜ ๋‘๋ฒˆ์งธ ํŒŒํŠธ(context)๋งŒ ์ž˜๋ผ๋ƒ„.
        max_length=max_seq_length,
        stride=doc_stride,
        return_overflowing_tokens=True, # ๊ธธ์ด๋ฅผ ๋„˜์–ด๊ฐ€๋Š” ํ† ํฐ๋“ค์„ ๋ฐ˜ํ™˜ํ•  ๊ฒƒ์ธ์ง€
        return_offsets_mapping=True,  # ๊ฐ ํ† ํฐ์— ๋Œ€ํ•ด (char_start, char_end) ์ •๋ณด๋ฅผ ๋ฐ˜ํ™˜ํ•œ ๊ฒƒ์ธ์ง€
        padding="max_length",
    )
    
    # example ํ•˜๋‚˜๊ฐ€ ์—ฌ๋Ÿฌ sequence์— ๋Œ€์‘ํ•˜๋Š” ๊ฒฝ์šฐ๋ฅผ ์œ„ํ•ด ๋งคํ•‘์ด ํ•„์š”ํ•จ.
    overflow_to_sample_mapping = tokenized_examples.pop("overflow_to_sample_mapping")
    # offset_mappings์œผ๋กœ ํ† ํฐ์ด ์›๋ณธ context ๋‚ด ๋ช‡๋ฒˆ์งธ ๊ธ€์ž๋ถ€ํ„ฐ ๋ช‡๋ฒˆ์งธ ๊ธ€์ž๊นŒ์ง€ ํ•ด๋‹นํ•˜๋Š”์ง€ ์•Œ ์ˆ˜ ์žˆ์Œ.
    offset_mapping = tokenized_examples.pop("offset_mapping")

    # ์ •๋‹ต์ง€๋ฅผ ๋งŒ๋“ค๊ธฐ ์œ„ํ•œ ๋ฆฌ์ŠคํŠธ
    tokenized_examples["start_positions"] = []
    tokenized_examples["end_positions"] = []

    for i, offsets in enumerate(offset_mapping):
        input_ids = tokenized_examples["input_ids"][i]
        cls_index = input_ids.index(tokenizer.cls_token_id)
        
        # ํ•ด๋‹น example์— ํ•ด๋‹นํ•˜๋Š” sequence๋ฅผ ์ฐพ์Œ.
        sequence_ids = tokenized_examples.sequence_ids(i)
        
        # sequence๊ฐ€ ์†ํ•˜๋Š” example์„ ์ฐพ๋Š”๋‹ค.
        example_index = overflow_to_sample_mapping[i]
        answers = examples["answers"][example_index]
        
        # ํ…์ŠคํŠธ์—์„œ answer์˜ ์‹œ์ž‘์ , ๋์ 
        answer_start_offset = answers["answer_start"][0]
        answer_end_offset = answer_start_offset + len(answers["text"][0])

        # ํ…์ŠคํŠธ์—์„œ ํ˜„์žฌ span์˜ ์‹œ์ž‘ ํ† ํฐ ์ธ๋ฑ์Šค
        token_start_index = 0
        while sequence_ids[token_start_index] != 1:
            token_start_index += 1
        
        # ํ…์ŠคํŠธ์—์„œ ํ˜„์žฌ span ๋ ํ† ํฐ ์ธ๋ฑ์Šค
        token_end_index = len(input_ids) - 1
        while sequence_ids[token_end_index] != 1:
            token_end_index -= 1

        # answer๊ฐ€ ํ˜„์žฌ span์„ ๋ฒ—์–ด๋‚ฌ๋Š”์ง€ ์ฒดํฌ
        if not (offsets[token_start_index][0] <= answer_start_offset and offsets[token_end_index][1] >= answer_end_offset):
            tokenized_examples["start_positions"].append(cls_index)
            tokenized_examples["end_positions"].append(cls_index)
        else:
            # token_start_index์™€ token_end_index๋ฅผ answer์˜ ์‹œ์ž‘์ ๊ณผ ๋์ ์œผ๋กœ ์˜ฎ๊น€
            while token_start_index < len(offsets) and offsets[token_start_index][0] <= answer_start_offset:
                token_start_index += 1
            tokenized_examples["start_positions"].append(token_start_index - 1)
            while offsets[token_end_index][1] >= answer_end_offset:
                token_end_index -= 1
            tokenized_examples["end_positions"].append(token_end_index + 1)

    return tokenized_examples
  • ๋ฐ์ดํ„ฐ๋ฅผ ์‚ฌ์ „์— ์„ค์ •ํ–ˆ๋˜ ๋Œ€๋กœ ์ „์ฒ˜๋ฆฌํ•ด์„œ dictionary ํ˜•ํƒœ๋กœ ๋ฐ˜ํ™˜ํ•˜๊ฒŒ ๋œ๋‹ค.

 train_dataset = datasets["train"]
 len(train_dataset)
 
 >>> 60000
 
 train_dataset = train_dataset.select(range(max_train_samples))
 len(train_dataset)
 
 >>> 16
  • select๋ฅผ ์ด์šฉํ•˜๋ฉด ๋ฐ์ดํ„ฐ๋ฅผ ์ผ์ • ์ˆ˜ ๋งŒํผ ๋ฝ‘์„ ์ˆ˜ ๋‹ค.

column_names = datasets["train"].column_names
train_dataset = train_dataset.map(
            prepare_train_features,
            batched=True,
            num_proc=preprocessing_num_workers,
            remove_columns=column_names,
            load_from_cache_file=True,
        )

def prepare_validation_features(examples):
    tokenized_examples = tokenizer(
        examples['question'],
        examples['context'],
        truncation="only_second",
        max_length=max_seq_length,
        stride=doc_stride,
        return_overflowing_tokens=True,
        return_offsets_mapping=True,
        padding="max_length",
    )

    sample_mapping = tokenized_examples.pop("overflow_to_sample_mapping")

    tokenized_examples["example_id"] = []

    for i in range(len(tokenized_examples["input_ids"])):
        sequence_ids = tokenized_examples.sequence_ids(i)
        context_index = 1

        sample_index = sample_mapping[i]
        tokenized_examples["example_id"].append(examples["id"][sample_index])

        tokenized_examples["offset_mapping"][i] = [
            (o if sequence_ids[k] == context_index else None)
            for k, o in enumerate(tokenized_examples["offset_mapping"][i])
        ]

    return tokenized_examples
  • valid ๋ฐ์ดํ„ฐ๋„ ๋˜‘๊ฐ™์ด ์ฒ˜๋ฆฌ ์ค€๋‹ค.

eval_examples = datasets["validation"]
eval_examples = eval_examples.select(range(max_val_samples))
eval_dataset = eval_examples.map(
            prepare_validation_features,
            batched=True,
            num_proc=preprocessing_num_workers,
            remove_columns=column_names,
            load_from_cache_file=True,
        )

Fine-tuning ํ•˜๊ธฐ

from transformers import default_data_collator, TrainingArguments, EvalPrediction
from trainer_qa import QuestionAnsweringTrainer
from utils_qa import postprocess_qa_predictions
  • default_data_collator : ์—ฌ๋Ÿฌ ๊ฐœ์˜ example์„ collate ํ•ด์ฃผ๋Š” ์—ญ

  • TrainingArguments : ํ•™์Šต์„ ํ•  ๋•Œ ์ค„ ์ˆ˜ ์žˆ๋Š” config๋ฅผ ํ•œ๋ฒˆ์— ์ค„ ์ˆ˜ ์žˆ๋Š” ํŽธ๋ฆฌํ•œ ๊ธฐ๋Šฅ

  • Eval Prediction : ์ข€ ๋” ํŽธํ•˜๊ฒŒ ์˜ˆ์ธกํ•  ์ˆ˜ ์žˆ๋„๋ก ํ•จ

  • QuestionAnsweringTrainer : ํ•™์Šต์„ ๋” ํŽธํ•˜๊ฒŒ ์ˆ˜ ์žˆ๋‹ค.

  • postprocess_qa_predictions : ๊ฒฐ๊ณผ๋ฅผ ์–ป๊ณ ๋‚˜์„œ ํ•œ๋ฒˆ ๋” post process๋ฅผ ํ•ด์•ผํ•˜๋Š”๋ฐ ์ด๋ฅผ ๊ฐ€๋Šฅํ•˜๊ฒŒ ํ•ด์ค€๋‹ค.

def compute_metrics(p: EvalPrediction):
    return metric.compute(predictions=p.predictions, references=p.label_ids)

training_args = TrainingArguments(
    output_dir="outputs",
    do_train=True, 
    do_eval=True, 
    learning_rate=3e-5,
    per_device_train_batch_size=batch_size,
    per_device_eval_batch_size=batch_size,
    num_train_epochs=num_train_epochs,
    weight_decay=0.01,
)

trainer = QuestionAnsweringTrainer(
        model=model,
        args=training_args,
        train_dataset=train_dataset,
        eval_dataset=eval_dataset,
        eval_examples=datasets["validation"],
        tokenizer=tokenizer,
        data_collator=default_data_collator,
        post_process_function=post_processing_function,
        compute_metrics=compute_metrics,
    )

train_result = trainer.train()
train_result

>>> TrainOutput(global_step=12, training_loss=4.897604942321777, metrics={'train_runtime': 224.1287, 'train_samples_per_second': 0.054, 'total_flos': 19604022976512.0, 'epoch': 2.0, 'init_mem_cpu_alloc_delta': 4986, 'init_mem_cpu_peaked_delta': 16725, 'train_mem_cpu_alloc_delta': 160674, 'train_mem_cpu_peaked_delta': 153619})

ํ‰๊ฐ€ํ•˜

metrics = trainer.evaluate()
metrics

>>> {'epoch': 2.0, 'exact_match': 0.0, 'f1': 0.0}