๐Ÿšดโ€โ™‚๏ธ
TIL
  • MAIN
  • : TIL?
  • : WIL
  • : Plan
  • : Retrospective
    • 21Y
      • Wait a moment!
      • 9M 2W
      • 9M1W
      • 8M4W
      • 8M3W
      • 8M2W
      • 8M1W
      • 7M4W
      • 7M3W
      • 7M2W
      • 7M1W
      • 6M5W
      • 1H
    • ์ƒˆ์‚ฌ๋žŒ ๋˜๊ธฐ ํ”„๋กœ์ ํŠธ
      • 2ํšŒ์ฐจ
      • 1ํšŒ์ฐจ
  • TIL : ML
    • Paper Analysis
      • BERT
      • Transformer
    • Boostcamp 2st
      • [S]Data Viz
        • (4-3) Seaborn ์‹ฌํ™”
        • (4-2) Seaborn ๊ธฐ์ดˆ
        • (4-1) Seaborn ์†Œ๊ฐœ
        • (3-4) More Tips
        • (3-3) Facet ์‚ฌ์šฉํ•˜๊ธฐ
        • (3-2) Color ์‚ฌ์šฉํ•˜๊ธฐ
        • (3-1) Text ์‚ฌ์šฉํ•˜๊ธฐ
        • (2-3) Scatter Plot ์‚ฌ์šฉํ•˜๊ธฐ
        • (2-2) Line Plot ์‚ฌ์šฉํ•˜๊ธฐ
        • (2-1) Bar Plot ์‚ฌ์šฉํ•˜๊ธฐ
        • (1-3) Python๊ณผ Matplotlib
        • (1-2) ์‹œ๊ฐํ™”์˜ ์š”์†Œ
        • (1-1) Welcome to Visualization (OT)
      • [P]MRC
        • (2๊ฐ•) Extraction-based MRC
        • (1๊ฐ•) MRC Intro & Python Basics
      • [P]KLUE
        • (5๊ฐ•) BERT ๊ธฐ๋ฐ˜ ๋‹จ์ผ ๋ฌธ์žฅ ๋ถ„๋ฅ˜ ๋ชจ๋ธ ํ•™์Šต
        • (4๊ฐ•) ํ•œ๊ตญ์–ด BERT ์–ธ์–ด ๋ชจ๋ธ ํ•™์Šต
        • [NLP] ๋ฌธ์žฅ ๋‚ด ๊ฐœ์ฒด๊ฐ„ ๊ด€๊ณ„ ์ถ”์ถœ
        • (3๊ฐ•) BERT ์–ธ์–ด๋ชจ๋ธ ์†Œ๊ฐœ
        • (2๊ฐ•) ์ž์—ฐ์–ด์˜ ์ „์ฒ˜๋ฆฌ
        • (1๊ฐ•) ์ธ๊ณต์ง€๋Šฅ๊ณผ ์ž์—ฐ์–ด ์ฒ˜๋ฆฌ
      • [U]Stage-CV
      • [U]Stage-NLP
        • 7W Retrospective
        • (10๊ฐ•) Advanced Self-supervised Pre-training Models
        • (09๊ฐ•) Self-supervised Pre-training Models
        • (08๊ฐ•) Transformer (2)
        • (07๊ฐ•) Transformer (1)
        • 6W Retrospective
        • (06๊ฐ•) Beam Search and BLEU score
        • (05๊ฐ•) Sequence to Sequence with Attention
        • (04๊ฐ•) LSTM and GRU
        • (03๊ฐ•) Recurrent Neural Network and Language Modeling
        • (02๊ฐ•) Word Embedding
        • (01๊ฐ•) Intro to NLP, Bag-of-Words
        • [ํ•„์ˆ˜ ๊ณผ์ œ 4] Preprocessing for NMT Model
        • [ํ•„์ˆ˜ ๊ณผ์ œ 3] Subword-level Language Model
        • [ํ•„์ˆ˜ ๊ณผ์ œ2] RNN-based Language Model
        • [์„ ํƒ ๊ณผ์ œ] BERT Fine-tuning with transformers
        • [ํ•„์ˆ˜ ๊ณผ์ œ] Data Preprocessing
      • Mask Wear Image Classification
        • 5W Retrospective
        • Report_Level1_6
        • Performance | Review
        • DAY 11 : HardVoting | MultiLabelClassification
        • DAY 10 : Cutmix
        • DAY 9 : Loss Function
        • DAY 8 : Baseline
        • DAY 7 : Class Imbalance | Stratification
        • DAY 6 : Error Fix
        • DAY 5 : Facenet | Save
        • DAY 4 : VIT | F1_Loss | LrScheduler
        • DAY 3 : DataSet/Lodaer | EfficientNet
        • DAY 2 : Labeling
        • DAY 1 : EDA
        • 2_EDA Analysis
      • [P]Stage-1
        • 4W Retrospective
        • (10๊ฐ•) Experiment Toolkits & Tips
        • (9๊ฐ•) Ensemble
        • (8๊ฐ•) Training & Inference 2
        • (7๊ฐ•) Training & Inference 1
        • (6๊ฐ•) Model 2
        • (5๊ฐ•) Model 1
        • (4๊ฐ•) Data Generation
        • (3๊ฐ•) Dataset
        • (2๊ฐ•) Image Classification & EDA
        • (1๊ฐ•) Competition with AI Stages!
      • [U]Stage-3
        • 3W Retrospective
        • PyTorch
          • (10๊ฐ•) PyTorch Troubleshooting
          • (09๊ฐ•) Hyperparameter Tuning
          • (08๊ฐ•) Multi-GPU ํ•™์Šต
          • (07๊ฐ•) Monitoring tools for PyTorch
          • (06๊ฐ•) ๋ชจ๋ธ ๋ถˆ๋Ÿฌ์˜ค๊ธฐ
          • (05๊ฐ•) Dataset & Dataloader
          • (04๊ฐ•) AutoGrad & Optimizer
          • (03๊ฐ•) PyTorch ํ”„๋กœ์ ํŠธ ๊ตฌ์กฐ ์ดํ•ดํ•˜๊ธฐ
          • (02๊ฐ•) PyTorch Basics
          • (01๊ฐ•) Introduction to PyTorch
      • [U]Stage-2
        • 2W Retrospective
        • DL Basic
          • (10๊ฐ•) Generative Models 2
          • (09๊ฐ•) Generative Models 1
          • (08๊ฐ•) Sequential Models - Transformer
          • (07๊ฐ•) Sequential Models - RNN
          • (06๊ฐ•) Computer Vision Applications
          • (05๊ฐ•) Modern CNN - 1x1 convolution์˜ ์ค‘์š”์„ฑ
          • (04๊ฐ•) Convolution์€ ๋ฌด์—‡์ธ๊ฐ€?
          • (03๊ฐ•) Optimization
          • (02๊ฐ•) ๋‰ด๋Ÿด ๋„คํŠธ์›Œํฌ - MLP (Multi-Layer Perceptron)
          • (01๊ฐ•) ๋”ฅ๋Ÿฌ๋‹ ๊ธฐ๋ณธ ์šฉ์–ด ์„ค๋ช… - Historical Review
        • Assignment
          • [ํ•„์ˆ˜ ๊ณผ์ œ] Multi-headed Attention Assignment
          • [ํ•„์ˆ˜ ๊ณผ์ œ] LSTM Assignment
          • [ํ•„์ˆ˜ ๊ณผ์ œ] CNN Assignment
          • [ํ•„์ˆ˜ ๊ณผ์ œ] Optimization Assignment
          • [ํ•„์ˆ˜ ๊ณผ์ œ] MLP Assignment
      • [U]Stage-1
        • 1W Retrospective
        • AI Math
          • (AI Math 10๊ฐ•) RNN ์ฒซ๊ฑธ์Œ
          • (AI Math 9๊ฐ•) CNN ์ฒซ๊ฑธ์Œ
          • (AI Math 8๊ฐ•) ๋ฒ ์ด์ฆˆ ํ†ต๊ณ„ํ•™ ๋ง›๋ณด๊ธฐ
          • (AI Math 7๊ฐ•) ํ†ต๊ณ„ํ•™ ๋ง›๋ณด๊ธฐ
          • (AI Math 6๊ฐ•) ํ™•๋ฅ ๋ก  ๋ง›๋ณด๊ธฐ
          • (AI Math 5๊ฐ•) ๋”ฅ๋Ÿฌ๋‹ ํ•™์Šต๋ฐฉ๋ฒ• ์ดํ•ดํ•˜๊ธฐ
          • (AI Math 4๊ฐ•) ๊ฒฝ์‚ฌํ•˜๊ฐ•๋ฒ• - ๋งค์šด๋ง›
          • (AI Math 3๊ฐ•) ๊ฒฝ์‚ฌํ•˜๊ฐ•๋ฒ• - ์ˆœํ•œ๋ง›
          • (AI Math 2๊ฐ•) ํ–‰๋ ฌ์ด ๋ญ์˜ˆ์š”?
          • (AI Math 1๊ฐ•) ๋ฒกํ„ฐ๊ฐ€ ๋ญ์˜ˆ์š”?
        • Python
          • (Python 7-2๊ฐ•) pandas II
          • (Python 7-1๊ฐ•) pandas I
          • (Python 6๊ฐ•) numpy
          • (Python 5-2๊ฐ•) Python data handling
          • (Python 5-1๊ฐ•) File / Exception / Log Handling
          • (Python 4-2๊ฐ•) Module and Project
          • (Python 4-1๊ฐ•) Python Object Oriented Programming
          • (Python 3-2๊ฐ•) Pythonic code
          • (Python 3-1๊ฐ•) Python Data Structure
          • (Python 2-4๊ฐ•) String and advanced function concept
          • (Python 2-3๊ฐ•) Conditionals and Loops
          • (Python 2-2๊ฐ•) Function and Console I/O
          • (Python 2-1๊ฐ•) Variables
          • (Python 1-3๊ฐ•) ํŒŒ์ด์ฌ ์ฝ”๋”ฉ ํ™˜๊ฒฝ
          • (Python 1-2๊ฐ•) ํŒŒ์ด์ฌ ๊ฐœ์š”
          • (Python 1-1๊ฐ•) Basic computer class for newbies
        • Assignment
          • [์„ ํƒ ๊ณผ์ œ 3] Maximum Likelihood Estimate
          • [์„ ํƒ ๊ณผ์ œ 2] Backpropagation
          • [์„ ํƒ ๊ณผ์ œ 1] Gradient Descent
          • [ํ•„์ˆ˜ ๊ณผ์ œ 5] Morsecode
          • [ํ•„์ˆ˜ ๊ณผ์ œ 4] Baseball
          • [ํ•„์ˆ˜ ๊ณผ์ œ 3] Text Processing 2
          • [ํ•„์ˆ˜ ๊ณผ์ œ 2] Text Processing 1
          • [ํ•„์ˆ˜ ๊ณผ์ œ 1] Basic Math
    • ๋”ฅ๋Ÿฌ๋‹ CNN ์™„๋ฒฝ ๊ฐ€์ด๋“œ - Fundamental ํŽธ
      • ์ข…ํ•ฉ ์‹ค์Šต 2 - ์บ๊ธ€ Plant Pathology(๋‚˜๋ฌด์žŽ ๋ณ‘ ์ง„๋‹จ) ๊ฒฝ์—ฐ ๋Œ€ํšŒ
      • ์ข…ํ•ฉ ์‹ค์Šต 1 - 120์ข…์˜ Dog Breed Identification ๋ชจ๋ธ ์ตœ์ ํ™”
      • ์‚ฌ์ „ ํ›ˆ๋ จ ๋ชจ๋ธ์˜ ๋ฏธ์„ธ ์กฐ์ • ํ•™์Šต๊ณผ ๋‹ค์–‘ํ•œ Learning Rate Scheduler์˜ ์ ์šฉ
      • Advanced CNN ๋ชจ๋ธ ํŒŒํ—ค์น˜๊ธฐ - ResNet ์ƒ์„ธ์™€ EfficientNet ๊ฐœ์š”
      • Advanced CNN ๋ชจ๋ธ ํŒŒํ—ค์น˜๊ธฐ - AlexNet, VGGNet, GoogLeNet
      • Albumentation์„ ์ด์šฉํ•œ Augmentation๊ธฐ๋ฒ•๊ณผ Keras Sequence ํ™œ์šฉํ•˜๊ธฐ
      • ์‚ฌ์ „ ํ›ˆ๋ จ CNN ๋ชจ๋ธ์˜ ํ™œ์šฉ๊ณผ Keras Generator ๋ฉ”์ปค๋‹ˆ์ฆ˜ ์ดํ•ด
      • ๋ฐ์ดํ„ฐ ์ฆ๊ฐ•์˜ ์ดํ•ด - Keras ImageDataGenerator ํ™œ์šฉ
      • CNN ๋ชจ๋ธ ๊ตฌํ˜„ ๋ฐ ์„ฑ๋Šฅ ํ–ฅ์ƒ ๊ธฐ๋ณธ ๊ธฐ๋ฒ• ์ ์šฉํ•˜๊ธฐ
    • AI School 1st
    • ํ˜„์—… ์‹ค๋ฌด์ž์—๊ฒŒ ๋ฐฐ์šฐ๋Š” Kaggle ๋จธ์‹ ๋Ÿฌ๋‹ ์ž…๋ฌธ
    • ํŒŒ์ด์ฌ ๋”ฅ๋Ÿฌ๋‹ ํŒŒ์ดํ† ์น˜
  • TIL : Python & Math
    • Do It! ์žฅ๊ณ +๋ถ€ํŠธ์ŠคํŠธ๋žฉ: ํŒŒ์ด์ฌ ์›น๊ฐœ๋ฐœ์˜ ์ •์„
      • Relations - ๋‹ค๋Œ€๋‹ค ๊ด€๊ณ„
      • Relations - ๋‹ค๋Œ€์ผ ๊ด€๊ณ„
      • ํ…œํ”Œ๋ฆฟ ํŒŒ์ผ ๋ชจ๋“ˆํ™” ํ•˜๊ธฐ
      • TDD (Test Driven Development)
      • template tags & ์กฐ๊ฑด๋ฌธ
      • ์ •์  ํŒŒ์ผ(static files) & ๋ฏธ๋””์–ด ํŒŒ์ผ(media files)
      • FBV (Function Based View)์™€ CBV (Class Based View)
      • Django ์ž…๋ฌธํ•˜๊ธฐ
      • ๋ถ€ํŠธ์ŠคํŠธ๋žฉ
      • ํ”„๋ก ํŠธ์—”๋“œ ๊ธฐ์ดˆ๋‹ค์ง€๊ธฐ (HTML, CSS, JS)
      • ๋“ค์–ด๊ฐ€๊ธฐ + ํ™˜๊ฒฝ์„ค์ •
    • Algorithm
      • Programmers
        • Level1
          • ์†Œ์ˆ˜ ๋งŒ๋“ค๊ธฐ
          • ์ˆซ์ž ๋ฌธ์ž์—ด๊ณผ ์˜๋‹จ์–ด
          • ์ž์—ฐ์ˆ˜ ๋’ค์ง‘์–ด ๋ฐฐ์—ด๋กœ ๋งŒ๋“ค๊ธฐ
          • ์ •์ˆ˜ ๋‚ด๋ฆผ์ฐจ์ˆœ์œผ๋กœ ๋ฐฐ์น˜ํ•˜๊ธฐ
          • ์ •์ˆ˜ ์ œ๊ณฑ๊ทผ ํŒ๋ณ„
          • ์ œ์ผ ์ž‘์€ ์ˆ˜ ์ œ๊ฑฐํ•˜๊ธฐ
          • ์ง์‚ฌ๊ฐํ˜• ๋ณ„์ฐ๊ธฐ
          • ์ง์ˆ˜์™€ ํ™€์ˆ˜
          • ์ฒด์œก๋ณต
          • ์ตœ๋Œ€๊ณต์•ฝ์ˆ˜์™€ ์ตœ์†Œ๊ณต๋ฐฐ์ˆ˜
          • ์ฝœ๋ผ์ธ  ์ถ”์ธก
          • ํฌ๋ ˆ์ธ ์ธํ˜•๋ฝ‘๊ธฐ ๊ฒŒ์ž„
          • ํ‚คํŒจ๋“œ ๋ˆ„๋ฅด๊ธฐ
          • ํ‰๊ท  ๊ตฌํ•˜๊ธฐ
          • ํฐ์ผ“๋ชฌ
          • ํ•˜์ƒค๋“œ ์ˆ˜
          • ํ•ธ๋“œํฐ ๋ฒˆํ˜ธ ๊ฐ€๋ฆฌ๊ธฐ
          • ํ–‰๋ ฌ์˜ ๋ง์…ˆ
        • Level2
          • ์ˆซ์ž์˜ ํ‘œํ˜„
          • ์ˆœ์œ„ ๊ฒ€์ƒ‰
          • ์ˆ˜์‹ ์ตœ๋Œ€ํ™”
          • ์†Œ์ˆ˜ ์ฐพ๊ธฐ
          • ์†Œ์ˆ˜ ๋งŒ๋“ค๊ธฐ
          • ์‚ผ๊ฐ ๋‹ฌํŒฝ์ด
          • ๋ฌธ์ž์—ด ์••์ถ•
          • ๋ฉ”๋‰ด ๋ฆฌ๋‰ด์–ผ
          • ๋” ๋งต๊ฒŒ
          • ๋•…๋”ฐ๋จน๊ธฐ
          • ๋ฉ€์ฉกํ•œ ์‚ฌ๊ฐํ˜•
          • ๊ด„ํ˜ธ ํšŒ์ „ํ•˜๊ธฐ
          • ๊ด„ํ˜ธ ๋ณ€ํ™˜
          • ๊ตฌ๋ช…๋ณดํŠธ
          • ๊ธฐ๋Šฅ ๊ฐœ๋ฐœ
          • ๋‰ด์Šค ํด๋Ÿฌ์Šคํ„ฐ๋ง
          • ๋‹ค๋ฆฌ๋ฅผ ์ง€๋‚˜๋Š” ํŠธ๋Ÿญ
          • ๋‹ค์Œ ํฐ ์ˆซ์ž
          • ๊ฒŒ์ž„ ๋งต ์ตœ๋‹จ๊ฑฐ๋ฆฌ
          • ๊ฑฐ๋ฆฌ๋‘๊ธฐ ํ™•์ธํ•˜๊ธฐ
          • ๊ฐ€์žฅ ํฐ ์ •์‚ฌ๊ฐํ˜• ์ฐพ๊ธฐ
          • H-Index
          • JadenCase ๋ฌธ์ž์—ด ๋งŒ๋“ค๊ธฐ
          • N๊ฐœ์˜ ์ตœ์†Œ๊ณต๋ฐฐ์ˆ˜
          • N์ง„์ˆ˜ ๊ฒŒ์ž„
          • ๊ฐ€์žฅ ํฐ ์ˆ˜
          • 124 ๋‚˜๋ผ์˜ ์ˆซ์ž
          • 2๊ฐœ ์ดํ•˜๋กœ ๋‹ค๋ฅธ ๋น„ํŠธ
          • [3์ฐจ] ํŒŒ์ผ๋ช… ์ •๋ ฌ
          • [3์ฐจ] ์••์ถ•
          • ์ค„ ์„œ๋Š” ๋ฐฉ๋ฒ•
          • [3์ฐจ] ๋ฐฉ๊ธˆ ๊ทธ๊ณก
          • ๊ฑฐ๋ฆฌ๋‘๊ธฐ ํ™•์ธํ•˜๊ธฐ
        • Level3
          • ๋งค์นญ ์ ์ˆ˜
          • ์™ธ๋ฒฝ ์ ๊ฒ€
          • ๊ธฐ์ง€๊ตญ ์„ค์น˜
          • ์ˆซ์ž ๊ฒŒ์ž„
          • 110 ์˜ฎ๊ธฐ๊ธฐ
          • ๊ด‘๊ณ  ์ œ๊ฑฐ
          • ๊ธธ ์ฐพ๊ธฐ ๊ฒŒ์ž„
          • ์…”ํ‹€๋ฒ„์Šค
          • ๋‹จ์†์นด๋ฉ”๋ผ
          • ํ‘œ ํŽธ์ง‘
          • N-Queen
          • ์ง•๊ฒ€๋‹ค๋ฆฌ ๊ฑด๋„ˆ๊ธฐ
          • ์ตœ๊ณ ์˜ ์ง‘ํ•ฉ
          • ํ•ฉ์Šน ํƒ์‹œ ์š”๊ธˆ
          • ๊ฑฐ์Šค๋ฆ„๋ˆ
          • ํ•˜๋…ธ์ด์˜ ํƒ‘
          • ๋ฉ€๋ฆฌ ๋›ฐ๊ธฐ
          • ๋ชจ๋‘ 0์œผ๋กœ ๋งŒ๋“ค๊ธฐ
        • Level4
    • Head First Python
    • ๋ฐ์ดํ„ฐ ๋ถ„์„์„ ์œ„ํ•œ SQL
    • ๋‹จ ๋‘ ์žฅ์˜ ๋ฌธ์„œ๋กœ ๋ฐ์ดํ„ฐ ๋ถ„์„๊ณผ ์‹œ๊ฐํ™” ๋ฝ€๊ฐœ๊ธฐ
    • Linear Algebra(Khan Academy)
    • ์ธ๊ณต์ง€๋Šฅ์„ ์œ„ํ•œ ์„ ํ˜•๋Œ€์ˆ˜
    • Statistics110
  • TIL : etc
    • [๋”ฐ๋ฐฐ๋Ÿฐ] Kubernetes
    • [๋”ฐ๋ฐฐ๋Ÿฐ] Docker
      • 2. ๋„์ปค ์„ค์น˜ ์‹ค์Šต 1 - ํ•™์ŠตํŽธ(์ค€๋น„๋ฌผ/์‹ค์Šต ์œ ํ˜• ์†Œ๊ฐœ)
      • 1. ์ปจํ…Œ์ด๋„ˆ์™€ ๋„์ปค์˜ ์ดํ•ด - ์ปจํ…Œ์ด๋„ˆ๋ฅผ ์“ฐ๋Š”์ด์œ  / ์ผ๋ฐ˜ํ”„๋กœ๊ทธ๋žจ๊ณผ ์ปจํ…Œ์ด๋„ˆํ”„๋กœ๊ทธ๋žจ์˜ ์ฐจ์ด์ 
      • 0. ๋“œ๋””์–ด ์ฐพ์•„์˜จ Docker ๊ฐ•์˜! ์™•์ดˆ๋ณด์—์„œ ๋„์ปค ๋งˆ์Šคํ„ฐ๋กœ - OT
    • CoinTrading
      • [๊ฐ€์ƒ ํ™”ํ ์ž๋™ ๋งค๋งค ํ”„๋กœ๊ทธ๋žจ] ๋ฐฑํ…Œ์ŠคํŒ… : ๊ฐ„๋‹จํ•œ ํ…Œ์ŠคํŒ…
    • Gatsby
      • 01 ๊นƒ๋ถ ํฌ๊ธฐ ์„ ์–ธ
  • TIL : Project
    • Mask Wear Image Classification
    • Project. GARIGO
  • 2021 TIL
    • CHANGED
    • JUN
      • 30 Wed
      • 29 Tue
      • 28 Mon
      • 27 Sun
      • 26 Sat
      • 25 Fri
      • 24 Thu
      • 23 Wed
      • 22 Tue
      • 21 Mon
      • 20 Sun
      • 19 Sat
      • 18 Fri
      • 17 Thu
      • 16 Wed
      • 15 Tue
      • 14 Mon
      • 13 Sun
      • 12 Sat
      • 11 Fri
      • 10 Thu
      • 9 Wed
      • 8 Tue
      • 7 Mon
      • 6 Sun
      • 5 Sat
      • 4 Fri
      • 3 Thu
      • 2 Wed
      • 1 Tue
    • MAY
      • 31 Mon
      • 30 Sun
      • 29 Sat
      • 28 Fri
      • 27 Thu
      • 26 Wed
      • 25 Tue
      • 24 Mon
      • 23 Sun
      • 22 Sat
      • 21 Fri
      • 20 Thu
      • 19 Wed
      • 18 Tue
      • 17 Mon
      • 16 Sun
      • 15 Sat
      • 14 Fri
      • 13 Thu
      • 12 Wed
      • 11 Tue
      • 10 Mon
      • 9 Sun
      • 8 Sat
      • 7 Fri
      • 6 Thu
      • 5 Wed
      • 4 Tue
      • 3 Mon
      • 2 Sun
      • 1 Sat
    • APR
      • 30 Fri
      • 29 Thu
      • 28 Wed
      • 27 Tue
      • 26 Mon
      • 25 Sun
      • 24 Sat
      • 23 Fri
      • 22 Thu
      • 21 Wed
      • 20 Tue
      • 19 Mon
      • 18 Sun
      • 17 Sat
      • 16 Fri
      • 15 Thu
      • 14 Wed
      • 13 Tue
      • 12 Mon
      • 11 Sun
      • 10 Sat
      • 9 Fri
      • 8 Thu
      • 7 Wed
      • 6 Tue
      • 5 Mon
      • 4 Sun
      • 3 Sat
      • 2 Fri
      • 1 Thu
    • MAR
      • 31 Wed
      • 30 Tue
      • 29 Mon
      • 28 Sun
      • 27 Sat
      • 26 Fri
      • 25 Thu
      • 24 Wed
      • 23 Tue
      • 22 Mon
      • 21 Sun
      • 20 Sat
      • 19 Fri
      • 18 Thu
      • 17 Wed
      • 16 Tue
      • 15 Mon
      • 14 Sun
      • 13 Sat
      • 12 Fri
      • 11 Thu
      • 10 Wed
      • 9 Tue
      • 8 Mon
      • 7 Sun
      • 6 Sat
      • 5 Fri
      • 4 Thu
      • 3 Wed
      • 2 Tue
      • 1 Mon
    • FEB
      • 28 Sun
      • 27 Sat
      • 26 Fri
      • 25 Thu
      • 24 Wed
      • 23 Tue
      • 22 Mon
      • 21 Sun
      • 20 Sat
      • 19 Fri
      • 18 Thu
      • 17 Wed
      • 16 Tue
      • 15 Mon
      • 14 Sun
      • 13 Sat
      • 12 Fri
      • 11 Thu
      • 10 Wed
      • 9 Tue
      • 8 Mon
      • 7 Sun
      • 6 Sat
      • 5 Fri
      • 4 Thu
      • 3 Wed
      • 2 Tue
      • 1 Mon
    • JAN
      • 31 Sun
      • 30 Sat
      • 29 Fri
      • 28 Thu
      • 27 Wed
      • 26 Tue
      • 25 Mon
      • 24 Sun
      • 23 Sat
      • 22 Fri
      • 21 Thu
      • 20 Wed
      • 19 Tue
      • 18 Mon
      • 17 Sun
      • 16 Sat
      • 15 Fri
      • 14 Thu
      • 13 Wed
      • 12 Tue
      • 11 Mon
      • 10 Sun
      • 9 Sat
      • 8 Fri
      • 7 Thu
      • 6 Wed
      • 5 Tue
      • 4 Mon
      • 3 Sun
      • 2 Sat
      • 1 Fri
  • 2020 TIL
    • DEC
      • 31 Thu
      • 30 Wed
      • 29 Tue
      • 28 Mon
      • 27 Sun
      • 26 Sat
      • 25 Fri
      • 24 Thu
      • 23 Wed
      • 22 Tue
      • 21 Mon
      • 20 Sun
      • 19 Sat
      • 18 Fri
      • 17 Thu
      • 16 Wed
      • 15 Tue
      • 14 Mon
      • 13 Sun
      • 12 Sat
      • 11 Fri
      • 10 Thu
      • 9 Wed
      • 8 Tue
      • 7 Mon
      • 6 Sun
      • 5 Sat
      • 4 Fri
      • 3 Tue
      • 2 Wed
      • 1 Tue
    • NOV
      • 30 Mon
Powered by GitBook
On this page
  • Review about AI School
  • ์„ ํ˜•๋Œ€์ˆ˜ ๊น€์ค€ํ˜ธ ๊ต์ˆ˜๋‹˜
  • ํ†ต๊ณ„ํ•™ ์ด์ƒํ™˜ ๊ต์ˆ˜๋‹˜
  • ๋ฐ์ดํ„ฐ ๋ถ„์„ ์ดํ˜ธ์ค€ ๋ฉ˜ํ† ๋‹˜
  • AWS ์˜ค์„ฑ์šฐ ๋ฉ˜ํ† ๋‹˜
  • ML ๊ธฐ์ดˆ ๊ฐ•์ฐฝ์„ฑ๊ต์ˆ˜๋‹˜
  • ์Šค์ฟจ ๋ฆฌ๋ทฐ
  • [์ธํ”„๋Ÿฐ] ๋‹จ ๋‘ ์žฅ์˜ ๋ฌธ์„œ๋กœ ๋ฐ์ดํ„ฐ ๋ถ„์„๊ณผ ์‹œ๊ฐํ™” ๋ฝ€๊ฐœ๊ธฐ
  • ์‹œ๊ณ„์—ด ๋ฐ์ดํ„ฐ(Time Series Data) ๋ถ„์„์„ ์œ„ํ•œ ํŒ๋‹ค์Šค Expanding and Rolling ์ดํ•ดํ•˜๊ธฐ - Windows
  • ํŒŒ์ด์ฌ ํŒ๋‹ค์Šค๋กœ Series ์™€ DataFrame ๋ฐ์ดํ„ฐ ์‹œ๊ฐํ™” ํ•˜๊ธฐ ์†Œ๊ฐœ - Plotting
  • ํŒŒ์ด์ฌ ๋ฐ์ดํ„ฐ ์‹œ๊ฐํ™” ํŒ๋‹ค์Šค๋กœ bar plot ๋ง‰๋Œ€๊ทธ๋ž˜ํ”„ ๊ทธ๋ฆฌ๊ธฐ - Plotting
  • ํŒŒ์ด์ฌ ๋ฐ์ดํ„ฐ ์‹œ๊ฐํ™” ํžˆ์Šคํ† ๊ทธ๋žจ๊ณผ ๋„์ˆ˜๋ถ„ํฌํ‘œ ์ดํ•ดํ•˜๊ธฐ - Plotting
  • ํŒŒ์ด์ฌ ์‹œ๊ฐํ™” ์ƒ์ž ์ˆ˜์—ผ ๊ทธ๋ฆผ(box plot) ๊ทธ๋ฆฌ๊ธฐ - Plotting
  • ํŒŒ์ด์ฌ ๋ฐ์ดํ„ฐ ์‹œ๊ฐํ™” Area plot, Grid ์˜ต์…˜์œผ๋กœ ๊ทธ๋ž˜ํ”„ ๊ฒฉ์ž ๋งŒ๋“ค๊ธฐ - Plotting
  • ํŒŒ์ด์ฌ ๋ฐ์ดํ„ฐ ์‹œ๊ฐํ™” scatter plot, ์‚ฐ์ ๋„ ๊ทธ๋ฆฌ๊ธฐ - Plotting
  • ํŒŒ์ด์ฌ ๋ฐ์ดํ„ฐ ์‹œ๊ฐํ™” ํžˆ์Šคํ† ๊ทธ๋žจ๊ณผ ์‚ฐ์ ๋„๋ฅผ ๋ณด์™„ํ•œ Hexbin plot ๊ทธ๋ฆฌ๊ธฐ - Plotting
  • Pandas pie plot, ์› ๊ทธ๋ž˜ํ”„, ์™œ seaborn ์—๋Š” ํŒŒ์ด์ฐจํŠธ๊ฐ€ ์—†์„๊นŒ? - Plotting
  • Scatter Matrix Plot ์‚ฐ์ ๋„์™€ ์ปค๋„๋ฐ€๋„ํ•จ์ˆ˜๋ฅผ ํ•จ๊ป˜ ํ‘œํ˜„ - Plotting
  • ํŒŒ์ด์ฌ ์‹œ๊ฐํ™” ๋ถ„ํฌ๋„ ๊ทธ๋ฆฌ๊ธฐ, Kernel Density Estimate plot ์ปค๋„๋ฐ€๋„ํ•จ์ˆ˜ - Plotting

Was this helpful?

  1. 2021 TIL
  2. JAN

5 Tue

Review and TIL

Review about AI School

ํ•œ๋‹ฌ์ด ์กฐ๊ธˆ ๋„˜๊ฒŒ ์ง„ํ–‰ํ•ด์˜จ AI School. ์˜ค๋Š˜์€ ์ด์—๋Œ€ํ•œ ์ „๋ฐ˜์ ์ธ ๋ฆฌ๋ทฐ๋ฅผ ๋‚จ๊ธฐ๊ณ  ์‹ถ๋‹ค. ์†”์งํ•˜๊ฒŒ ํ‘œํ˜„ํ•œ ๊ฒƒ์ด๋ฉฐ ๊ทธ๋ž˜์„œ ๊ต์œก์ž๋‹˜์ด ๋ถˆํŽธํ•  ์ˆ˜๋Š” ์žˆ๋‹ค. ํ•˜์ง€๋งŒ ๊ทธ๋Ÿฐ ์˜์‚ฌ๋ฅผ ๋ณด๋‚ด๋ ค๋Š” ์˜๋„๋Š” ์ „ํ˜€ ์—†๋‹ค.

์„ ํ˜•๋Œ€์ˆ˜ ๊น€์ค€ํ˜ธ ๊ต์ˆ˜๋‹˜

๋‚œ ๋“ฃ๊ธฐ ์ข‹์•˜๋‹ค. ํ›„๋ฐ˜์— ๊ฐˆ์ˆ˜๋ก ์กฐ๊ธˆ ์–ด๋ ต๊ธด ํ–ˆ๋Š”๋ฐ, ๊ฐ๊ฐ์˜ ์„ค๋ช…์ด ๋“ฑ์žฅํ•˜๋Š” ๋ฐฉ์‹์ด ์–ด๋– ํ•œ ์›๋ฆฌ์—์„œ ์ „๊ฐœ๋˜์–ด ๋ฐœ์ „ํ•˜๋Š” ๊ณผ์ •์ด์–ด์„œ ์žฌ๋ฏธ์žˆ์—ˆ๋‹ค. ๋น„์œ ํ•˜์ž ๋งˆ์น˜ ์œ„์ธ์ „์˜ ์„œ์‚ฌ๊นŒ. ์•„์‰ฌ์šด ์ ์€ AI์™€์˜ ๊ด€๋ จ์„ฑ์€ ๊ทธ๋‹ค์ง€ ๋งŽ์ง€ ์•Š์•˜๋‹ค๋Š” ์ . ๋‚œ ์ด ์ง€์‹์ด ๋ฌด์—‡์ธ์ง€๋„ ๊ถ๊ธˆํ•˜์ง€๋งŒ ์–ด๋–ป๊ฒŒ ์“ฐ์ด๋Š” ์ง€๋„ ๊ถ๊ธˆํ•˜๋‹ค. ๊ทธ๊ฒŒ ๋” ํฅ๋ฏธ๋กญ๊ณ  ๋‚ด๊ฐ€ ๋‚˜์ค‘์— ํ•ด๋‹น ์ง€์‹์„ ์“ธ ๋•Œ๋„ ๋” ๊ธฐ์–ต์— ๋‚จ์„ ๊ฒƒ ๊ฐ™๋‹ค. ์–ด์จ‹๋“ , ๋‚ด๊ฐ€ ๋Œ€ํ•™์ƒ ๋•Œ ๋ฐฐ์šด ์ˆ˜ํ•™ ์ˆ˜์—…์ด ์ด์ •๋„์˜€๋‹ค๋ฉด ์ˆ˜ํ•™์„ ์ข€ ๋” ์ž˜ํ–ˆ์„์ง€๋„ ๋ชจ๋ฅด๊ฒ ๋‹ค.

ํ†ต๊ณ„ํ•™ ์ด์ƒํ™˜ ๊ต์ˆ˜๋‹˜

์ด ๋•Œ๋Š” ์ข€ ์ฃผ์ถคํ–ˆ๋‹ค. ๋ณดํ†ต ํ•™์ƒ๋“ค์€ ๋‚˜์—ด์‹ ์„ค๋ช…์— ์ง€๋ฃจํ•จ์„ ๋А๋ผ๋Š”๋ฐ, ๊ทธ๋Ÿฐ ๋А๋‚Œ์ด์—ˆ๋‹ค. "AI์—์„œ ๋‹ค๋ฃจ๋Š” ์ˆ˜ํ•™ ์ง€์‹์€ ์ด์ •๋„์ด๊ณ  ์ด ์ •๋„๋งŒ ์ปค๋ฒ„ํ•˜๋ฉด ๋ ๊ฑฐ์•ผ" ๋ผ๋Š” ๋А๋‚Œ์œผ๋กœ ๋ฐฐ์šด ๋‚˜์—ด์‹ ์ˆ˜์—…. ์ดˆ๋ฐ˜ ๋‚ด์šฉ์€ ์‰ฌ์›Œ์„œ ๋“ค์„๋งŒ ํ–ˆ๋Š”๋ฐ ์ค‘ํ›„๋ฐ˜ ๋‚ด์šฉ๋ถ€ํ„ฐ ์ข€ ๋งŽ์ด ์ง€๋ฃจํ•˜๊ณ  ์–ด๋ ค์› ๋‹ค.

๋ฐ์ดํ„ฐ ๋ถ„์„ ์ดํ˜ธ์ค€ ๋ฉ˜ํ† ๋‹˜

์ƒ๋‹นํžˆ ๋งŒ์กฑ์Šค๋Ÿฌ์› ๋‹ค. ๋‚˜๋ฟ๋งŒ ์•„๋‹ˆ๊ณ  ๋‹ค๋“ค ๊ทธ๋Ÿด๋“ฏ. ์ฅฌํ”ผํ„ฐ๋ฅผ ์‚ฌ์šฉํ•œ ์ , ๋ชจ๋“  ์ฝ”๋“œ์— ๋Œ€ํ•ด ์•Œ๊ธฐ ์‰ฝ๊ฒŒ ์„ค๋ช…ํ•˜๋Š” ์ , ๋‹จ๊ณจ๋งŒ ์›ƒ๋Š”๋‹ค๋Š” ๊ฐœ๊ทธ. ๋ฌผ๋ก  ๊ต์œก์— ์žˆ์–ด์„œ ์ฝ”๋“œ๋ฅผ ์‚ฌ์šฉํ•˜๋‹ˆ๊นŒ ์ข€ ๋” ์žฌ๋ฏธ์žˆ์ง€๋งŒ ๊ทธ๋ž˜๋„ ๋‚œ ๋˜๊ฒŒ ์ดํ•ดํ•˜๋ ค๊ณ  ๋…ธ๋ ฅํ•˜๋ฉด์„œ ๋“ค์—ˆ๋‹ค. 3์ฃผ์ฐจ ์ˆ˜์—…์€ ์žฌ๋ฏธ๋„ ์žˆ์—ˆ๊ณ  ์ œ์ผ ๊ธฐ์–ต์— ๋‚จ๊ณ  ์ œ์ผ ๋‚ด๊ฒƒ์œผ๋กœ ๋งŒ๋“  ๊ฐ•์˜์ธ ๊ฒƒ ๊ฐ™๋‹ค.

4์ฃผ์ฐจ ์ˆ˜์—…์€ ๋‚œ์ด๋„๊ฐ€ ์–ด๋ ค์› ๋‹ค. ์•„๋ฌด๋ž˜๋„ ์ข€ ๋‚ด ์ˆ˜์ค€์ด ํ‰๊ท  ์•„๋‹ˆ๋ฉด ๊ทธ ์ดํ•˜์—ฌ์„œ ๊ทธ๋Ÿฐ๊ฑธ๊นŒ. ์‹ฌ์ง€์–ด ๋ณด๋„ˆ์Šค ๊ณผ์ œ๋Š” ๋„ˆ๋ฌด ์–ด๋ ค์›Œ์„œ ์ข€ ๊ณคํ˜น์„ ๊ฒช์—ˆ๋‹ค. ์›๋ž˜ ๋ณด๋„ˆ์Šค ๊ณผ์ œ๋„ ๊ณผ์ œ๋ผ๊ณ  ์ƒ๊ฐํ•˜๊ณ  ํ•˜๋Š”ํŽธ์ธ๋ฐ ์ผ์ฃผ์ผ ๋„˜๊ฒŒ ๋ชปํ•ด์„œ ๊ทธ๋ƒฅ ํฌ๊ธฐํ–ˆ๋‹ค. ์ „๋ฐ˜์ ์œผ๋กœ ๊ธฐ๋ณธ์„ ์ถฉ์‹คํžˆ ์Œ“๋Š” ๋ฐฐ์›€ ๋ฐฉ์‹์„ ์ถ”๊ตฌํ•˜๋Š” ๋‚˜๋กœ์จ๋Š” ๊ถ๊ธˆํ•œ๊ฒŒ ๋„ˆ๋ฌด ๋งŽ๊ณ  ์ฐพ๋Š” ์‹œ๊ฐ„๋„ ๊ฝค ๋งŽ์•„์„œ, ๊ฐ€๋ณ๊ฒŒ ๋ฐฐ์šด ๊ฒƒ์— ๋Œ€ํ•ด ๊ณผ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๋Š”๊ฒŒ ๋งˆ์Œ์ด ์—ฌ๊ฐ„ ๋ถˆํŽธํ•˜๊ธฐ๋„ ํ–ˆ๋‹ค. ๋ณด๋„ˆ์Šค ๊ณผ์ œ๋งŒ ์ œ์™ธํ•˜๋ฉด ์žฌ๋ฏธ์žˆ๋Š” ์ˆ˜์—…์ด์—ˆ๋‹ค.

AWS ์˜ค์„ฑ์šฐ ๋ฉ˜ํ† ๋‹˜

์ข€ ํž˜๋“ค์—ˆ๋˜ ๊ฐ•์˜์ด๋‹ค. ์•„์ง๋„ ์ˆ˜๊ฐ•๋„ ๋‹ค ๋ชปํ–ˆ๋‹ค. ๊ฐ•์˜ ์ž๋ฃŒ๋Š” ์ž˜ ์•ˆ๋ณด์—ฌ์„œ ๊ทธ๋ฆผํŒ์— ์บก์ณํ•ด์„œ ํ™•๋Œ€ํ•ด๊ฐ€๋ฉด์„œ ์ง„ํ–‰ํ–ˆ๊ณ  ์†Œ๋ฆฌ๋„ ๋„ˆ๋ฌด ์ž‘์•˜๋‹ค. ๊ทธ๋ฆฌ๊ณ  ์ œ์ผ ๋ถˆํŽธํ•œ ๊ฑด ๊ฐ•์˜ ์ง„ํ–‰์ด ๋„ˆ๋ฌด ๊ฐ„๋‹จํ•˜๊ณ  ์ƒ๋žต์ด ๋งŽ์•˜๋‹ค. "์ด ๋ฒ„ํŠผ์€ ์–ด๋””์ชฝ์— ์–ด๋”” ํ•ญ๋ชฉ์— ์žˆ๊ณ " ์˜ ์„ค๋ช…์ด ์•„๋‹Œ "๊ทธ๋ฆฌ๊ณ  ์ด ๋ฒ„ํŠผ์„ ๋ˆ„๋ฅด์‹œ๋ฉด ์—ฌ๊ธฐ๋กœ ๊ฐ€์ง‘๋‹ˆ๋‹ค" ์˜ ์„ค๋ช…์ด๋ผ์„œ ์ค‘๊ฐ„์— ๊ฐ•์˜๋ฅผ ๋˜๊ฒŒ ๋งŽ์ด ๋ฉˆ์ถ”๊ณ  ๋‹ค์‹œ ๋Œ๋ ค๋ณด์•˜๋‹ค. ๊ทธ๋ฆฌ๊ณ  ์ฝ”๋“œ ์ง„ํ–‰ํ•˜๋ฉด์„œ ๋ฌธ์ œ๊ฐ€ ๋„ˆ๋ฌด ๋งŽ์ด ๋ฐœ์ƒํ•ด์„œ ๊ตฌ๊ธ€๋ง ํ•˜๋Š” ์‹œ๊ฐ„์ด ๋„ˆ๋ฌด ๋งŽ์•˜๋‹ค. ์›๋ž˜ ์ด๋Ÿด ๋•Œ ์ œ์ผ ์กธ๋ฆฌ๊ณ  ๋‹ต๋‹ตํ•œ ์‚ฌ์‹ค. (๋งˆ์น˜ ์ˆ˜ํ•™๋ฌธ์ œ๊ฐ€ ์•ˆํ’€๋ฆฌ๋Š” ๋А๋‚Œ์ด๋ž„๊นŒ) ๊ทธ๋Ÿฌ๋‹ค๊ฐ€ git clone๋„ ์•ˆ๋˜๋Š” ์‚ฌํƒœ๊ฐ€ ๋ฒŒ์–ด์กŒ๋‹ค. (์•„๋‹ˆ ์ด ๊ฐ„๋‹จํ•œ๊ฒŒ ๋„๋Œ€์ฒด ์™œ ์•ˆ๋˜์ง€) AWS์—์„œ๋„ ๋ง‰ํ˜€๊ฐ€์ง€๊ณ  ๊ฒ€์ƒ‰๋„ ํ•ด๋ณด๊ณ  ๋ฉ˜ํ† ๋‹˜๊ป˜ ์งˆ๋ฌธ๋„ ํ•˜๋ฉด์„œ ์ƒˆ๋กœ EC2๋ฅผ ๋‹ค ๋งŒ๋“ค๊ณ  ์ง„ํ–‰ํ–ˆ๋Š”๋ฐ๋„ ๋˜์ง€ ์•Š์•„์„œ ๋‹ค์Œ ์žฅ์œผ๋กœ ๋ชป๋„˜์–ด๊ฐ”๋‹ค. ๋‚ด๊ฐ€ ์ˆ˜์—… ์ง„๋„๋ฅผ ๋‚˜๊ฐ”๋ƒ ๋ชป๋‚˜๊ฐ”๋ƒ๋„ ์ค‘์š”ํ•˜๊ฒ ์ง€๋งŒ ์ˆ˜์—…์ด ๋งŒ์กฑ์Šค๋Ÿฝ์ง€๋Š” ์•Š์•˜๋‹ค. ์—ฌ๋Ÿฌ ๊ฐ€์ง€ ์—๋Ÿฌ์ฒ˜๋ฆฌ๋ฅผ ์ข€ ๋‹ค๋ค„์ฃผ๊ณ  ์ง€์‹œ ์‚ฌํ•ญ์ด ์ข€ ๋” ์ž์„ธํ•˜๋ฉด ์ข‹์„ ๊ฒƒ ๊ฐ™๋‹ค.

์ข€ ๋ถ€์ •์ ์ธ ๊ฐ•์˜๋กœ ์ธ์ƒ์ด ๋‚จ์€ ๊ฑด ์‹ค์ œ ๋น„์šฉ์ด ๋ฐœ์ƒํ•ด์„œ์ด๋‹ค. ๋‚˜๋Š” AWS๋ฅผ ์ฒ˜์Œ ์‚ฌ์šฉํ–ˆ๊ณ  AWS๊ฐ€ ๋„ˆ๋ฌด ํ™œ์šฉํ•˜๊ธฐ ์–ด๋ ค์› ๋Š”๋ฐ(๋„ˆ๋ฌด ๋‚œ์žกํ•œ ์ธํ„ฐํŽ˜์ด์Šค๋ผ๊ณ  ์ƒ๊ฐ), ๋น„์šฉ์ด ๋ฐœ์ƒํ•ด์„œ ๋†€๋ž๊ณ  ์ด๊ฑธ ์™œ ๊ฐ•์˜ ์ดˆ๋ฐ˜์— ์–ธ๊ธ‰ํ•ด์ฃผ์ง€ ์•Š์•˜์„๊นŒ ํ•˜๋Š” ์–ธ์งข์Œ์ด ์˜ฌ๋ผ์™”๋‹ค. ๊ฒ€์ƒ‰ํ•ด๊ฐ€๋ฉด์„œ ์ธ์Šคํ„ด์Šค๋ฅผ ์ทจ์†Œํ•˜๊ณ  ๋ฆฌ์†Œ์Šค ์ทจ์†Œํ•˜๊ณ  ํ–ˆ๋Š”๋ฐ๋„ ํ•˜๋ฃจํ•˜๋ฃจ ์ง€๋‚  ๋•Œ ๋งˆ๋‹ค ๋น„์šฉ์ด ๋Š˜์–ด๋‚˜์„œ ๊ณ„์† ๊ธฐ๋ถ„์ด ์•ˆ์ข‹์•˜๋‹ค. 1์ฃผ์ผ ๋™์•ˆ ๊ณ„์† ์—†์• ๋„ ๋น„์šฉ์ด ๊ณ„์† ๋ฐœ์ƒํ•ด์„œ ์—ฌ๋Ÿฌ ์ง€์ธ์—๊ฒŒ ๋„์›€์„ ์š”์ฒญํ–ˆ๊ณ  ๊ทธ๋ž˜๋„ ํ•ด๊ฒฐํ•˜์ง€ ๋ชปํ•œ์ฑ„ ๋น„์šฉ์ด ๋ฐœ์ƒํ•ด์„œ ๊ณ ๊ฐ์„ผํ„ฐ์— ๋ฌธ์˜ํ–ˆ๋‹ค. ์ •์ƒ์ ์œผ๋กœ ํ™˜๋ถˆ ์ฒ˜๋ฆฌ๊ฐ€ ๋˜์—ˆ๊ณ  ๊ธฐ๋ถ„์€ ์ข‹์ง€ ์•Š์•˜๋‹ค. ์ด ๊ฐ•์˜๋ฅผ ๋“ฃ๊ณ  AWS์— ๋Œ€ํ•œ ์ด๋ฏธ์ง€๋„ ์ข€ ๋ถ€์ •์ ์œผ๋กœ ๋œ ๊ฒƒ ๊ฐ™๋‹ค. ์ข€ ์•„์‰ฌ์šด ๊ฐ•์˜์ด๋‹ค.

ML ๊ธฐ์ดˆ ๊ฐ•์ฐฝ์„ฑ๊ต์ˆ˜๋‹˜

๋จธ์‹ ๋Ÿฌ๋‹์˜ ์ „๋ฐ˜์ ์ธ ์ด์•ผ๊ธฐ๊ฐ€ ์ˆ˜ํ•™์— ๊ฐ€๋ ค์งˆ๊นŒ ์ˆ˜ํ•™์„ ๋นผ๊ณ  ์‰ฝ๊ฒŒ ์„ค๋ช…ํ•ด์ฃผ๋Š” ์ฑ…์ด ์žˆ๋‹ค. ๊ทธ๋ฆฌ๊ณ  ๊ฐ•์˜์—์„œ ํ•˜์‹  ์ด์•ผ๊ธฐ์ฒ˜๋Ÿผ ๋จธ์‹ ๋Ÿฌ๋‹์—์„œ ์ˆ˜ํ•™์ด ์ค‘์š”ํ•˜๊ธฐ์— ๋นผ๊ณ  ์„ค๋ช…ํ•˜๋Š” ๊ฒƒ์„ ์ถ”์ฒœํ•˜์ง€๋Š” ์•Š๋Š”๋‹ค๋ผ๋Š” ๋ง๋„ ๊ณต๊ฐํ•œ๋‹ค. ๊ทผ๋ฐ, "์ด ์ˆ˜ํ•™์ด ์–ด๋–ป๊ฒŒ ์–ด๋–ค ๋ชจ๋ธ์—์„œ ์–ด๋–ป๊ฒŒ ์“ฐ์ด๋ฉฐ, ์ด ์ง€์‹์€ ์–ด๋–ป๊ฒŒ ํ™œ์šฉํ•  ์ˆ˜ ์žˆ๋‹ค" ๋ฅผ ๊ฐ•์˜ํ•ด์ฃผ์ง€ ์•Š์•„์„œ ์•„์‰ฌ์šด ๊ฒƒ ๊ฐ™๋‹ค. ๋‹จํŽธ์ ์ด๊ณ  ์„ฑ๊ธ‰ํ•œ ํ‰๊ฐ€์ผ ์ง€ ๋ชจ๋ฅด์ง€๋งŒ, ๋Œ€๊ฐœ ๊ต์œก์ž์˜ ๊ต์œก ๋ฐฉ์‹์€ ๋ชจ๋“  ์ˆ˜์—…์—์„œ ๊ฑฐ์˜ ๋™์ผํ•˜๊ธฐ์— ์•ž์œผ๋กœ์˜ ์ˆ˜์—…์ด ์ง€๋ฃจํ•  ๊ฒƒ ๊ฐ™๊ณ  ๋ฌด์„ญ๋‹ค. ์˜ค๋Š˜ ๋ฐฐ์šด ์ด ์ง€์‹๋“ค์„ ์ž˜ ๋‹ด์•„๋†“์€ ํ•™์ƒ์ด ๋ช‡์ด๋‚˜ ์žˆ์„๊นŒ. ์ข€ ์ง€๋ฃจํ•˜๊ตฌ ๋‚œํ•ดํ–ˆ๋‹ค.

์ˆ˜ํ•™ ๊ณต๋ถ€๊ฐ€ ์•„๋‹ˆ๋ผ ๋จธ์‹  ๋Ÿฌ๋‹์„ ์œ„ํ•œ ์ˆ˜ํ•™ ๊ณต๋ถ€๋ผ๋ฉด(๋‚œ ๊ทธ๋ž˜๋„ ์ˆ˜ํ•™ ๊ณต๋ถ€๋„ ๋งˆ์ฐฌ๊ฐ€์ง€๋ผ๊ณ  ์ƒ๊ฐํ•œ๋‹ค) ์ข€ ๋” ์„œ๋กœ๊ฐ„์— ์—ฐ๊ฒฐ์„ฑ์„ ์–ธ๊ธ‰ํ•˜๋ฉด์„œ ์‰ฌ์šด ์„ค๋ช…์ด ํ•„์š”ํ–ˆ๋‹ค๊ณ  ์ƒ๊ฐํ•œ๋‹ค. ์˜ค๋Š˜ ๊ฐ•์˜๋Š” ํŽ˜์ด์ง€1์žฅ ์„ค๋ช…, ํŽ˜์ด์ง€2์žฅ ์„ค๋ช…์— ๊ทธ์น˜์ง€ ์•Š์•˜๋˜ ๊ฒƒ ๊ฐ™๋‹ค.

ML ๊ฐ•์˜๋„ ์ˆ˜ํ•™์ฒ˜๋Ÿผ ๋น„์Šทํ•œ ๋ถ„์œ„๊ธฐ๊ฐ€ ๋‚˜์ง€ ์•Š์„๊นŒ ๊ฑฑ์ •๋˜์ง€๋งŒ ๊ธฐ๋Œ€๋Š” ํ•œ๋‹ค.

+ (01 / 08 ์ถ”๊ฐ€)

์•„์•„.. ์ˆ˜์—…์„ ๋‹ค ๋“ฃ๊ณ  ํ›„๊ธฐ๋ฅผ ๋‚จ๊ฒจ๋ณธ๋‹ค.

์ฒซ๋ฒˆ์งธ. ์ผ๋‹จ ๊ต์œก์ž์˜ ๊ต์œก ๋ฐฉ์‹์€ ๋ชจ๋“  ์ˆ˜์—…์—์„œ ๊ฑฐ์˜ ๋™์ผํ•˜๋‹ค๋Š” ๋ถˆ๋ณ€์ด๋‹ค. ์ฒซ์ˆ˜์—…์€ ๋‚œํ•ดํ•˜๊ณ  ์ง€๋ฃจํ–ˆ๋Š”๋ฐ ๊ทธ ๊ธฐ๋ฅ˜๊ฐ€ ๋งˆ์ง€๋ง‰ ์ˆ˜์—…๊นŒ์ง€ ์ด์–ด์กŒ๋‹ค.

๋‘๋ฒˆ์งธ. ๊ฐ•์‚ฌ๋‹˜๋งŒ ์ „๋ฌธ์ ์ด๋ผ๊ณ  ์ƒ๊ฐํ•œ๋‹ค. ์ „๋ฌธ์ ์ธ ์ง€์‹ ์ „๋‹ฌ์— ์žˆ์–ด์„œ ์ ์–ด๋„ ๋‚ด ์ˆ˜์ค€์œผ๋กœ๋Š” ๊ต‰์žฅํžˆ ์–ด๋ ต๋‹ค. ์–ด๋ ค์šด ์ด์œ ๋Š” ๋ฐฐ๊ฒฝ ์ง€์‹์ด ๋ถ€์กฑํ•ด์„œ ์ผ์ˆ˜๋„ ์žˆ๊ฒ ์ง€๋งŒ ์„ค๋ช…์— ์ƒ๋žต์ด ๋งŽ๊ณ  ์ข€ ๋” ์–ด๋ ค์šด ์ง€์‹์ด ๋“ฑ์žฅํ•˜๋”๋ผ๋„ ์ถ”๊ฐ€์ ์ธ ์„ค๋ช…์ด ๋งŽ์ด ์—†๋‹ค. ๋˜, ์ฅฌํ”ผํ„ฐ๋ฅผ ํ†ตํ•œ ์‹ค์Šต์—์„œ๋Š” ์ด ์ฝ”๋“œ๋ฅผ ์น˜๋ฉด ๋ฉ๋‹ˆ๋‹ค ํ•˜๊ณ  ๋„˜์–ด๊ฐ€๋‹ˆ. ์˜์ƒ์„ ๋ฉˆ์ถฐ๋†“๊ณ  ์ฝ”๋“œ๋ฅผ ์ž‘์„ฑํ•˜๋ฉด์„œ๋„ ๋ชจ๋ฅด๋Š” ์ฝ”๋“œ๋„ ๋งŽ์•„์„œ ์ผ์ผ์ด ๊ฒ€์ƒ‰ํ•˜๋ฉด์„œ ์‹ค์Šตํ–ˆ๋‹ค. (๋ˆ„๊ตฌ๋Š” ์ด๊ฒƒ์ด ๊ณต๋ถ€๋ผ๊ณ  ํ•˜๊ฒ ์ง€๋งŒ, ์ „๋ฐ˜์ ์ธ ์ˆ˜์—… ๋‚ด์šฉ์„ ๋‹ค ๊ฒ€์ƒ‰ํ•ด์•ผ ํ•œ๋‹ค๊ณ ?)

์„ธ๋ฒˆ์งธ. ๋จธ์‹ ๋Ÿฌ๋‹ ๊ฐ•์˜์— ์ฒซ์ฃผ๋ฅผ ์˜จํ†ต ์ˆ˜ํ•™์œผ๋กœ ๋ณด๋‚ธ๊ฑธ ๋ณด๋ฉด ๊ทธ๋ž˜๋„ ์ˆ˜ํ•™์ด ๊ฝค ์ค‘์š”ํ•˜๊ตฌ๋‚˜ ๋ผ๋Š” ๊ฑธ ๊ฐ•์กฐํ•˜์‹œ๋Š” ๊ฒƒ ๊ฐ™๋‹ค. ๋‚˜๋Š” ์ง€๊ธˆ๊นŒ์ง€ ์ˆ˜ํ•™์„ ๋‹ค๋ฃจ๋Š” ์ฑ…๋“ค์„ ๋งŽ์ด ์ฝ์–ด๋ณด์•˜์ง€๋งŒ ๋„์ค‘์— ์ฝ์ง€ ๋ชปํ•˜๊ณ  ํฌ๊ธฐ๋ฅผ ๋งŽ์ด ํ–ˆ๋Š”๋ฐ, ๋‹ค์‹œ ํ•œ๋ฒˆ ์ˆ˜ํ•™์„ ๋ฐฐ์›Œ์•ผ ๋  ๊ฒƒ ๊ฐ™๋‹ค๋Š” ๋‹ค์ง์„ ํ–ˆ์Œ.

์Šค์ฟจ ๋ฆฌ๋ทฐ

๋Œ€์ฒด๋กœ ์ฝ”๋“œ ์ˆ˜์—…์„ ํ•  ๋•Œ๋Š” ๊ทธ๋ ‡์ง€ ์•Š์€๋ฐ. ์ˆ˜ํ•™ ์ˆ˜์—…์„ ํ•  ๋•Œ์—๋Š” ํ•„๊ธฐ๊ฐ€ ๋„ˆ๋ฌด ์˜ค๋ž˜๊ฑธ๋ฆฐ๋‹ค. AI ์Šค์ฟจ์— ๋Œ€ํ•ด TIL์„ ์“ฐ๊ธฐ๋ฅผ ๊ถŒ์žฅํ•˜๋ฉด์„œ ์ €์ž‘๊ถŒ ๋•Œ๋ฌธ์— ๋‹จ์ˆœํ•œ ์ˆ˜ํ•™ ์‹์ •๋„๋„ ์บก์ณํ•˜์ง€ ๋ชปํ•˜๊ฒŒ ํ•œ๋‹ค๋ฉด ์ ์–ด๋„ ๊ฐ•์˜์—์„œ ์‚ฌ์šฉ๋˜๋Š” ์ˆ˜์‹์— ๋Œ€ํ•œ Free-Image๋‚˜ ๋งˆํฌ๋‹ค์šด ์ •๋„๋Š” ์ œ๊ณตํ•ด์ค˜์•ผ ๋œ๋‹ค๊ณ  ์ƒ๊ฐํ•œ๋‹ค. ๊ต‰์žฅํžˆ ๋งŽ์€ ์ˆ˜์‹์„ ์ ์œผ๋ ค๊ณ  ๋ชจ๋“  ํ•™์ƒ๋“ค์ด ๋‹ค ์˜ค๋žœ์‹œ๊ฐ„ ๋…ธ๋™ํ•˜์ง€ ์•Š๊ฒ ๋Š”๊ฐ€. (์•„ ๊ทธ๋ƒฅ ์ˆ˜์‹์„ ์•ˆ์ ์œผ๋ ค๋‚˜...)

๋˜, ์ฝ”๋“œ ๋ฆฌ๋ทฐ๋‚˜ ์งˆ๋ฌธ์ด ์ƒ๊ฐ๋ณด๋‹ค ๋‹ต๋ณ€์ด ๋Šฆ๋Š” ์ ์ด ์•„์‰ฌ์› ๋‹ค. ๋‹ค๋“ค ํ˜„์—…์— ์žˆ์–ด์„œ ๊ทธ๋Ÿฐ๊ฐ€ ๋ณด๋‹ค ํ•˜๊ณ  ์ฃผ๋ง์„ ๊ธฐ๋‹ค๋ ธ๋Š”๋ฐ๋„ ํ”ผ๋“œ๋ฐฑ์ด ์—†์–ด์„œ ์•„์‰ฌ์› ๋‹ค.(๊ทธ๋ž˜๋„ ํ˜ธ์ค€ ๋ฉ˜ํ† ๋‹˜์ด ๋„ˆ๋ฌด ํ”ผ๋“œ๋ฐฑ์„ ์ž˜ํ•ด์ฃผ์…”์„œ ๊ฐ์‚ฌํ•˜๋‹ค) ์š”์ฆ˜์€ ์งˆ๋ฌธ๋ณด๋‹ค๋Š” ๊ตฌ๊ธ€๋ง์œผ๋กœ ๊ฑฐ์˜ ํ•ด๊ฒฐํ•˜๋ ค๊ณ  ํ•˜๊ณ  ์ง„์งœ ๋ชจ๋ฅด๊ฒ ์œผ๋ฉด ์Šฌ๋ž™์— ๋ฌผ์–ด๋ณด๊ธด ํ•œ๋‹ค. ์ „๋ฐ˜์ ์œผ๋กœ ์„ธ์…˜๋•Œ๋„ ๋ถ„์œ„๊ธฐ๋ฅผ ์ž˜ ์ด๋Œ์–ด์ฃผ๊ณ  ์†Œํ†ต์„ ๊ต‰์žฅํžˆ ์ž˜ํ•ด์ฃผ์…”์„œ ์ข‹๋‹ค. ๊ทธ๋ฆฌ๊ณ  ๋‹ค๋“ค ์ ๊ทน์ ์œผ๋กœ ๋„์™€์ฃผ๋ ค๋Š” ๋ˆˆ๋น›์ด์–ด์„œ ๊ฐ์‚ฌํ•˜๋‹ค. ์•„์‰ฌ์šด ์ ๋งŒ ๋งํ•ด์„œ ๊ทธ๋ ‡์ง€, ๋งŒ์กฑ๋„๋Š” 7-8์  ์ด์ƒ์ด๋‹ค. ๋„ˆ๋ฌด ๋ถˆํ‰๋ถˆ๋งŒ์œผ๋กœ ๋ณด์˜€์œผ๋ ค๋‚˜. ์ง•์ง•.. ์ง•์ง•..

๋˜ ์ƒ๊ฐ๋‚œ ๊ฒƒ์ด ์žˆ๋Š”๋ฐ, ๋ฐฐ์›€ ๊ธฐ๋ก ์นด๋“œ๋Š” ๋ฉ˜ํ† ๊ฐ€ ์ถ”์ฒœํ•˜๋Š” ๊ฒƒ์ด ๋งž๋‹ค๊ณ  ์ƒ๊ฐํ•œ๋‹ค. ํ˜„์žฌ ์ถ”์ฒœ ๋ฐฉ์‹์ด ๋ชจ๋“  ์ธ์›์ด ์ฐธ์—ฌํ•˜์ง€ ์•Š๊ณ , ๋˜ ๋ชจ๋“  ์ธ์›์ด ๋‹ค๋ฅธ ์ด์˜ ๋ฐฐ์›€ ๊ธฐ๋ก ์นด๋“œ๋ฅผ ์ผ์ผ์ด ๋‹ค ๋ณด์ง€ ์•Š์„ ๊ฒƒ์ด๋‹ค. ๋‚œ "์ด์ˆ˜์ง„"๋‹˜์ด ์ œ์ผ ์ž˜ ํ–ˆ๋‹ค๊ณ  ์ƒ๊ฐํ•˜๋Š”๋ฐ(๋‚˜๋„ ๋ชจ๋“  ์ธ์›์˜ ๋ฐฐ์›€ ๊ธฐ๋ก ์นด๋“œ๋ฅผ ๋ณธ ๊ฒƒ์€ ์•„๋‹ˆ์ง€๋งŒ ๊ทธ๋ž˜๋„ ๋Œ€์ฒด๋กœ ๋ดค์„ ๋•Œ ์ œ์ผ ์ž˜์ป๊ณ  ๋…ธ๋ ฅ์ด ๋ณด์ธ๋‹ค) ์ด ๋ถ„์ด ์ƒ์œ„๊ถŒ์— ์˜ฌ๋ผ๊ฐ€์ง€ ์•Š๋Š”๊ฒŒ ์ด์ƒํ•  ์ •๋„.

[์ธํ”„๋Ÿฐ] ๋‹จ ๋‘ ์žฅ์˜ ๋ฌธ์„œ๋กœ ๋ฐ์ดํ„ฐ ๋ถ„์„๊ณผ ์‹œ๊ฐํ™” ๋ฝ€๊ฐœ๊ธฐ

์‹œ๊ณ„์—ด ๋ฐ์ดํ„ฐ(Time Series Data) ๋ถ„์„์„ ์œ„ํ•œ ํŒ๋‹ค์Šค Expanding and Rolling ์ดํ•ดํ•˜๊ธฐ - Windows

df.expanding()
Return an Expanding object allowing summary functions to be
applied cumulatively.

df.rolling(n)
Return a Rolling object allowing summary functions to be
applied to windows of length n.
import pandas as pd
import numpy as np
s = pd.Series(
    np.random.randn(1000),
    index=pd.date_range("2020-01-01", periods=1000))
s.plot()
# ์ด์ „์—๋Š” %matplotlib inline์„ ์จ์ฃผ์ง€ ์•Š์œผ๋ฉด
# ๊ทธ๋ž˜ํ”„๊ฐ€ ๋ณด์ด์ง€ ์•Š์•˜๋‹ค. ํ˜„์žฌ๋Š” defualt๋กœ ์ ์šฉ๋œ๋‹ค.
<matplotlib.axes._subplots.AxesSubplot at 0x17adf475548>
s = s.cumsum()
s.plot()
<matplotlib.axes._subplots.AxesSubplot at 0x17adf536408>
r  = s.rolling(window=60)
r
# ์•ž์ชฝ์—์„œ 60๊ฐœ๋Š” ๋น„์–ด์žˆ๋‹ค
Rolling [window=60,center=False,axis=0]
r.mean()
2020-01-01          NaN
2020-01-02          NaN
2020-01-03          NaN
2020-01-04          NaN
2020-01-05          NaN
                ...    
2022-09-22   -29.150736
2022-09-23   -29.142874
2022-09-24   -29.154502
2022-09-25   -29.175149
2022-09-26   -29.231764
Freq: D, Length: 1000, dtype: float64
r  = s.rolling(window=30)
s.plot(style='k--')
r.mean().plot(style='k')
<matplotlib.axes._subplots.AxesSubplot at 0x17ae0411508>

plot์€ ๋ณดํ†ต ์ด๋™ํ‰๊ท ์„ ๊ตฌํ•  ๋•Œ ์‚ฌ์šฉํ•œ๋‹ค. (moving average)

df = pd.DataFrame(
   ....:     np.random.randn(1000, 4),
   ....:     index=pd.date_range("2020-01-01", periods=1000),
   ....:     columns=["A", "B", "C", "D"])
df

A

B

C

D

2020-01-01

-1.018892

0.842255

-0.987166

-0.597796

2020-01-02

0.109856

-0.017367

-0.120858

-0.263876

2020-01-03

-0.664397

0.611548

0.562033

-0.564003

2020-01-04

-0.159660

0.130362

1.087226

1.136409

2020-01-05

-0.236306

0.901542

0.642744

-1.831807

...

...

...

...

...

2022-09-22

0.475307

-0.239127

0.852104

-0.170865

2022-09-23

0.000562

0.120297

0.885682

-0.085760

2022-09-24

-0.040042

-0.339307

-0.082087

0.848679

2022-09-25

-0.272293

-1.296961

0.230514

-0.849387

2022-09-26

-0.229110

-0.066247

0.093493

-0.149113

1000 rows ร— 4 columns

df = df.cumsum()
df

A

B

C

D

2020-01-01

-1.018892

0.842255

-0.987166

-0.597796

2020-01-02

-0.909036

0.824888

-1.108024

-0.861672

2020-01-03

-1.573433

1.436436

-0.545991

-1.425675

2020-01-04

-1.733093

1.566798

0.541234

-0.289266

2020-01-05

-1.969399

2.468341

1.183978

-2.121073

...

...

...

...

...

2022-09-22

-55.677922

-53.406551

64.343770

6.064137

2022-09-23

-55.677360

-53.286254

65.229452

5.978377

2022-09-24

-55.717402

-53.625562

65.147365

6.827056

2022-09-25

-55.989695

-54.922523

65.377879

5.977669

2022-09-26

-56.218805

-54.988770

65.471372

5.828556

1000 rows ร— 4 columns

df.plot()
<matplotlib.axes._subplots.AxesSubplot at 0x17adf8b32c8>
df.rolling(window=60).sum().plot(subplots=True)
array([<matplotlib.axes._subplots.AxesSubplot object at 0x0000017ADFDB1888>,
       <matplotlib.axes._subplots.AxesSubplot object at 0x0000017ADFDFB1C8>,
       <matplotlib.axes._subplots.AxesSubplot object at 0x0000017ADFE43708>,
       <matplotlib.axes._subplots.AxesSubplot object at 0x0000017ADFE8D8C8>],
      dtype=object)
df.rolling(window=len(df), min_periods=1).mean().plot()
<matplotlib.axes._subplots.AxesSubplot at 0x17ad47a9348>
df.expanding(min_periods=1).mean().plot()
<matplotlib.axes._subplots.AxesSubplot at 0x17ae1681fc8>
df.expanding?
dfe = pd.DataFrame({"B": [0, 1, 2, np.nan, 4]})
dfe

B

0

0.0

1

1.0

2

2.0

3

NaN

4

4.0

dfe.plot()
<matplotlib.axes._subplots.AxesSubplot at 0x17ae17cbd08>

์ค‘๊ฐ„์— NaN์ด ์žˆ์–ด์„œ ๊ทธ๋ž˜ํ”„๊ฐ€ ๋Š๊ธฐ๋Š” ๊ฒƒ์„ ๋ณผ ์ˆ˜ ์žˆ๋‹ค => Expanding์ด ํ•„์š”!

dfe.expanding(2).sum()

B

0

NaN

1

1.0

2

3.0

3

3.0

4

7.0

dfe.expanding(2).sum().plot()
<matplotlib.axes._subplots.AxesSubplot at 0x17ae183e688>
dfe.expanding(2).mean().plot()
<matplotlib.axes._subplots.AxesSubplot at 0x17ae189dec8>

ํŒŒ์ด์ฌ ํŒ๋‹ค์Šค๋กœ Series ์™€ DataFrame ๋ฐ์ดํ„ฐ ์‹œ๊ฐํ™” ํ•˜๊ธฐ ์†Œ๊ฐœ - Plotting

import pandas as pd
import numpy as np
import matplotlib as mpl
import matplotlib.pyplot as plt

# ํ•œ๊ธ€ํฐํŠธ ์‚ฌ์šฉ์‹œ ๊ทธ๋ž˜ํ”„์—์„œ ๋งˆ์ด๋„ˆ์Šค ํฐํŠธ ๊นจ์ง€๋Š” ๋ฌธ์ œ์— ๋Œ€ํ•œ ๋Œ€์ฒ˜
mpl.rcParams['axes.unicode_minus'] = False
df.plot?

plot

kind : str
- 'line' : line plot (default)
- 'bar' : vertical bar plot
- 'barh' : horizontal bar plot
- 'hist' : histogram
- 'box' : boxplot
- 'kde' : Kernel Density Estimation plot
- 'density' : same as 'kde'
- 'area' : area plot
- 'pie' : pie plot
- 'scatter' : scatter plot
- 'hexbin' : hexbin plot.
ts = pd.Series(
    np.random.randn(1000),
    index=pd.date_range("2020-01-01", periods=1000))
ts
2020-01-01   -0.259357
2020-01-02    0.660092
2020-01-03   -0.759879
2020-01-04    0.158824
2020-01-05    0.008104
                ...   
2022-09-22   -0.705859
2022-09-23   -0.725916
2022-09-24    0.577275
2022-09-25    1.199486
2022-09-26    1.529162
Freq: D, Length: 1000, dtype: float64
ts.plot()
<matplotlib.axes._subplots.AxesSubplot at 0x17ae19f6608>
ts = ts.cumsum()
ts.plot()
<matplotlib.axes._subplots.AxesSubplot at 0x17ae2b3ac88>
df = pd.DataFrame(np.random.randn(1000, 4),
                  index=ts.index, columns=list('ABCD'))
df

A

B

C

D

2020-01-01

-0.494674

0.168681

0.251175

0.574593

2020-01-02

1.715208

0.956359

0.149570

0.385309

2020-01-03

1.064786

0.482158

0.142449

0.829651

2020-01-04

-1.873236

-1.087397

-1.401830

-0.522738

2020-01-05

2.044977

0.423282

0.132798

0.043316

...

...

...

...

...

2022-09-22

-1.494242

2.286578

0.045736

-0.210665

2022-09-23

-0.922229

-0.520283

0.887929

-0.417726

2022-09-24

-1.244146

0.125490

1.108425

0.116583

2022-09-25

-0.654476

-0.596485

-1.908873

-1.268358

2022-09-26

0.680929

0.989327

-0.790184

1.183248

1000 rows ร— 4 columns

df.plot()
<matplotlib.axes._subplots.AxesSubplot at 0x17ae2e893c8>
df = df.cumsum()
df.plot()
<matplotlib.axes._subplots.AxesSubplot at 0x17ae2fd7648>
df3 = pd.DataFrame(np.random.randn(1000, 2), columns=['B', 'C']).cumsum()
df3.head()

B

C

0

0.743316

0.712930

1

0.703344

1.902628

2

0.793851

2.953486

3

-0.741258

2.168597

4

-0.241436

3.004576

df3['A'] = pd.Series(list(range(len(df))))
df3.head()

B

C

A

0

0.743316

0.712930

0

1

0.703344

1.902628

1

2

0.793851

2.953486

2

3

-0.741258

2.168597

3

4

-0.241436

3.004576

4

df3.plot(x='A', y='B')
<matplotlib.axes._subplots.AxesSubplot at 0x17ae312f688>

ํŒŒ์ด์ฌ ๋ฐ์ดํ„ฐ ์‹œ๊ฐํ™” ํŒ๋‹ค์Šค๋กœ bar plot ๋ง‰๋Œ€๊ทธ๋ž˜ํ”„ ๊ทธ๋ฆฌ๊ธฐ - Plotting

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
ts = pd.Series(np.random.randn(1000),
               index=pd.date_range("2020-01-01", periods=1000))
ts.head()
2020-01-01   -0.923483
2020-01-02    1.554615
2020-01-03   -0.850197
2020-01-04    0.607606
2020-01-05   -1.544911
Freq: D, dtype: float64
df = pd.DataFrame(np.random.randn(1000, 4),
               index=ts.index, columns=list('ABCD'))
df.head(6)

A

B

C

D

2020-01-01

-0.000248

0.468759

-0.570039

0.922824

2020-01-02

-0.900794

-2.259521

0.328642

0.522356

2020-01-03

-0.478821

1.064958

0.245880

1.558642

2020-01-04

-0.953477

1.419711

1.096004

0.581822

2020-01-05

0.400680

-0.037835

-0.767587

0.150695

2020-01-06

-0.470434

-0.026065

0.629644

0.113024

df.tail(3)

A

B

C

D

2022-09-24

0.779658

-0.688993

0.583472

-0.693562

2022-09-25

0.514137

-0.193501

-0.004432

0.905607

2022-09-26

0.087004

0.291302

2.354743

0.235747

df.iloc[5]
# 5๋ฒˆ์งธ ์ธ๋ฑ์Šค(2020-01-06)
A   -0.470434
B   -0.026065
C    0.629644
D    0.113024
Name: 2020-01-06 00:00:00, dtype: float64
df.iloc[5].plot(kind='bar')
<matplotlib.axes._subplots.AxesSubplot at 0x17ae31a01c8>
df.iloc[5].plot.bar()
<matplotlib.axes._subplots.AxesSubplot at 0x17ae43b1c88>
df.iloc[5].plot.bar()
plt.axhline(0, color='k')
<matplotlib.lines.Line2D at 0x17ae45c6d08>
df2 = pd.DataFrame(np.random.rand(10, 4), columns=['a', 'b', 'c', 'd'])
df.head(4)

A

B

C

D

2020-01-01

-0.000248

0.468759

-0.570039

0.922824

2020-01-02

-0.900794

-2.259521

0.328642

0.522356

2020-01-03

-0.478821

1.064958

0.245880

1.558642

2020-01-04

-0.953477

1.419711

1.096004

0.581822

df2.plot.bar()
<matplotlib.axes._subplots.AxesSubplot at 0x17ae40b8c08>
df2.plot.bar(stacked=True)
<matplotlib.axes._subplots.AxesSubplot at 0x17ad4e86ac8>
df2.plot.barh(stacked=True)
<matplotlib.axes._subplots.AxesSubplot at 0x17ae4743588>

ํŒŒ์ด์ฌ ๋ฐ์ดํ„ฐ ์‹œ๊ฐํ™” ํžˆ์Šคํ† ๊ทธ๋žจ๊ณผ ๋„์ˆ˜๋ถ„ํฌํ‘œ ์ดํ•ดํ•˜๊ธฐ - Plotting

Histogram

๋„์ˆ˜๋ถ„ํฌํ‘œ vs ํžˆ์Šคํ† ๊ทธ๋žจ

  • ๋„์ˆ˜๋ถ„ํฌํ‘œ : ํŠน์ • ๊ตฌ๊ฐ„์— ์†ํ•˜๋Š” ์ž๋ฃŒ์˜ ๊ฐœ์ˆ˜๋ฅผ ํ‘œํ˜„

  • ํžˆ์Šคํ† ๊ทธ๋žจ : ๋„์ˆ˜๋ถ„ํฌํ‘œ๋ฅผ ์‹œ๊ฐํ™”ํ•˜์—ฌ ๋ง‰๋Œ€๊ทธ๋ž˜ํ”„๋กœ ํ‘œํ˜„, ํ•˜์ง€๋งŒ bar plot๊ณผ๋Š” ๋‹ค๋ฆ„

๋ง‰๋Œ€๊ทธ๋ž˜ํ”„ vs ํžˆ์Šคํ† ๊ทธ๋žจ (bar plot vs hist plot)

  • bar plot : ํ‘œํ˜„ ๊ฐ’์— ๋น„๋ก€ํ•˜์—ฌ ๋†’์ด์™€ ๊ธธ์ด๋ฅผ ์ง€๋‹Œ ์ง์‚ฌ๊ฐํ˜• ๋ง‰๋Œ€๋กœ ๋ฒ”์ฃผํ˜• ๋ฐ์ดํ„ฐ๋ฅผ ํ‘œํ˜„ํ•˜๋Š” ์ฐจํŠธ๋‚˜ ๊ทธ๋ž˜ํ”„, ํ•ฉ๊ณ„, ํ‰๊ท  ๋“ฑ์˜ ์ˆ˜์น˜๋ฅผ ์‹œ๊ฐํ™”

  • hist plot : ๊ตฌ๊ฐ„๋ณ„ ๋นˆ๋„์ˆ˜๋ฅผ ํ‘œํ˜„

ํžˆ์Šคํ† ๊ทธ๋žจ vs ์ •๊ทœ๋ถ„ํฌ (hist plot vs density plot)

  • hist plot : ๊ตฌ๊ฐ„๋ณ„ ๋นˆ๋„์ˆ˜

  • density plot : ํ™•๋ฅ  ๋ฐ€๋„ ํ•จ์ˆ˜ - ํ™•๋ฅ  ๋ณ€์ˆ˜์˜ ๋ฐ€๋„๋ฅผ ํ‘œํ˜„

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
df4 = pd.DataFrame({'a' : np.random.randn(1000) + 1,
                  'b' : np.random.randn(1000),
                  'c' : np.random.randn(1000) - 1},
                 columns = ['a', 'b', 'c'])
df4.head()

a

b

c

0

0.233627

0.933449

-2.384155

1

2.890279

-0.604678

-1.667775

2

1.490996

-0.958704

-0.533509

3

-0.549594

-1.567981

-2.083608

4

2.881449

2.508202

-4.514146

df4.plot.hist(alpha=0.5)
# alpha๋Š” ํˆฌ๋ช…๋„๋ฅผ ์˜๋ฏธ
<matplotlib.axes._subplots.AxesSubplot at 0x17ae482a2c8>
df4.plot.hist(stacked=True, bins= 20)
# bin default = 10
# bin๊ฐ’์— ๋”ฐ๋ผ์„œ frequency๋„ ๋‹ฌ๋ผ์ง„๋‹ค
<matplotlib.axes._subplots.AxesSubplot at 0x17ae4a9ec08>
df4['a'].plot.hist(orientation='horizontal', cumulative=True)
<matplotlib.axes._subplots.AxesSubplot at 0x17ae4a83f08>
df4['a_diff'] = df4['a'].diff()
df4['a_shift'] = df4['a'].shift(1)
df4['a_minus'] = df4['a'] - df4['a_shift']
df4[['a', 'a_shift', 'a_minus', 'a_diff']].head()
# diff๋Š” ์ฐจ๋ถ„์„ ์˜๋ฏธ : ๋’ค์˜ ๊ฐ’์—์„œ ์•ž์˜ ๊ฐ’์„ ๋นผ๋Š” ๊ฒƒ

a

a_shift

a_minus

a_diff

0

0.233627

NaN

NaN

NaN

1

2.890279

0.233627

2.656652

2.656652

2

1.490996

2.890279

-1.399283

-1.399283

3

-0.549594

1.490996

-2.040590

-2.040590

4

2.881449

-0.549594

3.431044

3.431044

df4['a'].diff().hist()
<matplotlib.axes._subplots.AxesSubplot at 0x17ae5eba208>
df4[['a', 'b', 'c']].diff().hist(color='k', alpha = 0.5, bins = 50)
array([[<matplotlib.axes._subplots.AxesSubplot object at 0x0000017AE61E1B48>,
        <matplotlib.axes._subplots.AxesSubplot object at 0x0000017AE62C35C8>],
       [<matplotlib.axes._subplots.AxesSubplot object at 0x0000017AE62EC808>,
        <matplotlib.axes._subplots.AxesSubplot object at 0x0000017AE631A648>]],
      dtype=object)
data = pd.Series(np.random.randn(1000))
data.hist(by=np.random.randint(0, 4, 1000), figsize=(6, 4))
# category ๋ณ„๋กœ ๊ทธ๋ฆด ์ˆ˜๋„ ์žˆ๋‹ค
array([[<matplotlib.axes._subplots.AxesSubplot object at 0x0000017AD71DDB48>,
        <matplotlib.axes._subplots.AxesSubplot object at 0x0000017AE6550F48>],
       [<matplotlib.axes._subplots.AxesSubplot object at 0x0000017AE65800C8>,
        <matplotlib.axes._subplots.AxesSubplot object at 0x0000017AE65A6DC8>]],
      dtype=object)
data = pd.DataFrame({'a' : np.random.randn(1000),
                    'b' : np.random.randint(0, 4, 1000)})
data.head()

a

b

0

0.494461

0

1

0.377023

1

2

0.127343

1

3

0.262922

0

4

0.035662

0

data['a'].hist(by=data['b'], figsize=(6, 4))
array([[<matplotlib.axes._subplots.AxesSubplot object at 0x0000017AE66BEF48>,
        <matplotlib.axes._subplots.AxesSubplot object at 0x0000017AE66D74C8>],
       [<matplotlib.axes._subplots.AxesSubplot object at 0x0000017AE6708148>,
        <matplotlib.axes._subplots.AxesSubplot object at 0x0000017AE672FF88>]],
      dtype=object)

ํŒŒ์ด์ฌ ์‹œ๊ฐํ™” ์ƒ์ž ์ˆ˜์—ผ ๊ทธ๋ฆผ(box plot) ๊ทธ๋ฆฌ๊ธฐ - Plotting

box plot

๊ฐ€๊ณตํ•˜์ง€ ์•Š์€ ์ž๋ฃŒ ๊ทธ๋Œ€๋กœ๋ฅผ ์ด์šฉํ•˜์—ฌ ๊ทธ๋ฆฐ ๊ฒƒ์ด ์•„๋‹ˆ๋ผ, ์ž๋ฃŒ๋กœ๋ถ€ํ„ฐ ์–ป์–ด๋‚ธ ํ†ต๊ณ„๋Ÿ‰์ธ 5๊ฐ€์ง€ ์š”์•ฝ ์ˆ˜์น˜๋กœ ๊ทธ๋ฆฐ ๊ฒƒ.

  • ์ตœ์†Ÿ๊ฐ’

  • ์ œ1 ์‚ฌ๋ถ„์œ„์ˆ˜

  • ์ œ2 ์‚ฌ๋ถ„์œ„์ˆ˜(=์ค‘์•™๊ฐ’)

  • ์ œ3 ์‚ฌ๋ถ„์œ„์ˆ˜

  • ์ตœ๋Œ€๊ฐ’

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
df = pd.DataFrame(np.random.rand(10, 5), columns=['A', 'B', 'C', 'D', 'E'])
df

A

B

C

D

E

0

0.370232

0.606418

0.652095

0.492909

0.349718

1

0.112048

0.106915

0.265264

0.728334

0.494577

2

0.004974

0.729390

0.920953

0.773177

0.977718

3

0.934579

0.708746

0.029470

0.329899

0.377750

4

0.856339

0.376554

0.731859

0.703761

0.995195

5

0.635157

0.040028

0.316238

0.307590

0.040899

6

0.017509

0.356093

0.728913

0.297290

0.337541

7

0.128064

0.690733

0.733154

0.523859

0.315686

8

0.115399

0.202518

0.540844

0.667318

0.415735

9

0.645682

0.007744

0.336712

0.230469

0.553611

df.describe()

A

B

C

D

E

count

10.000000

10.000000

10.000000

10.000000

10.000000

mean

0.381998

0.382514

0.525550

0.505461

0.485843

std

0.357076

0.286496

0.277474

0.204475

0.296292

min

0.004974

0.007744

0.029470

0.230469

0.040899

25%

0.112886

0.130816

0.321356

0.313168

0.340585

50%

0.249148

0.366324

0.596469

0.508384

0.396742

75%

0.643050

0.669654

0.731123

0.694650

0.538853

max

0.934579

0.729390

0.920953

0.773177

0.995195

df.plot.box()
<matplotlib.axes._subplots.AxesSubplot at 0x17ae684e288>
color = {
    "boxes": "DarkGreen",
    "whiskers": "DarkOrange",
    "medians": "DarkBlue",
    "caps": "Gray",}
df.plot
<pandas.plotting._core.PlotAccessor object at 0x0000017AE6803288>
df.plot.box(color=color, sym="r+")
<matplotlib.axes._subplots.AxesSubplot at 0x17ae68fb488>
df.plot.box(vert=False, positions=[1, 4, 5, 6, 8])
<matplotlib.axes._subplots.AxesSubplot at 0x17ae6a60088>
df = pd.DataFrame(np.random.rand(10, 5))
df.head()

0

1

2

3

4

0

0.929338

0.659756

0.972052

0.521413

0.215369

1

0.450177

0.283452

0.816272

0.466250

0.451954

2

0.877936

0.720482

0.350979

0.020901

0.633757

3

0.445642

0.444882

0.349320

0.321260

0.384497

4

0.404033

0.092795

0.097995

0.723962

0.870682

plt.figure()
bp = df.boxplot()
# ๊ฒฉ์ž ์ƒ์„ฑ
df = pd.DataFrame(np.random.rand(10, 2), columns=["Col1", "Col2"])
df.head()

Col1

Col2

0

0.463268

0.297339

1

0.594417

0.267667

2

0.666147

0.707854

3

0.378402

0.735593

4

0.420503

0.365746

df["X"] = pd.Series(["A", "A", "A", "A", "A", "B", "B", "B", "B", "B"])
df.head()

Col1

Col2

X

0

0.463268

0.297339

A

1

0.594417

0.267667

A

2

0.666147

0.707854

A

3

0.378402

0.735593

A

4

0.420503

0.365746

A

plt.figure()
bp = df.boxplot(by="X")
<Figure size 432x288 with 0 Axes>
np.random.seed(1234)
df_box = pd.DataFrame(np.random.randn(50, 2))
df_box["g"] = np.random.choice(["A", "B"], size=50)
df_box.loc[df_box["g"] == "B", 1] += 3
bp = df_box.boxplot(by="g")
bp = df_box.groupby("g").boxplot()

ํŒŒ์ด์ฌ ๋ฐ์ดํ„ฐ ์‹œ๊ฐํ™” Area plot, Grid ์˜ต์…˜์œผ๋กœ ๊ทธ๋ž˜ํ”„ ๊ฒฉ์ž ๋งŒ๋“ค๊ธฐ - Plotting

Area plot

๊ธฐ๋ณธ์ ์œผ๋กœ stacked=True์ธ ๊ทธ๋ž˜ํ”„๋ฅผ ๊ทธ๋ฆฐ๋‹ค

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
df = pd.DataFrame(np.random.rand(10, 4), columns=["a", "b", "c", "d"])
df

a

b

c

d

0

0.156005

0.052212

0.188224

0.807246

1

0.667732

0.712762

0.537092

0.519210

2

0.034637

0.735852

0.533051

0.258751

3

0.250128

0.011195

0.654490

0.954305

4

0.794724

0.648470

0.988780

0.206013

5

0.656444

0.189838

0.076012

0.627008

6

0.191030

0.520235

0.869168

0.957507

7

0.795974

0.170474

0.791833

0.782586

8

0.980319

0.722360

0.134649

0.879211

9

0.751361

0.697953

0.240086

0.953517

df.plot()
<matplotlib.axes._subplots.AxesSubplot at 0x1e3fcb14a88>
df.plot.area()
# ์œ„ ๊ทธ๋ž˜ํ”„์™€ ์กฐ๊ธˆ ๋‹ค๋ฅธ ๋ชจ์Šต
# stacked ๋˜์–ด ์žˆ๋‹ค
<matplotlib.axes._subplots.AxesSubplot at 0x1e3fcbfed88>
df.plot.area(stacked=False)
<matplotlib.axes._subplots.AxesSubplot at 0x1e3fcc78c08>
df.plot(grid=True)
<matplotlib.axes._subplots.AxesSubplot at 0x1e3fcce2448>

ํŒŒ์ด์ฌ ๋ฐ์ดํ„ฐ ์‹œ๊ฐํ™” scatter plot, ์‚ฐ์ ๋„ ๊ทธ๋ฆฌ๊ธฐ - Plotting

Scatter plot

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
df = pd.DataFrame(np.random.rand(50, 4), columns=["a", "b", "c", "d"])
df.plot.scatter(x="a", y="b")
<matplotlib.axes._subplots.AxesSubplot at 0x1e3fcd83f48>
df.plot.scatter(x="a", y="b", grid=True)
# ๊ฒฉ์ž ์ถ”๊ฐ€
<matplotlib.axes._subplots.AxesSubplot at 0x1e3fcdc7dc8>
df.plot.scatter(x="a", y="b", grid=True, s=50)
# ์  ํฌ๊ธฐ ์„ค์ •
<matplotlib.axes._subplots.AxesSubplot at 0x1e3fcd52888>
ax = df.plot.scatter(x="a", y="b", color="Red", label="Group 1")
df.plot.scatter(x="c", y="d", color="DarkBlue", label="Group 2", ax=ax)
<matplotlib.axes._subplots.AxesSubplot at 0x1e3fdfd7188>
df.plot.scatter(x="a", y="b", c="c", s=60)
<matplotlib.axes._subplots.AxesSubplot at 0x1e3fe113c48>
df.plot.scatter(x="a", y="b", s=df["c"] * 200)
<matplotlib.axes._subplots.AxesSubplot at 0x1e3fe1afac8>

ํŒŒ์ด์ฌ ๋ฐ์ดํ„ฐ ์‹œ๊ฐํ™” ํžˆ์Šคํ† ๊ทธ๋žจ๊ณผ ์‚ฐ์ ๋„๋ฅผ ๋ณด์™„ํ•œ Hexbin plot ๊ทธ๋ฆฌ๊ธฐ - Plotting

๋ฐ์ดํ„ฐ๊ฐ€ ๋„ˆ๋ฌด ๋งŽ์„ ๋•Œ Hexagonal Bin Plot์„ ์‚ฌ์šฉํ•œ๋‹ค. Scatter plot์˜ ๋‹จ์ ์„ ๋ณด์™„ํ•œ๋‹ค

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
df = pd.DataFrame(np.random.randn(1000, 2), columns=["a", "b"])
df["b"] = df["b"] + np.arange(1000) # ๊ฐ๊ฐ์˜ index๊ฐ’์„ ์ถ”๊ฐ€ํ•ด์คŒ. ๋’ค๋กœ ๊ฐˆ์ˆ˜๋ก ๊ฐ’์ด ์ฆ๊ฐ€
df

a

b

0

0.143530

0.168155

1

0.510635

1.377011

2

-0.440050

2.127910

3

1.717786

1.872928

4

-0.691979

3.942575

...

...

...

995

1.212032

996.148623

996

-1.094107

996.484877

997

1.512110

996.442078

998

-0.523446

999.116458

999

-1.085693

997.781039

1000 rows ร— 2 columns

df.plot.hexbin(x="a", y="b", gridsize=5)
<matplotlib.axes._subplots.AxesSubplot at 0x1e3ff0e41c8>
df.plot.hexbin(x="a", y="b", gridsize=15)
<matplotlib.axes._subplots.AxesSubplot at 0x1e3ff224508>
df["z"] = np.random.uniform(0, 3, 1000) # 0์—์„œ 3๊นŒ์ง€์˜ ๋ฒ”์œ„๋กœ 1000๊ฐœ ์ƒ์„ฑ
df

a

b

z

0

0.143530

0.168155

2.223601

1

0.510635

1.377011

0.424057

2

-0.440050

2.127910

0.388958

3

1.717786

1.872928

0.238032

4

-0.691979

3.942575

1.840137

...

...

...

...

995

1.212032

996.148623

1.041120

996

-1.094107

996.484877

2.818773

997

1.512110

996.442078

1.570454

998

-0.523446

999.116458

1.100353

999

-1.085693

997.781039

0.660611

1000 rows ร— 3 columns

df.plot.hexbin(x="a", y="b", C="z", gridsize=20)
<matplotlib.axes._subplots.AxesSubplot at 0x1e3ff3ed3c8>
df.plot.hexbin(x="a", y="b", C="z", reduce_C_function=np.max, gridsize=20)

# reduce_C_function
# ๊ธฐ๋ณธ์ ์œผ๋กœ ๊ฐ ์  (x, y) ์ฃผ๋ณ€์˜ ๊ฐœ์ˆ˜์— ๋Œ€ํ•œ ํžˆ์Šคํ† ๊ทธ๋žจ์ด ๊ณ„์‚ฐ๋œ๋‹ค.
# reduce_C_function ๋˜๋Š” C์— ์ธ์ž๋ฅผ ์ „๋‹ฌํ•ด์ฃผ๋ฉด
# ํ•ด๋‹น ๋„์ˆ˜ ๊ฐ’์ด ์•„๋‹ˆ๋ผ, ์ตœ๋Œ€, ์ตœ์†Œ, ์ค‘์•™, ํ‰๊ท  ๊ฐ’ ๋“ฑ์œผ๋กœ ํ™•์ธํ•  ์ˆ˜ ์žˆ๋‹ค
<matplotlib.axes._subplots.AxesSubplot at 0x1e3ff53da88>

Pandas pie plot, ์› ๊ทธ๋ž˜ํ”„, ์™œ seaborn ์—๋Š” ํŒŒ์ด์ฐจํŠธ๊ฐ€ ์—†์„๊นŒ? - Plotting

Pie plot

seaborn์—์„œ๋Š” ์ง€์›ํ•˜์ง€ ์•Š๋Š”๋‹ค. (์•ž์œผ๋กœ๋„ ์ง€์›ํ•  ์˜ˆ์ •์ด ์—†์Œ) ์ด์œ  : ์˜คํ•ด์˜ ์†Œ์ง€๊ฐ€ ์žˆ๊ณ , ๋ฐ์ดํ„ฐ๋ฅผ ์ •ํ™•ํ•˜๊ฒŒ ํ‘œํ˜„ํ•˜์ง€ ๋ชปํ•  ๋•Œ๊ฐ€ ์žˆ๋‹ค.

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
series = pd.Series(3 * np.random.rand(4), index=["a", "b", "c", "d"], name="series")
series
a    1.576532
b    2.637146
c    2.704557
d    1.490052
Name: series, dtype: float64
series.plot.pie(figsize=(6, 6));
df = pd.DataFrame({'mass': [0.330, 4.87, 5.97],
                  'radius': [2439.7, 6051.8, 6378.1]},
                 index=['Mercury', 'Venus', 'Earth'])
plot = df.plot.pie(y='mass', figsize=(5, 5))
plot = df.plot.pie(subplots=True, figsize=(10, 5))
series.plot.pie(
    labels=["AA", "BB", "CC", "DD"],
    colors=["r", "g", "b", "c"],
    autopct="%.2f",
    fontsize=20,
    figsize=(6, 6),)
<matplotlib.axes._subplots.AxesSubplot at 0x1e3ff755408>
series = pd.Series([0.1] * 4, index=["a", "b", "c", "d"], name="series2")
series.plot.pie(figsize=(6, 6))
<matplotlib.axes._subplots.AxesSubplot at 0x1e3ff7a1fc8>

Scatter Matrix Plot ์‚ฐ์ ๋„์™€ ์ปค๋„๋ฐ€๋„ํ•จ์ˆ˜๋ฅผ ํ•จ๊ป˜ ํ‘œํ˜„ - Plotting

from pandas.plotting import scatter_matrix
df = pd.DataFrame(np.random.randn(1000, 4), columns=["a", "b", "c", "d"])
df.head()

a

b

c

d

0

0.887408

-0.002276

-1.342932

-0.202530

1

-1.443966

-0.018683

-0.676024

0.478978

2

-0.665443

-0.916739

-0.566526

0.948019

3

-0.621182

-1.709215

-0.375141

-1.305123

4

1.656003

-0.898862

-1.744376

0.926337

scatter_matrix(df, alpha=0.2, figsize=(6, 6), diagonal="kde")
# alpha ํˆฌ๋ช…๋„, figsize ํฌ๊ธฐ, diagonal ๋Œ€๊ฐ์„ , kde ์ปค๋„๋ฐ€๋„ํ•จ์ˆ˜(kernal density estimate)
array([[<matplotlib.axes._subplots.AxesSubplot object at 0x000001E389800548>,
        <matplotlib.axes._subplots.AxesSubplot object at 0x000001E38AC3E488>,
        <matplotlib.axes._subplots.AxesSubplot object at 0x000001E38AC6B6C8>,
        <matplotlib.axes._subplots.AxesSubplot object at 0x000001E38AC96BC8>],
       [<matplotlib.axes._subplots.AxesSubplot object at 0x000001E38ACC74C8>,
        <matplotlib.axes._subplots.AxesSubplot object at 0x000001E38ACC7588>,
        <matplotlib.axes._subplots.AxesSubplot object at 0x000001E38ACF7248>,
        <matplotlib.axes._subplots.AxesSubplot object at 0x000001E38AD50D48>],
       [<matplotlib.axes._subplots.AxesSubplot object at 0x000001E38AD813C8>,
        <matplotlib.axes._subplots.AxesSubplot object at 0x000001E38ADAAFC8>,
        <matplotlib.axes._subplots.AxesSubplot object at 0x000001E38ADDBD08>,
        <matplotlib.axes._subplots.AxesSubplot object at 0x000001E38AE0BA48>],
       [<matplotlib.axes._subplots.AxesSubplot object at 0x000001E38AE3E788>,
        <matplotlib.axes._subplots.AxesSubplot object at 0x000001E38AE6D4C8>,
        <matplotlib.axes._subplots.AxesSubplot object at 0x000001E38AE9E208>,
        <matplotlib.axes._subplots.AxesSubplot object at 0x000001E38AEC5F08>]],
      dtype=object)

ํŒŒ์ด์ฌ ์‹œ๊ฐํ™” ๋ถ„ํฌ๋„ ๊ทธ๋ฆฌ๊ธฐ, Kernel Density Estimate plot ์ปค๋„๋ฐ€๋„ํ•จ์ˆ˜ - Plotting

KDF

ํ†ต๊ณ„์—์„œ ์ปค๋„ ๋ฐ€๋„ ์ถ”์ •์€ ์ž„์˜์˜ ๋ณ€์ˆ˜์˜ ํ™•๋ฅ  ๋ฐ€๋„ ํ•จ์ˆ˜๋ฅผ ์ถ”์ •ํ•˜๋Š” ๋น„๋ชจ์ˆ˜์  ๋ฐฉ๋ฒ•์ด๋‹ค. ์ด ํ•จ์ˆ˜๋Š” ๊ฐ€์šฐ์Šค ์ปค๋„์„ ์‚ฌ์šฉํ•˜๋ฉฐ ์ž๋™ ๋Œ€์—ญํญ ๊ฒฐ์ •์„ ํฌํ•จํ•œ๋‹ค.

์ปค๋„ ๋ฐ€๋„ ์ถ”์ •์น˜๋Š” ํžˆ์Šคํ† ๊ทธ๋žจ๊ณผ ๋ฐ€์ ‘ํ•œ ๊ด€๋ จ์ด ์žˆ์ง€๋งŒ ์ ์ ˆํ•œ ์ปค๋„์„ ์‚ฌ์šฉํ•˜๊ณ  ๋งค๋„๋Ÿฝ๊ณ  ์—ฐ์†์„ฑ๊ณผ ๊ฐ™์€ ์†์„ฑ์„ ๋ถ€์—ฌํ•  ์ˆ˜ ์žˆ๋‹ค.

์ปค๋„ ํ•จ์ˆ˜๋Š” ์›์ ์„ ์ค‘์‹ฌ์œผ๋กœ ๋Œ€์นญ์ด๋ฉฐ ์ ๋ถ„๊ฐ’์ด 1์ธ ํ•จ์ˆ˜์ด๋‹ค

๋น„๋ชจ์ˆ˜ ํ†ต๊ณ„๋ฒ• ์‚ฌ์šฉ์˜ ์กฐ๊ฑด

  • ์ž๋ฃŒ๊ฐ€ ๋‚˜ํƒ€๋‚ด๋Š” ๋ชจ์ง‘๋‹จ์˜ ํ˜„์ƒ์ด ์ •๊ทœ๋ถ„ํฌ๊ฐ€ ์•„๋‹ ๋•Œ

  • ์ž๋ฃŒ๊ฐ€ ๋‚˜ํƒ€๋‚ด๋Š” ๋ชจ์ง‘๋‹จ์˜ ํ˜„์ƒ์ด ์ •๊ทœ๋ถ„ํฌ๋กœ ์ ์ ˆํžˆ ๋ณ€ํ™˜๋˜์ง€ ๋ชปํ•  ๋•Œ

  • ์ž๋ฃŒ์˜ ํ‘œ๋ณธ์ˆ˜๊ฐ€ ์ ์„ ๋•Œ

  • ์ž๋ฃŒ๋“ค์ด ์„œ๋กœ ๋…๋ฆฝ์ ์ผ ๋•Œ

  • ๋ณ€์ธ์˜ ์ฒ™๋„๊ฐ€ ๋ช…๋ช…์ฒ™๋„๋‚˜ ์„œ์—ด์ฒ™๋„์ผ ๋•Œ

ser = pd.Series(np.random.randn(1000))
ser
0      0.851606
1      0.012797
2      1.338664
3      0.598750
4      2.415107
         ...   
995    1.056412
996    0.844509
997    0.165994
998   -0.918307
999   -0.441499
Length: 1000, dtype: float64
plt.figure();
ser.plot.hist(alpha=0.5, bins=5)
<matplotlib.axes._subplots.AxesSubplot at 0x1e38b4f7848>
ser.plot.hist(alpha=0.5, bins=10)
<matplotlib.axes._subplots.AxesSubplot at 0x1e38b588688>
ser.plot.hist(alpha=0.5, bins=20)
<matplotlib.axes._subplots.AxesSubplot at 0x1e38b6035c8>
ser.plot.kde()
<matplotlib.axes._subplots.AxesSubplot at 0x1e38b041448>
ser.plot.density()
<matplotlib.axes._subplots.AxesSubplot at 0x1e38b691f08>

์ปค๋„ ๋ฐ€๋„ ํ•จ์ˆ˜๋Š” ์œ„ ํžˆ์Šคํ† ๊ทธ๋žจ์„ ๋ถ€๋“œ๋Ÿฝ๊ฒŒ ํ‘œํ˜„ํ•œ ๊ฒƒ์ด๋‹ค ํžˆ์Šคํ† ๊ทธ๋žจ์€ frequency๋ฅผ ์ถ•์œผ๋กœ ๊ฐ€์ง€์ง€๋งŒ ๋ฐ€๋„ ํ•จ์ˆ˜๋Š” density๋ฅผ ์ถ•์œผ๋กœ ๊ฐ€์ง„๋‹ค density == kde

Previous6 WedNext4 Mon

Last updated 4 years ago

Was this helpful?