16 Wed

TIL

프로그래머스 AI 스쿨 1기

3주차 DAY 3

Matlab으로 데이터 시각화하기

데이터를 보기좋게 표현해봅시다.

1. Matplotlib 시작하기

2. 자주 사용되는 Plotting의 Options

크기 : figsize
제목 : title
라벨 : _label
눈금 : _tics
범례 : legend

3. Matplotlib Case Study

꺾은선 그래프 (Plot)
산점도 (Scatter Plot)
박스그림 (Box Plot)
막대그래프 (Bar Chart)
원형그래프 (Pie Chart)

4. The 멋진 그래프, seaborn Case Study

커널밀도그림 (Kernel Density Plot)
카운트그림 (Count Plot)
캣그림 (Cat Plot)
스트립그림 (Strip Plot)
히트맵 (Heatmap)

I. Matplotlib 시작하기

파이썬의 데이터 시각화 라이브러리
cf) 라이브러리 vs 프레임워크
라이브러리 : 라이브러리 내부 코드를 조합해서 결과를 도출
ex : numpy, pandas
프레임워크 : 정해져 있는 틀에 내용물을 채워감
ex : django, flask
pip install matplotlib
%matplotlib inline : 활성화

import numpy as np, pandas as pd, matplotlib.pyplot as plt
%matplotlib inline

II. Matplotlib Case Study

plt.plot([1, 2, 3, 4, 5]) # 실제 plotting을 하는 함수 # y = x + 1
# 이것은 plt.plot(x = index, y = [1,2,3,4,5]) 와 동일
plt.show() # plt를 확인하는 명령

plt.plot([2,4,2,4,2])
plt.show()

Figsize : Figure(도면)의 크기 조절

figure : 그래프를 이루는 도면 figsize는 튜플을 이루며 1당 72픽셀을 의미한다

plt.figure(figsize=(3, 3)) # plotting을 할 도면을 선언

plt.plot([0, 1, 2, 3, 4])
plt.show()

2차함수 그래프 with plot()

# 리스트를 이용해서 1차 함수 y = x를 그려보면:

plt.plot([0, 1, 2, 3, 4])
plt.show()

# numpy.array를 이용하여 함수 그래프 그리기

x = np.array([1, 2, 3, 4, 5]) # 정의역
y = np.array([1, 4, 9, 16, 25]) # 치역

plt.plot(x, y)
plt.show()

# np.arange(a, b, c) c : 0.01

x = np.arange(-10, 10, 0.01)

plt.xlabel("x value")
plt.ylabel("f(x) value")

plt.plot(x, x**2)
plt.show()

# x, y축의 범위를 설정하기

x = np.arange(-10, 10, 0.01)
plt.xlabel("x value")
plt.ylabel("f(x) value")

plt.axis([-5, 5, 0 , 25]) # [x_min, x_max, y_min, y_max]

plt.plot(x, x**2)
plt.show()

# x, y축에 눈금 설정하기

x = np.arange(-10, 10, 0.01)
plt.xlabel("x value")
plt.ylabel("f(x) value")
plt.axis([-5, 5, 0 , 25]) # [x_min, x_max, y_min, y_max]

plt.xticks([i for i in range(-5, 6, 1)])
plt.yticks([i*i for i in range(0, 6)])

plt.plot(x, x**2)
plt.show()

# 그래프에 title 달기

x = np.arange(-10, 10, 0.01)
plt.xlabel("x value")
plt.ylabel("f(x) value")
plt.axis([-5, 5, 0 , 25]) # [x_min, x_max, y_min, y_max]
plt.xticks([i for i in range(-5, 6, 1)])
plt.yticks([i*i for i in range(0, 6)])

plt.title("y = x^2 graph")

plt.plot(x, x**2)
plt.show()

# 함수 선 이름 달기

x = np.arange(-10, 10, 0.01)
plt.xlabel("x value")
plt.ylabel("f(x) value")
plt.axis([-5, 5, 0 , 25]) # [x_min, x_max, y_min, y_max]
plt.xticks([i for i in range(-5, 6, 1)])
plt.yticks([i*i for i in range(0, 6)])

plt.title("y = x^2 graph")

plt.plot(x, x**2, label="trend")
plt.legend()

plt.show()

III. Matplotlib Case Study

꺾은선 그래프(Plot)

.plot()

x = np.arange(20) # 0~19
y = np.random.randint(0, 21, 20) # 0~20 난수를 20번 생성

x, y

(array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16,
        17, 18, 19]),
 array([ 4, 13, 12, 11,  7, 14,  7, 20,  9,  6, 14, 12, 17, 20, 12,  0,  0,
        11, 13,  6]))

plt.plot(x, y)
plt.show()

# Extra : y축을 20까지 보이게 하고싶다면?, y축을 "5"단위로 보이게 하고 싶다면?

plt.axis([0, 20, 0 , 20]) # [x_min, x_max, y_min, y_max]
plt.yticks([i for i in range(0, 21, 5)])

plt.plot(x, y)
plt.show()

산점도 (Scatter Plot)

.scatter()

plt.scatter(x, y)
plt.show()

Plot : 규칙성 Scatter Plot : 상관관계

박스 그림 (Box Plot)

수치형 데이터에 대한 정보 (Q1, Q2, Q3, min, max)

plt.boxplot(y)
plt.show()

# Extra : Plot의 title을 "Box plot of y"

plt.boxplot((x, y))
plt.title("Box plot of x, y")
plt.show()

막대 그래프 (Bar Plot)

범주형 데이터의 "값"과 그 값의 크기를 직사각형으로 나타낸 그림
.bar()

plt.bar(x, y)
plt.xticks(np.arange(0, 21, 1))
plt.show()

# Extra : xtics를 올바르게 처리해봅시다.

# cf) Histogram
# 도수분포를 직사각형의 막대 형태로 나타냈다.
# 막대그래프는 개개인의 변량을 표시
# 히스토그램은 여러 변량을 묶은 "계급"으로 나타낸 것이 특징
# 0, 1, 2가 아니라 0~2까지의 "범주형" 데이터로 구성 후 그림을 그림
# .hist()

plt.hist(y, bins=np.arange(0, 21, 2)) # bins : 범주의 간격

# Extra : xtics 수정
plt.xticks(np.arange(0, 21, 2))
plt.show()

원형 그래프 (Pie Chart)

데이터에서 전체에 대한 부분의 비율을 부채꼴로 나타낸 그래프
다른 그래프에 비해서 비율 확인에 용이
.pie()

z = [100, 300, 200, 400]

plt.pie(z, labels=['one', 'two', 'three', 'four'])
plt.show()

IV. The 멋진 그래프, Seaborn Case Study

Matplotlib를 기반으로 더 다양한 시각화 방법을 제공하는 라이브러리

커널밀도그림
카운트그림
캣그림
스트립그림
히트맵

Seaborn Import 하기

import seaborn as sns

커널밀도그림 (Kernel Density Plot)

히스토그램과 같은 연속적인 분포를 곡선화해서 그린 그림
sns.kdeplot()

# in Histogram

x = np.arange(0, 22, 2)
print(x)
y = np.random.randint(0, 20, 20)
print(y)
plt.hist(y, bins=x)
plt.show()

[ 0  2  4  6  8 10 12 14 16 18 20]
[13  0 11 13 18  6  1  6 10  1  5 12 14 18  7  3  7  3  6 14]

# kdeplot

sns.kdeplot(y)
plt.show()

# kdeplot

sns.kdeplot(y, shade=True) # shade : 그래프 아래에 있는 부분에 대해서 음영을 추가 가능
plt.show()

카운트그림 (Count Plot)

범주형 column의 빈도수를 시각화 -> Groupby 후의 도수를 하는 것과 동일한 효과
sns.countplot()

vote_df = pd.DataFrame({"name":['Andy', 'Bob', 'Cat'], "vote":[True, True, False]})

vote_df

name

vote

Andy

True

Bob

True

Cat

False

# in matplotlib barplot

vote_count = vote_df.groupby('vote').count()
vote_count

name

vote

False

True

plt.bar(x=[False, True], height=vote_count['name'])
plt.show()

# sns의 countplot => countplot을 사용하면 count한 결과를 보기 좋게 출력 가능

sns.countplot(x=vote_df['vote'])
plt.show()

캣그림 (Cat Plot)

concat에서 따온 cat
숫자형 변수와 하나 이상의 범주형 관계를 보여주는 함수
sns.catplot()

covid = pd.read_csv("./country_wise_latest.csv")
covid.head(5)

Country/Region

Confirmed

Deaths

Recovered

Active

New cases

New deaths

New recovered

Deaths / 100 Cases

Recovered / 100 Cases

Deaths / 100 Recovered

Confirmed last week

1 week change

1 week % increase

WHO Region

Afghanistan

36263

1269

25198

9796

106

3.50

69.49

5.04

35526

737

2.07

Eastern Mediterranean

Albania

4880

144

2745

1991

117

2.95

56.25

5.25

4171

709

17.00

Europe

Algeria

27973

1163

18837

7973

616

749

4.16

67.34

6.17

23691

4282

18.07

Africa

Andorra

907

803

5.73

88.53

6.48

884

2.60

Europe

Angola

950

242

667

4.32

25.47

16.94

749

201

26.84

Africa

s = sns.catplot(x="WHO Region", y="Confirmed", data=covid) #default : kind = 'strip'
s.fig.set_size_inches(10, 6)
plt.show()
# catplot : 범주형 데이터와 수치형 데이터를 출력하는데 좋음 => 여러 데이터를 모음

s = sns.catplot(x="WHO Region", y="Confirmed", data=covid, kind='violin')
s.fig.set_size_inches(10, 6)
plt.show()
# catplot : 범주형 데이터와 수치형 데이터를 출력하는데 좋음 => 여러 데이터를 모음

스트립그림 (Strip Plot)

scatter plot과 유사하게 데이터의 수치를 표현하는 그래프
sns.stripplot()

sns.stripplot(x='WHO Region', y='Recovered', data=covid)
plt.show()

# cf) swarmplot - 동일한 value를 가진 경우 실제로 얼마나 있는지 모르니, 값을 퍼트려준다.

s = sns.swarmplot(x='WHO Region', y='Recovered', data=covid)
plt.show()
# error는 주어진 데이터를 다 표현할 수 없다는 warning

c:\users\32154049\appdata\local\programs\python\python37\lib\site-packages\seaborn\categorical.py:1296: UserWarning: 22.7% of the points cannot be placed; you may want to decrease the size of the markers or use stripplot.
  warnings.warn(msg, UserWarning)
c:\users\32154049\appdata\local\programs\python\python37\lib\site-packages\seaborn\categorical.py:1296: UserWarning: 69.6% of the points cannot be placed; you may want to decrease the size of the markers or use stripplot.
  warnings.warn(msg, UserWarning)
c:\users\32154049\appdata\local\programs\python\python37\lib\site-packages\seaborn\categorical.py:1296: UserWarning: 79.2% of the points cannot be placed; you may want to decrease the size of the markers or use stripplot.
  warnings.warn(msg, UserWarning)
c:\users\32154049\appdata\local\programs\python\python37\lib\site-packages\seaborn\categorical.py:1296: UserWarning: 54.3% of the points cannot be placed; you may want to decrease the size of the markers or use stripplot.
  warnings.warn(msg, UserWarning)
c:\users\32154049\appdata\local\programs\python\python37\lib\site-packages\seaborn\categorical.py:1296: UserWarning: 31.2% of the points cannot be placed; you may want to decrease the size of the markers or use stripplot.
  warnings.warn(msg, UserWarning)

히트맵 (Heatmap)

데이터의 행렬을 색상으로 표현해주는 그래프
sns.heatmap()

covid.corr() #correlation => 상관 관계

Confirmed

Deaths

Recovered

Active

New cases

New deaths

New recovered

Deaths / 100 Cases

Recovered / 100 Cases

Deaths / 100 Recovered

Confirmed last week

1 week change

1 week % increase

Confirmed

1.000000

0.934698

0.906377

0.927018

0.909720

0.871683

0.859252

0.063550

-0.064815

0.025175

0.999127

0.954710

-0.010161

Deaths

0.934698

1.000000

0.832098

0.871586

0.806975

0.814161

0.765114

0.251565

-0.114529

0.169006

0.939082

0.855330

-0.034708

Recovered

0.906377

0.832098

1.000000

0.682103

0.818942

0.820338

0.919203

0.048438

0.026610

-0.027277

0.899312

0.910013

-0.013697

Active

0.927018

0.871586

0.682103

1.000000

0.851190

0.781123

0.673887

0.054380

-0.132618

0.058386

0.931459

0.847642

-0.003752

New cases

0.909720

0.806975

0.818942

0.851190

1.000000

0.935947

0.914765

0.020104

-0.078666

-0.011637

0.896084

0.959993

0.030791

New deaths

0.871683

0.814161

0.820338

0.781123

0.935947

1.000000

0.889234

0.060399

-0.062792

-0.020750

0.862118

0.894915

0.025293

New recovered

0.859252

0.765114

0.919203

0.673887

0.914765

0.889234

1.000000

0.017090

-0.024293

-0.023340

0.839692

0.954321

0.032662

Deaths / 100 Cases

0.063550

0.251565

0.048438

0.054380

0.020104

0.060399

0.017090

1.000000

-0.168920

0.334594

0.069894

0.015095

-0.134534

Recovered / 100 Cases

-0.064815

-0.114529

0.026610

-0.132618

-0.078666

-0.062792

-0.024293

-0.168920

1.000000

-0.295381

-0.064600

-0.063013

-0.394254

Deaths / 100 Recovered

0.025175

0.169006

-0.027277

0.058386

-0.011637

-0.020750

-0.023340

0.334594

-0.295381

1.000000

0.030460

-0.013763

-0.049083

Confirmed last week

0.999127

0.939082

0.899312

0.931459

0.896084

0.862118

0.839692

0.069894

-0.064600

0.030460

1.000000

0.941448

-0.015247

1 week change

0.954710

0.855330

0.910013

0.847642

0.959993

0.894915

0.954321

0.015095

-0.063013

-0.013763

0.941448

1.000000

0.026594

1 week % increase

-0.010161

-0.034708

-0.013697

-0.003752

0.030791

0.025293

0.032662

-0.134534

-0.394254

-0.049083

-0.015247

0.026594

1.000000

# 수치로 주어져 있으면 알아보기 어려움
# 색깔을 이용해서 행렬로 표현 
sns.heatmap(covid.corr())
plt.show()

Previous17 Thu Next15 Tue

Last updated 4 years ago

Was this helpful?

16 Wed

TIL

프로그래머스 AI 스쿨 1기

3주차 DAY 3

Matlab으로 데이터 시각화하기

데이터를 보기좋게 표현해봅시다.

1. Matplotlib 시작하기

2. 자주 사용되는 Plotting의 Options

크기 : figsize
제목 : title
라벨 : _label
눈금 : _tics
범례 : legend

3. Matplotlib Case Study

꺾은선 그래프 (Plot)
산점도 (Scatter Plot)
박스그림 (Box Plot)
막대그래프 (Bar Chart)
원형그래프 (Pie Chart)

4. The 멋진 그래프, seaborn Case Study

커널밀도그림 (Kernel Density Plot)
카운트그림 (Count Plot)
캣그림 (Cat Plot)
스트립그림 (Strip Plot)
히트맵 (Heatmap)

I. Matplotlib 시작하기

파이썬의 데이터 시각화 라이브러리
cf) 라이브러리 vs 프레임워크
라이브러리 : 라이브러리 내부 코드를 조합해서 결과를 도출
ex : numpy, pandas
프레임워크 : 정해져 있는 틀에 내용물을 채워감
ex : django, flask
pip install matplotlib
%matplotlib inline : 활성화

import numpy as np, pandas as pd, matplotlib.pyplot as plt
%matplotlib inline

II. Matplotlib Case Study

plt.plot([1, 2, 3, 4, 5]) # 실제 plotting을 하는 함수 # y = x + 1
# 이것은 plt.plot(x = index, y = [1,2,3,4,5]) 와 동일
plt.show() # plt를 확인하는 명령

plt.plot([2,4,2,4,2])
plt.show()

Figsize : Figure(도면)의 크기 조절

figure : 그래프를 이루는 도면 figsize는 튜플을 이루며 1당 72픽셀을 의미한다

plt.figure(figsize=(3, 3)) # plotting을 할 도면을 선언

plt.plot([0, 1, 2, 3, 4])
plt.show()

2차함수 그래프 with plot()

# 리스트를 이용해서 1차 함수 y = x를 그려보면:

plt.plot([0, 1, 2, 3, 4])
plt.show()

# numpy.array를 이용하여 함수 그래프 그리기

x = np.array([1, 2, 3, 4, 5]) # 정의역
y = np.array([1, 4, 9, 16, 25]) # 치역

plt.plot(x, y)
plt.show()

# np.arange(a, b, c) c : 0.01

x = np.arange(-10, 10, 0.01)

plt.xlabel("x value")
plt.ylabel("f(x) value")

plt.plot(x, x**2)
plt.show()

# x, y축의 범위를 설정하기

x = np.arange(-10, 10, 0.01)
plt.xlabel("x value")
plt.ylabel("f(x) value")

plt.axis([-5, 5, 0 , 25]) # [x_min, x_max, y_min, y_max]

plt.plot(x, x**2)
plt.show()

# x, y축에 눈금 설정하기

x = np.arange(-10, 10, 0.01)
plt.xlabel("x value")
plt.ylabel("f(x) value")
plt.axis([-5, 5, 0 , 25]) # [x_min, x_max, y_min, y_max]

plt.xticks([i for i in range(-5, 6, 1)])
plt.yticks([i*i for i in range(0, 6)])

plt.plot(x, x**2)
plt.show()

# 그래프에 title 달기

x = np.arange(-10, 10, 0.01)
plt.xlabel("x value")
plt.ylabel("f(x) value")
plt.axis([-5, 5, 0 , 25]) # [x_min, x_max, y_min, y_max]
plt.xticks([i for i in range(-5, 6, 1)])
plt.yticks([i*i for i in range(0, 6)])

plt.title("y = x^2 graph")

plt.plot(x, x**2)
plt.show()

# 함수 선 이름 달기

x = np.arange(-10, 10, 0.01)
plt.xlabel("x value")
plt.ylabel("f(x) value")
plt.axis([-5, 5, 0 , 25]) # [x_min, x_max, y_min, y_max]
plt.xticks([i for i in range(-5, 6, 1)])
plt.yticks([i*i for i in range(0, 6)])

plt.title("y = x^2 graph")

plt.plot(x, x**2, label="trend")
plt.legend()

plt.show()

III. Matplotlib Case Study

꺾은선 그래프(Plot)

.plot()

x = np.arange(20) # 0~19
y = np.random.randint(0, 21, 20) # 0~20 난수를 20번 생성

x, y

(array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16,
        17, 18, 19]),
 array([ 4, 13, 12, 11,  7, 14,  7, 20,  9,  6, 14, 12, 17, 20, 12,  0,  0,
        11, 13,  6]))

plt.plot(x, y)
plt.show()

# Extra : y축을 20까지 보이게 하고싶다면?, y축을 "5"단위로 보이게 하고 싶다면?

plt.axis([0, 20, 0 , 20]) # [x_min, x_max, y_min, y_max]
plt.yticks([i for i in range(0, 21, 5)])

plt.plot(x, y)
plt.show()

산점도 (Scatter Plot)

.scatter()

plt.scatter(x, y)
plt.show()

Plot : 규칙성 Scatter Plot : 상관관계

박스 그림 (Box Plot)

수치형 데이터에 대한 정보 (Q1, Q2, Q3, min, max)

plt.boxplot(y)
plt.show()

# Extra : Plot의 title을 "Box plot of y"

plt.boxplot((x, y))
plt.title("Box plot of x, y")
plt.show()

막대 그래프 (Bar Plot)

범주형 데이터의 "값"과 그 값의 크기를 직사각형으로 나타낸 그림
.bar()

plt.bar(x, y)
plt.xticks(np.arange(0, 21, 1))
plt.show()

# Extra : xtics를 올바르게 처리해봅시다.

# cf) Histogram
# 도수분포를 직사각형의 막대 형태로 나타냈다.
# 막대그래프는 개개인의 변량을 표시
# 히스토그램은 여러 변량을 묶은 "계급"으로 나타낸 것이 특징
# 0, 1, 2가 아니라 0~2까지의 "범주형" 데이터로 구성 후 그림을 그림
# .hist()

plt.hist(y, bins=np.arange(0, 21, 2)) # bins : 범주의 간격

# Extra : xtics 수정
plt.xticks(np.arange(0, 21, 2))
plt.show()

원형 그래프 (Pie Chart)

데이터에서 전체에 대한 부분의 비율을 부채꼴로 나타낸 그래프
다른 그래프에 비해서 비율 확인에 용이
.pie()

z = [100, 300, 200, 400]

plt.pie(z, labels=['one', 'two', 'three', 'four'])
plt.show()

IV. The 멋진 그래프, Seaborn Case Study

Matplotlib를 기반으로 더 다양한 시각화 방법을 제공하는 라이브러리

커널밀도그림
카운트그림
캣그림
스트립그림
히트맵

Seaborn Import 하기

import seaborn as sns

커널밀도그림 (Kernel Density Plot)

히스토그램과 같은 연속적인 분포를 곡선화해서 그린 그림
sns.kdeplot()

# in Histogram

x = np.arange(0, 22, 2)
print(x)
y = np.random.randint(0, 20, 20)
print(y)
plt.hist(y, bins=x)
plt.show()

[ 0  2  4  6  8 10 12 14 16 18 20]
[13  0 11 13 18  6  1  6 10  1  5 12 14 18  7  3  7  3  6 14]

# kdeplot

sns.kdeplot(y)
plt.show()

# kdeplot

sns.kdeplot(y, shade=True) # shade : 그래프 아래에 있는 부분에 대해서 음영을 추가 가능
plt.show()

카운트그림 (Count Plot)

범주형 column의 빈도수를 시각화 -> Groupby 후의 도수를 하는 것과 동일한 효과
sns.countplot()

vote_df = pd.DataFrame({"name":['Andy', 'Bob', 'Cat'], "vote":[True, True, False]})

vote_df

name

vote

Andy

True

Bob

True

Cat

False

# in matplotlib barplot

vote_count = vote_df.groupby('vote').count()
vote_count

name

vote

False

True

plt.bar(x=[False, True], height=vote_count['name'])
plt.show()

# sns의 countplot => countplot을 사용하면 count한 결과를 보기 좋게 출력 가능

sns.countplot(x=vote_df['vote'])
plt.show()

캣그림 (Cat Plot)

concat에서 따온 cat
숫자형 변수와 하나 이상의 범주형 관계를 보여주는 함수
sns.catplot()

covid = pd.read_csv("./country_wise_latest.csv")
covid.head(5)

Country/Region

Confirmed

Deaths

Recovered

Active

New cases

New deaths

New recovered

Deaths / 100 Cases

Recovered / 100 Cases

Deaths / 100 Recovered

Confirmed last week

1 week change

1 week % increase

WHO Region

Afghanistan

36263

1269

25198

9796

106

3.50

69.49

5.04

35526

737

2.07

Eastern Mediterranean

Albania

4880

144

2745

1991

117

2.95

56.25

5.25

4171

709

17.00

Europe

Algeria

27973

1163

18837

7973

616

749

4.16

67.34

6.17

23691

4282

18.07

Africa

Andorra

907

803

5.73

88.53

6.48

884

2.60

Europe

Angola

950

242

667

4.32

25.47

16.94

749

201

26.84

Africa

s = sns.catplot(x="WHO Region", y="Confirmed", data=covid) #default : kind = 'strip'
s.fig.set_size_inches(10, 6)
plt.show()
# catplot : 범주형 데이터와 수치형 데이터를 출력하는데 좋음 => 여러 데이터를 모음

s = sns.catplot(x="WHO Region", y="Confirmed", data=covid, kind='violin')
s.fig.set_size_inches(10, 6)
plt.show()
# catplot : 범주형 데이터와 수치형 데이터를 출력하는데 좋음 => 여러 데이터를 모음

스트립그림 (Strip Plot)

scatter plot과 유사하게 데이터의 수치를 표현하는 그래프
sns.stripplot()

sns.stripplot(x='WHO Region', y='Recovered', data=covid)
plt.show()

# cf) swarmplot - 동일한 value를 가진 경우 실제로 얼마나 있는지 모르니, 값을 퍼트려준다.

s = sns.swarmplot(x='WHO Region', y='Recovered', data=covid)
plt.show()
# error는 주어진 데이터를 다 표현할 수 없다는 warning

c:\users\32154049\appdata\local\programs\python\python37\lib\site-packages\seaborn\categorical.py:1296: UserWarning: 22.7% of the points cannot be placed; you may want to decrease the size of the markers or use stripplot.
  warnings.warn(msg, UserWarning)
c:\users\32154049\appdata\local\programs\python\python37\lib\site-packages\seaborn\categorical.py:1296: UserWarning: 69.6% of the points cannot be placed; you may want to decrease the size of the markers or use stripplot.
  warnings.warn(msg, UserWarning)
c:\users\32154049\appdata\local\programs\python\python37\lib\site-packages\seaborn\categorical.py:1296: UserWarning: 79.2% of the points cannot be placed; you may want to decrease the size of the markers or use stripplot.
  warnings.warn(msg, UserWarning)
c:\users\32154049\appdata\local\programs\python\python37\lib\site-packages\seaborn\categorical.py:1296: UserWarning: 54.3% of the points cannot be placed; you may want to decrease the size of the markers or use stripplot.
  warnings.warn(msg, UserWarning)
c:\users\32154049\appdata\local\programs\python\python37\lib\site-packages\seaborn\categorical.py:1296: UserWarning: 31.2% of the points cannot be placed; you may want to decrease the size of the markers or use stripplot.
  warnings.warn(msg, UserWarning)

히트맵 (Heatmap)

데이터의 행렬을 색상으로 표현해주는 그래프
sns.heatmap()

covid.corr() #correlation => 상관 관계

Confirmed

Deaths

Recovered

Active

New cases

New deaths

New recovered

Deaths / 100 Cases

Recovered / 100 Cases

Deaths / 100 Recovered

Confirmed last week

1 week change

1 week % increase

Confirmed

1.000000

0.934698

0.906377

0.927018

0.909720

0.871683

0.859252

0.063550

-0.064815

0.025175

0.999127

0.954710

-0.010161

Deaths

0.934698

1.000000

0.832098

0.871586

0.806975

0.814161

0.765114

0.251565

-0.114529

0.169006

0.939082

0.855330

-0.034708

Recovered

0.906377

0.832098

1.000000

0.682103

0.818942

0.820338

0.919203

0.048438

0.026610

-0.027277

0.899312

0.910013

-0.013697

Active

0.927018

0.871586

0.682103

1.000000

0.851190

0.781123

0.673887

0.054380

-0.132618

0.058386

0.931459

0.847642

-0.003752

New cases

0.909720

0.806975

0.818942

0.851190

1.000000

0.935947

0.914765

0.020104

-0.078666

-0.011637

0.896084

0.959993

0.030791

New deaths

0.871683

0.814161

0.820338

0.781123

0.935947

1.000000

0.889234

0.060399

-0.062792

-0.020750

0.862118

0.894915

0.025293

New recovered

0.859252

0.765114

0.919203

0.673887

0.914765

0.889234

1.000000

0.017090

-0.024293

-0.023340

0.839692

0.954321

0.032662

Deaths / 100 Cases

0.063550

0.251565

0.048438

0.054380

0.020104

0.060399

0.017090

1.000000

-0.168920

0.334594

0.069894

0.015095

-0.134534

Recovered / 100 Cases

-0.064815

-0.114529

0.026610

-0.132618

-0.078666

-0.062792

-0.024293

-0.168920

1.000000

-0.295381

-0.064600

-0.063013

-0.394254

Deaths / 100 Recovered

0.025175

0.169006

-0.027277

0.058386

-0.011637

-0.020750

-0.023340

0.334594

-0.295381

1.000000

0.030460

-0.013763

-0.049083

Confirmed last week

0.999127

0.939082

0.899312

0.931459

0.896084

0.862118

0.839692

0.069894

-0.064600

0.030460

1.000000

0.941448

-0.015247

1 week change

0.954710

0.855330

0.910013

0.847642

0.959993

0.894915

0.954321

0.015095

-0.063013

-0.013763

0.941448

1.000000

0.026594

1 week % increase

-0.010161

-0.034708

-0.013697

-0.003752

0.030791

0.025293

0.032662

-0.134534

-0.394254

-0.049083

-0.015247

0.026594

1.000000

# 수치로 주어져 있으면 알아보기 어려움
# 색깔을 이용해서 행렬로 표현 
sns.heatmap(covid.corr())
plt.show()

Previous17 Thu Next15 Tue

Last updated 4 years ago

Was this helpful?