DAY 2 : Labeling

210824

๊ธฐ์กด ๋ฐ์ดํ„ฐ๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™๋‹ค

data

id

gender

race

age

path

0

000001

female

Asian

45

000001_female_Asian_45

1

000002

female

Asian

52

000002_female_Asian_52

2

000004

male

Asian

54

000004_male_Asian_54

3

000005

female

Asian

58

000005_female_Asian_58

4

000006

female

Asian

59

000006_female_Asian_59

...

...

...

...

...

...

2695

006954

male

Asian

19

006954_male_Asian_19

2696

006955

male

Asian

19

006955_male_Asian_19

2697

006956

male

Asian

19

006956_male_Asian_19

2698

006957

male

Asian

20

006957_male_Asian_20

2699

006959

male

Asian

19

006959_male_Asian_19

2700 rows ร— 5 columns

์œ„์ฒ˜๋Ÿผ, ํ˜„์žฌ ๋ฐ์ดํ„ฐํ”„๋ ˆ์ž„์€ id์™€ gender, race, age ๊ทธ๋ฆฌ๊ณ  path๋ผ๋Š” ์ปฌ๋Ÿผ์œผ๋กœ ์ด๋ฃจ์–ด์ง„ ํ…Œ์ด๋ธ”๋กœ ๋˜์–ด์žˆ๋‹ค. ๋ผ๋ฒจ๋ง์„ ํ•ด์•ผํ•˜๋Š” ๋‘ ๊ฐ€์ง€ ์ด์œ ๊ฐ€ ์žˆ๋‹ค.

1. ํ˜„์žฌ๋Š” ํ•œ ์‚ฌ๋žŒ์˜ 7์žฅ ์‚ฌ์ง„์ด ์žˆ๋Š” ํด๋”๋ฅผ ๊ธฐ์ค€์œผ๋กœ ๋ฐ์ดํ„ฐํ”„๋ ˆ์ž„์ด ๊ตฌ์„ฑ๋˜์–ด์žˆ๋‹ค. ์ถ”ํ›„์— ์ด๋ฏธ์ง€ ์ ‘๊ทผ์„ ์‚ฌ์ง„ ๊ฐ๊ฐ์— ํ•˜๊ธฐ ์œ„ํ•ด์„œ ๋ฐ์ดํ„ฐํ”„๋ ˆ์ž„์„ ํ™•์žฅํ•ด์•ผํ•œ๋‹ค. ์ด ๋•Œ ๊ฐ๊ฐ์˜ ์ด๋ฏธ์ง€ ์ฃผ์†Œ๋ฅผ ๊ฐ€์ง€๊ณ  ์žˆ๋Š” ์ปฌ๋Ÿผ์„ ์ถ”๊ฐ€ํ•œ๋‹ค.

2. ํ˜„์žฌ๋Š” ์ง์ ‘์ ์œผ๋กœ ํด๋ž˜์Šค๋ฅผ ๋‚˜ํƒ€๋‚ด์ง€ ์•Š์œผ๋ฏ€๋กœ ๋ชจ๋ธ์—์„œ ๋ถ„๋ฅ˜ํ•˜๊ธฐ์— ๊ฐ€๋Šฅ์€ ํ•˜๋‚˜ ๋ถˆํŽธํ•จ์ด ์žˆ๋‹ค. ๋˜ํ•œ GPU ํšจ์œจ์„ ์ตœ๋Œ€ํ™”ํ•˜๊ธฐ ์œ„ํ•ด ์ด๋Ÿฐ ์ž‘์—…์€ CPU์—์„œ ์ตœ๋Œ€ํ•œ ํ•ด์ฃผ๋Š” ๊ฒƒ์ด ์ข‹๋‹ค. ๋‚˜์ด์™€ ์„ฑ๋ณ„ ๊ทธ๋ฆฌ๊ณ  ๋งˆ์Šคํฌ ์ฐฉ์šฉ ์—ฌ๋ถ€๋ฅผ ํ† ๋Œ€๋กœ ๋ผ๋ฒจ์„ ์ถ”๊ฐ€ํ•ด์•ผํ•œ๋‹ค. ์ด ๋•Œ ๋งˆ์Šคํฌ ์ฐฉ์šฉ ์—ฌ๋ถ€๋Š” ์ด๋ฏธ์ง€์˜ ์ด๋ฆ„์œผ๋กœ ํŒ๋‹จํ•œ๋‹ค.

๊ฐ ํด๋ž˜์Šค๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์€ ํŠน์ง•์ด์žˆ๋‹ค.

  • ๋งˆ์Šคํฌ ์ •์ƒ ์ฐฉ์šฉ : +0 | ๋งˆ์Šคํฌ ๋น„์ •์ƒ ์ฐฉ์šฉ : +6 | ๋งˆ์Šคํฌ ๋ฏธ์ฐฉ์šฉ : +12

  • ๋‚จ์„ฑ : +0 | ์—ฌ์„ฑ : +3

  • 30์„ธ ๋ฏธ๋งŒ : +0 | 30์„ธ ์ด์ƒ 60์„ธ ๋ฏธ๋งŒ : +1 | 60์„ธ ์ด์ƒ : +2

๋”ฐ๋ผ์„œ, ์กฐ๊ฑด๋ฌธ์œผ๋กœ ๊ตฌ๋ณ„ํ•˜๊ธฐ ๋ณด๋‹ค๋Š” ๊ฐ ์†์„ฑ๋“ค์„ ์ˆ˜์‹ํ™”ํ•˜๋ฉด ์‰ฝ๊ฒŒ ๋ผ๋ฒจ๋ง ํ•  ์ˆ˜ ์žˆ๋‹ค.

  • ๋งˆ์Šคํฌ

    • ํŒŒ์ผ๋ช…์— 'Incorrect'๊ฐ€ ํฌํ•จ๋˜๋ฉด +6

    • ํŒŒ์ผ๋ช…์— 'Normal'์ด ํฌํ•จ๋˜๋ฉด +12

  • ์„ฑ๋ณ„

    • ๋‚จ์„ฑ๊ณผ ์—ฌ์„ฑ์˜ ์ฐจ์ด๊ฐ€ 3๋งŒํผ ๋‚˜์•ผํ•œ๋‹ค. ๋ฌธ์ž์—ด๋กœ๋งŒ ๋น„๊ตํ•  ์ˆ˜ ์žˆ๋Š” ์ ์€ ๊ธธ์ด๊ฐ€ ๋‹ค๋ฅด๋‹ค๋Š” ๊ฒƒ. ์ด๋ฅผ ์ด์šฉํ•œ๋‹ค. ๋‚จ์„ฑ์€ 4๊ธ€์ž, ์—ฌ์„ฑ์€ 6๊ธ€์ž์ด๋‹ค

    • ํ˜„์žฌ ๋‘˜์˜ ์ฐจ์ด๋Š” 2๊ธ€์ž์ด๋ฏ€๋กœ ์ด๊ฒƒ์ด 3๋งŒํผ ์ฐจ์ด๋‚˜๋ ค๋ฉด 1.5๋ฐฐ๋งŒํผ ๊ณฑํ•ด์•ผํ•œ๋‹ค.

  • ๋‚˜์ด

    • ๊ฐ„๊ฒฉ์ด 30๋งŒํผ ์žˆ์œผ๋ฏ€๋กœ 30์œผ๋กœ ๋‚˜๋ˆˆ ๋ชซ๋งŒํผ์„ ํ• ๋‹นํ•œ๋‹ค

data2 = []
def new_dataframe(x):
    id, gender, race, age = x.split('_')
    for filename in FILES:
        path = os.path.join(DATA_DIR, x, filename)
        path = glob(path)[0]
        label = (int(age) // 30) + (len(gender) * 1.5 - 6)
        if 'incorrect' in filename:
            label += 6
        elif 'normal' in filename:
            label += 12
        data2.append([gender, age, path, int(label)])

data['path'].apply(new_dataframe)
data2 = pd.DataFrame(data=data2, columns=['gender', 'age', 'path', 'label'])
data2

gender

age

path

label

0

female

45

./input/data/train/images/000001_female_Asian_...

4

1

female

45

./input/data/train/images/000001_female_Asian_...

4

2

female

45

./input/data/train/images/000001_female_Asian_...

4

3

female

45

./input/data/train/images/000001_female_Asian_...

4

4

female

45

./input/data/train/images/000001_female_Asian_...

4

...

...

...

...

...

18895

male

19

./input/data/train/images/006959_male_Asian_19...

0

18896

male

19

./input/data/train/images/006959_male_Asian_19...

0

18897

male

19

./input/data/train/images/006959_male_Asian_19...

0

18898

male

19

./input/data/train/images/006959_male_Asian_19...

6

18899

male

19

./input/data/train/images/006959_male_Asian_19...

12

18900 rows ร— 4 columns

์ดํ›„, ๋งค๋ฒˆ ๋ฐ์ดํ„ฐํ”„๋ ˆ์ž„์„ ๋งŒ๋“ค๊ณ  ๋ถˆ๋Ÿฌ์˜ค๋Š” ์ž‘์—…์„ ์ค„์ด๊ธฐ ์œ„ํ•ด ์ƒˆ๋กญ๊ฒŒ csv ํŒŒ์ผ๋กœ ์ €์žฅํ•˜๊ณ  ์ดํ›„์— ๋ถˆ๋Ÿฌ์˜ฌ ์ˆ˜ ์žˆ๋„๋ก ํ•œ๋‹ค.

data2.to_csv("train_data.csv", mode='w', index=False)
  • mode ๋ฅผ w ๋กœ ์„ค์ •ํ•˜๋ฉด ๋ฎ์–ด์“ฐ๊ธฐ๊ฐ€ ๋˜๋ฉฐ ์ด์–ด์„œ ์ˆ˜์ •ํ•˜๋ ค๋ฉด a ๋กœ ์„ค์ •ํ•˜๋ฉด ๋œ๋‹ค.

  • index=False ๋ฅผ ํ•˜์ง€์•Š์œผ๋ฉด ์ดํ›„์— ๋‹ค์‹œ ๋ถˆ๋Ÿฌ์˜ฌ ๋•Œ index ๊ฐ€ ๋‘ ๊ฐœ์˜ ์ปฌ๋Ÿผ์œผ๋กœ ์กด์žฌํ•˜๊ฒŒ ๋œ๋‹ค. csvํŒŒ์ผ๋กœ ์ €์žฅ๋  ๋•Œ๋Š” ์ž์ฒด์— default๋กœ index๊ฐ€ ์žˆ๊ธฐ ๋•Œ๋ฌธ์ด๋‹ค.

Last updated

Was this helpful?