7 Fri

딥러닝 CNN 완벽 가이드 - Fundamental 편

Functional API 이용하여 모델 만들기

Functional API가 충분히 쉬운데 Sequential이 너무 쉽다보니 Functional API가 어렵다는 인식이 생긴다.
우리 강의에서는 Sequential을 거의 쓰지 않을 것

Sequential vs Functional API

일반적으로 Seq를 이용하면 모델을 쉽게 생성할 수 있음
하지만 Keras 프레임워크의 핵심은 Func임.
처음부터 Func로 모델 생성 및 활용 기법을 안 뒤에 Seq를 활용하는 것이 바람직함

Sequential

# Sequential Model을 이용하여 Keras 모델 생성 
from tensorflow.keras.layers import Dense, Flatten
from tensorflow.keras.models import Sequential

INPUT_SIZE = 28

model = Sequential([
    Flatten(input_shape=(INPUT_SIZE, INPUT_SIZE)),
    Dense(100, activation='relu'),
    Dense(30, activation='relu'),
    Dense(10, activation='softmax')
])

model.summary()

model1 = Sequential()
model1.add(Flatten(input_shape=(INPUT_SIZE, INPUT_SIZE)))
model1.add(Dense(100, activation='relu'))
model1.add(Dense(30, activation='relu'))
model1.add(Dense(10, activation='softmax'))

model1.summary()

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
flatten (Flatten)            (None, 784)               0         
_________________________________________________________________
dense (Dense)                (None, 100)               78500     
_________________________________________________________________
dense_1 (Dense)              (None, 30)                3030      
_________________________________________________________________
dense_2 (Dense)              (None, 10)                310       
=================================================================
Total params: 81,840
Trainable params: 81,840
Non-trainable params: 0
_________________________________________________________________
Model: "sequential_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
flatten_1 (Flatten)          (None, 784)               0         
_________________________________________________________________
dense_3 (Dense)              (None, 100)               78500     
_________________________________________________________________
dense_4 (Dense)              (None, 30)                3030      
_________________________________________________________________
dense_5 (Dense)              (None, 10)                310       
=================================================================
Total params: 81,840
Trainable params: 81,840
Non-trainable params: 0
_________________________________________________________________

처음에는 아래와 같이 model.add 를 사용했지만 요즘은 Sequential 안에 리스트 형태로 담는다
이 때 Sequential은 지금 입력이 input layer에서 오는 입력인지 other layer에서 오는 입력인지 알지 못한다.
- 따라서 이 때 Flatten 내부에 input_shape 로 인자를 받으면서 input layer에서 오는 인자인것을 확인한다

Functional API

from tensorflow.keras.layers import Input, Flatten, Dense
from tensorflow.keras.models import Model

input_tensor = Input(shape=(INPUT_SIZE, INPUT_SIZE))
x = Flatten()(input_tensor)
x = Dense(100, activation='relu')(x)
x = Dense(30, activation='relu')(x)
output = Dense(10, activation='softmax')(x)

model = Model(inputs=input_tensor, outputs=output)

model.summary()

Model: "model"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_1 (InputLayer)         [(None, 28, 28)]          0         
_________________________________________________________________
flatten_2 (Flatten)          (None, 784)               0         
_________________________________________________________________
dense_6 (Dense)              (None, 100)               78500     
_________________________________________________________________
dense_7 (Dense)              (None, 30)                3030      
_________________________________________________________________
dense_8 (Dense)              (None, 10)                310       
=================================================================
Total params: 81,840
Trainable params: 81,840
Non-trainable params: 0

반면에 Functional API는 코드 한 줄을 더 쓰는 대신 Input layer의 존재를 명시한다.

Functional API 구조 이해하기 - 01

Functional API의 필요성

앞 처리 로직의 결과과 이어지는 처리 로직의 입력 데이터로 주어지는 Chain 형태의 프로그래밍 구현 로직에 적합

conv_out_01=Conv2D(filter=32, kernel_size = 3)(input_tensor)

filter=32, kernel_size = 3

Functional API를 구성하는 주요 파라미터

input tensor

Functional API에 입력되는 데이터값

=> 하이퍼 파리미터와 인자를 따로 구분해서 받는 구조인 듯 싶다

Sequential은 Functional API를 좀 더 쉽게 사용할 수 있도록 함

Functional API 구조 이해하기 - 02

Custom한 Dense Layer 생성

from tensorflow.keras.layers import Layer, Input
from tensorflow.keras.models import Model
import tensorflow as tf

class CustomDense(tf.keras.layers.Layer):
    # CustomDense 객체 생성시 입력되는 초기화 parameter 처리
    def __init__(self, units=32):
        super(CustomDense, self).__init__()
        self.units = units

    def build(self, input_shape):
        self.w = self.add_weight(
            shape=(input_shape[-1], self.units),
            initializer="random_normal",
            trainable=True,
        )
        self.b = self.add_weight(
            shape=(self.units,), initializer="random_normal", trainable=True
        )
        
    # CustomDense 객체에 callable로 입력된 입력 데이터 처리. 
    def call(self, inputs):
        return tf.matmul(inputs, self.w) + self.b

# input 값을 4개의 원소를 가지는 1차원으로 생성. 
inputs = Input((4,))
# 10개의 unit을 가지는 CustomDense 객체를 생성 후 callable로 inputs값 입력 
outputs = CustomDense(10)(inputs)

# inputs와 outputs로 model 생성. 
model = Model(inputs, outputs)
model.summary()

outputs = CustomDense(10)(inputs)

10 은 init 함수로 전달되어 self.units == 10 이 된다.
inputs 는 call 함수로 전달된다
- 이것은 CustomDense가 tf.keras.layers.layer 를 상속받아서 할 수 있는 구조
- __call__ 던던 메서드와 동일한 기능을 한다
callable 인자 입력 부분을 별도로 수행해도 무방하다

inputs = Input((4,))
# 10개의 unit을 가지는 CustomDense 객체를 생성 후 callable로 inputs값 입력 
my_layer = CustomDense(10)
outputs = my_layer(inputs)

# inputs와 outputs로 model 생성. 
model = Model(inputs, outputs)
model.summary()

Sequential Model의 원리

from tensorflow.keras.models import Sequential

model = Sequential([Input((4,)),
                   CustomDense(10),
                   CustomDense(8), 
                   tf.keras.layers.ReLU()])
model.summary()

위와 같은 Seq가 있다고 하자. 이는 아래 코드처럼 구현이 되어있다

layers_list = [Input((4,)), CustomDense(10), CustomDense(8), tf.keras.layers.ReLU()]

inputs = None
callable_inputs = None
outputs = None
# layers_list에 있는 Functional 객체를 iteration 수행하면서 적용. 
for index, layer in enumerate(layers_list):
    # layers_list의 첫번째 인자는 Input 간주. 
    if index == 0:
        inputs = layer
        callable_inputs = layer
    # Functional 객체에 callable 인자로 callable_inputs를 입력하고 반환 결과 값을 다시 callable_inputs로 할당.     
    else: 
        callable_inputs = layer(callable_inputs)
    
outputs = callable_inputs
model = Model(inputs, outputs)
model.summary()

첫 레이어(index == 0)는 callable_input 을 자기 자신으로 설정하며 그 다음 레이어부터 layer(callable_inputs) 로 설정하는 것을 볼 수있다.
이 코드는 간단한 구조이고 실제로는 조금 더 복잡한 처리들이 있다.
여기서 중요한 점은 Seq 역시 Func API로 구성되어 있다는 것이다.

Dense Layer로 Fashion MNIST 예측 모델 Live Coding 으로 구현 정리 - 01

실습

Dense Layer로 Fashion MNIST 예측 모델 Live Coding 으로 구현 정리 - 02

실습

Keras Callback 개요

대표적으로 학습률을 동적으로 조정하고 싶을 때 사용한다.

그 이외에도 Epoch가 돌아갈 때 동적으로 무언가를 건드리고 싶을 때 사용한다

등록될 수 있는 Callback 들은 다음과 같다

ModelCheckpoint()
ReduceLROnPlateau()
LearningRateScheduler()
EarlyStopping()
TensorBoard()

Keras Callback 실습 - ModelCheckpoint, ReduceLROnPlateau, EarlyStopping

ModelCheckpoint

주기적으로 모델을 파일로 저장하는 것
- 모델이 돌아가는데 6시간이 걸린다고 하면 매번 모델을 돌릴 수 없으니 학습된 모델을 파일로 저장하게 된다
- 이 때 학습이 끝나고 모델을 저장할 수도 있지만 주기적으로 하게 된다
- 왜냐하면, 모델이 돌다가 리소스 부족으로 종료될 수도 있고 학습이 끝난 시점이 모델이 최고 성능을 내는 지점을 지났을 수도 있기 때문이다

ModelCheckpoint(filepath,
                monitor='val_loss',
                verbose=0, 
                save_best_only=False,
                save_weights_only=False,
                mode='auto',
                period=1)

특정 조건에 맞춰서 모델을 파일로 저장
filepath: filepath는 (on_epoch_end에서 전달되는) epoch의 값과 logs의 키로 채워진 이름 형식 옵션을 가질 수 있음. 예를 들어 filepath가 weights.{epoch:02d}-{val_loss:.2f}.hdf5라면, 파일 이름에 세대 번호와 검증 손실을 넣어 모델의 체크포인트가 저장
monitor: 모니터할 지표(loss 또는 평가 지표)
save_best_only: 가장 좋은 성능을 나타내는 모델만 저장할 여부
- 100개를 다 저장해두면 용량이 너무 크니까 가장 좋은 성능의 모델만 저장할지에 대한 여부
save_weights_only: Weights만 저장할 지 여부
- 보통은 모델보다 weights만을 저장하는 것이 좋다
- True로 설정하는 것이 좋음
mode: {auto, min, max} 중 하나. monitor 지표가 감소해야 좋을 경우 min, 증가해야 좋을 경우 max, auto는 monitor 이름에서 자동으로 유추.
- val-loss나 val-accuracy의 성능 지표는 loss같은 경우는 작을 수록 좋고 accuracy는 클수록 좋기 때문에 성능지표에 맞추어 max, min, auto로 설정
period : 몇 epoch마다 저장할지

ReduceLROnPlateau

Plateau는 안정 기에 달하다 라는 뜻
- OnPlateau : 안정적이게 될때까지
- LR : 에러를
- Reduce : 감소시켜라
- 라는 뜻

ReduceLROnPlateau(monitor='val_loss',
                factor=0.1,
                patience=10,
                verbose=0,
                mode='auto',
                min_delta=0.0001,
                cooldown=0,
                min_lr=0)

특정 epochs 횟수동안 성능이 개선 되지 않을 시 Learning rate를 동적으로 감소 시킴
monitor: 모니터할 지표(loss 또는 평가 지표)
factor: 학습 속도를 줄일 인수. new_lr = lr * factor
patience: Learing Rate를 줄이기 전에 monitor할 epochs 횟수.
mode: {auto, min, max} 중 하나. monitor 지표가 감소해야 좋을 경우 min, 증가해야 좋을 경우 max, auto는 monitor 이름에서 유추.

EarlyStopping

훈련 데이터의 오류는 줄지만 검증 데이터의 오류는 증가할 수 있기 때문에 어느 정도의 학습이 진행되면 멈추는 함수

EarlyStopping(monitor='val_loss',
                min_delta=0,
                patience=0,
                verbose=0,
                mode='auto',
                baseline=None,
                restore_best_weights=False)

특정 epochs 동안 성능이 개선되지 않을 시 학습을 조기에 중단
monitor: 모니터할 지표(loss 또는 평가 지표)
patience: Early Stopping 적용 전에 monitor할 epochs 횟수.
mode: {auto, min, max} 중 하나. monitor 지표가 감소해야 좋을 경우 min, 증가해야 좋을 경우 max, auto는 monitor 이름에서 유추.

보통은 하나씩 쓰지 않고 모두 한꺼번에 다 쓴다.

from tensorflow.keras.callbacks import ModelCheckpoint, ReduceLROnPlateau, EarlyStopping

model = create_model()
model.compile(optimizer=Adam(0.001), loss='categorical_crossentropy', metrics=['accuracy'])

mcp_cb = ModelCheckpoint(filepath='/kaggle/working/weights.{epoch:02d}-{val_loss:.2f}.hdf5', monitor='val_loss', 
                         save_best_only=True, save_weights_only=True, mode='min', period=1, verbose=0)
rlr_cb = ReduceLROnPlateau(monitor='val_loss', factor=0.3, patience=5, mode='min', verbose=1)
ely_cb = EarlyStopping(monitor='val_loss', patience=7, mode='min', verbose=1)

history = model.fit(x=tr_images, y=tr_oh_labels, batch_size=128, epochs=40, validation_data=(val_images, val_oh_labels),
                   callbacks=[mcp_cb, rlr_cb, ely_cb])

model.fit(callbacks=[]) 로 callback 함수들을 적용시켜준 모습을 볼 수 있다

Numpy array와 Tensor 차이, 그리고 fit() 메소드 상세 설명

Numpy 특징

SIMD, Single Instruction Multiple Data 기반으로 수행속도를 최적화 할 수 있고 매우 빠르게 대량 데이터의 수치 연산을 수행할 수 있다
- simd는 명령 한번에 여러가지 처리를 할 수 있다는 뜻
넘파이가 없었다면 파이썬의 머신러닝 운용은 불가능

SIMD

병렬 프로세서의 한 종류로, 하나의 명령어로 여러개의 값을 동시에 계산하는 방식
비디오 게임, 컴퓨터 그래픽스, HPC, High Performance Computing 등의 다양한 분야에서 활용

Numpy Array vs Tensor

Numpy

Numpy는 GPU를 지원하지 않음
보다 범용적인 영역(이미지 처리, 자연과학/공학)에서 처리

Tensor

Tensor는 CPU와 GPU를 모두 지원한다
- 딥러닝 학습은 CPU SIMD 기반의 Numpy로는 감당할 수 없을 정도의 많은 연산이 필요
- 딥러닝 학습을 위해 GPU가 필요
딥러닝 전용의 기능을 가지고 있음
Tensorflow/Keras, Pytorch 등의 딥러닝 프레임워크 별로 기능적인 특성을 반영한 추가 기능들이 있음
최근 추세는 Numpy의 범용적인 영역까지 처리 영역을 확충하고 있음

모델

모델에 입력되는 인자는 np.array 이다.
이 때 모델의 Input layer에서 np -> tensor로 변환한다.
따라서 Flatten layer에서의 입력은 이미 tensor이다.

Previous8 Sat Next6 Thu

Last updated 4 years ago

Was this helpful?