과적합방지

드롭 아웃

model.add(layers.Dropout(0.2))

→ 과적합은 방지되지만,

노드 수가 줄어든 것과 같은 효과이기 때문에 학습 정확도와 처리 속도는 다소 떨어짐

→ 결국, 하이퍼파라미터 딥러닝 : Dropout(0.2)가 몇 %가 최적인지 설정하는 지가 중요!

model = keras.Sequential([
    keras.layers.Flatten(input_shape=(28, 28)),
    keras.layers.Dense(256, activation='relu'),
    keras.layers.Dropout(0.2),
    keras.layers.Dense(256, activation='relu'),
    keras.layers.Dropout(0.2),
    keras.layers.Dense(10, activation='softmax')
])

케라스 v2 이상

케라스 드롭 커넥트

SGD 옵티마이저

sgd=keras.optimizers.SGD(learning_rate=0.001)

model.compile(optimizer=sgd, loss='sparse_categorical_crossentropy', metrics=['accuracy'])

# 과적합 방지 in compile 예제 -> SGD 옵티마이저 
sgd=keras.optimizers.SGD(learning_rate=0.001)
model.compile(optimizer=sgd, loss='sparse_categorical_crossentropy', metrics=['accuracy'])

가중치 초기화

H 초기화(

model.add(layers.Dense(64, activation='relu'))

L1 정규화

model.add(layers.Dense(64, activity_regularizer=tf.keras.regularizers.l1(0.01), activation='relu'))

L2 정규화

model.add(layers.Dense(64, kernel_regularizer=tf.keras.regularizers.l2(0.01), activation='relu'))

model = keras.Sequential([
    keras.layers.Flatten(input_shape=(28, 28)),
    keras.layers.Dense(256, activity_regularizer=tf.keras.regularizers.l1(0.01), kernel_initializer='he_normal', kernel_regularizer=tf.keras.regularizers.l2(0.01), activation='relu'),
    keras.layers.Dropout(0.2),
    keras.layers.Dense(256, activity_regularizer=tf.keras.regularizers.l1(0.01), kernel_initializer='he_normal', kernel_regularizer=tf.keras.regularizers.l2(0.01), activation='relu'),
    keras.layers.Dropout(0.2),
    keras.layers.Dense(10, activity_regularizer=tf.keras.regularizers.l1(0.01), kernel_initializer='he_normal', kernel_regularizer=tf.keras.regularizers.l2(0.01), activation='softmax')
])

CNN layer + MLP ← Runtime by GPU

model = keras.Sequential([
    keras.layers.Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1)),
    keras.layers.MaxPooling2D((2, 2)),
    keras.layers.Conv2D(64, (3, 3), activation='relu'),
    keras.layers.MaxPooling2D((2, 2)),
    keras.layers.Conv2D(64, (3, 3), activation='relu'),
    keras.layers.Flatten(),
    keras.layers.Dense(64, activity_regularizer=tf.keras.regularizers.l1(0.01), kernel_initializer='he_normal', kernel_regularizer=tf.keras.regularizers.l2(0.01), activation='relu'),
    keras.layers.Dropout(0.2),
    keras.layers.Dense(10, activity_regularizer=tf.keras.regularizers.l1(0.01), kernel_initializer='he_normal', kernel_regularizer=tf.keras.regularizers.l2(0.01), activation='softmax')
])

배치 정규화 & CNN

model.add(layers.BatchNormalization())

model = keras.Sequential([
    keras.layers.Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1)),
    keras.layers.MaxPooling2D((2, 2)),
    keras.layers.BatchNormalization(),
    keras.layers.Conv2D(64, (3, 3), activation='relu'),
    keras.layers.MaxPooling2D((2, 2)),
    keras.layers.BatchNormalization(),
    keras.layers.Conv2D(64, (3, 3), activation='relu'),
    keras.layers.Flatten(),
    keras.layers.Dense(64, activity_regularizer=tf.keras.regularizers.l1(0.01), kernel_initializer='he_normal', kernel_regularizer=tf.keras.regularizers.l2(0.01), activation='relu'),
    keras.layers.Dropout(0.2),
    keras.layers.Dense(10, activity_regularizer=tf.keras.regularizers.l1(0.01), kernel_initializer='he_normal', kernel_regularizer=tf.keras.regularizers.l2(0.01), activation='softmax')
])

전체 코드

Google Colaboratory