과적합방지
드롭 아웃
model.add(layers.Dropout(0.2))
→ 과적합은 방지되지만,
노드 수가 줄어든 것과 같은 효과이기 때문에 학습 정확도와 처리 속도는 다소 떨어짐
→ 결국, 하이퍼파라미터 딥러닝 : Dropout(0.2)가 몇 %가 최적인지 설정하는 지가 중요!
model = keras.Sequential([
keras.layers.Flatten(input_shape=(28, 28)),
keras.layers.Dense(256, activation='relu'),
keras.layers.Dropout(0.2),
keras.layers.Dense(256, activation='relu'),
keras.layers.Dropout(0.2),
keras.layers.Dense(10, activation='softmax')
])
케라스 v2 이상
케라스 드롭 커넥트
SGD 옵티마이저
sgd=keras.optimizers.SGD(learning_rate=0.001)
model.compile(optimizer=sgd, loss='sparse_categorical_crossentropy', metrics=['accuracy'])
# 과적합 방지 in compile 예제 -> SGD 옵티마이저
sgd=keras.optimizers.SGD(learning_rate=0.001)
model.compile(optimizer=sgd, loss='sparse_categorical_crossentropy', metrics=['accuracy'])
가중치 초기화
H 초기화(
model.add(layers.Dense(64, activation='relu'))
L1 정규화
model.add(layers.Dense(64, activity_regularizer=tf.keras.regularizers.l1(0.01), activation='relu'))
L2 정규화
model.add(layers.Dense(64, kernel_regularizer=tf.keras.regularizers.l2(0.01), activation='relu'))
model = keras.Sequential([
keras.layers.Flatten(input_shape=(28, 28)),
keras.layers.Dense(256, activity_regularizer=tf.keras.regularizers.l1(0.01), kernel_initializer='he_normal', kernel_regularizer=tf.keras.regularizers.l2(0.01), activation='relu'),
keras.layers.Dropout(0.2),
keras.layers.Dense(256, activity_regularizer=tf.keras.regularizers.l1(0.01), kernel_initializer='he_normal', kernel_regularizer=tf.keras.regularizers.l2(0.01), activation='relu'),
keras.layers.Dropout(0.2),
keras.layers.Dense(10, activity_regularizer=tf.keras.regularizers.l1(0.01), kernel_initializer='he_normal', kernel_regularizer=tf.keras.regularizers.l2(0.01), activation='softmax')
])
CNN layer + MLP ← Runtime by GPU
model = keras.Sequential([
keras.layers.Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1)),
keras.layers.MaxPooling2D((2, 2)),
keras.layers.Conv2D(64, (3, 3), activation='relu'),
keras.layers.MaxPooling2D((2, 2)),
keras.layers.Conv2D(64, (3, 3), activation='relu'),
keras.layers.Flatten(),
keras.layers.Dense(64, activity_regularizer=tf.keras.regularizers.l1(0.01), kernel_initializer='he_normal', kernel_regularizer=tf.keras.regularizers.l2(0.01), activation='relu'),
keras.layers.Dropout(0.2),
keras.layers.Dense(10, activity_regularizer=tf.keras.regularizers.l1(0.01), kernel_initializer='he_normal', kernel_regularizer=tf.keras.regularizers.l2(0.01), activation='softmax')
])
배치 정규화 & CNN
model.add(layers.BatchNormalization())
model = keras.Sequential([
keras.layers.Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1)),
keras.layers.MaxPooling2D((2, 2)),
keras.layers.BatchNormalization(),
keras.layers.Conv2D(64, (3, 3), activation='relu'),
keras.layers.MaxPooling2D((2, 2)),
keras.layers.BatchNormalization(),
keras.layers.Conv2D(64, (3, 3), activation='relu'),
keras.layers.Flatten(),
keras.layers.Dense(64, activity_regularizer=tf.keras.regularizers.l1(0.01), kernel_initializer='he_normal', kernel_regularizer=tf.keras.regularizers.l2(0.01), activation='relu'),
keras.layers.Dropout(0.2),
keras.layers.Dense(10, activity_regularizer=tf.keras.regularizers.l1(0.01), kernel_initializer='he_normal', kernel_regularizer=tf.keras.regularizers.l2(0.01), activation='softmax')
])
전체 코드