1. Abstract

2. Introduction

architecture

resNet과 VGG를 기반으로 모델링
- optional shortcut : resNet의 shortcut path방식, 기존의 6-layers에서 29-layers까지 발전
- filter : VGG의 3-size filter을 응용하여 3-gram 역할, character의 특성을 반영하기 위해 size를 키우지 않음
- 메모리 사용을 줄이는 효과
look-up-table
- input layer 역할, 2D tensor
- tensor size = embeddings of the $s$ characters = $({f_{0}}, s)$
- ${f_{0}}$ = dimension of input text, $s$ = 1024 fixed
- character-level로 input을 받음
마지막에 k-max pooling을 통하여 $512 \times k$ 개의 feature를 뽑아낸 후, 벡터로 만들어 fully connected ReLU classifier로 분류
regularization을 위하여 temporal batch norm만 사용했으며, dropout은 사용하지 않음

convoulutional block

Convolution -> Batch Normalization -> ReLu 형태의 sequence of convolutional layers
Temporal batch normalization 는 batch normalization과 비슷한 regularization 사용
- activation이 mini-batch일 경우 jointly normalized over temporal (instead of spatial) locations
filter의 크기가 작아 parameter가 적음
- convolution layer를 통하여 network의 depth를 많이 늘리는 것이 가능
전체적인 architecture에서 depth 조절을 convolutional block 개수를 통해 조절

result1

character embedding size 16, mini-batch size 128, learning rate 0.01, momintum 0.9로 학습
중간중간 해주는 pooling에서 max-pooling이 다른 pooling기법보다 효과가 좋았다.

result2