Hey Everyone, I got a question:

I am trying to test speech recognition using a CNN + BLSTMP network (https://github.com/espnet/espnet). The result is comparable with that using only BLSMTP. However, when I tried to add BN to the CNN layers, the becomes worst. I already tried implementing a with similar characteristics but for another sequence application and BN improves the result even when the decode batchsize is . Has anyone tried BN in CNN for sequence training.? Is working that well? is that depend on the framework (TF, Chainer or Pytorch) or BN does not work well for Speech in sequence (will need ref) but it will be better to implement a normalization along the sequence?

Regards



Source link
thanks you RSS link
( https://www.reddit.com/r//comments/8xsv9q/d_speech__training_and_bn/)

LEAVE A REPLY

Please enter your comment!
Please enter your name here