I know how work, and i know that at each step of decoder, we keep k top result and continue decode with them. The thing i want to ask is beam is applied to the test time only or in both test and train????????

