What do I mean by ?

We can start an RNN with truncation length of 1 i.e. it acts as if a feed-forward network. Once we have trained it to some extent we increase the truncation length to 2 and so on.

Would it be reasonable to think that shorter sequences are some what easier to learn so that they induce the RNN to learn a reasonable set of weights fast and hence beneficial as curriculum ?

Update 1: I am moved. I now think that truncated sequences are not necessarily easier to learn.