Fine-tuning a LM pre-trained on 1BLM performed well on QA and other NLP tasks. Why don’t we construct a gigantic text dataset taken from novels, textbooks, news, webpages and conversation, so that a LM trained on this dataset can output a structured output (e.g. multiple sentences) conditioned on input sentences? Note that we don’t do fine-tuning here; we train the model on one giant dataset (not Seq2Seq, just LM), and that’s it. 1BLM has no inter-sentence dependency, yet here each sample of minibatch is a randomly sampled consecutive multiple sentences. It doesn’t only do QA but also give an appropriate output according to the input task. The samples are noisy, and the samples with question and answer format may be much smaller than the entire dataset. However, I believe the large size of the dataset would generalize nicely to resolve them. Any feedback?