I’ve been disconnected from the recent couple of years in image classification progress. From what I can tell plenty of papers (e.g. Inception v4, Mobile Net) seem to skip preprocessing techniques like image normalization and mean value subtraction (e.g. the original ResNet).
When did these techniques fall out of favor and what was the motivation? I see arguments along the lines of “it probably doesn’t matter much since we do batch/layer/instance/x-normalization anyway”, but I haven’t been able to find results showing this is the case. Thoughts, arguments and references to papers arguing this convincingly would be highly appreciated.