We just open-sourced a project to create labeled datasets for ML on satellite imagery. There are only a handful of high quality satellite datasets out there, so our team built something to quickly/easily generate new ones. It pulls label information from OpenStreetMap and saves both the imagery and labels into numpy arrays for incorporation into ML workflows. You can filter by common tags in OSM like roads, buildings, railroads, etc., and it’s able to package data for classification, object detection, or segmentation.
Heads up that you need define some source for the satellite imagery (and most high-quality ones aren’t free), but you can use free imagery tiles from OpenAerialMap or some available from England here or here.