Earth-observing remote sensing data, including aerial photography and satellite imagery, offer a snapshot of the world from which we can learn about the state of our environment, anthropogenic systems, and natural resources. The components of energy systems that are visible from above may be assessed with these remote sensing data when combined with machine learning methods. Here we focus on the information gap in distributed solar photovoltaic (PV) arrays, of which there is limited data on solar PV deployments at small geographic scales. We created a machine learning dataset to develop the process of automatically identifying solar PV locations through the use of remote sensing imagery.
This dataset contains the geospatial coordinates and border vertices for 19,433 solar panels across 601 high resolution images from four cities in California. Dataset applications include training object detection and other machine learning algorithms that use remote sensing imagery, developing specific algorithms for predictive detection of distributed PV systems, and analysis of the socioeconomic correlates of PV deployment.
Links to the aerial photographs from Fresno, Stockton, Oxnard, and Modesto can be found in the references.