This dataset combines reported emissions and population data with external datasets:
- City Population, from the Geonames database
- Country GDP per capita from the World Bank
- Country emissions per capita from EDGAR
The results from the following data quality checks are displayed:
- Population Check - Comparison between reported populations and figures from the Geonames database
- Emissions Check 1 – A multiple regression model is used to identify statistical outliers (using city population and country GDP per capita)
- Emissions Check 2 - A comparison between city per capita emissions and country per capita emissions.
Field names:
geoid - A unique identifier corresponding to the city's record in the Geonames database
account_id - The CRM organisation ID
org_name - The CRM organisation name
rep.population - The reported population
rep.population.source - The source of the reported population. The population data used is primarily from question 4.5, as this figure corresponds to the year the inventory was taken. If no population is provided in question 4.5, then question 0.5 (current population) is used instead.
geo.population - The city population provided in the Geonames database (used to verify the reported population)
effective.population - After the population check has been carried out, the 'effective population' is defined for use in the subsequent emissions checks. If a city passes the population check, their reported data is used here, otherwise external data is used (i.e. from the Geonames database)
has.inventory - This is the city's response to question 4.0 in the 2019 questionnaire, which asks the city to state whether they have an emissions inventory to report
inventory.type - If the city has an emissions inventory, this field specifies the format that this inventory is provided in
effective.s1.s2.sum - The total scope 1 + scope 2 emissions reported by the city. This is obtained from either question 4.6a or 4.6b, depending on whether the city followed the CRF, or GPC
city.emissions.per.cap - The city's effective.s1.s2.sum is divided by the effective.population to calculate the emissions per capita for that city
country_co2e_per_capita_2012 - This is country level per capita emissions data obtained from: EDGAR’s Global Greenhouse Gas Emissions from 1970 to 2012 (EDGARv4.3.2 dataset). While the data used is from 2012, this is the most recent dataset available which includes all greenhouse gases. More recent datasets are available for CO2 alone, however this is not consistent with the cities' data, which generally includes other GHGs too. Link here: https://edgar.jrc.ec.europa.eu/overview.php?v=CO2andGHG1970-2016&dst=GHGpc
country.GDP.pc.2018 - This is country level GDP per capita in 2018 in US$, obtained from the World Bank. Link here: https://data.worldbank.org/indicator/NY.GDP.PCAP.KD
flag.population - The data quality flag identifying cities whose reported population differs from the external population by +/- 50%
flag.emissions.check1.statistical.outliers - The data quality flag identifying cities whose emissions appear to be anomalously high or low, based on a statistical regression with population and country GDP
flag.emissions.check2.per.capita.country - The data quality flag identifying cities whose per capita emissions differ greatly from the country per capita emissions (a generous theshold of 10x is used, as some variation is expected)