Population equivalence what?

Well, I don’t really know what to call them but you might have seen them in the wild before. Here’s one example of Canada, where the map is comprised of four areas with equally sized population counts.

I created a few of these of Sweden and thought it might be interesting to share the methodology. In this example we’ll be using a population grid of 1 km^2 squares. The grid shapefile can be downloaded from here.

We start by loading the data and using the column Ruta as the index.

import geopandas as gpd

df = gpd.read_file('grid1km_sex_20181231/Inspire_Sex_Sweref_region.shp')
df = df.set_index('Ruta')

We can plot the data to see what we’re dealing with.

df.plot(figsize=(8, 16));

We’re going to do something a bit different than in the Canada example. We’ll start by selecting four squares in different parts of the country.

# Indices of selected squares
gbg_loc = '3310006425000'
sthlm_loc = '6970006576000'
north_loc = '7420007652000'
malmo_loc = '3710006163000'

We then write a custom function which creates a buffer around the selected square(s), selects all squares that intersects the buffer, and sums up the population counts (TotPop) of those squares. It continues expanding the buffer until the total population exceeds the pop threshold.

def tally_pop(pop, start_loc):
    """Return a list of geometries starting with `start_loc`
    which in total host a population of at least `pop`."""
    buffer = 1000
    total_pop = 0
    geom = df.loc[start_loc].geometry
    while total_pop < pop:
        rows = df[df.intersects(geom.buffer(buffer))]
        total_pop = rows.TotPop.sum()
        buffer += 1000
    return rows.index.tolist()

We then run this function for our four selected starting squares, and set the population threshold to one million.

north_1m = tally_pop(1e6, north_loc)
gbg_1m = tally_pop(1e6, gbg_loc)
sthlm_1m = tally_pop(1e6, sthlm_loc)
malmo_1m = tally_pop(1e6, malmo_loc)

Now that we have the indices of all returned buffer zones, we can proceed to plot the results.

fig, ax = plt.subplots(figsize=(8, 16))
df.plot(color='#ffc830', ax=ax)
df.loc[gbg_1m + sthlm_1m + malmo_1m + north_1m].plot(color='#806ced', ax=ax);

And there you have it. As expected (if you know the Swedish landscape), the northern buffer is huge in comparison to the buffers starting from the three largest cities. The Stockholm zone in particular stands out with it’s high population density.