Controlling Geographical Distortions¶
Flow cartograms distort the shapes of geographical regions so that their areas $A$ are proportional to a variable of interest $v$, such as population size. Distortions are often noticeable at the outer edges because the algorithm lacks geographic constraints beyond the data regions. Below are some advanced options to help control or reduce these distortions. Keep in mind that these techniques often require trial and error, and if pushed to extremes, they may lead to even greater distortions or cause the algorithm to fail to converge.
The flow cartogram algorithm internally constructs a gridded density field that encompasses all regions and includes a margin. Each grid point within a region is assigned the region's current variable density, represented as $\frac{v}{A}$. Points outside the regions are assigned a target density, which equals the mean density across all regions, calculated as $\frac{\sum{v}}{\sum{A}}$. This approach ensures that the total area of the regions remains unchanged. The density field is then transformed into a velocity flow field, indicating how space needs to expand or contract to equalize the regional densities with the target density.
By adjusting the density or velocity fields (using the density_mod, Dx/Dy, and and anisotropy options), a user can gain some control over spatial distortions. For instance, introducing anisotropy allows the user to direct the flow so that expansions and contractions primarily occur along either the horizontal or vertical axis. Another approach is to gradually decrease the velocity near the outer boundaries, such as coastlines, which helps minimize distortions at these edges but may result in greater distortions at the inner boundaries, like those separating countries or states. Similarly, a smoother transition in the variable density at the outer borders instead of a sudden drop-off to the target density can help reduce local velocity fields, leading to fewer distortions.
Distortions can also be very noticeable for islands or non-contiguous regions that have a stretched (non-compact) shape, may have a large area, but a low variable density. For example, when computing a flow cartogram of the US States using population size as a variable, Alaska is strongly deformed and reduced to a thin line. This deformation can be largely avoided by separately scaling all connected regions and islands to the correct area before computing the flow cartogram (using the prescale_components option).
In summary, MorphOptions provides the following options to mitigate distortions in flow cartograms:
| Option | What it does |
|---|---|
prescale_components |
Pre-scales each group of connected geometries to its target area before morphing starts |
density_mod |
density field modification |
Dx/Dy |
axis-aligned anisotropy |
anisotropy |
general velocity field modification |
Import Libraries¶
import matplotlib.pyplot as plt
import numpy as np
import carto_flow.data as examples
import carto_flow.flow_cartogram as flow
Load Geographic Data¶
Load the contiguous US States plus Alaska, Puerto Rico, and Hawaii.
gdf = examples.load_us_census(population=True, contiguous_only=False)
ax = gdf.plot(
"Population (Millions)",
cmap="RdYlGn_r",
linewidth=0.3,
legend=True,
legend_kwds={"shrink": 0.7, "label": "Population (Millions)"},
)
ax.axis("off");
Prescaling¶
Without prescaling, Alaska shows extreme deformation in the form of a thin skeleton, and the algorithm does not converge (Alaska's morph error is 41.7%). With prescaling, the algorithm is better behaved, and Alaska is not deformed at all, because it was prescaled to the correct area.
# Without prescaling
result_no_prescale = flow.morph_gdf(gdf, "Population", options=flow.MorphOptions(grid_size=512))
# With prescaling
result_prescale = flow.morph_gdf(gdf, "Population", options=flow.MorphOptions(grid_size=512, prescale_components=True))
Morph geometries: 25%|█████████▏ | 127/500 [00:05<00:16, 22.41it/s, max=41.7%, mean=1.5% - stalled] Morph geometries: 20%|███████▏ | 102/500 [00:02<00:11, 36.00it/s, max=6.3%, mean=0.4% - converged]
# Let's look at the morph error percentages of Alaska
print("Morph Error % for Alaska")
no_prescale_err = result_no_prescale.to_geodataframe().query("`State Name`=='Alaska'")["_morph_error_pct"].values[0]
prescale_err = result_prescale.to_geodataframe().query("`State Name`=='Alaska'")["_morph_error_pct"].values[0]
print(f" No pre-scaling: {no_prescale_err:05.2f}%")
print(f"With pre-scaling: {prescale_err:05.2f}%")
Morph Error % for Alaska No pre-scaling: 41.70% With pre-scaling: 00.00%
fig, axes = plt.subplots(2, 1, figsize=(7, 9), sharex=True, sharey=True)
result_no_prescale.to_geodataframe().plot(
"Population (Millions)",
ax=axes[0],
cmap="RdYlGn_r",
linewidth=0.3,
legend=True,
legend_kwds={"shrink": 0.5, "label": "Population (Millions)"},
)
axes[0].set(title="Without prescaling")
axes[0].axis("off")
result_prescale.to_geodataframe().plot(
"Population (Millions)",
ax=axes[1],
cmap="RdYlGn_r",
linewidth=0.3,
legend=True,
legend_kwds={"shrink": 0.5, "label": "Population (Millions)"},
)
axes[1].set(title="With prescaling", frame_on=False)
axes[1].axis("off")
plt.tight_layout();
Density Field Modulation¶
In the following example, the density field is adjusted so that density values extend beyond the borders of the states, gradually transitioning back to the target density. This modification indirectly alters the velocity fields, ensuring that the morphed outer borders of the regions stay closer to their original positions. To modify the density field, use the density_mod option and set it to a pipeline of DensityModulator objects. Two modulators are available: DensityBorderExtension and DensitySmooth (but users can build their own if needed).
# let's only work with the contiguous US States
gdf_contig = gdf[~gdf["State Name"].isin(["Alaska", "Hawaii", "Puerto Rico"])]
Create a DensityBorderExtension modifier that extends the densities 100km beyond the state borders with a subsequent 100km-wide transition to the target density.
density_mod = flow.density.DensityBorderExtension(extension_width=100_000, transition_width=100_000)
Let's preview the modulator's impact.
# let's preview the impact of the density modulator
plot = flow.preview_density_modulator(
density_mod,
gdf_contig,
column="Population",
# use area_scale to convert m^2 to km^2
# density units are then people / km^2
area_scale=1e-6,
grid_size=256,
show="output",
colorbar_kwargs={"shrink": 0.75},
)
_ = plot.ax.set(title="Density Border Extension")
_ = plot.ax.axis("off")
Let's compute and compare cartograms with and without the density modifier.
# set area_scale=1e-6 to convert m^2 to km^2, which means density fields have the units people/km^2
base_options = flow.MorphOptions(grid_size=512, area_scale=1e-6)
result_default = flow.morph_gdf(gdf_contig, "Population", options=base_options)
result_density_mod = flow.morph_gdf(
gdf_contig,
"Population",
options=base_options.copy_with(
density_mod=density_mod,
),
)
Morph geometries: 43%|██████████████▉ | 214/500 [00:08<00:11, 24.39it/s, max=7.2%, mean=0.5% - converged] Morph geometries: 50%|█████████████████▌ | 250/500 [00:07<00:07, 32.30it/s, max=8.1%, mean=0.5% - converged]
import geopandas as gpd
outer = gpd.GeoSeries([gdf_contig.union_all().boundary])
fig, axes = plt.subplots(2, 1, figsize=(7, 9), sharex=True, sharey=True)
result_default.to_geodataframe().plot(
"Population (Millions)",
ax=axes[0],
cmap="RdYlGn_r",
linewidth=0.3,
edgecolor="white",
legend=True,
legend_kwds={"shrink": 0.7, "label": "Population (Millions)"},
)
axes[0].set(title="Without Density Field Modification")
axes[0].axis("off")
outer.plot(ax=axes[0], edgecolor="k", lw=0.5)
result_density_mod.to_geodataframe().plot(
"Population (Millions)",
ax=axes[1],
cmap="RdYlGn_r",
linewidth=0.3,
edgecolor="white",
legend=True,
legend_kwds={"shrink": 0.7, "label": "Population (Millions)"},
)
axes[1].set(title="With Density Field Modification", frame_on=False)
axes[1].axis("off")
outer.plot(ax=axes[1], edgecolor="k", lw=0.5)
plt.tight_layout();
Anisotropy¶
In the following example, we modify the velocity field directly by altering the diffusion coefficients for flow in the horizontal (Dx) or vertical direction (Dy). A higher diffusion coefficient means that flows are faster in the corresponding direction. Note that what matters is the ratio between Dx and Dy, and not their absolute values.
# let's only work with the contiguous US States
gdf_contig = gdf[~gdf["State Name"].isin(["Alaska", "Hawaii", "Puerto Rico"])]
Let's examine how the velocity field changes when we vary the diffusion constants in the horizontal and vertical directions. It's important to note that setting Dx=5 and Dy=1 is equivalent to setting Dx=1 and Dy=0.2 (i.e., the ratio is what matters). In the example below, when Dx/Dy=5, the horizontal flow increases relative to the vertical flow. Conversely, when Dx/Dy=0.2, the vertical flow dominates.
fig, axes = plt.subplots(3, 1, figsize=(5, 10), sharex=True, sharey=True)
anisotropy = 5
for dx, ax in zip((1, anisotropy, 1 / anisotropy), axes, strict=False):
plot = flow.preview_velocity_modulator(
gdf=gdf_contig,
column="Population",
Dx=dx,
grid_size=256,
skip=8,
# heatmap='output', heatmap_type='magnitude', heatmap_colorbar_kwargs={'shrink': 0.7}, #heatmap_cmap='twilight_shifted', heatmap_alpha_from_magnitude=True,
ax=ax,
colorbar_kwargs={"shrink": 0.7},
)
_ = plot.ax.axis("off")
Let's compute and compare cartograms with and without the anisotropy modifiers.
base_options = flow.MorphOptions(grid_size=512)
# isotropic
result_default = flow.morph_gdf(gdf_contig, "Population", options=base_options)
# anisotropic: Dx/Dy=5
result_dx = flow.morph_gdf(gdf_contig, "Population", options=base_options.copy_with(Dx=5))
# anisotropic: Dx/Dy=0.2
result_dy = flow.morph_gdf(gdf_contig, "Population", options=base_options.copy_with(Dy=5))
Morph geometries: 43%|██████████████▉ | 214/500 [00:09<00:12, 23.19it/s, max=7.2%, mean=0.5% - converged] Morph geometries: 53%|██████████████████▌ | 265/500 [00:07<00:07, 33.14it/s, max=8.6%, mean=1.7% - converged] Morph geometries: 47%|████████████████▍ | 234/500 [00:07<00:08, 31.95it/s, max=8.1%, mean=1.3% - converged]
import geopandas as gpd
outer = gpd.GeoSeries([gdf_contig.union_all().boundary])
fig, axes = plt.subplots(3, 1, figsize=(7, 12), sharex=True, sharey=True)
result_default.to_geodataframe().plot(
"Population (Millions)",
ax=axes[0],
cmap="RdYlGn_r",
linewidth=0.3,
edgecolor="white",
legend=True,
legend_kwds={"shrink": 0.7, "label": "Population (Millions)"},
)
axes[0].set(title="Isotropy")
axes[0].axis("off")
outer.plot(ax=axes[0], edgecolor="k", lw=0.5)
result_dx.to_geodataframe().plot(
"Population (Millions)",
ax=axes[1],
cmap="RdYlGn_r",
linewidth=0.3,
edgecolor="white",
legend=True,
legend_kwds={"shrink": 0.7, "label": "Population (Millions)"},
)
axes[1].set(title="Dx/Dy=5: Stronger East-West Diffusion")
axes[1].axis("off")
outer.plot(ax=axes[1], edgecolor="k", lw=0.5)
result_dy.to_geodataframe().plot(
"Population (Millions)",
ax=axes[2],
cmap="RdYlGn_r",
linewidth=0.3,
edgecolor="white",
legend=True,
legend_kwds={"shrink": 0.7, "label": "Population (Millions)"},
)
axes[2].set(title="Dx/Dy=0.2: Stronger North-South Diffusion", frame_on=False)
axes[2].axis("off")
outer.plot(ax=axes[2], edgecolor="k", lw=0.5)
plt.tight_layout();
Velocity Field Modulation¶
Modulating velocity through the Dx and Dy anisotropy options is limited to the horizontal and vertical axes and is applied globally. More general modulation of velocity can be done using the anisotropy option, which takes a pipeline of VelocityModulator objects. Available modulators are: BoundaryDecay, BoundaryNormalDecay, VelocitySmooth, DirectionalTensor, and LocalizedTensor. Multiple modulators can be combined into a pipeline by "adding" them, e.g., DirectionalTensor(1.7) + VelocitySmooth(5).
We demonstrate only the effects of these modifiers on the velocity field, and we encourage the user to explore how these modifiers influence deformations in cartograms.
# let's only work with the contiguous US States
gdf_contig = gdf[~gdf["State Name"].isin(["Alaska", "Hawaii", "Puerto Rico"])]
Boundary Decay¶
boundary_decay_mod = flow.anisotropy.BoundaryDecay(decay_length=100_000, damping_floor=0.2, smooth=20_000)
plot = flow.preview_velocity_modulator(
boundary_decay_mod,
gdf=gdf_contig,
column="Population",
grid_size=256,
skip=8,
colorbar_kwargs={"shrink": 0.7},
)
_ = plot.ax.set(title="Boundary Decay")
_ = plot.ax.axis("off")
Boundary Normal Decay¶
boundary_normal_mod = flow.anisotropy.BoundaryNormalDecay(decay_length=500_000, damping_floor=0.0, smooth=20_000)
plot = flow.preview_velocity_modulator(
boundary_normal_mod,
gdf=gdf_contig,
column="Population",
grid_size=256,
skip=8,
colorbar_kwargs={"shrink": 0.7},
)
_ = plot.ax.set(title="Boundary Normal Vector Decay")
_ = plot.ax.axis("off")
Directional Tensor¶
Radial Modifier¶
center_state = "Texas"
center = gdf_contig.query(f"`State Name`=='{center_state}'").geometry.centroid
center = np.array((center.x, center.y))
radial_mod = flow.DirectionalTensor.radial(center=center, Dpar=5)
plot = flow.preview_velocity_modulator(
radial_mod,
gdf=gdf_contig,
column="Population",
grid_size=256,
skip=8,
colorbar_kwargs={"shrink": 0.7},
)
_ = plot.ax.set(title=f"Radial Modifier center over {center_state}")
_ = plot.ax.axis("off")
Tangential Modifier¶
center_state = "Texas"
center = gdf_contig.query(f"`State Name`=='{center_state}'").geometry.centroid
center = np.array((center.x, center.y))
tangential_mod = flow.DirectionalTensor.tangential(center=center, Dpar=5)
plot = flow.preview_velocity_modulator(
tangential_mod,
gdf=gdf_contig,
column="Population", # show_vectors='diff',
grid_size=256,
skip=8,
colorbar_kwargs={"shrink": 0.7},
)
_ = plot.ax.set(title=f"Tangential Modifier centered over {center_state}")
_ = plot.ax.axis("off")
Directional Modifier¶
directional_mod = flow.DirectionalTensor(theta=0.25 * np.pi, Dpar=5)
plot = flow.preview_velocity_modulator(
directional_mod,
gdf=gdf_contig,
column="Population", # show_vectors='diff',
grid_size=256,
skip=8,
colorbar_kwargs={"shrink": 0.7},
)
_ = plot.ax.set(title="Northeast-Southwest Directional Tensor")
_ = plot.ax.axis("off")
Seeded Modifier¶
seeds = []
# promote north-south flow over Texas, east-west flow over Montana, and Northeast-Southwest flow over New York
for state, angle in (("Texas", 0.5 * np.pi), ("Montana", 0), ("New York", 0.25 * np.pi)):
c = gdf_contig.query(f"`State Name`=='{state}'").squeeze().geometry.centroid
seeds.append((c.x, c.y, angle))
# Supply the seeds and use a diffusion coefficient of 5 parallel to the seed angle
seeded_mod = flow.DirectionalTensor.from_seeds(seeds, Dpar=5)
plot = flow.preview_velocity_modulator(
seeded_mod,
gdf=gdf_contig,
column="Population",
grid_size=256,
skip=8,
colorbar_kwargs={"shrink": 0.7},
)
_ = plot.ax.set(title="Seeded Directional Tensor")
_ = plot.ax.axis("off")
Localized Tensor¶
seeds = []
for state, angle, sigma in (
("Texas", 0.5 * np.pi, 700_000),
("Montana", 0, (500_000, 250_000)),
("New York", 0.25 * np.pi, (500_000, 250_000)),
):
c = gdf_contig.query(f"`State Name`=='{state}'").squeeze().geometry.centroid
seeds.append((c.x, c.y, angle, sigma))
local_tensor_mod = flow.LocalizedTensor(seeds, default_Dpar=5)
plot = flow.preview_velocity_modulator(
local_tensor_mod,
gdf=gdf_contig,
column="Population",
grid_size=256,
skip=8,
colorbar_kwargs={"shrink": 0.7},
arrow_scale=2,
)
_ = plot.ax.set(title="Localized Tensor")
_ = plot.ax.axis("off")