Multiresolution Flow Cartogram Creation¶
Flow cartograms distort geography so that each region's area is proportional to a data value. The underlying algorithm iteratively displaces coordinates on a velocity field derived from a density grid. Three key parameters control the trade-off between accuracy and compute time:
| Parameter | What it controls |
|---|---|
grid_size |
Resolution of the density / velocity grid |
dt |
Size of each displacement step |
recompute_every |
How often the density and velocity fields are recomputed |
This tutorial walks through each parameter's impact and then shows how multiresolution morphing lets you get accurate results without paying the full cost of a fine grid from the start.
What You'll Learn¶
- How
grid_size,dt, andrecompute_everyaffect convergence and compute time - How these parameters interact
- How multiresolution morphing improves the accuracy / speed trade-off
Prerequisites¶
Install carto-flow first:
pip install carto-flow
import matplotlib.colors as mcolors
import matplotlib.pyplot as plt
import pandas as pd
import carto_flow.data as examples
import carto_flow.flow_cartogram as flow
# Helper function to compare two flow cartogram maps
def _plot_error_maps(r1, r2, title1, title2, figsize=(12, 5)):
"""Side-by-side error maps with a shared linear symmetric color scale.
Scale spans ±max_tolerance (white = no error, red = over-expanded,
blue = under-expanded). Regions outside ±tolerance are shown in a
distinct saturated color with triangular colorbar caps.
"""
max_tol_pct = (
max(
r1.options.max_tol if r1.options else 0.10,
r2.options.max_tol if r2.options else 0.10,
)
* 100
)
norm = mcolors.Normalize(vmin=-max_tol_pct, vmax=max_tol_pct)
cmap = plt.cm.RdBu_r.copy()
cmap.set_over("#47000f") # dark red: over-expanded beyond tolerance
cmap.set_under("#032041") # dark blue: under-expanded beyond tolerance
fig, axes = plt.subplots(1, 2, figsize=figsize)
r1.plot(column="_morph_error_pct", ax=axes[0], norm=norm, cmap=cmap, legend=False)
axes[0].set_title(title1)
axes[0].axis("off")
r2.plot(column="_morph_error_pct", ax=axes[1], norm=norm, cmap=cmap, legend=False)
axes[1].set_title(title2)
axes[1].axis("off")
# Shared colorbar; triangular caps show the out-of-tolerance color
sm = plt.cm.ScalarMappable(cmap=cmap, norm=norm)
sm.set_array([])
cbar = fig.colorbar(sm, ax=list(axes), fraction=0.03, pad=0.04, extend="both", shrink=0.7)
ticks = [-max_tol_pct, -max_tol_pct / 2, 0, max_tol_pct / 2, max_tol_pct]
cbar.set_ticks(ticks)
cbar.set_ticklabels(["0%" if t == 0 else f"{t:+g}%" for t in ticks])
cbar.set_label("Area error")
# Load population and geometry data for the lower 48 US States
us_states = examples.load_us_census(population=True)
column = "Population"
1. The accuracy / speed trade-off¶
Every iteration of the flow cartogram algorithm:
- Rasterizes current geometries onto a grid to compute a density field
- Solves a Poisson equation (via FFT) to derive a velocity field
- Displaces all coordinates by one time step
Accuracy improves as:
- the grid gets finer (more spatial detail captured)
- the time step gets smaller (smoother, more stable displacements)
- the density field is recomputed more often (tracks geometry changes)
But each of these improvements also increases computation time. The next three sections isolate each factor.
2. Effect of grid_size¶
grid_size sets the resolution of the internal density and velocity grid.
A finer grid resolves small features better and produces smoother velocity
fields, but each FFT solve is O(N log N) in the number of grid cells.
Too coarse: small regions may not register in the density field at all, causing the algorithm to stall because the velocity field lacks the spatial detail needed to push them toward their target area.
Too fine: each iteration is slower, and the gain in accuracy diminishes once the grid is finer than the smallest polygon.
A practical starting point: run at grid_size=256 first and inspect the final
error map (column="_morph_error_pct"). If small regions show persistently high
error, increase the grid size. If the result converges quickly and error is
uniform, the grid is sufficient.
# Compute flow cartograms for multiple grid sizes
grid_sizes = [64, 128, 256, 512]
results_grid = {}
for gs in grid_sizes:
opts = flow.MorphOptions(grid_size=gs, n_iter=300, show_progress=False, mean_tol=0.005, max_tol=0.02)
results_grid[gs] = flow.morph_gdf(us_states, column, options=opts)
# Summary: status / iterations / duration / final mean error for all grid sizes
summary_grid = pd.DataFrame([
{
"grid_size": gs,
"status": r.status.value,
"iterations": r.niterations,
"duration_s": round(r.duration, 2),
"time per iteration_ms": round(1000 * r.duration / r.niterations, 1),
"mean_error_pct": round(r.get_errors().mean_error_pct, 2),
}
for gs, r in results_grid.items()
])
display(summary_grid)
| grid_size | status | iterations | duration_s | time per iteration_ms | mean_error_pct | |
|---|---|---|---|---|---|---|
| 0 | 64 | stalled | 39 | 2.53 | 64.9 | 34.99 |
| 1 | 128 | stalled | 69 | 0.70 | 10.2 | 13.76 |
| 2 | 256 | converged | 112 | 1.92 | 17.2 | 0.23 |
| 3 | 512 | converged | 215 | 8.37 | 39.0 | 0.19 |
fig, axes = plt.subplots(1, 2, figsize=(10, 4), gridspec_kw={"width_ratios": [5, 2]})
# Convergence curves for all grid sizes
for gs, r in results_grid.items():
axes[0].plot(
r.convergence.iterations,
r.convergence.mean_errors_pct,
label=f"grid={gs}" + f" {'✔' if r.status == 'converged' else '✗'}",
)
axes[0].set(title="Convergence by grid size", xlabel="Iteration", ylabel="Mean error (%)")
axes[0].legend()
# Compute time for all grid sizes
_ = summary_grid.plot(
x="grid_size", y="duration_s", kind="line", marker="o", ylabel="duration (s)", legend=False, ax=axes[1]
)
axes[1].set(xticks=grid_sizes, ylim=(0, None))
plt.tight_layout()
# Compare the cartogram results
_plot_error_maps(
results_grid[64],
results_grid[512],
f"grid=64 — {results_grid[64].status.value}",
f"grid=512 — {results_grid[512].status.value}",
)
3. Effect of dt¶
dt scales how far coordinates move in each iteration.
The actual displacement per step is proportional to dt × min(dx, dy),
where dx, dy are the grid cell dimensions.
Large dt (e.g. 0.5): fewer iterations needed to reach the target density,
but large steps can overshoot — a region contracts past its target, the velocity
reverses, and the algorithm oscillates without converging.
Small dt (e.g. 0.05): stable and precise, but many more iterations are
needed to move geometries a meaningful distance.
The default dt=0.2 is a balanced starting point. For smooth, relatively
uniform input data you can increase it; for data with large contrasts
(some regions much smaller / larger than others) reduce it.
Oscillation signal: if the mean error stops decreasing and bounces up and down,
dtmay be too large. Reduce it by half and re-run. Thestatusfield will reportSTALLEDwhenstall_patiencetriggers.
# Compute flow cartograms for multiple dt values
dt_values = [0.05, 0.1, 0.2, 0.3, 0.5]
results_dt = {}
opts_base = flow.MorphOptions(grid_size=256, n_iter=300, show_progress=False, mean_tol=0.005, max_tol=0.02)
for dt in dt_values:
results_dt[dt] = flow.morph_gdf(us_states, column, options=opts_base.copy_with(dt=dt))
# Summary: status / iterations / duration / final mean error for all dt values
summary_dt = pd.DataFrame([
{
"dt": dt,
"status": r.status.value,
"iterations": r.niterations,
"duration_s": round(r.duration, 2),
"time per iteration_ms": round(1000 * r.duration / r.niterations, 1),
"mean_error_pct": round(r.get_errors().mean_error_pct, 2),
}
for dt, r in results_dt.items()
])
display(summary_dt)
| dt | status | iterations | duration_s | time per iteration_ms | mean_error_pct | |
|---|---|---|---|---|---|---|
| 0 | 0.05 | completed | 300 | 3.27 | 10.9 | 39.32 |
| 1 | 0.10 | converged | 224 | 2.36 | 10.5 | 0.18 |
| 2 | 0.20 | converged | 112 | 1.23 | 11.0 | 0.23 |
| 3 | 0.30 | stalled | 85 | 0.95 | 11.2 | 4.30 |
| 4 | 0.50 | stalled | 59 | 0.65 | 11.1 | 17.96 |
fig, axes = plt.subplots(1, 2, figsize=(10, 4), gridspec_kw={"width_ratios": [5, 2]})
# Convergence curves for all dt values
for dt, r in results_dt.items():
axes[0].plot(
r.convergence.iterations,
r.convergence.mean_errors_pct,
label=f"dt={dt}" + f" {'✔' if r.status == 'converged' else '✗'}",
)
axes[0].set(title="Convergence by dt", xlabel="Iteration", ylabel="Mean error (%)")
axes[0].legend()
# Compute time for all dt values
_ = summary_dt.plot(x="dt", y="duration_s", kind="line", marker="o", ylabel="duration (s)", legend=False, ax=axes[1])
axes[1].set(xticks=dt_values, ylim=(0, None))
axes[1].set_xticklabels(dt_values, rotation=45)
plt.tight_layout()
# Compare the cartogram results
_plot_error_maps(
results_dt[0.1],
results_dt[0.5],
f"dt=0.1 — {results_dt[0.1].status.value}",
f"dt=0.5 — {results_dt[0.5].status.value}",
)
4. Effect of recompute_every¶
Computing the density and velocity fields (rasterization + FFT solve) is the most expensive
part of each iteration. recompute_every=N skips this on N-1 out of every N
steps, reusing the cached velocity field instead.
recompute_every=1: maximum accuracy — the velocity field always reflects
the current geometry. Slowest per-iteration, but least likely to stall.
recompute_every=10 (default): a good balance. Geometries do not move
much in 10 steps with dt=0.2, so the cached field remains a good approximation.
recompute_every=50+: the velocity field can become stale, especially
early in morphing when geometries are still far from their targets.
The algorithm may stall or oscillate.
Reducing recompute_every is particularly helpful when morphing data with
extreme value contrasts.
# Compute flow cartograms for multiple recompute_every values
recompute_values = [1, 5, 10, 25, 50]
results_rc = {}
opts_base = flow.MorphOptions(grid_size=256, n_iter=300, dt=0.2, show_progress=False, mean_tol=0.005, max_tol=0.02)
for rc in recompute_values:
results_rc[rc] = flow.morph_gdf(us_states, column, options=opts_base.copy_with(recompute_every=rc))
# Summary: status / iterations / duration / final mean error for all recompute_every values
summary_rc = pd.DataFrame([
{
"recompute_every": rc,
"status": r.status.value,
"iterations": r.niterations,
"duration_s": round(r.duration, 2),
"time per iteration_ms": round(1000 * r.duration / r.niterations, 1),
"mean_error_pct": round(r.get_errors().mean_error_pct, 2),
}
for rc, r in results_rc.items()
])
display(summary_rc)
| recompute_every | status | iterations | duration_s | time per iteration_ms | mean_error_pct | |
|---|---|---|---|---|---|---|
| 0 | 1 | converged | 115 | 10.75 | 93.5 | 0.38 |
| 1 | 5 | converged | 118 | 2.35 | 19.9 | 0.32 |
| 2 | 10 | converged | 112 | 1.24 | 11.0 | 0.23 |
| 3 | 25 | stalled | 136 | 0.74 | 5.4 | 6.49 |
| 4 | 50 | stalled | 164 | 0.53 | 3.2 | 8.19 |
fig, axes = plt.subplots(1, 2, figsize=(10, 4), gridspec_kw={"width_ratios": [5, 2]})
# Convergence curves for all recompute_every values
for rc, r in results_rc.items():
axes[0].plot(
r.convergence.iterations,
r.convergence.mean_errors_pct,
label=f"recompute_every={rc}" + f" {'✔' if r.status == 'converged' else '✗'}",
)
axes[0].set(title="Convergence by recompute_every", xlabel="Iteration", ylabel="Mean error (%)")
axes[0].legend()
# Compute time for all recompute_every values
_ = summary_rc.plot(
x="recompute_every", y="duration_s", kind="line", marker="o", ylabel="duration (s)", legend=False, ax=axes[1]
)
axes[1].set(xticks=recompute_values, ylim=(0, None))
plt.tight_layout()
# Compare the cartogram results
_plot_error_maps(
results_rc[1],
results_rc[50],
f"recompute_every=1 — {results_rc[1].status.value}",
f"recompute_every=50 — {results_rc[50].status.value}",
)
Oscillations and the recompute_every cycle¶
With the recompute_every option, the velocity field is refreshed only every N
iterations. After the velocity field is computed, the error typically decreases, but
this decline may slow down or even reverse as the cached velocity field no longer
accurately reflects the actual situation. This cycle continues until the velocity
field is recomputed, correcting the accumulated drift. As a result, an oscillatory
pattern emerges in the convergence curve. The default setting for stall_patience is 5,
which will terminate the run if errors increase for more than 5 iterations. If you set
stall_patience to None, this check is disabled, allowing for the full oscillation
pattern to be observed.
The example below illustrates an oscillatory pattern, demonstrating that the algorithm
does not converge. The orange lines indicate each field recomputation. By default, the
function will stop when a stall is detected. However, to visualize the oscillation
phenomenon, stall detection has been disabled (with stall_patience=None).
# Oscillation demo: disable stall detection to see the full recompute_every cycle
result_osc = flow.morph_gdf(
us_states,
column,
options=flow.MorphOptions(
grid_size=128,
n_iter=300,
dt=0.2,
recompute_every=20, # long cycle → visible oscillations
stall_patience=None, # disable stall detection so run completes
show_progress=False,
mean_tol=0.005,
max_tol=0.02,
),
)
fig, ax = plt.subplots(figsize=(9, 4))
ax.plot(result_osc.convergence.iterations, result_osc.convergence.mean_errors_pct, color="steelblue")
# mark each field recompute step
for k in range(0, result_osc.niterations, 20):
ax.axvline(k, color="orange", linewidth=0.8, alpha=0.6, label="field recompute" if k == 0 else None)
ax.set(
title="Oscillations tied to recompute_every=20 (stall_patience=None)", xlabel="Iteration", ylabel="Mean error (%)"
)
ax.legend()
plt.tight_layout()
5. A note on how the parameters interact¶
The three parameters are not independent. Understanding their interactions helps you avoid counterproductive combinations.
the effective step size¶
The actual displacement per iteration is computed as:
$$\Delta x = dt \times \frac{\min(dx, dy)}{v_{\max}}$$
where $dx$, $dy$ are the physical size of one grid cell. A finer grid has
smaller cells, so the absolute step size shrinks proportionally.
Doubling grid_size halves the absolute step size, meaning the algorithm
needs roughly twice as many iterations to achieve the same displacement —
even with the same dt.
When convergence is slow or stalling, changing only one of grid_size or dt
(not both simultaneously) is a good diagnostic step. If the error plateaus on a
coarse grid, try increasing grid_size. If it oscillates, reduce dt.
Changing both at once makes it hard to attribute the effect.
velocity field staleness¶
If dt is large, geometries move a lot each step. The density field changes
rapidly, and the cached velocity field becomes a poor guide sooner.
Large dt should be paired with small recompute_every to keep the
velocity field fresh.
Conversely, with a small dt (e.g. after increasing grid_size), geometries
barely move between steps, so the cached field stays accurate for longer.
You can safely increase recompute_every without losing much accuracy,
trading some quality for speed.
6. Multiresolution morphing¶
The three parameters interact: a large grid requires many iterations; a small
dt requires many iterations; frequent recomputation makes each iteration
slower. Choosing parameters for a single run forces you to pay the full cost.
Multiresolution morphing breaks the problem into stages:
- Coarse grid (fast) — large-scale shape rearrangements happen quickly. Convergence may be approximate; the key is to get geometries close to their targets without the expense of a fine grid.
- Finer grid — start from the approximately-morphed geometries of the previous stage. Less work remains, so convergence is fast even on a larger grid.
- Finest grid — polish remaining errors. Typically only a few dozen iterations are needed because coarser stages already did the heavy lifting.
This strategy is analogous to multigrid solvers in numerical analysis: coarse grids propagate information across long distances efficiently; fine grids resolve local detail.
The min_resolution parameter sets the coarsest grid; each subsequent level
doubles the resolution. With min_resolution=128, levels=3 the grids are
128 → 256 → 512.
Note:
multiresolution_morph()is a convenience wrapper that returns only the final-levelCartogram. To access per-level results and plot per-level convergence histories, useCartogramWorkflowdirectly as shown below.
# Baseline: single fine-grid run
opts_single = flow.MorphOptions(
grid_size=512, n_iter=500, show_progress=True, progress_message="Single 512", mean_tol=0.005, max_tol=0.01
)
result_single = flow.morph_gdf(us_states, column, options=opts_single)
print(f"Single-res: {result_single.duration:.1f}s status={result_single.status.value}")
print(f" mean error: {result_single.get_errors().mean_error_pct:.2f}%")
Single 512: 45%|███████████▋ | 225/500 [00:06<00:08, 33.54it/s, max=0.8%, mean=0.1% - converged]
Single-res: 6.7s status=converged mean error: 0.12%
from carto_flow.flow_cartogram import CartogramWorkflow
# Multiresolution: 128 → 256 → 512
multi_workflow = CartogramWorkflow(
us_states, column, options=flow.MorphOptions(n_iter=200, dt=0.2, show_progress=True, mean_tol=0.005, max_tol=0.02)
)
multi_workflow.morph_multiresolution(min_resolution=128, levels=3)
result_multi = multi_workflow.latest
total_multi_duration = sum(r.duration for r in multi_workflow if r is not multi_workflow.original)
print(f"Multi-res: {total_multi_duration:.1f}s status={result_multi.status.value}")
print(f" mean error: {result_multi.get_errors().mean_error_pct:.2f}%")
Morphing with 128x89 grid: 34%|███ | 67/200 [00:00<00:01, 102.05it/s, max=1861.3%, mean=14.5% - stalled] Refining with 256x178 grid: 13%|█▋ | 26/200 [00:00<00:02, 70.05it/s, max=4.3%, mean=0.9% - stalled] Refining with 512x356 grid: 1%| | 2/200 [00:00<00:28, 6.99it/s, max=0.3%, mean=0.0% - converged]
Multi-res: 1.4s status=converged mean error: 0.03%
# Summary: status / iterations / duration / final mean error for single- and multi-resolution approaches
summary_multi = pd.DataFrame({
"method": ("single", "multi"),
"status": (result_single.status.value, result_multi.status.value),
"iterations": (
result_single.niterations,
sum(r.niterations for r in multi_workflow if r is not multi_workflow.original),
),
"duration_s": (round(result_single.duration, 2), round(total_multi_duration, 2)),
"mean_error_pct": (
round(result_single.get_errors().mean_error_pct, 2),
round(result_multi.get_errors().mean_error_pct, 2),
),
})
display(summary_multi)
| method | status | iterations | duration_s | mean_error_pct | |
|---|---|---|---|---|---|
| 0 | single | converged | 225 | 6.74 | 0.12 |
| 1 | multi | converged | 95 | 1.42 | 0.03 |
# Convergence comparison: stack per-level curves on a single axis
fig, axes = plt.subplots(1, 2, figsize=(9, 4), gridspec_kw={"width_ratios": (5, 1)})
# Single-resolution curve
axes[0].plot(
result_single.convergence.iterations,
result_single.convergence.mean_errors_pct,
color="gray",
label=f"single 512 ({result_single.duration:.1f}s)",
)
# Multiresolution: one curve per level, offset on x-axis
offset = 0
colors = ["#1f77b4", "#ff7f0e", "#2ca02c"]
level_idx = 0
for level_result in multi_workflow:
if level_result is multi_workflow.original:
continue # skip unmorphed state
iters = level_result.convergence.iterations + offset
gs = level_result.grid.sx # actual grid width used
color = colors[level_idx % len(colors)]
axes[0].plot(
iters,
level_result.convergence.mean_errors_pct,
color=color,
label=f"level {level_idx + 1}: grid={gs} ({level_result.duration:.1f}s)",
)
axes[0].axvline(offset, color=color, lw=1, alpha=0.4)
offset += level_result.niterations
level_idx += 1
axes[0].set(
title="Convergence: single-resolution vs multi-resolution", xlabel="Cumulative iteration", ylabel="Mean error (%)"
)
axes[0].legend()
# Compute time
_ = summary_multi.plot(x="method", y="duration_s", kind="bar", ylabel="duration (s)", legend=False, ax=axes[1])
axes[1].set(ylim=(0, None))
plt.tight_layout()
# Compare the cartogram results
total_multi_duration = sum(r.duration for r in multi_workflow if r is not multi_workflow.original)
_plot_error_maps(
result_single,
result_multi,
f"Single-resolution 512\n{result_single.duration:.1f}s, {result_single.status.value}",
f"Multiresolution (128→256→512)\n{total_multi_duration:.1f}s, {result_multi.status.value}",
figsize=(14, 5),
)
7. Choosing parameters in practice¶
The status field tells you what happened and what to do next:
CONVERGED— tolerances met; result is accurateCOMPLETED— iteration limit reached; try increasingn_iteror adjusting parametersSTALLED— error started growing; reducedtorrecompute_every, or increasegrid_size
When diagnosing convergence problems, change one parameter at a time
- Error plateaus at high value → try a finer
grid_size - Error oscillates / STALLED → increase
grid_sizeor reducedt - Converges but slowly → use multiresolution
Additional Resources¶
- API Reference: morph_gdf
- API Reference: MorphOptions
- API Reference: Cartogram object
- API Reference: CartogramWorkflow
This notebook is part of the carto-flow documentation. Report issues at https://github.com/bright-fakl/carto-flow/issues