Flow Cartogram Algorithm¶

Overview¶

The flow cartogram algorithm deforms polygon geometries so that each region's area becomes proportional to an associated data value while maintaining spatial contiguity and approximate shape recognition. The approach follows Gastner and Newman's diffusion-based density-equalizing map method [1]: high-density regions expand and low-density regions contract until all regions reach a uniform equilibrium density.

The algorithm is implemented in morph_geometries() (algorithm.py). The high-level API is CartogramWorkflow (workflow.py).

Mathematical Foundation¶

Equilibrium Condition¶

Given \(n\) regions with data values \(v_1, \ldots, v_n\) and current areas \(a_1, \ldots, a_n\), the target is for every region to have the same density:

\[ \rho^* = \frac{\sum_i v_i}{\sum_i a_i} \]

At convergence, each region has area:

\[ a_i^{\text{final}} = \frac{v_i}{\rho^*} \]

Error Metric¶

Area error is measured in log₂ space:

\[ e_i = \log_2\!\left(\frac{a_i^{\text{current}}}{a_i^{\text{target}}}\right) \]

This representation is symmetric: \(e_i = +1\) means the region is twice as large as its target; \(e_i = -1\) means half as large. The equivalent percentage error is:

\[ e_i^{\%} = \operatorname{sign}(e_i) \cdot \left(2^{|e_i|} - 1\right) \cdot 100 \]

Computational Pipeline¶

---
displayMode: compact
config:
  theme: neutral
---
flowchart LR
    subgraph Init[INITIALIZATION]
        direction TB
        I1[Compute areas<br/>and target density]
        I1 --> I2[Create Grid]
        I2 --> I3[Optional:<br/>Pre-scale components]
    end

    subgraph Loop[ITERATION LOOP]
        direction TB
        L1[Compute density field<br/>rasterize geometries to grid]
        L1 --> L2[Compute velocity field<br/>FFT Poisson solver]
        L2 --> L3[Compute adaptive<br/>time step]
        L3 --> L4[Displace coordinates<br/>bilinear interpolation]
        L4 --> L5[Compute errors]
        L5 --> L6{Converged<br/>or stalled?}
        L6 -->|No| L1
        L6 -->|Yes| L7[Finalize]
    end

    Init --> Loop

    classDef hidden display: none;

Initialization¶

Compute initial polygon areas and target density \(\rho^*\).
Build a Grid covering the input bounds (with optional margin).
Optionally pre-scale connected components (prescale_components=True): each group of geometrically adjacent polygons is uniformly scaled to its target combined area before morphing begins. This reduces over-deformation of outer polygons. See Reduce Boundary Distortion.

Per-Iteration Steps¶

1. Density Field¶

Each polygon is rasterized onto the grid: cells whose centers fall inside polygon \(i\) receive density \(\rho_i = v_i / a_i\). Cells outside all polygons are filled with the global mean density \(\rho^*\).

Rasterization uses shapely.contains_xy() for vectorized point-in-polygon tests. The resulting field can be transformed by a DensityModulator before the velocity solve (see Modulators below).

By default, the density field is recomputed every recompute_every iterations (default: 10). Between recomputations the same field is reused, which is valid because the velocity pattern changes slowly near convergence.

2. Velocity Field¶

The velocity field is computed by solving a Poisson equation in the frequency domain [1]. The normalized density:

\[ \tilde{\rho} = \frac{\rho - \bar{\rho}}{\bar{\rho}} \]

is the source term. The Poisson equation \(\nabla^2 \phi = \tilde{\rho}\) is solved spectrally:

\[ \hat{\phi}(k_x, k_y) = -\frac{\hat{\tilde{\rho}}(k_x, k_y)}{\dfrac{k_x^2}{D_x} + \dfrac{k_y^2}{D_y}} \]

where \(k_x = 2\pi \cdot \text{fftfreq}(n_x, \Delta x)\), \(k_y = 2\pi \cdot \text{fftfreq}(n_y, \Delta y)\), and \(D_x\), \(D_y\) are anisotropy parameters that scale flow in each axis direction. The \(k=0\) component is set to zero (imposing zero mean velocity). The velocity components follow from the potential:

\[ v_x = \operatorname{Re}\!\left(\mathcal{F}^{-1}\!\left[i k_x \hat{\phi}\right]\right), \qquad v_y = \operatorname{Re}\!\left(\mathcal{F}^{-1}\!\left[i k_y \hat{\phi}\right]\right) \]

The velocity field is then normalized by the maximum velocity from the first iteration so that time steps remain consistent across iterations.

A VelocityModulator can modify the field after computation (see Modulators).

Three implementations are available: compute_velocity_anisotropic() (NumPy FFT, reference), compute_velocity_anisotropic_rfft() (real FFT with cached wavenumbers), and VelocityComputerFFTW (pre-allocated arrays, FFTW plans, SIMD-aligned memory).

3. Adaptive Time Step¶

\[ \Delta t = dt \cdot \frac{\min(\Delta x, \Delta y)}{\max(|v_x|, |v_y|)} \]

The factor \(dt \in (0, 1]\) is set in MorphOptions. This CFL-like expression ensures that no vertex moves more than one cell width per iteration, preventing coordinate overshooting.

4. Displacement¶

Every polygon vertex (and any provided landmark or coordinate set) is displaced by bilinear interpolation of the velocity field:

\[ \mathbf{p}^{\text{new}} = \mathbf{p}^{\text{old}} + \Delta t \cdot \mathbf{v}(\mathbf{p}^{\text{old}}) \]

Interpolation is implemented as a Numba JIT-compiled, parallelized function that computes cell indices directly (no searchsorted overhead).

The coords parameter supports three formats—\((N, 2)\) point arrays, \((X, Y)\) meshgrid tuples, and \((M, N, 2)\) mesh arrays—which are converted internally and restored to the original format after displacement.

5. Error Computation and Termination¶

After displacement, polygon areas are recomputed from the deformed vertices and per-geometry log₂ errors \(e_i\) are calculated. The algorithm terminates when:

Convergence: \(\bar{e} < \log_2(1 + \tau_{\text{mean}})\) and \(\max e_i < \log_2(1 + \tau_{\text{max}})\), where \(\tau_{\text{mean}}\) and \(\tau_{\text{max}}\) are the mean_tol and max_tol parameters (expressed as fractions, e.g. 0.05 for 5%).
Stall: The maximum error increases for stall_patience consecutive iterations.
Iteration limit: n_iter iterations have been completed.

Scalar error metrics are recorded in ConvergenceHistory at every iteration. Full CartogramSnapshot objects (including geometries) are saved at every snapshot_every iterations, and always at the final iteration regardless of the termination reason.

Grid¶

Grid (grid.py) defines the spatial discretization for the FFT computation.

Constructor parameter	Type	Description
`bounds`	`(xmin, ymin, xmax, ymax)`	Spatial extent
`size`	`int` or `(sx, sy)`	Grid points on longest axis, or explicit size
`margin`	`float`	Fractional padding added around bounds
`square`	`bool`	Enforce equal cell spacing (\(\Delta x = \Delta y\))

Cell centers, edges, and spacings are computed lazily on first access and cached as properties.

Multi-Resolution Grids¶

build_multilevel_grids(min_resolution, levels, margin, square) returns a dyadic hierarchy of Grid objects. Each successive level doubles the resolution on the longest axis. The coarsest grid has min_resolution points; there are levels grids total.

This hierarchy is used by CartogramWorkflow.morph_multiresolution() to run morphing sequentially from coarse to fine, with each level starting from the geometry produced by the previous level.

Density and Velocity Modulators¶

The algorithm exposes two extension points for modifying the density and velocity fields without altering the core loop:

DensityModulator: transforms the density grid after rasterization. Multiple modulators can be chained with +. The density_mod parameter in MorphOptions accepts a modulator or chain.
VelocityModulator: transforms the velocity field after the FFT solve. Supplied via the anisotropy parameter in MorphOptions.

Both are defined as abstract base classes in anisotropy.py. The primary use case is reducing outer-boundary distortion: for example, DensityBorderExtension inpaints interior densities outward so the density gradient at the boundary is less abrupt. See the Reduce Boundary Distortion how-to for practical guidance.

MorphOptions¶

MorphOptions (options.py) is a validated dataclass that controls all algorithm parameters.

Parameter	Default	Description
`dt`	1.0	Time step scalar controlling displacement magnitude per iteration
`n_iter`	500	Maximum iterations
`recompute_every`	10	Recompute density/velocity every N iterations
`snapshot_every`	`None`	Save full snapshot every N iterations; `None` = final only
`mean_tol`	0.05	Convergence threshold for mean area error (fraction)
`max_tol`	0.1	Convergence threshold for max area error (fraction)
`stall_patience`	5	Stall after N consecutive iterations of increasing max error; `None` = disabled
`grid`	`None`	Pre-constructed `Grid`; takes precedence over size/margin/square
`grid_size`	100	Grid points on longest axis
`grid_margin`	0.5	Fractional padding around input bounds
`grid_square`	`True`	Enforce square cells
`Dx`, `Dy`	1.0	Anisotropy factors for the FFT solver
`anisotropy`	`None`	`VelocityModulator` applied after the FFT solve
`density_mod`	`None`	`DensityModulator` applied after rasterization
`area_scale`	1.0	Multiplier applied to polygon areas before density computation
`prescale_components`	`False`	Pre-scale connected components to target area
`save_internals`	`False`	Save density/velocity grids in `Cartogram.internals`

Three preset factory methods are available: MorphOptions.preset_fast(), .preset_balanced(), and .preset_high_quality().

Limitations¶

Periodic boundary conditions. The FFT solver assumes periodic boundary conditions. This means the velocity field wraps at grid edges, which can produce artifacts if input geometries are close to the grid boundary. Mitigate by setting grid_margin to add sufficient padding (default 0.5 = 50% of the bounding box on each side).

Outer-boundary distortion. Exterior cells are assigned mean density, creating a sharp density step at the boundary of the data region. Outer polygons tend to expand more than expected. This is partly mitigated by prescale_components and by density modulators such as DensityBorderExtension.

Grid resolution trade-off. Higher grid resolution produces a smoother, more accurate velocity field but increases computation time. The density field is only as detailed as the grid; very thin or small polygons may not rasterize correctly at low resolution.

Iterative approximation. The algorithm does not solve the equilibrium condition exactly in one step. Convergence is reached after many iterations, and the result depends on dt, n_iter, mean_tol, and max_tol. Tight tolerances may require many iterations; loose tolerances may leave visible residual errors.

References¶

[1] M. T. Gastner and M. E. J. Newman, "Diffusion-based method for producing density-equalizing maps," Proc. Natl. Acad. Sci. U.S.A., vol. 101, no. 20, pp. 7499–7504, 2004. DOI: 10.1073/pnas.0400280101 · arXiv: physics/0401102 Reference implementation (C): https://websites.umich.edu/~mejn/cart/

[2] M. T. Gastner, V. Seguy, and P. More, "Fast flow-based algorithm for creating density-equalizing map projections," Proc. Natl. Acad. Sci. U.S.A., vol. 115, no. 10, pp. E2156–E2164, 2018. DOI: 10.1073/pnas.1712674115 Reference implementation (C): github.com/Flow-Based-Cartograms/go_cart

carto-flow implements the method of [1]. The 2018 algorithm [2] replaces the iterative density rasterization with a prescribed linear density path ρ(t) = (1-t)·ρ₀ + t·ρ̄, which eliminates re-rasterization at each step and is approximately 40× faster in practice.