Skip to content

carto_flow.proportional_cartogram.dot_density

Dot density map generation for GeoDataFrames.

Provides functions to generate randomly positioned markers inside geometries proportional to column values, and to visualize them as dot density maps.

Functions:

  • generate_dot_density

    Generate randomly positioned dots inside geometries to represent variable magnitudes.

  • plot_dot_density

    Plot a dot density map with base geometries as background.

generate_dot_density

generate_dot_density(
    gdf: GeoDataFrame,
    columns: str | Sequence[str],
    n_dots: int = 100,
    normalization: Literal[
        "sum", "maximum", "row", None
    ] = None,
    invert: bool = False,
    seed: int | None = None,
) -> gpd.GeoDataFrame

Generate randomly positioned dots inside geometries to represent variable magnitudes.

For each geometry, dots are sampled once (one rejection-sampling pass) and then assigned to categories by shuffling and slicing. This means dots from different categories do not overlap and the full sampling cost is paid once per geometry.

Parameters:

  • gdf (GeoDataFrame) –

    Input GeoDataFrame with polygon geometries.

  • columns (str or Sequence[str]) –

    Column name(s) containing values to represent as dots.

    • Single string: Single-variable density plot. The geometry with the largest value gets n_dots markers; others are scaled proportionally. Use normalization='maximum' (or leave None, which applies 'maximum' automatically for a single column).
    • List of strings: Multi-fraction dot density map. Each column contributes dots of a distinct category, colored separately.
  • n_dots (int, default: 100 ) –

    Number of dots for a fully-filled geometry (fraction = 1.0). Each geometry gets round(fraction * n_dots) dots per category. For a single column with normalization='maximum', the geometry with the maximum value receives n_dots dots.

  • normalization (('sum', 'maximum', 'row', None), default: 'sum' ) –

    How to convert column values to fractions:

    • 'sum': Divide each value by the sum of all values across all columns and rows.
    • 'maximum': Divide by the maximum row sum; the geometry with the largest total gets n_dots dots in total.
    • 'row': Normalise each row to sum to 1.0 (only valid for multiple columns). Every geometry gets exactly n_dots dots in total.
    • None: Use values directly as fractions. Must be in [0, 1] and row sums must not exceed 1.0.

    For a single column, None behaves the same as 'maximum'.

  • invert (bool, default: False ) –

    Whether to invert computed fractions (1 - fraction).

  • seed (int, default: None ) –

    Random seed for reproducibility.

Returns:

  • GeoDataFrame

    GeoDataFrame of Point geometries with columns:

    • geometry : Point
    • category : str — the column name this dot represents
    • original_index : the index value of the source row in gdf
    • value : float — the raw value from the source column

Raises:

  • ValueError

    If columns are not found, contain invalid values, or row sums exceed 1.0 when normalization is None.

  • TypeError

    If gdf is not a GeoDataFrame.

Examples:

Single-variable density:

>>> dots = generate_dot_density(gdf, "population", n_dots=100,
...                             normalization="maximum", seed=0)
>>> dots.plot(markersize=3)

Multi-fraction dot density:

>>> dots = generate_dot_density(
...     gdf,
...     columns=["agriculture", "industry", "services"],
...     n_dots=200,
...     normalization="row",
...     seed=42,
... )
>>> dots.groupby("category").size()

plot_dot_density

plot_dot_density(
    gdf: GeoDataFrame,
    dots_gdf: GeoDataFrame | None = None,
    columns: str | Sequence[str] | None = None,
    n_dots: int = 100,
    normalization: Literal[
        "sum", "maximum", "row", None
    ] = None,
    invert: bool = False,
    seed: int | None = None,
    palette: dict[str, str] | None = None,
    cmap: str | None = None,
    size: float = 4,
    alpha: float | Sequence[float] = 0.7,
    marker: str | Sequence[str] = "o",
    base_color: str = "#e0e0e0",
    base_edgecolor: str = "white",
    base_linewidth: float = 0.5,
    base_alpha: float = 1.0,
    ax: Axes | None = None,
    legend: bool = True,
    **kwargs: Any,
) -> DotDensityPlotResult

Plot a dot density map with base geometries as background.

Dots are placed randomly inside each geometry, with the count per category proportional to the corresponding column value. Each category is rendered in a distinct color.

Parameters:

  • gdf (GeoDataFrame) –

    Input GeoDataFrame with polygon geometries. Used for the background and, when dots_gdf is None, for generating dots.

  • dots_gdf (GeoDataFrame, default: None ) –

    Pre-generated dots from :func:generate_dot_density. If provided, columns, n_dots, normalization, invert, and seed are ignored.

  • columns (str or Sequence[str], default: None ) –

    Column name(s) for dot generation. Required when dots_gdf is None.

  • n_dots (int, default: 100 ) –

    Passed to :func:generate_dot_density when dots_gdf is None.

  • normalization (('sum', 'maximum', 'row', None), default: 'sum' ) –

    Passed to :func:generate_dot_density when dots_gdf is None.

  • invert (bool, default: False ) –

    Passed to :func:generate_dot_density when dots_gdf is None.

  • seed (int, default: None ) –

    Passed to :func:generate_dot_density when dots_gdf is None.

  • palette (dict[str, str], default: None ) –

    Mapping from category names to colors. If not provided, colors are drawn from cmap.

  • cmap (str, default: None ) –

    Colormap name for categorical colors. Defaults to 'tab10'.

  • size (float, default: 4 ) –

    Marker size in points.

  • alpha (float or Sequence[float], default: 0.7 ) –

    Marker transparency. A single float applies to all categories. A sequence of floats sets transparency per category, in the order categories appear in dots_gdf["category"].unique(). Length must match the number of categories.

  • marker (str or Sequence[str], default: 'o' ) –

    Matplotlib marker style. A single string applies to all categories. A sequence sets the marker per category (same ordering and length rules as alpha). Any marker accepted by ax.scatter() works, e.g. 'o', 's', '^', '*'.

  • base_color (str, default: '#e0e0e0' ) –

    Fill color for the background geometries.

  • base_edgecolor (str, default: 'white' ) –

    Edge color for background geometries.

  • base_linewidth (float, default: 0.5 ) –

    Edge line width for background geometries.

  • base_alpha (float, default: 1.0 ) –

    Transparency for background geometries.

  • ax (Axes, default: None ) –

    Axes to plot on. If None, a new figure is created.

  • legend (bool, default: True ) –

    Whether to add a legend for dot categories.

  • **kwargs (Any, default: {} ) –

    Additional keyword arguments passed to ax.scatter().

Returns:

  • DotDensityPlotResult

    Result with axes, scatter collections, base polygon collection, and legend.

Examples:

Quick single-variable density plot:

>>> result = plot_dot_density(gdf, columns="population",
...                           n_dots=100, normalization="maximum")

Multi-fraction with custom colors:

>>> result = plot_dot_density(
...     gdf,
...     columns=["agriculture", "industry", "services"],
...     normalization="row",
...     palette={"agriculture": "#4daf4a", "industry": "#377eb8", "services": "#e41a1c"},
...     n_dots=150,
... )

Reuse pre-generated dots:

>>> dots = generate_dot_density(gdf, "population", n_dots=100, seed=0)
>>> result = plot_dot_density(gdf, dots_gdf=dots)