carto_flow.proportional_cartogram.dot_density¶
Dot density map generation for GeoDataFrames.
Provides functions to generate randomly positioned markers inside geometries proportional to column values, and to visualize them as dot density maps.
Functions:
-
generate_dot_density–Generate randomly positioned dots inside geometries to represent variable magnitudes.
-
plot_dot_density–Plot a dot density map with base geometries as background.
generate_dot_density
¶
generate_dot_density(
gdf: GeoDataFrame,
columns: str | Sequence[str],
n_dots: int = 100,
normalization: Literal[
"sum", "maximum", "row", None
] = None,
invert: bool = False,
seed: int | None = None,
) -> gpd.GeoDataFrame
Generate randomly positioned dots inside geometries to represent variable magnitudes.
For each geometry, dots are sampled once (one rejection-sampling pass) and then assigned to categories by shuffling and slicing. This means dots from different categories do not overlap and the full sampling cost is paid once per geometry.
Parameters:
-
gdf(GeoDataFrame) –Input GeoDataFrame with polygon geometries.
-
columns(str or Sequence[str]) –Column name(s) containing values to represent as dots.
- Single string: Single-variable density plot. The geometry with the
largest value gets
n_dotsmarkers; others are scaled proportionally. Usenormalization='maximum'(or leaveNone, which applies'maximum'automatically for a single column). - List of strings: Multi-fraction dot density map. Each column contributes dots of a distinct category, colored separately.
- Single string: Single-variable density plot. The geometry with the
largest value gets
-
n_dots(int, default:100) –Number of dots for a fully-filled geometry (fraction = 1.0). Each geometry gets
round(fraction * n_dots)dots per category. For a single column withnormalization='maximum', the geometry with the maximum value receivesn_dotsdots. -
normalization(('sum', 'maximum', 'row', None), default:'sum') –How to convert column values to fractions:
- 'sum': Divide each value by the sum of all values across all columns and rows.
- 'maximum': Divide by the maximum row sum; the geometry with the
largest total gets
n_dotsdots in total. - 'row': Normalise each row to sum to 1.0 (only valid for multiple
columns). Every geometry gets exactly
n_dotsdots in total. - None: Use values directly as fractions. Must be in [0, 1] and row sums must not exceed 1.0.
For a single column,
Nonebehaves the same as'maximum'. -
invert(bool, default:False) –Whether to invert computed fractions (1 - fraction).
-
seed(int, default:None) –Random seed for reproducibility.
Returns:
-
GeoDataFrame–GeoDataFrame of Point geometries with columns:
geometry: Pointcategory: str — the column name this dot representsoriginal_index: the index value of the source row ingdfvalue: float — the raw value from the source column
Raises:
-
ValueError–If columns are not found, contain invalid values, or row sums exceed 1.0 when
normalizationis None. -
TypeError–If
gdfis not a GeoDataFrame.
Examples:
Single-variable density:
>>> dots = generate_dot_density(gdf, "population", n_dots=100,
... normalization="maximum", seed=0)
>>> dots.plot(markersize=3)
Multi-fraction dot density:
plot_dot_density
¶
plot_dot_density(
gdf: GeoDataFrame,
dots_gdf: GeoDataFrame | None = None,
columns: str | Sequence[str] | None = None,
n_dots: int = 100,
normalization: Literal[
"sum", "maximum", "row", None
] = None,
invert: bool = False,
seed: int | None = None,
palette: dict[str, str] | None = None,
cmap: str | None = None,
size: float = 4,
alpha: float | Sequence[float] = 0.7,
marker: str | Sequence[str] = "o",
base_color: str = "#e0e0e0",
base_edgecolor: str = "white",
base_linewidth: float = 0.5,
base_alpha: float = 1.0,
ax: Axes | None = None,
legend: bool = True,
**kwargs: Any,
) -> DotDensityPlotResult
Plot a dot density map with base geometries as background.
Dots are placed randomly inside each geometry, with the count per category proportional to the corresponding column value. Each category is rendered in a distinct color.
Parameters:
-
gdf(GeoDataFrame) –Input GeoDataFrame with polygon geometries. Used for the background and, when
dots_gdfis None, for generating dots. -
dots_gdf(GeoDataFrame, default:None) –Pre-generated dots from :func:
generate_dot_density. If provided,columns,n_dots,normalization,invert, andseedare ignored. -
columns(str or Sequence[str], default:None) –Column name(s) for dot generation. Required when
dots_gdfis None. -
n_dots(int, default:100) –Passed to :func:
generate_dot_densitywhendots_gdfis None. -
normalization(('sum', 'maximum', 'row', None), default:'sum') –Passed to :func:
generate_dot_densitywhendots_gdfis None. -
invert(bool, default:False) –Passed to :func:
generate_dot_densitywhendots_gdfis None. -
seed(int, default:None) –Passed to :func:
generate_dot_densitywhendots_gdfis None. -
palette(dict[str, str], default:None) –Mapping from category names to colors. If not provided, colors are drawn from
cmap. -
cmap(str, default:None) –Colormap name for categorical colors. Defaults to
'tab10'. -
size(float, default:4) –Marker size in points.
-
alpha(float or Sequence[float], default:0.7) –Marker transparency. A single float applies to all categories. A sequence of floats sets transparency per category, in the order categories appear in
dots_gdf["category"].unique(). Length must match the number of categories. -
marker(str or Sequence[str], default:'o') –Matplotlib marker style. A single string applies to all categories. A sequence sets the marker per category (same ordering and length rules as
alpha). Any marker accepted byax.scatter()works, e.g.'o','s','^','*'. -
base_color(str, default:'#e0e0e0') –Fill color for the background geometries.
-
base_edgecolor(str, default:'white') –Edge color for background geometries.
-
base_linewidth(float, default:0.5) –Edge line width for background geometries.
-
base_alpha(float, default:1.0) –Transparency for background geometries.
-
ax(Axes, default:None) –Axes to plot on. If None, a new figure is created.
-
legend(bool, default:True) –Whether to add a legend for dot categories.
-
**kwargs(Any, default:{}) –Additional keyword arguments passed to
ax.scatter().
Returns:
-
DotDensityPlotResult–Result with axes, scatter collections, base polygon collection, and legend.
Examples:
Quick single-variable density plot:
Multi-fraction with custom colors:
>>> result = plot_dot_density(
... gdf,
... columns=["agriculture", "industry", "services"],
... normalization="row",
... palette={"agriculture": "#4daf4a", "industry": "#377eb8", "services": "#e41a1c"},
... n_dots=150,
... )
Reuse pre-generated dots: