Skip to content

carto_flow.data

Data loading utilities for example datasets used in documentation and examples.

This module provides convenience functions to load example datasets without requiring users to manually download and manage data files. The datasets are either included with the package or downloaded on-demand from reliable sources.

Optional Dependencies

This module has optional dependencies that are not required for the core functionality of carto-flow. These dependencies include: - censusdis: For accessing US census data (optional, for demographic examples)

If these dependencies are not installed, functions that require them will raise a clear ImportError with instructions on how to install the missing packages.

State-Region and State-Division Mappings

The following dictionaries provide mappings between US states (identified by their 2-digit FIPS codes) and their respective census regions and divisions, as defined by the US Census Bureau:

  • STATE_REGIONS: Maps state FIPS codes to region codes (1-4)
  • STATE_DIVISIONS: Maps state FIPS codes to division codes (1-9)
  • REGION_NAMES: Maps region codes to human-readable region names
  • DIVISION_NAMES: Maps division codes to human-readable division names
  • REGION_DIVISIONS: Maps region codes to lists of division codes within that region

Functions:

  • load_sample_cities

    Load a sample dataset of cities with population information.

  • load_us_census

    Load US state boundaries with ACS demographic data from the US Census Bureau.

  • load_us_states

    Load a US states dataset with population and area information.

  • load_world

    Load a world countries dataset with population estimates.

load_sample_cities

load_sample_cities() -> geopandas.GeoDataFrame

Load a sample dataset of cities with population information.

This dataset includes major cities around the world with population estimates.

Returns:

  • GeoDataFrame

    Cities dataset with geometry and population information.

Examples:

>>> from carto_flow.data import load_sample_cities
>>> gdf = load_sample_cities()
>>> print(gdf.shape)
(200, ...)

load_us_census

load_us_census(
    population: bool = True,
    race: bool = False,
    poverty: bool = False,
    simplify: float | None = None,
    vintage: int = 2020,
    contiguous_only: bool = True,
) -> geopandas.GeoDataFrame

Load US state boundaries with ACS demographic data from the US Census Bureau.

Downloads American Community Survey (ACS) 5-year estimates for US states, projected to the Albers equal-area projection (ESRI:102008). Alaska, Hawaii, and Puerto Rico are excluded.

Parameters:

  • population (bool, default: True ) –

    Include total population. Adds columns: - Population, Population (Millions), Population Density

  • race (bool, default: False ) –

    Include race/ethnicity breakdown. Adds columns: - Total Race, White, Black or African American, Asian, Hispanic or Latino, and corresponding <group> % columns.

  • poverty (bool, default: False ) –

    Include poverty status. Adds columns: - Total Poverty, Below Poverty Level, Above Poverty Level, Below Poverty Level %, Above Poverty Level %

  • simplify (float or None, default: None ) –

    If given, simplify geometries using this tolerance in meters (units of ESRI:102008). A value of 1000 gives a good balance of detail vs. speed. None keeps full-resolution geometries.

  • vintage (int, default: 2020 ) –

    ACS 5-year vintage year.

Returns:

  • GeoDataFrame

    US states dataset in ESRI:102008 (Albers equal-area, meters).

Examples:

>>> from carto_flow.data import load_us_census
>>> gdf = load_us_census(population=True, race=True, simplify=1000)
>>> print(gdf["State Name"].head())

load_us_states

load_us_states() -> geopandas.GeoDataFrame

Load a US states dataset with population and area information.

This dataset includes state boundaries, population estimates, and area information from the US Census Bureau.

Returns:

  • GeoDataFrame

    US states dataset with geometry and demographic information.

Examples:

>>> from carto_flow.data import load_us_states
>>> gdf = load_us_states()
>>> print(gdf.shape)
(51, ...)

load_world

load_world() -> geopandas.GeoDataFrame

Load a world countries dataset with population estimates.

This dataset includes country boundaries and population estimates from Natural Earth (public domain data).

Returns:

  • GeoDataFrame

    World countries dataset with geometry and population information. Columns: - name: Country name - continent: Continent name - pop_est: Population estimate - gdp_md_est: GDP estimate (millions USD) - geometry: Country boundaries

Examples:

>>> from carto_flow.data import load_world
>>> gdf = load_world()
>>> print(gdf.shape)
(177, 5)