GeoFlowKit

A Python package for handling and analyzing geographical flow data, extending pandas and geopandas with flow-specific operations.

Overview

GeoFlowKit provides FlowSeries and FlowDataFrame types, which are subclasses of pandas.Series and pandas.DataFrame respectively. They are designed to work with flow data consisting of origin-destination (OD) pairs, similar to how geopandas.GeoSeries and geopandas.GeoDataFrame work with geometries.

Installation

pip install geoflowkit

Or install from source:

pip install .

Dependencies

shapely
numpy
pandas
geopandas >= 1.0.1
matplotlib
scikit-learn
tqdm
numba

Quick Start

Creating Flow Objects

import numpy as np
from geoflowkit import Flow, FlowSeries, FlowDataFrame

# Create a single Flow (origin-destination pair)
flow = Flow([[0, 0], [1, 1]])

# Access origin and destination points
print(flow.o)  # POINT (0 0)
print(flow.d)  # POINT (1 1)

Creating FlowSeries

# From a list of Flow objects
fs = FlowSeries([
    Flow([[0, 0], [1, 1]]),
    Flow([[1, 1], [2, 2]]),
    Flow([[2, 2], [3, 3]])
], crs="EPSG:4326")

# From coordinate arrays using flows_from_od
from geoflowkit import flows_from_od

o_points = np.array([[0, 0], [1, 1], [2, 2]])
d_points = np.array([[1, 1], [2, 2], [3, 3]])
fs = flows_from_od(o_points, d_points, crs="EPSG:4326")

Creating FlowDataFrame

# Create a FlowDataFrame with attributes
data = {
    'id': [1, 2, 3],
    'value': [10, 20, 30],
    'geometry': fs
}
fdf = FlowDataFrame(data, crs="EPSG:4326")
print(fdf)

Reading Data from Files

# Read from CSV (specify origin/destination columns)
fdf = read_csv(
    'flow_data.csv',
    use_cols=['ox', 'oy', 'dx', 'dy'],
    crs='EPSG:4326'
)

# Read from GeoPackage
fdf = read_file('flow_data.gpkg', layer='flows')

Core Features

Flow Properties

# Access origin and destination points
origins = fdf.o  # GeoSeries of origin points
destinations = fdf.d  # GeoSeries of destination points

# Flow length and angle
lengths = fdf.length  # Distance from origin to destination
angles = fdf.angle    # Direction of flow (radians)

# Flow density and volume
density = fdf.density  # Flows per unit area
volume = fdf.volume    # Total bounding area

Spatial Operations

# Calculate pairwise distances between flows
from geoflowkit import pairwise_distances

dist_matrix = pairwise_distances(fdf, distance='max')

# Clip flows within a polygon
clipped = fdf.clip(polygon_mask)

# Select flows within bounds
within_bounds = fdf.within(bounds_box)

Spatial Clustering Scale Detection (K/L Functions)

# Calculate K function for spatial clustering
from geoflowkit import k_func, l_func

r_list, kr_list = k_func(fdf, dr=0.1, k=1)
r_list, lr_list = l_func(fdf, dr=0.1, k=1)

# Local L function for individual flows
from geoflowkit import local_l_func

llrs = local_l_func(fdf, r=0.5)

Grid Aggregation

# Divide study area into grid and aggregate flows
gridded = fdf.to_grid(delta_x=0.1, delta_y=0.1)

Visualization

# Plot flows as arrows
ax = fdf.plot(kind='arrow', column='value')

# Plot FlowSeries
ax = fs.plot()

Flow Clustering

# K-medoid clustering
from geoflowkit import kmedoid

labels = kmedoid(fdf, n_clusters=5)

# DBSCAN clustering
from geoflowkit import dbscan

labels = dbscan(fdf, eps=0.5, min_samples=5)

Manifold Learning (FTSNE)

from geoflowkit import FTSNE

# Global interpretability (separate O and D)
transformer = FTSNE(perplexity=200, learning_rate='auto')
X_embedded = transformer.fit_transform(
    fdf,
    identity={'o': 0, 'd': 1}
)

# Local interpretability (union O and D)
X_embedded = transformer.fit_transform(
    fdf,
    union={('o', 'd'): (0, 1)}
)

Location Centrality (I-index)

The I-index quantifies the irreplaceability of a location based on flows, combining flow volume and flow length into a single metric following the H-index principle.

from geoflowkit.spatial import i_index

# Calculate I-index for each zone
result = i_index(fdf, zones)

# Using origin points instead of destination
result = i_index(fdf, zones, od_type='o')

# With custom alpha parameter
result = i_index(fdf, zones, alpha=1000.0)

I-index definition: The I-index of a location is the maximum value of i such that at least i flows with a length of at least α × i meters have reached this location. Higher values indicate more irreplaceable locations that attract many long-distance flows.

Examples

Jupyter notebook examples are available in the examples/ folder:

basic_usage.ipynb - Basic usage of Flow, FlowSeries, and FlowDataFrame
clustering.ipynb - K-medoid and DBSCAN clustering for flow data
kl_function.ipynb - K/L functions for spatial clustering detection
ft_sne.ipynb - FTSNE manifold learning for flow data
centrality.ipynb - I-index for location irreplaceability

API Reference

Classes

Flow: Geometry object representing an origin-destination pair
FlowSeries: pandas Series subclass for storing Flow objects
FlowDataFrame: pandas DataFrame subclass with Flow geometry column

Key Functions

flows_from_od(o, d, crs=None): Create FlowSeries from coordinate arrays
flows_from_geometry(geometry, crs=None): Create FlowSeries from geometry objects
read_csv(file_path, use_cols, crs=None, **kwargs): Read flow data from CSV
read_file(file_path, **kwargs): Read flow data from vector file
pairwise_distances(fdf, distance='max', ...): Calculate flow distance matrix
k_func(fdf, dr, k=1, distance='max', ...): K function for spatial clustering detection
l_func(fdf, dr, k=1, distance='max', ...): L function for spatial clustering detection
local_l_func(fdf, r, distance='max', ...): Local L function for individual flows
kmedoid(fdf, n_clusters=5, ...): K-medoid clustering for flows
dbscan(fdf, eps=0.5, min_samples=5, ...): DBSCAN clustering for flows
i_index(fdf, zones, alpha=None, od_type='d', ...): I-index for location irreplaceability

License

GeoFlowKit is licensed under the MIT License.

Contact

For questions or feedback: djw@lreis.ac.cn

Name		Name	Last commit message	Last commit date
Latest commit History 40 Commits
examples		examples
geoflowkit		geoflowkit
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GeoFlowKit

Overview

Installation

Dependencies

Quick Start

Creating Flow Objects

Creating FlowSeries

Creating FlowDataFrame

Reading Data from Files

Core Features

Flow Properties

Spatial Operations

Spatial Clustering Scale Detection (K/L Functions)

Grid Aggregation

Visualization

Flow Clustering

Manifold Learning (FTSNE)

Location Centrality (I-index)

Examples

API Reference

Classes

Key Functions

License

Contact

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

GeoFlowKit

Overview

Installation

Dependencies

Quick Start

Creating Flow Objects

Creating FlowSeries

Creating FlowDataFrame

Reading Data from Files

Core Features

Flow Properties

Spatial Operations

Spatial Clustering Scale Detection (K/L Functions)

Grid Aggregation

Visualization

Flow Clustering

Manifold Learning (FTSNE)

Location Centrality (I-index)

Examples

API Reference

Classes

Key Functions

License

Contact

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages