Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 3 additions & 2 deletions .github/workflows/claude-code-review.yml
Original file line number Diff line number Diff line change
@@ -1,8 +1,9 @@
name: Claude Code Review

on:
pull_request:
types: [opened, synchronize, ready_for_review, reopened]
workflow_dispatch:
# pull_request:
# types: [opened, synchronize, ready_for_review, reopened]
# Optional: Only run on specific file changes
# paths:
# - "src/**/*.ts"
Expand Down
212 changes: 212 additions & 0 deletions CLAUDE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,212 @@
# CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

## Overview

`flownet` is an R package for transport modeling that implements network processing, route enumeration, and traffic assignment algorithms. The package is maintained by CPCS transport consultants and provides high-performance tools through a combination of R, fastverse packages, and custom C implementations.

## Development Commands

### Package Building and Testing

```r
# Build and install the package
devtools::install()

# Run tests
devtools::test()

# Run specific test file
testthat::test_file("tests/testthat/test-assignment.R")

# Check package (like R CMD check)
devtools::check()

# Build documentation
devtools::document()

# Build vignettes
devtools::build_vignettes()
```

### C Code Compilation

The package includes custom C implementations in `src/` for performance-critical operations:
- `path_sized_logit.c` - Core path-sized logit algorithm
- `utils.c` - Utility functions for path operations
- `init.c` - Registration of C functions

After modifying C code:
```r
devtools::clean_dll() # Clean compiled objects
devtools::load_all() # Recompile and reload
```

### Testing Commands

```bash
# Run R CMD check from terminal
R CMD build . && R CMD check flownet_*.tar.gz

# Run tests with coverage
Rscript -e "covr::package_coverage()"
```

## Architecture

### Core Algorithms

The package centers on two traffic assignment methods:

1. **All-or-Nothing (AoN)**: Fast assignment that allocates all flow to shortest paths. Implementation uses batched shortest path computation grouped by origin nodes for efficiency.

2. **Path-Sized Logit (PSL)**: Sophisticated stochastic assignment considering multiple alternative routes with overlap correction. The algorithm:
- Enumerates candidate routes using distance matrices and geographic filtering
- Filters routes by detour factor (`detour.max`) and angle constraints (`angle.max`)
- Computes path-size factors to penalize overlapping routes
- Assigns flows probabilistically based on generalized costs

### Route Enumeration Strategy

The PSL method uses a two-stage approach to avoid computing implausible paths:

1. **Pre-selection**: Uses precomputed distance matrices to identify promising intermediate nodes based on total cost (origin→intermediate→destination)
2. **Geographic filtering**: When `angle.max` is specified and coordinates are available, filters nodes using the triangle equation with geodesic distances
3. **Path computation**: Only computes actual paths for pre-selected candidates, then filters duplicates

This strategy dramatically reduces computational cost compared to enumerating all possible paths.

### Network Processing Pipeline

Typical workflow for processing spatial networks:

1. **`linestrings_to_graph()`**: Convert sf LINESTRING geometries to graph data frames with node coordinates (FX, FY, TX, TY)
2. **`create_undirected_graph()`**: Normalize edge directions and aggregate bidirectional links
3. **`consolidate_graph()`**: Contract intermediate nodes (degree-2 nodes) recursively, merging edges while preserving important nodes
4. **`simplify_network()`**: Further reduce network size using either:
- Shortest-paths method: Keep only edges traversed by shortest paths between key nodes
- Cluster method: Spatially cluster nodes using leaderCluster and contract graph

### Parallelization

The package uses `mirai` for asynchronous parallelism:
- Work is split across OD-pairs and distributed to daemon processes
- Each daemon processes a subset independently
- Results are aggregated after collection
- The `nthreads` parameter controls the number of parallel workers

### C Integration

Performance-critical operations are implemented in C and called via `.Call()`:
- `C_compute_path_sized_logit`: Core PSL computation including overlap detection and probability calculation
- `C_check_path_duplicates`: Detect paths with duplicate edges (invalid routes)
- `C_assign_flows_to_paths`: Batch flow assignment for AoN method
- `C_mark_edges_traversed`: Track edge usage for simplify_network
- `C_set_vector_elt`: Efficient list element assignment

## Code Organization

### Main Source Files

- **`R/assignment.R`** (802 lines): Contains `run_assignment()` and both AoN and PSL core functions. The PSL implementation handles distance matrix chunking for large networks and coordinates with C functions for path overlap calculations.

- **`R/utils.R`** (1273 lines): Network processing functions including:
- Graph conversion utilities (`linestrings_to_graph`, `nodes_from_graph`, etc.)
- `consolidate_graph()`: Recursive node contraction with sophisticated degree tracking
- `simplify_network()`: Two methods for network reduction
- `melt_od_matrix()`: OD matrix format conversion

- **`R/data.R`**: Documentation for included datasets (Africa network, cities, trade flows)

### C Source Files

- **`src/path_sized_logit.c`**: Implements path-size factor computation and flow assignment
- **`src/utils.c`**: Helper functions for path operations
- **`src/init.c`**: C function registration for .Call interface

### Tests

Tests are organized by functionality in `tests/testthat/`:
- `test-assignment.R`: Traffic assignment methods
- `test-graph-utils.R`: Graph utility functions
- `test-network-processing.R`: Network conversion and processing
- `test-od-matrix.R`: OD matrix operations
- `test-consolidation.R`: Graph consolidation

## Dependencies

### Core Dependencies
- **collapse** (≥ 2.1.5): Fast data transformations, used extensively for grouping, aggregation, and memory-efficient operations
- **igraph** (≥ 2.1.4): Shortest path algorithms via Dijkstra
- **sf** (≥ 1.0.0): Spatial data handling for LINESTRING networks
- **geodist** (≥ 0.1.1): Fast haversine distance calculations for geographic filtering
- **leaderCluster** (≥ 1.5.0): Efficient spatial clustering for network simplification
- **mirai** (≥ 2.5.2): Asynchronous parallelism
- **kit** (≥ 0.0.21): Fast tabulation and vectorized operations

## Key Implementation Details

### Distance Matrix Strategy

The package uses adaptive distance matrix computation:
- If network size ≤ `sqrt(dmat.max.size)`, precompute full distance matrix once
- Otherwise, compute in chunks as needed during OD-pair iteration
- Separate geodesic distance matrices are used for angle-based filtering

### Graph Representation

Graphs are represented as data frames with:
- `from`, `to`: Node IDs (integers)
- `FX`, `FY`, `TX`, `TY`: Node coordinates (for spatial operations)
- `edge`: Edge identifier (optional, regenerated by many functions)
- Cost/attribute columns (e.g., `duration`, `cost`, `distance`)

Internally, igraph is used for shortest path computation, but the primary data structure is a data frame for flexibility and integration with fastverse tools.

### Node Consolidation Algorithm

The `consolidate_graph()` function uses a sophisticated multi-pass approach:
1. Drop loop edges, duplicates, and singleton edges (optional)
2. Identify degree-2 nodes (or nodes with deg_from=1 and deg_to=1 for directed graphs)
3. For undirected graphs, orient edges so intermediate nodes appear as "from" in one edge and "to" in another
4. Merge edges through intermediate nodes, tracking groups via `gid` vector
5. Aggregate edge attributes using collapse::collap()
6. Repeat recursively if `recursive = "full"` until no more consolidation possible

The `by` parameter allows preserving mode/type boundaries by preventing consolidation across different link characteristics.

## Working with Spatial Data

The package integrates with sf for spatial operations:

```r
# Typical pattern for mapping OD zones to network nodes
nodes <- nodes_from_graph(graph, sf = TRUE)
nearest_nodes <- nodes$node[st_nearest_feature(od_zones, nodes)]
```

Coordinate columns (FX, FY, TX, TY) are preserved through most operations and can be used to convert back to sf LINESTRING objects with `linestrings_from_graph()`.

## Performance Considerations

- Use `method = "AoN"` for large networks when route alternatives are not needed (much faster)
- Adjust `detour.max` and `angle.max` to control PSL computation time (lower values = faster)
- Set `unique.cost = TRUE` to deduplicate routes with same total cost
- Use `dmat.max.size` to control memory usage for large networks
- Enable multithreading with `nthreads` for large OD matrices
- Consider consolidating and simplifying networks before assignment to reduce computational burden

## Package Structure

This is a standard R package with:
- DESCRIPTION: Package metadata and dependencies
- NAMESPACE: Exported functions (managed by roxygen2)
- R/: R source code
- src/: C source code with compiled .so/.dll
- man/: Documentation (auto-generated from roxygen2 comments)
- tests/testthat/: Test suite
- vignettes/: Package vignettes
- data/: Included datasets (Africa network, cities, trade)

Use roxygen2 for documentation - add `#'` comments above functions and run `devtools::document()` to update NAMESPACE and man/ files.
2 changes: 1 addition & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
Package: flownet
Type: Package
Title: Transport Modeling: Network Processing, Route Enumeration, and Traffic Assignment
Version: 0.2.1.9000
Version: 0.2.2
Authors@R: c(person("Sebastian", "Krantz", email = "sebastian.krantz@graduateinstitute.ch", role = c("aut", "cre")),
person("Kamol", "Roy", role = "ctb"))
Description: High-performance tools for transport modeling - network processing, route
Expand Down
1 change: 0 additions & 1 deletion NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -89,7 +89,6 @@ importFrom(kit,fpmin)
importFrom(kit,iif)
importFrom(leaderCluster,leaderCluster)
importFrom(mirai,daemons)
importFrom(mirai,everywhere)
importFrom(mirai,mirai_map)
importFrom(progress,progress_bar)
importFrom(sf,st_as_sf)
Expand Down
4 changes: 3 additions & 1 deletion NEWS.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,9 @@
# flownet 0.2.1.9000
# flownet 0.2.2

- Fixed issue in `consolidate_graph()` which used to modify columns (`from` and `to` in-place). Users in older versions are advised to input a `data.table::copy()` of the graph to retain it.

- Fixes issue with multithreading for newer versions of *mirai* (or R). Thanks @kent37 (#69).

# flownet 0.2.1

- `angle.max` constraint in `run_assignment()` is now two-sided (angle measured from origin and destination node against the straight line between them), rather than just one-sided (from origin). Also, the implementation is slightly more efficient.
Expand Down
7 changes: 2 additions & 5 deletions R/assignment.R
Original file line number Diff line number Diff line change
Expand Up @@ -232,7 +232,7 @@
#' @importFrom kit fpmin fpmax
#' @importFrom igraph V graph_from_data_frame delete_vertex_attr igraph_options distances shortest_paths vcount ecount
#' @importFrom geodist geodist_vec
#' @importFrom mirai mirai_map daemons everywhere
#' @importFrom mirai mirai_map daemons
#' @importFrom progress progress_bar
run_assignment <- function(graph_df, od_matrix_long,
directed = FALSE,
Expand Down Expand Up @@ -661,13 +661,10 @@ run_assignment <- function(graph_df, od_matrix_long,
if(!is.finite(nthreads) || nthreads <= 1L) {
res$final_flows <- run_assignment_core(seq_len(N), verbose, TRUE)
} else {
envir <- environment()
# Split OD matrix in equal parts
ind <- sample.int(as.integer(nthreads), N, replace = TRUE)
ind_list <- gsplit(g = if(is_aon) sort(ind) else ind) # Since AoN should reduce calls to shortest_paths()
daemons(n = nthreads - 1L)
# Pass current environment dynamically
everywhere({}, envir)
# Now run the map in the background
res_other <- mirai_map(ind_list[-1L], run_assignment_core)
# Runs the first instance in the current session
Expand Down Expand Up @@ -697,7 +694,7 @@ run_assignment <- function(graph_df, od_matrix_long,
}
}
res$final_flows <- final_flows
rm(res_other, envir, ind_list, final_flows)
rm(res_other, ind_list, final_flows)
}

if(anyNA(od_pairs)) {
Expand Down
Loading