Skip to content

Refactor#289

Open
FlxPo wants to merge 12 commits intomainfrom
refactor
Open

Refactor#289
FlxPo wants to merge 12 commits intomainfrom
refactor

Conversation

@FlxPo
Copy link
Copy Markdown
Contributor

@FlxPo FlxPo commented Mar 27, 2026

Motivation

The package and class structure has drifted over time through accumulated PRs. Many of those PRs introduced local changes, new classes, and new files without enough attention to the overall project structure, naming conventions, and core concepts. It's getting difficult to follow the logic of the package, fix bugs, or add new features.

Related issues:

Changes

This PR is a big refactor that redefines the core concepts of the package, and groups the files by conceptual proximity. The modelling workflow remains the same overall: define transport zones, generate a population, derive activity schedules from surveys, compute transport costs, and simulate mobility behaviors iteratively.

The package now uses submodules to have a folder structure that is easier to understand and navigate for users. Even if most previous concepts still exist, a lot of classes were renamed, broken up and moved around, so breaking changes are to be expected.

The project is now organized around the following concepts, aligned as much as possible with mobility and transport science concepts :

  • mobility.spatial: spatial definition of the model. This module now groups the classes and helpers used to define the study area and its zoning: StudyArea, TransportZones, local administrative units and categories, administrative boundaries, and the osm submodule for raw OpenStreetMap extracts and borders. The R script used to prepare transport zones also lives here.

  • mobility.population: population generation and localization. This module contains the main Population class together with the data assets it depends on, such as CityLegalPopulation and CensusLocalizedIndividuals.

  • mobility.surveys: survey-based behavioural inputs. This module contains the generic MobilitySurvey abstraction, the MobilitySurveyAggregator, and country-specific implementations under surveys.france (EMP, ENTD).

  • mobility.activities: activity concepts and their opportunity models. We used to call them "motives" because the french mobility survey has a trip oriented terminology, but I think it makes sense to organize our concepts around the activity schedule concept, which is now central in Mobility. The common Activity base class is now separated from the concrete activities Home, Other, Work, Study, Shop, and Leisure. Activity-specific subpackages group each activity class with the data assets that describe its spatial opportunities and flows, for example jobs for work, schools for studies, shops for shopping, and leisure facilities for leisure.

  • mobility.transport: transport supply and generalized costs. This module is split into three complementary parts:

    • transport.modes: transport mode definitions such as Walk, Bicycle, Car, Carpool, and PublicTransport, plus shared mode concepts in core, choice utilities in choice, and GTFS / intermodal public transport logic under public_transport.
    • transport.graphs: the graph-building pipeline, organized by graph state: simplified, modified, congested, contracted, plus core graph utilities.
    • transport.costs: travel cost computation and aggregation, including routing and generalized-cost parameters, path travel costs, congestion-related OD flow assets, and the TransportCostsAggregator.
  • mobility.trips: trip and day-plan generation. This module is now split between:

    • trips.individual_year_trips, for annual trip sampling at the individual level;
    • trips.group_day_trips, for the iterative grouped day-plan model. Inside group_day_trips, the code is further organized by role: core for the main GroupDayTrips / Run logic and parameters, plans for activity, destination and mode sequence construction, iterations for persisted iteration state, transitions for congestion and transition metrics, and evaluation for result analysis.
  • mobility.impacts: post-processing of transport impacts. This module groups the carbon-related logic, including carbon_computation, default GHG values, and the ADEME Base Carbone integration.

  • mobility.runtime: technical infrastructure that supports the domain modules. This module now isolates the package internals that are not mobility concepts themselves: cached Asset abstractions (asset, file_asset, in_memory_asset), file download and I/O helpers, R integration, and bundled static resources used at runtime.

  • At the package root, the remaining files are limited to package entry points and configuration: __init__.py exposes the public API, and config.py contains set_params and environment setup.

Breaking changes

  • The internal package layout was heavily reorganized. Top-level imports are still exposed where possible (for example mobility.Car), but direct imports based on the previous file structure will likely need to be updated.
  • Many previous internal module paths were regrouped under new domain-oriented subpackages, especially under mobility.spatial, mobility.activities, mobility.transport, mobility.trips, mobility.population, mobility.surveys, mobility.impacts, and mobility.runtime.
  • Old internal import paths under motives, parsers, transport_modes, transport_graphs, transport_costs, study_area, and transport_zones were reorganized and may need to be updated.
  • PopulationTrips becomes GroupDayTrips, which is a wrapper around a weekday Run and an optional weekend Run. Each Run exposes outputs and evaluation methods similar to those previously associated with PopulationTrips.
  • Trips becomes IndividualYearTrips.
  • Motive classes become Activity classes.
  • Some concrete class names were also simplified, for example CarMode becomes Car, and activity classes now use names such as Home, Work, Study, Shop, Leisure, and Other.
  • Some dead code and legacy scripts were removed from the package entirely, or moved to the top-level experiments/ folder when they were kept for reference.
  • set_params is still available as mobility.set_params, but the underlying module is now mobility.config instead of mobility.set_params for direct imports.

Example

From the quickstart :

import os
import dotenv
import mobility
from mobility.trips.group_day_trips import Parameters

dotenv.load_dotenv()

mobility.set_params(
    package_data_folder_path=os.environ["MOBILITY_PACKAGE_DATA_FOLDER"],
    project_data_folder_path=os.environ["MOBILITY_PROJECT_DATA_FOLDER"]
)

# Using Foix (a small town) and a limited radius for quick results
transport_zones = mobility.TransportZones("fr-09122", radius=10.0)

# Using EMP, the latest national mobility survey for France
survey = mobility.EMPMobilitySurvey()

# Creating a synthetic population of 1000 for the area
population = mobility.Population(transport_zones, sample_size=1000)

# Simulating trips for this population for car, walk, bicycle
population_trips = mobility.GroupDayTrips(
    population=population,
    modes=[
        mobility.Car(transport_zones),
        mobility.Walk(transport_zones),
        mobility.Bicycle(transport_zones),
    ],
    activities=[
        mobility.Home(),
        mobility.Work(),
        mobility.Other(population=population),
    ],
    surveys=[survey],
    parameters=Parameters(
        n_iterations=1,
        mode_sequence_search_parallel=False,
    ),
)

# You can get weekday plan steps to inspect them
weekday_plan_steps = population_trips.get()["weekday_plan_steps"].collect()

# You can compute global metrics for weekday trips
global_metrics = population_trips.weekday_run.evaluate("global_metrics")

# You can plot weekday OD flows, with labels for prominent cities
weekday_results = population_trips.weekday_run.results()
labels = weekday_results.get_prominent_cities()
weekday_results.plot_od_flows(labels=labels)

# You can get a report of the parameters used in the model
report = population_trips.parameters_dataframe()

AI-assisted contribution

Select one:

  • AI used for substantial parts of this PR

If substantial, briefly describe:

  • Scope of AI-assisted content: splitting up classes, reorganizing methods within and across classes, naming suggestions.
  • What you reviewed or changed: all AI generated changes were reviewed to align with the package style. The package is organized around mobility concepts like transport zones or modes, materialized or in memory input and output assets that are linked together by a dependency graph and a caching system. Changes were made to the names of the classes, methods, variable names, code flow, to make the intent of the code as clear as possible for modellers, whore are not often not professional developers).
  • How you validated it (tests, checks, manual verification): model runs for test cases (from a clean cache), local tests with the unit / integration test suite. Local tests are still ongoing to check that we don't have regressions.

Checklist

  • I have reviewed all code in this PR
  • I understand the code and can maintain it
  • I added or ran appropriate tests/checks for the changed behavior

@FlxPo FlxPo self-assigned this Mar 27, 2026
@FlxPo FlxPo added the enhancement New feature or request label Mar 27, 2026
@codecov
Copy link
Copy Markdown

codecov bot commented Mar 30, 2026

Codecov Report

❌ Patch coverage is 82.58391% with 275 lines in your changes missing coverage. Please review.
✅ Project coverage is 73.25%. Comparing base (79769a7) to head (45400aa).

Files with missing lines Patch % Lines
mobility/trips/group_day_trips/core/results.py 43.96% 144 Missing ⚠️
...ips/group_day_trips/iterations/iteration_assets.py 80.80% 19 Missing ⚠️
.../carpool/detailed/detailed_carpool_travel_costs.py 33.33% 18 Missing ⚠️
...ity/trips/group_day_trips/iterations/iterations.py 81.81% 16 Missing ⚠️
...lity/trips/group_day_trips/core/group_day_trips.py 81.53% 12 Missing ⚠️
...detailed/detailed_carpool_travel_costs_snapshot.py 55.00% 9 Missing ⚠️
...ips/group_day_trips/plans/destination_sequences.py 90.52% 9 Missing ⚠️
...lity/transport/costs/transport_costs_aggregator.py 92.78% 7 Missing ⚠️
mobility/transport/costs/path/path_travel_costs.py 85.71% 6 Missing ⚠️
mobility/trips/group_day_trips/core/run.py 95.48% 6 Missing ⚠️
... and 15 more
Additional details and impacted files
@@             Coverage Diff             @@
##             main     #289       +/-   ##
===========================================
+ Coverage   59.06%   73.25%   +14.18%     
===========================================
  Files         129      143       +14     
  Lines        6738     6068      -670     
===========================================
+ Hits         3980     4445      +465     
+ Misses       2758     1623     -1135     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Copy Markdown
Contributor

@Mind-the-Cap Mind-the-Cap left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, a great improvement! Thanks @FlxPo for the huge work on that.

I'm approving but I would greatly prefer if we kept CarMode, PublicTransportMode, BicycleMode, etc. and WorkActivity, etc., not just Car and Work. It would me much clearer what theses classes are about (modes and activities structured in the very same way). It would avoid future confusion (out of my mind: would Train be a mode specifically for trains, a very precise activity of going to sport trainings, or a future class for machine learning?)

I would be keen to have more precise docstrings in many cases (I highlighted some of them). It would avoid some documentation debt!

congestion_state: CongestionState | None = None,
):

# Hack to match the current API and compute the GHG emissions from
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have an issue for that todo?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not yet ! For a better separation of concerns, I think we should put impact calculations in specific classes or functions in the new impacts folder.

.with_columns(exp_u=pl.col("cost").neg().exp())
.with_columns(prob=pl.col("exp_u")/pl.col("exp_u").sum().over(["from", "to"]))

# Keep only the first 99.9 % of the distribution
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be interesting to have a parameter to control this percentage? Do we know the effect of using 99% or 99.99% instead?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes we could have a parameter. The idea is to prune modes with very low probability.

iteration=None,
mode_name: str,
):
"""Persist congestion flows as a period-scoped snapshot asset."""
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I woudl understand better with a more detailed asset.
What's unclear to me:

  • what is a snapshot in that context?
  • what means period-scoped, and which period?
  • what means "persist", notably as the function is called "create"?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Naming could be improved, I agree ! I dropped all "snapshot" terms in #296 in favor of more explicit terms. Period scoped in this context means the value of flows for a given iteration. They are persisted in a "snapshot" asset at each iteration because we need to be able to resume calculations from the middle of a run.

from mobility.transport.modes.core.transport_mode import TransportMode


class GroupDayTrips:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find this name quite unclear to be fair!

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes we talked about PopulationGroupsTrips in #155. I wanted to add the concept of time period to differentiate between this population level run for week days and weekends from the individual level run over a year. So I tried PopulationGroupsDayTrips, DemandGroupsDayTrips, PopSegmentDayTrips... But I was trying to avoid long class names for users, so went with GroupDayTrips.

Any suggestions ?

from ..transitions.transition_metrics import state_waterfall as _state_waterfall


class RunResults:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should probably be a little more precise!

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you be more specific ?



@dataclass
class RunState:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would be great to have a little doc on this class!

rng_state: object


class Iteration:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this name might be confusing as we have other kinds of interations (notably for congestion)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe RunIteration ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants