Skip to content

Editor "Load version" scrambles pattern stops across routes — bug in gtfs-lib dependency (PatternFinder.createPatternObjects) #648

@canales

Description

@canales

Summary

When loading a published GTFS version into the GTFS editor ("Load version"), every pattern in the feed receives the stop sequence of a different pattern, crossing route boundaries. Shapes display correctly; only stop sequences are affected. The corruption is silent — no error or warning is surfaced. The bug is deterministic and reproduces on every load of any multi-route feed.

Note: The fix does not belong in this repo. The root cause is in ibi-group/gtfs-lib at PatternFinder.createPatternObjects(). Filing here because ibi-group/gtfs-lib does not have issues enabled. Resolution requires a patch to gtfs-lib and a dependency bump in datatools-server.


Steps to Reproduce

  1. Any published feed with 2+ routes, each with 2+ trip patterns
  2. Click Load version on the published version and confirm
  3. Open any route in the editor → Trip patterns tab
  4. Inspect the stop sequence of any pattern

Expected: stops match the pattern's shape and the published stop_times

Actual: stops belong to a different pattern, typically from a different route


Root Cause (in ibi-group/gtfs-lib)

PatternFinder.createPatternObjects() includes the following logic:

boolean usePatternsFromFeed =
    patternsFromFeed.size() == tripsForPattern.keySet().size();

if (usePatternsFromFeed) {
    pattern.pattern_id =
        patternsFromFeed.get(patternsFromFeedIndex).pattern_id; // ← BY INDEX
}

When pattern counts match, file-loaded pattern IDs are assigned to derived patterns by array index position, not by content-based match. The two lists have incompatible orderings:

List Sort order
patternsFromFeed pattern_id ascending
tripsForPattern (LinkedHashMultimap) first trip occurrence in trips.txt

These orderings are unrelated for any real-world feed. Every pattern_stop row ends up written with the pattern_id of a different pattern. Since the patterns table is not recreated when usePatternsFromFeed = true, it retains correct route/shape metadata — but the pattern_stops reference wrong pattern_ids. No SQL constraint catches this; the corruption is silent and semantic.

The code comment at the assignment site explicitly acknowledges the problem:

"There is no viable relationship between patterns that are loaded from a feed (patternsFromFeed) and patterns generated here."


Proof

For a feed with 2 routes (Route A, Route B), each with 2 patterns:

patternsFromFeed order (ascending pattern_id):

Index pattern_id Route Shape
0 "1" Route A shape X
1 "2" Route A shape Y
2 "3" Route B shape W
3 "5" Route B shape Z

Note: gap at "4" — a previously deleted pattern left a non-contiguous sequence, a common real-world condition.

tripsForPattern order (trip file order):

Index Route Shape
0 Route B shape Z
1 Route B shape W
2 Route A shape X
3 Route A shape Y

Positional assignment result — 0 out of 4 correct:

pattern_id written Stops stored patterns table says
"1" Route B / shape Z stops Route A / shape X
"2" Route B / shape W stops Route A / shape Y
"3" Route A / shape X stops Route B / shape W
"5" Route A / shape Y stops Route B / shape Z

Both mismatches verified programmatically against a real GTFS feed and confirmed in the live editor.


Suggested Fix (in ibi-group/gtfs-lib)

Replace positional index assignment with a shape_id-keyed lookup in PatternFinder.createPatternObjects():

// Build map before the loop
Map<String, Pattern> filePatternsByShape = new HashMap<>();
for (Pattern p : patternsFromFeed) {
    filePatternsByShape.put(p.shape_id, p);
}

// Inside the loop — replace positional with content-keyed match
if (usePatternsFromFeed) {
    String shapeId = pattern.associatedShapes.isEmpty() ? null
        : pattern.associatedShapes.iterator().next();
    Pattern filePattern = filePatternsByShape.get(shapeId);
    if (filePattern != null) {
        pattern.pattern_id = filePattern.pattern_id;
        pattern.name = filePattern.name;
    } else {
        pattern.pattern_id = Integer.toString(nextPatternId++);
    }
}

shape_id is the natural stable key — it is written to both trips.txt and datatools_patterns.txt at export time, making it the correct join key between the two lists.


Related

  • Secondary issue also identified in gtfs-lib: shape_id is not part of TripPatternKey.equals(), so trips with the same stops but different shapes are merged into one pattern with non-deterministic shape assignment (associatedShapes.iterator().next() on a HashSet). Lower severity but worth addressing separately.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions