Skip to content

aggregate_channels does not re-index per-channel identifier columns #4546

@h-mayorquin

Description

@h-mayorquin

I came here thinking about these two: SpikeInterface/probeinterface#425 and SpikeInterface/probeinterface#420.

aggregate_channels concatenates per-channel arrays verbatim across children, including columns whose values are local integer identifiers. Two such columns are affected: the top-level group property, and the probe_index field inside contact_vector. Neither is shifted per child, so values from different children end up sharing the same namespace in the aggregate. A group value of 0 after aggregation can refer to shank 0 of any original recording, and a probe_index value of 0 in the combined contact_vector no longer identifies a unique probe, which is why ProbeGroup.from_numpy at read time collapses distinct probes into a single synthetic one.

from spikeinterface.core import generate_recording, aggregate_channels

rec_A = generate_recording(num_channels=4, durations=[1.0], set_probe=False).rename_channels(["a0", "a1", "a2", "a3"])
rec_A.set_property("group", [0, 0, 1, 1])

rec_B = generate_recording(num_channels=4, durations=[1.0], set_probe=False).rename_channels(["b0", "b1", "b2", "b3"])
rec_B.set_property("group", [0, 0, 1, 1])

combined = aggregate_channels([rec_A, rec_B])

for group_id, sub in combined.split_by("group").items():
    print(f"group {group_id}: {list(sub.get_channel_ids())}")

Expected: four groups, one per shank per probe.

Observed on main:

group 0: ['a0', 'a1', 'b0', 'b1']
group 1: ['a2', 'a3', 'b2', 'b3']

Group 0 now mixes shank 0 of probe A with shank 0 of probe B, and group 1 mixes shank 1 of probe A with shank 1 of probe B. Any per-shank pipeline built on split_by("group") silently operates on cross-probe mixtures.

The probe_index case inside contact_vector has the same mechanism: aggregating two single-probe recordings (each with probe_index = 0) produces a combined contact_vector whose probe_index column is all zeros, so combined.get_probegroup() reconstructs a single merged probe regardless of how many distinct probes fed in. If the children also share contact_ids (common, since probes typically number contacts from 0), the collapsed namespace makes ProbeGroup.from_numpy raise ValueError: contact_ids must be unique within a Probe instead of silently returning the merged probe.

The same mechanism affects any per-probe-local integer property set by extractors (for example IBL's shank, shank_row, shank_col, adc, index_on_probe; Maxwell's electrode; Biocam's row, col): after aggregation these all share a single namespace across children.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions