Skip to content

aggregate_channels does not re-index per-channel identifier columns #4546

@h-mayorquin

Description

@h-mayorquin

I came here thinking about these two: SpikeInterface/probeinterface#425 and SpikeInterface/probeinterface#420.

aggregate_channels concatenates per-channel arrays verbatim across children, including columns whose values are local integer identifiers. Two such columns are affected: the top-level group property, and the probe_index field inside contact_vector. Neither is shifted per child, so values from different children end up sharing the same namespace in the aggregate. A group value of 0 after aggregation can refer to shank 0 of any original recording, and a probe_index value of 0 in the combined contact_vector no longer identifies a unique probe, which is why ProbeGroup.from_numpy at read time collapses distinct probes into a single synthetic one.

from spikeinterface.core import generate_recording, aggregate_channels

rec_A = generate_recording(num_channels=4, durations=[1.0], set_probe=False).rename_channels(["a0", "a1", "a2", "a3"])
rec_A.set_property("group", [0, 0, 1, 1])

rec_B = generate_recording(num_channels=4, durations=[1.0], set_probe=False).rename_channels(["b0", "b1", "b2", "b3"])
rec_B.set_property("group", [0, 0, 1, 1])

combined = aggregate_channels([rec_A, rec_B])

for group_id, sub in combined.split_by("group").items():
    print(f"group {group_id}: {list(sub.get_channel_ids())}")

Expected: four groups, one per shank per probe.

Observed on main:

group 0: ['a0', 'a1', 'b0', 'b1']
group 1: ['a2', 'a3', 'b2', 'b3']

Group 0 now mixes shank 0 of probe A with shank 0 of probe B, and group 1 mixes shank 1 of probe A with shank 1 of probe B. Any per-shank pipeline built on split_by("group") silently operates on cross-probe mixtures.

The probe_index case inside contact_vector has the same mechanism: aggregating two single-probe recordings (each with probe_index = 0) produces a combined contact_vector whose probe_index column is all zeros, so combined.get_probegroup() reconstructs a single merged probe regardless of how many distinct probes fed in. If the children also share contact_ids (common, since probes typically number contacts from 0), the collapsed namespace makes ProbeGroup.from_numpy raise ValueError: contact_ids must be unique within a Probe instead of silently returning the merged probe.

The same mechanism affects any per-probe-local integer property set by extractors (for example IBL's shank, shank_row, shank_col, adc, index_on_probe; Maxwell's electrode; Biocam's row, col): after aggregation these all share a single namespace across children.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions