I came here thinking about these two: SpikeInterface/probeinterface#425 and SpikeInterface/probeinterface#420.
aggregate_channels concatenates per-channel arrays verbatim across children, including columns whose values are local integer identifiers. Two such columns are affected: the top-level group property, and the probe_index field inside contact_vector. Neither is shifted per child, so values from different children end up sharing the same namespace in the aggregate. A group value of 0 after aggregation can refer to shank 0 of any original recording, and a probe_index value of 0 in the combined contact_vector no longer identifies a unique probe, which is why ProbeGroup.from_numpy at read time collapses distinct probes into a single synthetic one.
from spikeinterface.core import generate_recording, aggregate_channels
rec_A = generate_recording(num_channels=4, durations=[1.0], set_probe=False).rename_channels(["a0", "a1", "a2", "a3"])
rec_A.set_property("group", [0, 0, 1, 1])
rec_B = generate_recording(num_channels=4, durations=[1.0], set_probe=False).rename_channels(["b0", "b1", "b2", "b3"])
rec_B.set_property("group", [0, 0, 1, 1])
combined = aggregate_channels([rec_A, rec_B])
for group_id, sub in combined.split_by("group").items():
print(f"group {group_id}: {list(sub.get_channel_ids())}")
Expected: four groups, one per shank per probe.
Observed on main:
group 0: ['a0', 'a1', 'b0', 'b1']
group 1: ['a2', 'a3', 'b2', 'b3']
Group 0 now mixes shank 0 of probe A with shank 0 of probe B, and group 1 mixes shank 1 of probe A with shank 1 of probe B. Any per-shank pipeline built on split_by("group") silently operates on cross-probe mixtures.
The probe_index case inside contact_vector has the same mechanism: aggregating two single-probe recordings (each with probe_index = 0) produces a combined contact_vector whose probe_index column is all zeros, so combined.get_probegroup() reconstructs a single merged probe regardless of how many distinct probes fed in. If the children also share contact_ids (common, since probes typically number contacts from 0), the collapsed namespace makes ProbeGroup.from_numpy raise ValueError: contact_ids must be unique within a Probe instead of silently returning the merged probe.
The same mechanism affects any per-probe-local integer property set by extractors (for example IBL's shank, shank_row, shank_col, adc, index_on_probe; Maxwell's electrode; Biocam's row, col): after aggregation these all share a single namespace across children.
I came here thinking about these two: SpikeInterface/probeinterface#425 and SpikeInterface/probeinterface#420.
aggregate_channelsconcatenates per-channel arrays verbatim across children, including columns whose values are local integer identifiers. Two such columns are affected: the top-levelgroupproperty, and theprobe_indexfield insidecontact_vector. Neither is shifted per child, so values from different children end up sharing the same namespace in the aggregate. Agroupvalue of 0 after aggregation can refer to shank 0 of any original recording, and aprobe_indexvalue of 0 in the combinedcontact_vectorno longer identifies a unique probe, which is whyProbeGroup.from_numpyat read time collapses distinct probes into a single synthetic one.Expected: four groups, one per shank per probe.
Observed on
main:Group 0 now mixes shank 0 of probe A with shank 0 of probe B, and group 1 mixes shank 1 of probe A with shank 1 of probe B. Any per-shank pipeline built on
split_by("group")silently operates on cross-probe mixtures.The
probe_indexcase insidecontact_vectorhas the same mechanism: aggregating two single-probe recordings (each withprobe_index = 0) produces a combinedcontact_vectorwhoseprobe_indexcolumn is all zeros, socombined.get_probegroup()reconstructs a single merged probe regardless of how many distinct probes fed in. If the children also sharecontact_ids(common, since probes typically number contacts from0), the collapsed namespace makesProbeGroup.from_numpyraiseValueError: contact_ids must be unique within a Probeinstead of silently returning the merged probe.The same mechanism affects any per-probe-local integer property set by extractors (for example IBL's
shank,shank_row,shank_col,adc,index_on_probe; Maxwell'selectrode; Biocam'srow,col): after aggregation these all share a single namespace across children.