Detailed overview of quality metrics

Detailed overview of the quality metrics

Quality metrics to weed out noise units

Spikes that come from real neurons (as opposed to being artifacts) have stereotypical waveform shapes that typically include a set number of peaks (1-2) and troughs (1), waveforms with set durations, and a waveform with spatial decay across channels. We use the following waveform-based metrics to weed-out artefactual units.

Number of waveform peaks

nPeaks

We detect waveform peaks using the template waveform for each unit on the peak channel for that unit (channel with maximum absolute value). Any peak with a minimum prominence of `param.minThreshDetectPeaksTroughs` multiplied by the waveform maximum absolute value are detected, using MATLAB's `findpeaks` function. If this function returns no peak, the maximum value is taken as peak.

Units with a template waveform containing more than detected peaks (qMetric.nPeaks) than param.maxNPeaks are classified as noise.

Note: for units that are recorded in brain regions with atypical numbers of peaks and troughs (like the complex spikes of Purkinje cells in the cerebellum), adjust the default parameters.

Number of waveform troughs

nTroughs

We detect waveform troughs using the template waveform for each unit on the peak channel for that unit (channel with maximum absolute value). Any trough with a minimum prominence of `param.minThreshDetectPeaksTroughs` multiplied by the waveform maximum absolute value are detected, using MATLAB's `findpeaks` function. If this function returns no trough, the minimum value is taken as peak.

Units with a template waveform containing more detected troughs (qMetric.nTroughs) than param.maxNTroughs are classified as noise.

Note: for units that are recorded in brain regions with atypical numbers of peaks and troughs (like the complex spikes of Purkinje cells in the cerebellum), adjust the default parameters.

Waveform peak-to-trough duration

nTroughs

To calculate a waveform's peak-to-trough duration (in us), we take the time samples (divided by `param.ephys_sample_rate` and multiplied by `1e6`) either: - if the waveform has a trough that is bigger any of the peaks: from the waveforms biggest trough to the biggest peak that occurs *after* the trough - if the waveform has a trough that is smaller any of the peaks: from the waveforms biggest peak to the biggest trough that occurs *after* the peak.

Units with peak-to-trough template waveform durations (qMetric.waveformDuration_peakTrough) of less than param.minWvDuration us or more than param.maxWvDuration are classified as noise.

Waveform spatial decay

On high-density probes, like Neuropixels, recording sites are tightly spaced together (typically between 15 and 20um). Thanks to this, each neuron is recorded on multiple sites, that are at different distances from the neuron. The sites closest to the neuron will record the highest amplitude spikes, and spikes' amplitudes will decrease as sites get further away from the neuron. This property if often not true in the case of artefactual units.

We compute each units spatial decay by taking the waveform's absolute maximum value for each unit, on the peak channel and channels up to 75-100 um away from the peak channel. Then if param.spDecayLinFit is true, we use MATLAB's function polyfit to fit a first-order polynomial to these points and extract the slope of this fit. Otherwise, we use an exponential fit (this is the preferred method).

If using a linear fit, units with a spatial decay slope (qMetric.spatialDecaySlope) smaller than param.minSpatialDecaySlope (a.u.)/um are classified as noise (image below).

If using an exponential fit units with a spatial decay slope (qMetric.spatialDecaySlope) smaller than param.minSpatialDecaySlopeExp or greater than param.maxSpatialDecaySlopeExp (a.u.)/um are classified as noise.

Waveform baseline 'flatness'

nPeaks

Empirically, we noticed that artefactual units often had a 'non-flat' baseline (see below). To quantify this, we compute how large the biggest value in the window param.waveformBaselineWindowStart:param.waveformBaselineWindowStop is relative to the waveforms largest value.

Units with a maximum waveform baseline value (qMetric.waveformBaselineFlatness) exceeding a fraction of the waveforms maximum absolute value by param.maxWvBaselineFraction are classified as noise.

Waveform 'repolarization peak'-to-trough ratio

Units with a 'repolarization peak'-to-trough ratio greater than param.maxScndPeakToTroughRatio is classified as noise.

Quality metrics to classify non-somatic units

Somatic waveforms are defined as waveforms where the largest trough precedes the largest peak, and the trough is larger than the peak (Deligkaris, Bullmann & Frey, 2016). If param.somatic is equal to 1, the waveforms that meet these criteria are classified into a separate non-somatic category.

Waveform 'main peak'-to-trough ratio

Units with a 'main peak'-to-trough ratio greater than param.minTroughToPeakRatio are classified as non-somatic. To explain what we mean by the different peaks and troughs, here are two example waveforms with labels:

Waveform 'first peak'-to-'repolarization peak' ratio

Units with a 'first peak'-to-'repolarization peak' ratio greater than param.maxPeak1ToPeak2Ratio_nonSomatic, as well as a first peak of width smaller than param.minWidthFirstPeak_nonSomatic samples, a ttough of width smaller than param.minWidthTroughnonSomatic samples and main peak-to-trough ratio greater than param.minTroughToPeak2Ratio_nonSomatic) are classified as non-somatic.

Quality metrics to weed out multi-units

After classifying noise units, the remaining units are classified as good single units if the criteria below are met. They are classifying as multi-units otherwise.

Number of spikes

Below a certain amount of spikes, ephys properties like ACGs will not be reliable. A good minimum to use is 300 empirically, because Kilosort2 does not attempt to split any clusters that have less than 300 spikes in the post-processing phase.

If a unit has less than param.minSpikes spikes, it is classified as multi-unit activity.

Percentage of spikes below detection threshold

The percent of spikes missing (false negatives) is estimated by fitting a gaussian the distribution of amplitudes, with a cutoff parameter. This assumes the spike amplitudes follow a gaussian distribution, which is not strictly true for bursty cells, like for instance Medium Spiny Neurons in the striatum.

Optionally, if param.computeTimeChunks is true, the recording is split in time chunks of size param.deltaTimeChunk, and the percentage of spikes missing (and the fraction of refractpry period violations -see below) is estimated separately on these time chunks. Then, only the time chunks with a percent of spikes missing and fraction of refractory period violations (see below) below the threshold are kept, and the rest of the quality metrics are computed on these epochs. These time points are stored in qMetric.useTheseTimesStart and qMetric.useTheseTimesStop. This can be of use in cases where there is a lot of drift during the recording. All subsequent metrics are computed only on these time chunks.

Two additional metrics are calculated, but not currently used:

qMetric.ksTest_pValue gives the output p-value of a Kolmogorov–Smirnov test
qMetric.percentageSpikesMissing_symmetric estimates the percentage of spikes missing a distribution of spike amplitudes symmetric on both sides of its mode.

If the estimated percentage of spikes missing (qMetric.percentageSpikesMissing_gaussian) is above the param.maxPercSpikesMissing, the unit is classified as multi-unit activity.

Below: example of unit with many spikes below the detection threshold in the first two time chunks of the recording.

Refractory period violations

We have two possible methods to compute rate of refractory period violations:

If param.hillOrLlobetMethod is true, the Hill et al. method is used. In brief, the rate of refractory period violations (false positives) is estimated using r = 2*(tauR - tauC) * N^2 * (1-Fp) * Fp / T , solving for Fp, with tauR the refractory period, tauC the censored period, T the total experiment duration, r the number of refractory period violations, Fp the rate of contamination. method from Hill et al., J. Neuro, 2011. tauC is defined in param.tauC and tauR is defined in param.tauR_valuesMin, in seconds.
If param.hillOrLlobetMethod is false, the Llobet et al. method is used. Hill et al's solution is partially incorrect because they used an expression from Meunier et al (2003) that assumed contaminating spikes came from a single neuron with a refractory period. The correct expression, derived in Llobet et al (bioRxiv 2022), accounts for contamination from true Poisson processes like electrical noise or multiple nearby neurons. In practice, below a rate of 30, both methods are highly correlated and agree. You can see in the plot below how the two are correlated in one dataset (each dot is one unit).

Optionally, if param.computeTimeChunks is true, the recording is split in time chunks of size param.deltaTimeChunk, and the rate of refractory period violations (and the percentage of spikes missing - see above) estimated separately on these time chunks. Then, only the time chunks with a rate of refractory period violation and percentage of spikes missing (see above) below the threshold are kept, and the rest of the quality metrics are computed on these epochs. These time points are stored in qMetric.useTheseTimesStart and qMetric.useTheseTimesStop. This can be of use in cases where there is a lot of drift during the recording. All subsequent metrics are computed only on these time chunks.

Additionally, if param.tauR_valuesMax is larger than param.tauR_valuesMin, the rate refractory violations is calculated for several possible tauR values, ranging from param.tauR_valuesMin to param.tauR_valuesMax in steps of param.tauR_valuesStep. This matrix (units by tauR value) of rate of refractory period values is saved seperatly from the other quality metrics, in a templates._bc_fractionRefractoryPeriodViolationsPerTauR.parquet file. The tauR value used to calculate the rate of refractory periods to classify a unit is defined as the tauR value which gives the smallest estimate.

If the estimated rate refractory period violations (qMetric.fractionRPVs_estimatedTauR) is above param.maxRPVviolations, the unit is classified as multi-unit activity.

Below: examples of a unit with a small rate of refractory period violations (left) and one with a large rate (right).

Drift estimate

If param.computeDrift is equal to 1, we estimate the amount of drift based on a metrics described in Seigle et al., 2021:

To compute the maximum drift for one unit, the peak channel was calculated from the top principal components of every spike. Next, the peak channel values are binned in 51-s intervals, and the median value is calculated across all spikes in each bin (assuming at least 10 spikes per bin). The maximum drift is defined as the difference between the maximum peak channel and the minimum peak channel across all bins. The average maximum drift across all units is used to identify sessions with a high amount of probe motion relative to the brain.

Each units cumulative drift is also calculated, and stored in qMetric.cumDriftEstimate. This metric is not currently in use.

Units with a maximum drift (qMetric.maxDriftEstimate) larger than param.maxDrift are classified as multi-unit activity.

Bombcell can uses anotherhelpful tool to assess drift, and can also define time chunks of low drift: optionally, if param.computeTimeChunks is true, the recording is split in time chunks of size param.deltaTimeChunk, and the fraction of refractory period violations and percentage of spikes missing is estimated separately on these time chunks. Then, only the time chunks with a fraction of refractory period violation and percentage of spikes missing below the threshold are kept, and the rest of the quality metrics are computed on these epochs. These time points are stored in qMetric.useTheseTimesStart and qMetric.useTheseTimesStop. This can be of use in cases where there is a lot of drift during the recording. All subsequent metrics are computed only on these time chunks.

Presence ratio

The presence ratio is the fraction of time chunks that contain recordings from this unit, with a chunk considered "present" if it has at least 5% of the maximum spike count seen in any chunk.

Units with a presence ratio qMetric.presenceRatio smaller than param.minPresenceRatio are classified as multi-unit activity.

Mean raw waveform amplitude

Units with low amplitude are noisy, further away units, that are more likely to be multi-unit activity. If param.extractRaw is equal to 1, bombcell extracts param.nRawSpikesToExtract spikes of width param.spikeWidth (in samples) from the raw binary file located at param.rawFile. Bombcell then uses information in the recording's .meta or .oebin meta file located at param.ephysMetaFile to get amplitude values in uV.

If a unit's raw mean waveform maximum amplitude is below param.minAmplitude, it is classified as multi-unit activity.

Below: examples of a unit with high amplitude (blue) and one with low amplitude (red).

Signal-to-noise ratio

If param.extractRaw is equal to 1, bombcell extracts param.nRawSpikesToExtract spikes of width param.spikeWidth (in samples) from the raw binary file located at param.rawFile. A signal-to-noise ratio is calculated by taking each units maximum waveform value (on its peak channel) and dividing that by the variance across its raw extracted waveform baselines. The waveform baseline is defined by param.waveformBaselineNoiseWindow (in samples).

Units with a signal-to-noise ratio (qMetric.signalToNoiseRatio) below param.minSNR are classified as multi-unit activity.

Distance metrics: isolation distance, L-ratio and silhouette score

If param.computeDistanceMetrics is equal to 1, bombcell computes a few distance metrics. This is the most time-consuming part of the script. See Harris et al., Neuron, 2001 for more information on isolation distance, (see Schmitzer-Torbert and Redish, 2004) for more information on l-ratio and (see Rousseeuw, 1987 for more information on silhouette-score.

Note that these metrics take a long time to compute, and they are not very robust to recordings with a lot of drift (as is the case for most acute Neuropixels recordings). For this reason, these distance metrics are disabled by default. Empirically, we have found that good proxies for how well a unit is isolated from others is the unit's mean raw waveform amplitude and signal-to-noise ratio. Neurons that are larger/closer to the probe tend to be well isolated.

If a unit has distance metrics below param.minIsoD for the isolation distance, above param.lratioMax for l-ratio, or above param.ssMax for the silhouette score it is classified as multi-unit activity.

Below: examples of a unit with high isolation distance (left) and one with low isolation distance (right).

💣 Any issues? To get support, create a github issue, create a pull request or write a message on the the Neuropixels slack workgroup.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly