The estimation of auditory evoked potentials requires deconvolution when the duration of the responses to be recovered exceeds the inter-stimulus interval. Based on least squares deconvolution, in this article we extend the procedure to the case of a multi-response convolutional model, that is, a model in which different categories of stimulus are expected to evoke different responses. The computational cost of the multi-response deconvolution significantly increases with the number of responses to be deconvolved, which restricts its applicability in practical situations. In order to alleviate this restriction, we propose to perform the multi-response deconvolution in a reduced representation space associated with a latency-dependent filtering of auditory responses, which provides a significant dimensionality reduction. We demonstrate the practical viability of the multi-response deconvolution with auditory responses evoked by clicks presented at different levels and categorized according to their stimulation level. The multi-response deconvolution applied in a reduced representation space provides the least squares estimation of the responses with a reasonable computational load. matlab/Octave code implementing the proposed procedure is included as supplementary material.
I. INTRODUCTION
Auditory Evoked Potentials (AEPs) are commonly applied to study the auditory system in the context of both research and clinical perspectives (Burkard and Don, 2007). Due to the low amplitude of AEP responses and the presence of background noise, the estimation of evoked responses from the electroencephalogram (EEG) usually includes a repetitive presentation of stimuli and the synchronous averaging of epochs (Thornton, 2007).
Synchronous averaging requires the inter-stimulus interval (ISI) to be longer than the response duration in order to avoid overlapping adjacent responses. The conventional protocols for AEP recording suggest appropriate configuration of the EEG filtering, response duration, and maximum stimulation rate specific for the estimation of auditory brainstem responses (ABR), middle latency responses (MLR), or cortical auditory evoked potentials (CAEPs) in order to avoid overlapping of responses (Hall, 2007).
If the response duration is longer than the ISI (because of the high stimulation rate) a deconvolution-based estimation (instead of synchronous averaging) is required to obtain the responses (Bohorquez and Ozdamar, 2006; Eysholdt and Schreiner, 1982; Valderrama , 2014b). Deconvolution methods for recovering AEP responses include maximum length sequences (MLS) (Eysholdt and Schreiner, 1982; Thornton and Slaven, 1993), adjacent-responses (ADJAR) (Woldorff, 1993), quasi-periodic sequence deconvolution (QSD) (Jewett , 2004), continuous loop averaging deconvolution (CLAD) (Bohorquez and Ozdamar, 2006; Ozdamar and Bohorquez, 2006), linear deconvolution for baseline correction (LDBC) (Lütkenhöner, 2010), randomized stimulation and averaging (RSA) (Valderrama , 2012), iterative randomized stimulation and averaging (IRSA) (de la Torre , 2019; Valderrama , 2014b; Valderrama , 2016) and randomized stimulation with least squares deconvolution (RSLSD) (Bardy , 2014a; Bardy , 2014b; Bardy , 2014c; de la Torre , 2019). Alternatively, the use of uncorrelated stimulation sequences (i.e., with ISI following a Poisson distribution) allows to deconvolve the responses by synchronous average (Maddox and Lee, 2018; Polonenko and Maddox, 2019, 2022; Stoll and Maddox, 2023). However, for specific experimental configurations and ISI distributions, and because of short- and long-term adaptation mechanisms (Valderrama , 2014a; Valderrama , 2016), the responses associated with different ISI values could be significantly different, which would lead to an inconsistent convolutional model and to an inaccurate estimation of the responses. For this reason, the use of a Poisson ISI distribution (with a large spread in the ISI values) makes it recommendable to verify that the magnitude of the ISI effect is moderate (as analyzed in Maddox and Lee, 2018; Polonenko and Maddox, 2019, 2022).
While some deconvolution methods impose constraints to the stimulation pattern (like MLS or CLAD), others (like IRSA and RSLSD) provide much more flexibility in the stimulus design, because they only require the autocorrelation matrix of the stimulation sequence to be invertible (Bardy , 2014a; Bardy , 2014c; de la Torre , 2019; de la Torre , 2022; Valderrama , 2014a; Valderrama , 2016). Thanks to this flexibility, audiological experiments with more ecologically-valid stimuli (i.e., more closely related to natural stimulation) are possible (Burkard , 2018; de la Torre , 2019; Finneran , 2019; Martinez , 2021; Valderrama , 2014b; Valderrama , 2019; Valderrama , 2016). Deconvolution-based AEP estimation methods allow the study of neural adaptation mechanisms (Gillespie and Muller, 2009; Thornton and Coleman, 1975; Thornton and Slaven, 1993; Valderrama , 2014b), estimation of the response from extensive portions of the auditory pathway (de la Torre , 2020; de la Torre , 2022; Holt and Ozdamar, 2016; Kohl , 2019) and the analysis of the neurophysiological response to complex stimuli more natural than repetitive sequences of clicks, like natural speech (Maddox and Lee, 2018; Valderrama , 2019) or responses associated with binaural activity (Martinez , 2021).
The convolutional model assumes the hearing system to be linear and time invariant (LTI), which is an acceptable approach only under certain circumstances (Lütkenhöner, 2016; Valderrama , 2014a). Under specific conditions (for example, if a stimulation pattern is presented repetitively, always at the same stimulation level) the responses evoked by each stimulus are expected to be identical and the hearing system can be considered LTI-like. However, if the expected response is not always the same, the behavior of the hearing system cannot be considered LTI-like (for example, if the stimuli are presented at different levels, the latency and amplitude of the waves change according to the stimulation level). Under a multi-response convolutional model (Bardy , 2014a; Bardy , 2014b; Valderrama , 2016), different categories of stimuli can be defined, each category evoking a specific response. An appropriate categorization of the stimuli (for example, a categorization based on stimulation level) provides an LTI-like behavior of the hearing system without the requirement of repetitive stimulation patterns. This categorization increases the flexibility in the design of audiological experiments (Bardy, 2022; Bardy , 2014a; Bardy , 2014b; Bardy , 2014c; Martinez , 2021; Metzger , 2020; Valderrama , 2019; Valderrama , 2016). In the multi-response paradigm, the deconvolution problem consists in the estimation of the response associated with each category of stimuli (Bardy , 2014a; Bardy , 2014b; Bardy , 2014c; Dale, 1999).
Convolution can be mathematically described in terms of matrix formulation. If the response length is J samples, deconvolution involves the inversion of a J × J matrix. Multi-response deconvolution can also be described as a matrix-inversion problem. If the model involves M independent responses, each one with a length of J samples, the size of the matrix to be inverted is . Therefore, the computational cost of multi-response deconvolution increases very fast with the number of responses (since the complexity of matrix algorithms increases faster than for matrices; Davie and Stothers, 2013), both if the deconvolution is performed as a matrix division (Bardy , 2014a) or via an iterative procedure (Martinez , 2021; Valderrama , 2019; Valderrama , 2016).
The size of the matrix to be inverted (and therefore, the computational cost) can significantly be reduced by the application of the recently proposed latency-dependent filtering and downsampling (LDFDS) procedure to the AEP responses (de la Torre , 2020). LDFDS is based on the local bandwidth required to adequately represent each component of the AEP, progressively decreasing as the latency increases. A filtering method that is locally adapted to the required bandwidth reduces the noise of the responses, and the locally-adapted downsampling provides a substantial reduction of the dimensionality required to represent the evoked responses. This idea is particularly useful for processing the complete auditory pathway response (i.e., including brainstem, middle latency, and cortical components simultaneously): for example, the complete response usually requiring around 10 000 samples at a constant sampling-rate (one second sampled at 10 kHz) can be correctly represented after LDFDS with around 120 samples (three decades of latency represented with 40 samples per decade) (de la Torre , 2020). As proposed by Lütkenhöner (2016) and de la Torre (2022), the deconvolution problem can be constrained to a reduced representation space (defined according to the expected behavior of the responses). The deconvolution constrained to the reduced representation space provided by LDFDS provides AEP responses less affected by noise (thanks to the latency-dependent filtering) and substantially reduces the computational cost (thanks to the dimensionality reduction) (de la Torre , 2022).
Based on the previous studies about deconvolution constrained to a reduced representation space and about multi-response deconvolution, in this work, we combine both concepts and propose the multi-response deconvolution constrained to the reduced representation space provided by LDFDS. As in the case of the single-response deconvolution, the proposed subspace-constrained deconvolution provides the least squares estimation of the evoked response (de la Torre , 2022). Additionally, the dimensionality reduction, from J (typically several thousands) to Jr (around one hundred), substantially reduces the size of the matrix to be inverted, , which significantly reduces the computational cost associated with the deconvolution. Therefore, multi-response AEP experiments which are usually difficult (or even impossible) to be deconvolved in the complete representation space, can easily be deconvolved in the reduced representation space.
This article presents the mathematical formulation of multi-response deconvolution performed in the reduced representation space. We demonstrate the viability of the proposed method using auditory responses evoked by clicks with randomized stimulation levels (Martinez , 2023), in which the stimulus presentation level is uniformly distributed in the interval 0–80 dB normal hearing level (nHL), and the multi-response model is defined by categorizing the stimuli according to their level.
II. SUBSPACE-CONSTRAINED MULTI-RESPONSE DECONVOLUTION
A. Least-squares deconvolution
Therefore, the LS estimation of the evoked response requires the synchronous averaging of the EEG ( ) and the inversion of the normalized autocorrelation matrix of the stimulation sequence (Rs). This LS estimation can be obtained by matrix division (as proposed in RSLSD; Bardy , 2014a; Bardy , 2014b; Bardy , 2014c) or by an iterative procedure (as proposed by IRSA; de la Torre , 2019). It is important to note that, according to Eq. (4), the synchronous average is a biased estimator of the response. The unbiased estimation of the response requires the application of the inverse of the autocorrelation matrix in order to compensate for the effect of the response overlapping. Additionally, the application of the inverted matrix could amplify some components of the noise in the estimated response, due to low eigenvalues in Rs. The condition number of the matrix (i.e., the ratio of the largest to the smallest eigenvalues, in absolute value) provides an approximate evaluation of the magnitude of both effects (the bias of the synchronous average and the potential amplification of specific components of the noise).
B. Multi-response deconvolution
This multi-response deconvolution can be obtained by matrix division (Bardy , 2014a), or alternatively, with an iterative procedure (Valderrama , 2016). However, the numerical algorithms for matrix division (or the proposed iterative alternatives) show a computational complexity growing faster than (being Nd the number of dimensions, in this case, J × M), which limits the practical use of multi-response deconvolution if the length of the response J or the number of responses M are large.
C. Deconvolution in a reduced representation space
The LDFDS matrix is proposed as an orthonormal projector (the rows of the Vr matrix are orthonormal), which preserves the metrics (i.e., the distances and energies) in the reduced subspace and allows the recovery of the latency-dependent filtered response in the original representation (at the original uniform sampling rate and with J components) by applying the transpose of the Vr matrix to the response in the subspace representation (de la Torre , 2020).
In summary, subspace-constrained LS deconvolution of the EEG can be obtained with the following steps: (a) the autocorrelation matrix Rs and the synchronous averaging of the EEG are transformed to the subspace using the transformation Vr; (b) the autocorrelation matrix in the reduced representation is inverted; and (c) the inverted reduced autocorrelation matrix is applied to the reduced synchronous averaging. The LS deconvolution in Eq. (16) is similar to that in Eq. (4), with the difference that the problem is solved in the reduced representation space, and therefore it involves the vectors and of Jr components (instead of J), and the inversion of the matrix Rsr [instead of the inversion of the matrix Rs], which provides a substantial reduction of computational cost (de la Torre , 2022). Again, this subspace-constrained LS deconvolution can be implemented via matrix division or by an iterative procedure.
D. Multi-response deconvolution in a reduced representation space
As previously described, multi-response deconvolution is usually difficult to implement due to the high dimensionality (J × M) involved in the matrix division. On the other hand, subspace-constrained deconvolution provides a substantial dimensionality reduction (from J, typically several thousands, to Jr, around one hundred). The obvious solution for a practical implementation of multi-response deconvolution consists in its formulation in a Jr dimension reduced representation space, i.e., involving a dimensionality in the matrix division. This section provides the mathematical formulation of multi-response deconvolution constrained to the reduced representation space.
The supplementary material includes a detailed explanation of the matrix operations involved in single- and multi-response convolution and deconvolution (supplementary material Sec. 1); matlab/Octave code (scripts and functions) implementing the subspace-constrained multi-response deconvolution proposed in this manuscript with some examples and simulations (supplementary material Sec. 2); and the extension of the subspace-constrained multi-response deconvolution for the case in which a specific orthonormal projector Vm is applied to each response (supplementary material Sec. 3).
III. EXPERIMENTS AND RESULTS
The proposed subspace-constrained multi-response deconvolution has been evaluated using auditory responses evoked by clicks presented at different levels and categorized according to the stimulation level. The purpose of the experiments is to demonstrate the viability of multi-response deconvolution performed in a reduced representation and to compare this method with the multi-response deconvolution performed in the complete representation space.
A. Experimental design
In these AEP experiments, the stimulation consisted of rarefaction clicks of 0.1 ms, presented at a randomized stimulation rate with an ISI in the range 15–30 ms (average stimulation rate of 44.44 stim/s), and with a random stimulation level with continuous uniform distribution in the range 0–80 dB nHL. The 0 dB nHL reference level was estimated as described in Martinez (2023), i.e., as the mean threshold level estimated in a sample of ten normal-hearing adults (five female, 23–38 years) who presented pure-tone threshold levels within the normal range in the 0.5–8 kHz frequency range and had no history of any type of auditory dysfunction. The clicks were presented diotically through ER-3A insert earphones (providing a flat response in the range 0.1–4 kHz) with a tube length of around 28 cm.
The EEGs were recorded using surface active electrodes (MettingVanRijn , 1996) located at the forehead (Fz, active), the two mastoids (Tp9 and Tp10, reference), and middle forehead (Fp1, common mode sense; Fp2, driven right-leg), using a Biosemi ActiveTwo AD-box and the ActiView EEG acquisition software. The EEG was digitized in bipolar mode (recommended configuration for ABR) at a sampling rate of 16 384 Hz with 24 bits/sample. The bipolar mode includes a bandpass filtering (100–3300 Hz bandwidth) which has been extended (with digital filtering using an appropriate inverse filter) to the range 20–3300 Hz in order to allow the acquisition of MLR responses. The EEG to be deconvolved has been obtained as the signal at the active channel (Fz) minus the average of both reference channels (Tp9 and Tp10). The EEG database contains recordings from eight subjects (aged 23–51 years, two female). All participants met the inclusion criteria of reporting no hearing difficulties and absence of a history of auditory dysfunction. Each EEG recording consists of 1890 s (i.e., 31.5 min) and includes the responses to a total of 84 000 stimuli.
The AEP responses include 200 ms of transient after the presentation of the stimuli, and therefore the length of the responses is J = 3277 samples. Taking into account that most of the energy of the evoked responses is below ten oscillations per decade of latency, the latency-dependent filtering and downsampling are performed using a resolution of 40 samples/decade (de la Torre , 2020), which preserves the frequency components below 3.37 kHz, 628 and 68.8 Hz at latencies of 1, 10, and 100 ms, respectively, and also reduces the dimensionality of the responses in the subspace to Jr = 91 components.
In these multi-response deconvolution experiments, in addition to the deconvolution of the responses, the matrix involved in the matrix division has been analyzed in terms of its condition number in order to evaluate potential problems associated with the matrix inversion or the matrix division problem [a large condition number reports ill-conditioned deconvolution with potential amplification of noise in the estimated responses (Bardy , 2014a)].
The multi-response deconvolutions performed in both the complete and the reduced representation spaces are compared taking into account the deconvolved responses, the condition number of the matrices involved in the deconvolution, and the computational cost, which is evaluated in terms of execution time, measured using a desktop computer with an Intel-Core i7-3770 CPU, 3.40 GHz, 8.00 GB RAM running the algorithms in matlab.
B. Multi-response deconvolution in the complete and the reduced representation space
Figure 1 represents the AEP responses obtained with multi-response deconvolution in subject 1 (the latency axis is represented in logarithmic scale in order to facilitate the simultaneous visualization of the ABR and MLR components). The figure includes the responses obtained with the multi-response conventional deconvolution (i.e., in the complete representation space) and with the subspace-constrained deconvolution (i.e., in the reduced representation space). These responses are estimated from the same EEG. In this example, the multi-response deconvolution has been performed by categorizing the stimuli into ten groups according to the stimulation level (in ranges of 8 dB) and each response has been obtained from approximately 8400 stimuli. In these plots, the expected changes in the responses (i.e., in the amplitude and latency of the components) can be observed as a function of the stimulation intensity.
There are several important differences between the deconvolution performed in the complete or the reduced representation space. First, the latency-dependent filtering applied in the second case reduces the high frequency noise according to the local bandwidth, which is more evident for later latencies (the noise reduction provided by the latency dependent filtering looks like a simple low-pass filtering due to the logarithmic compression of the latency axis in the figure). Second, there is a substantial reduction of the execution time when the deconvolution is performed in the reduced representation space: in this example, with ten categories, the dimension of the multi-response in the complete representation space is 32 770, while it is only 910 in the reduced representation space, and the deconvolution took 1064.2 s in the complete representation space, while only 30.3 s in the reduced representation space. Finally, the eigenvalues of the matrix to be inverted could not be computed in the case of the complete representation space due to a memory overflow error (the allocation of a 32 770 × 32 770 matrix requires 8.0 GB of free memory).
Figure 2 shows the AEP responses for subject 1, for different categorizations based on the stimulation level, using between 4 and 32 categories (intervals between 20 and 2.5 dB). All these traces correspond to deconvolutions performed in the reduced representation space, and have been obtained from the same EEG (like those in Fig. 1). The total number of events (84 000 with random stimulation level) are distributed among the different categories (21 000, 10 500, 5250, and 2625 events per category for 4, 8, 16, and 32 categories, respectively) according to the stimulation level. Each categorization and multi-response deconvolution was performed in an independent procedure after the EEG recording. These figures show that, as the intensity range decreases (the number of categories increases), the resolution in intensity improves, but fewer stimuli are involved in the estimation of each trace, and therefore the traces are more affected by noise. On the other hand, the categorization with wide intervals implies some spread of the latencies and amplitudes for the individual responses included in each category, making the LTI-like assumption weak and causing an amplitude reduction in the estimated responses, as observed for the intervals of 20 dB. Supplementary material (Sec. 4) includes similar plots for more categorizations (2, 3, 4, 5, 6, 8, 10, 12, 16, 20, 24, and 32 categories), for the multi-response deconvolution performed in the reduced representation space, and in the complete representation space for those cases in which it could be computed without a memory overflow error (i.e., ten categories or less).
Table I compares the computational cost when the deconvolution is performed in the complete or the reduced representation space. For each categorization of the stimuli (from 2 to 32 categories), the number of dimensions involved in the deconvolutions is specified, as well as the execution time for the estimation of the responses (including the initialization time, the deconvolution time, and the total execution time), the computation time devoted to the estimation of the eigenvalues, and the condition number of the matrix involved in the deconvolution. These results correspond to the deconvolution of the responses for subject 1. Figure 3 compares the execution times for the multi-response deconvolution performed in both the complete and the reduced representation space.
Experiment . | Complete representation space . | Subspace-constrained deconvolution . | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Numb. . | Interval . | Numb. . | Execution time (seconds) . | Cond. . | Numb. . | Execution time (seconds) . | Cond. . | ||||||
categ. . | (dB) . | dimen. . | init. . | deconv. . | total . | eigenval. . | number . | dimen. . | init. . | deconv. . | total . | eigenval. . | number . |
2 | 40.0 | 6554 | 7.80 | 2.54 | 10.34 | 104.05 | 230.0 | 182 | 7.15 | 0.0004 | 7.15 | 0.0085 | 220.7 |
3 | 26.7 | 9831 | 9.95 | 7.28 | 17.23 | 318.37 | 246.8 | 273 | 8.87 | 0.0008 | 8.87 | 0.0133 | 220.7 |
4 | 20.0 | 13 108 | 12.48 | 15.97 | 28.45 | 723.91 | 241.0 | 364 | 10.70 | 0.0017 | 10.70 | 0.0255 | 220.7 |
5 | 16.0 | 16 385 | 15.86 | 33.01 | 48.88 | 8642.96 | 256.2 | 455 | 12.99 | 0.0024 | 12.99 | 0.0547 | 220.8 |
6 | 13.3 | 19 662 | 20.60 | 59.16 | 79.76 | mem. overflow | 546 | 15.98 | 0.0049 | 15.99 | 0.0784 | 220.8 | |
8 | 10.0 | 26 216 | 31.20 | 273.21 | 304.41 | mem. overflow | 728 | 22.78 | 0.0067 | 22.79 | 0.2239 | 220.8 | |
10 | 8.0 | 32 770 | 46.17 | 1018.04 | 1064.21 | mem. overflow | 910 | 30.29 | 0.0120 | 30.30 | 0.2933 | 220.9 | |
12 | 6.7 | 39 324 | memory overflow | mem. overflow | 1092 | 38.68 | 0.0218 | 38.70 | 0.5386 | 220.9 | |||
16 | 5.0 | 52 432 | memory overflow | mem. overflow | 1456 | 56.90 | 0.0531 | 56.95 | 1.1333 | 221.1 | |||
20 | 4.0 | 65 540 | memory overflow | mem. overflow | 1820 | 77.56 | 0.1038 | 77.67 | 2.4461 | 221.1 | |||
24 | 3.3 | 78 648 | memory overflow | mem. overflow | 2184 | 97.57 | 0.1114 | 97.68 | 3.6501 | 221.1 | |||
32 | 2.5 | 1 04 864 | memory overflow | mem. overflow | 2912 | 135.48 | 0.2279 | 135.71 | 7.8493 | 221.5 |
Experiment . | Complete representation space . | Subspace-constrained deconvolution . | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Numb. . | Interval . | Numb. . | Execution time (seconds) . | Cond. . | Numb. . | Execution time (seconds) . | Cond. . | ||||||
categ. . | (dB) . | dimen. . | init. . | deconv. . | total . | eigenval. . | number . | dimen. . | init. . | deconv. . | total . | eigenval. . | number . |
2 | 40.0 | 6554 | 7.80 | 2.54 | 10.34 | 104.05 | 230.0 | 182 | 7.15 | 0.0004 | 7.15 | 0.0085 | 220.7 |
3 | 26.7 | 9831 | 9.95 | 7.28 | 17.23 | 318.37 | 246.8 | 273 | 8.87 | 0.0008 | 8.87 | 0.0133 | 220.7 |
4 | 20.0 | 13 108 | 12.48 | 15.97 | 28.45 | 723.91 | 241.0 | 364 | 10.70 | 0.0017 | 10.70 | 0.0255 | 220.7 |
5 | 16.0 | 16 385 | 15.86 | 33.01 | 48.88 | 8642.96 | 256.2 | 455 | 12.99 | 0.0024 | 12.99 | 0.0547 | 220.8 |
6 | 13.3 | 19 662 | 20.60 | 59.16 | 79.76 | mem. overflow | 546 | 15.98 | 0.0049 | 15.99 | 0.0784 | 220.8 | |
8 | 10.0 | 26 216 | 31.20 | 273.21 | 304.41 | mem. overflow | 728 | 22.78 | 0.0067 | 22.79 | 0.2239 | 220.8 | |
10 | 8.0 | 32 770 | 46.17 | 1018.04 | 1064.21 | mem. overflow | 910 | 30.29 | 0.0120 | 30.30 | 0.2933 | 220.9 | |
12 | 6.7 | 39 324 | memory overflow | mem. overflow | 1092 | 38.68 | 0.0218 | 38.70 | 0.5386 | 220.9 | |||
16 | 5.0 | 52 432 | memory overflow | mem. overflow | 1456 | 56.90 | 0.0531 | 56.95 | 1.1333 | 221.1 | |||
20 | 4.0 | 65 540 | memory overflow | mem. overflow | 1820 | 77.56 | 0.1038 | 77.67 | 2.4461 | 221.1 | |||
24 | 3.3 | 78 648 | memory overflow | mem. overflow | 2184 | 97.57 | 0.1114 | 97.68 | 3.6501 | 221.1 | |||
32 | 2.5 | 1 04 864 | memory overflow | mem. overflow | 2912 | 135.48 | 0.2279 | 135.71 | 7.8493 | 221.5 |
Since the representation of each response requires 3277 samples in the complete representation space but only 91 in the reduced representation space, there is a substantial reduction in the computational cost associated with the multi-response deconvolution, particularly as the number of categories increases. When the deconvolution is performed in the complete representation space, most of the computational cost is dedicated to the matrix division (except for categories 2 and 3). However, in the reduced representation space, the matrix division computational cost is ridiculous compared with that of the initialization (less than 0.2% in the worst case). The initialization time involves the computation of the cross-correlations (between the EEG and the stimulation sequences in order to estimate , and among the stimulation sequences in order to estimate ). The initialization time is similar in the complete and reduced representation spaces, even though it is slightly smaller in the reduced representation space because the allocation of memory for requires more resources than that for .
In addition to the unpractical execution time when the multi-response deconvolution is performed in the complete representation space, for 12 categories or more the size of the matrix involved in the deconvolution is too large and cannot be allocated in memory (leading to a memory overflow error).
The time devoted to the computation of the eigenvalues is also reported in Table I. While deconvolution requires a matrix division (which is computationally easier than a matrix inversion), the estimation of the eigenvalues of this matrix is significantly more expensive in terms of execution time and memory allocation. In the case of the complete representation space, the memory overflow error is reported for six categories or more, and the estimation of the eigenvalues takes around 2 h 24 min for five categories. As can be observed in the table, the condition number of the matrices (i.e., the ratio of the largest to the smallest eigenvalues) is smaller in the reduced representation space, meaning the LS deconvolution is better conditioned in the reduced representation space. This result is expected according to the Cauchy interlacing theorem (also known as the Poincaré separation theorem), which gives upper and lower bounds of eigenvalues of a real symmetric matrix obtained as the orthogonal projection of a larger real symmetric matrix.
Supplementary material (Sec. 5) includes plots representing the execution time for the multi-response deconvolution and the eigenvalues computation (as a function of both, the number of estimated responses, and the number of components of the multi-response vector), an analysis of the memory requirements and an estimation of the computational cost expected for other computer architectures.
C. Grand average and individual AEP responses
Figure 4 shows the individual AEP responses, using a multi-response deconvolution with 16 categories (stimulation intervals of 5 dB) estimated from the 31.5 min of EEG available for each subject. Each trace is obtained with approximately 5250 stimuli. The grand average AEP response (involving 42 000 stimuli per trace) is also included. The main ABR and MLR components are labeled in the first trace of each plot. Supplementary material (section 6) includes individual and grand average responses, estimated with several categorizations (intervals of 20, 10, and 5 dB).
The plots with individual traces (subjects 1 to 8) visually show that most of the AEP components from wave I (of the ABR) to the Pb (of the MLR) can be identified at the higher stimulation levels. The amplitude of the components decreases with the reduction of the stimulation intensity, and also an increase in the latencies, particularly in the ABR components, is observed. The comparison of the grand average and the individual AEP responses reveals a high consistency across subjects for most of the components. A post-auricular muscle response (PAMR, an action potential occurring at approximately 13 ms after the stimulus onset resulting from the contraction of the post-auricular muscle) is observed in subject 5 (Sec. 6 of the supplementary material compares the grand average responses including and excluding this subject).
D. Quality of the AEP estimations for different categorizations
The quality of the AEP responses was determined in terms of their SNR. The SNR was calculated as the ratio between the energy of the AEP waveform and the energy of the noise in the latency interval 0–200 ms. The SNR was calculated in the reduced representation (since the transformation Vr between the complete and the reduced representation spaces is orthonormal, the energy is the same in both representations). The respective energies of the response and the noise are obtained according to the “plus-minus reference” procedure proposed by Schimmel (1967). In our implementation, the procedure was estimated using the first half and the second half of the available epochs (Martinez , 2023), specifically for each categorization and for each subject.
Figure 5 represents the average SNR estimated for each categorization and each subject as a function of the interval width used for the categorization. The figure includes individual traces for each subject (with triangles) and traces for the grand average (with circles). The number of categories is also indicated in the figure. A reduction in a factor 2 in the number of categories (for example, from 32 categories, 2.5 dB intervals, to 16 categories, 5 dB intervals) implies an increment in a factor 2 in the number of epochs per category contributing to the response estimation, and therefore, a SNR improvement of 3 dB would be expected (because the amplitude of the noise in the estimated response would decrease 3 dB as the number of epochs is doubled). This improvement is approximately achieved on the left side of the plot (many categories with narrow intervals), but the improvement is smaller than expected (the slope decreases) as we move to the right side of the plot. The reduction of the SNR improvement with respect to the expected one seems to be associated with the fact that wide categories cause a spread of latencies (and a weak verification of the LTI-like assumption) and the subsequent amplitude reduction in the response estimation (as observed in Fig. 2).
Supplementary material (Sec. 7) contains a detailed description of the procedure for the SNR estimation (including a simulation to verify that the procedure provides a non-biased SNR estimation) and detailed results (including also the average signal and noise levels estimated for each subject and categorization).
E. Estimation of latencies and amplitudes
Even though a very detailed analysis of the morphology of the estimated AEP responses is not the main objective of this manuscript (because of its extension, such analysis will be considered in a future manuscript), in Fig. 6, we include measurements of amplitudes and latencies for the ABR wave V and the MLR wave Na-Pa (two of the most consistent components across subjects) as a function of the stimulation level. These measurements correspond to a categorization of the responses in intervals of 10 dB (which provides a reasonable trade-off between noise and resolution). The results presented in the figure include individual measurements for each subject (triangles) and mean across the subjects (circles). The latency results are corrected (reduced by 0.82 ms with respect to the measurements in the evoked responses) taking into account the length of the transducer tube (28 cm).
In the case of wave V, a clear dependence is observed in the plots, with a reduction of latency and an increase in the amplitude of the wave as the stimulation level increases. Due to the noise affecting the estimated evoked responses, the amplitude measurements are affected by fluctuations. The latency measurements are significantly more stable. In the case of wave Na-Pa, a strong dependence is observed in the amplitude, with a significant inter-subject variability. The plot with the largest amplitude corresponds to subject 5 (with a large PAMR component). In this case, the Na negative peak is influenced by the PAMR component (as observed in Fig. 4), which causes an overestimation of the Na-Pa amplitude. For this reason, the measurements corresponding to this subject were excluded from the average. The dependence on the stimulation level is significantly smaller for the latency of the peak Pa. Supplementary material (Sec. 8) includes measurements of waves V and Na-Pa (latencies and amplitudes) for categorizations in intervals of 20 dB (less noisy responses but with lower resolution) and 5 dB (higher resolution but more noisy responses).
IV. DISCUSSION AND CONCLUSIONS
In this article, we propose multi-response deconvolution of AEPs performed in the reduced representation space provided by the latency-dependent filtering and downsampling. The proposed method can be considered a natural and obvious combination of the multi-response deconvolution proposed by Bardy (2014a) and the subspace-constrained deconvolution proposed by de la Torre (2022). However, the proposed method provides a very useful and practical tool, because in the complete representation space, the multi-response deconvolution is difficult to be applied due to the computational cost, while this difficulty is alleviated in the reduced representation space. In the experiments presented in Fig. 1, the time required for the AEP estimation was reduced from 18 min to 30 s (and with more categories, the AEP estimation was not possible in the complete representation space due to memory overflow). Interestingly, the computational time required to deconvolve the responses (135.7 s for 32 categorizations in the most complex experiment) was significantly smaller than the duration of the EEG (i.e., the time devoted to EEG recording, 1890 s in the experiments). This fact could be applied in the future to estimate and represent the responses in real time during the recording session. Additionally, the latency-dependent filtering involved in the reduced representation provides a reduction of high frequency noise.
The reduced representation space also allows a detailed analysis of the matrix involved in the deconvolution (for instance, the estimation of the eigenvalues or the condition number). This analysis is computationally more expensive than the deconvolution. The eigenvalues and the condition number are useful to assess potential problems in the deconvolved responses (the matrix division amplifies those components with low eigenvalues, and the over-amplification of background noise increases the risk of highly noisy AEP estimations). On the other hand, the dimensionality reduction guarantees a reduction of the condition number according to the Cauchy interlacing theorem, as discussed in de la Torre (2022), which is more important in the multi-response deconvolution (the condition number grows with the number of categories, particularly in the complete representation space, as can be observed in Table I). This reduction of the condition number leads to a better conditioned deconvolution problem (less susceptible to noise amplification). The supplementary material (Sec. 9) includes an analysis (including examples and simulations) of the eigenvalues and the condition number and their utility to detect ill-conditioned deconvolution problems.
The evoked responses observed in Figs. 1, 2, and 4 are compatible with those described in de la Torre (2019, 2020) and de la Torre (2022) (for the ISI range 15–30 ms). These plots provide clear and consistent ABR and MLR responses. However, in the responses presented in this article, the CAEP components are difficult to observe due to the filter applied to the EEG (with a 20–3300 Hz bandwidth). The Pb/P1 wave (latency around 60 ms) is attenuated with respect to the amplitudes observed in the previous works, and the P2 wave, also attenuated, can be observed only in subjects 1 and 3. As in the previous works (even though with some limitations due to the bandwidth), the proposed comprehensive AEP representation facilitates the simultaneous exploration of peripheral and central components.
The proposed multi-response deconvolution allows an a posteriori categorization, as can be observed in Fig. 2. In these experiments, the stimuli can easily be categorized according to their intensity, and different categorizations can be defined using different sizes of the intensity intervals because the response morphology changes only subtly for small variations of the stimulation intensity.
Using few categories (with wide intensity intervals and many stimuli in each category) reduces the computational cost of the multi-response deconvolution and provides responses less affected by noise, as observed in Fig. 5, but affected by an amplitude reduction (associated with the spread of amplitudes and latencies within each category and the subsequent weak verification of the LTI-assumption), as observed in Fig. 2. With more categories, the amplitude of the components is usually larger (as observed in Fig. 2 and in the Secs. 4 and 6 in the supplementary material), because the spread in their latencies is smaller. Additionally, a better resolution in intensity is achieved (which would be important if the purpose of the exploration was, for example, to estimate click-based hearing thresholds). The optimal categorization would depend on the total number of available stimuli (which grows with the recording time) and the level of the background noise in the EEG. The flexibility in the multi-response categorization could be very useful for clinical applications of AEP explorations in order to optimize the information obtained from the recorded data since the categorization could easily be adapted after the exploration (or even during the exploration) according to the noise conditions and the purpose of the exploration (for example, identification of an electrophysiological response, or accurate determination of click-based thresholds).
One of the limitations of the convolutional model for the AEP responses is the LTI assumption of the hearing system. The hearing system is not lineal (an increase in the stimulus level produces a non-linear but approximately logarithmic increase in the response, with threshold and saturation effects and with changes in the latencies and morphology). Similarly, the hearing system is not time invariant due to adaptation and fatigue mechanisms (Valderrama , 2014a; Valderrama , 2016). Under specific circumstances, the no-LTI hearing system behaves as an LTI-like system (for example for a quasi-stationary presentation of repetitive stimuli), if the response to each stimulus can be assumed to be identical. In such a case, the convolutional model can be applied to estimate the responses via deconvolution (or with synchronous averaging if the responses are not overlapped), but this limitation strongly constrains the stimulation patterns that can be used to evoke consistent responses. The proposed multi-response deconvolution allows the definition of different categories of stimuli (which are expected to evoke different responses) and therefore allows for the extension of the LTI-like assumption to stimulation patterns much more elaborated than the classical repetitive patterns. This flexibility can be used to make the acquisition sessions more comfortable for the explored subjects (some of them find annoying or boring the repetitive stimulation patterns, Martinez , 2023), or for the assessment of specific aspects of the hearing perception (requiring complex stimulation patterns).
The multi-response estimation of evoked responses have some precedents in the literature. In the context of auditory steady state responses (ASSR) the simultaneous estimation of responses to different stimulation frequencies is very common (John , 1998; Lins and Picton, 1995; Luts , 2006), and more recently, some interesting experiments have been proposed to evaluate the auditory attention (Manting , 2023). Compared with those approaches, the multi-response deconvolution proposed in the present article has the advantage of providing the transient evoked responses, and therefore information about the source of the neural activity. In the context of multiple estimations of transient responses, the parallel auditory brainstem response (Polonenko and Maddox, 2019, 2022) proposes a multi-frequency stimulation procedure using non-correlated stimulation sequences and therefore allowing them to deconvolve the evoked response using the synchronous average (without the need of an elaborated deconvolution procedure and without limitations related to computational cost). This is possible (in spite of the response overlapping) because the autocorrelation matrix is the identity matrix for uncorrelated stimulation sequences. However, this approach is restricted to experiments in which the stimulation events are specifically configured to be statistically independent (which restricts the flexibility in the experimental design to stimulation sequences following a Poisson distribution). Additionally, the large ISI spread of uncorrelated stimulation sequences could compromise the verification of the LTI-assumption (weak when epochs with different expected responses are grouped under the same category), and makes it recommendable the verification of the size of the ISI effect (Maddox and Lee, 2018; Polonenko and Maddox, 2019, 2022; Stoll and Maddox, 2023) in order to guarantee an accurate estimation of the responses.
In general, a quasi-uncorrelated stimulation sequence (i.e., with large ISI spread) provides a low condition number in the autocorrelation matrix. This guarantees low amplification of the noise in the estimated response and a small bias between the response and the synchronous average (as discussed and simulated in Sec. 9 of the supplementary material). However, a large spread of ISI compromises the verification of the LTI-assumption, and could lead to incorrect estimation of the responses. On the other hand, a narrow ISI range guarantees the verification of the LTI-assumption, but causes a large condition number (and amplification of noise, requiring an increment of the number of events to compensate for this effect). According to our previous experience with adaptation of evoked responses (de la Torre , 2022; Valderrama , 2014a; Valderrama , 2016), a uniform ISI distribution with the upper limit twice the lower one (one octave ISI range) provides a reasonable trade-off between the verification of the LTI-assumption and the condition number.
In the present article, we have applied a relatively simple stimulation pattern (the stimuli are always the same except for the stimulation level), in order to describe the proposed procedure with a simple experiment. However, the proposed multi-response deconvolution can be extended to experimental designs using more complex stimulation patterns and more audacious and creative categorizations. In a previous work, we stimulated with bursts of clicks, and the stimuli were categorized taking into account their position in the burst in order to study fast and slow adaptation mechanisms in the hearing system (Valderrama , 2019). In another study, the stimuli were presented binaurally, where the stimulation pattern was affected by changes in the interaural delay, and different categories were defined for the stimulation pattern or the changes in the interaural delay (Martinez , 2021). In these studies, an iterative (and time consuming) multi-response deconvolution was applied, and the experimental design was constrained by the computational cost required by the deconvolution. With the procedure proposed in this article, the computational cost is alleviated, and more complex AEP experiments can be designed, which will offer new perspectives for the research and assessment of the hearing system.
SUPPLEMENTARY MATERIAL
See the supplementary material for (Sec. 1) detailed description of convolution and deconvolution with matrix formulation including examples and simulations; (Sec. 2) description of the code (also provided as supplementary material) with algorithms, examples of use and simulations; (Sec. 3) extension of the subspace-constrained multi-response deconvolution for the case of response-specific transformation; (Sec. 4) detailed results of AEP responses for subject 1; (Sec. 5) detailed results about computational cost; (Sec. 6) detailed results about individual and grand average AEP responses; (Sec. 7) description of the procedure for SNR estimation in AEP responses and detailed results; (Sec. 8) detailed results about estimated latencies and amplitudes of ABR wave V and MLR wave Na-Pa; (Sec. 9) a detailed analysis about eigenvalues and matrix condition number, including simulations for single- and multi-response deconvolution. See also supplementary material for a zip file with code and data with algorithms, examples and simulations: (a) scripts implementing simple examples of convolution and deconvolution; (b) functions, scripts and data with the algorithms for multi-response deconvolution, with examples of use and simulations; (c) a script with a simulation verifying the accuracy of the procedure for SNR estimation; (d) functions and scripts implementing simulations related to the effect of the eigenvalues and the condition number on the deconvolved responses.
ACKNOWLEDGMENTS
This research was supported by the (1) PID2020-119073GB-I00 project grant, funded by MICIN / AEI / 10.13039/501100011033; (2) B.TIC.382.UGR20 project grant, funded by “Junta de Andalucia—Consejeria de Economia, Conocimiento, Empresas y Universidad” and by the “European Union - ERDF A way to make Europe”; (3) ProyExcel_00152 project grant, funded by “Junta de Andalucia - Consejeria de Universidad, Investigacion e Innovacion”; and (4) “Ayudas Maria Zambrano” post-doctoral fellowship, funded by the Ministry of Universities of the Spanish Government and by the “European Union—Next Generation EU.” The authors would like to sincerely acknowledge the contribution and implication of our coauthor J. C. Segura, who unfortunately passed away September 10, 2023. The authors also acknowledge the constructive revision of the anonymous reviewers.
AUTHOR DECLARATIONS
Conflict of Interest
The authors declare no conflict of interest.
Ethics Approval
Ethics approval to conduct the present research was obtained from the Human Research Ethics Committee of the University of Granada (Ref. no. 961/CEIH/2019).
DATA AVAILABILITY
The data that support the findings of this study are available from the corresponding author upon reasonable request.