2026-05-20
Stack Overflow: View Question
Tags: audio, webrtc, signal-processing, aec, echo-cancellation
Score: 0 | Views: 43
The asker is probing a real pain point: when acoustic echo cancellation (AEC) breaks down — reverberant rooms, double-talk, abrupt acoustic changes — commercial stacks like WebRTC's AEC3, Zoom, and Meet fall back to residual echo suppression that effectively half-duplexes the call (volume ducking, near-end gating). Their proposal: instead of ducking the whole spectrum, partition it. Allocate disjoint frequency sub-bands to the near-end and far-end speakers in real time, so they literally can't acoustically collide in the loop.
Why it's interesting: AEC fundamentally relies on estimating a linear echo path. When that estimate diverges, the residual is unbounded. Frequency-domain duplexing sidesteps adaptive filtering by enforcing orthogonality in a different domain — borrowing an idea from FDMA radio. The clever bit is that speech is sparse in frequency: voiced segments concentrate energy in formants below ~3.5 kHz, with gaps. So you don't need to discard half the band — you allocate based on instantaneous spectral occupancy.
Why it's hard:
Prior art to check: The closest published work is perceptually-weighted residual echo suppression (Valin et al., RNNoise lineage) and frequency-domain double-talk detection in AEC3. Sub-band AEC itself is standard (partitioned-block frequency-domain adaptive filter, PBFDAF). What the asker is describing is closer to cognitive-radio-style dynamic spectrum access applied to acoustic full-duplex — I'm not aware of it as a deployed AEC fallback, though academic "spectral duplexing" papers exist for hearing aids.
Direction: Prototype with WebRTC's APM. Tap the far-end render signal, compute its short-time spectrum, and on a double-talk detector trigger, apply a complementary mask to the near-end capture instead of a flat suppression gain. Compare PESQ/POLQA against AEC3's default suppressor under reverberant conditions (image-source simulator, RT60 = 0.6s).
Gotcha: If the far-end is also doing this, you've created a coupled control loop across the network with RTT-bounded stability. Allocation must be unilateral or negotiated out-of-band.
