182c29810SThomas Gleixner // SPDX-License-Identifier: GPL-2.0-only
26e2055a9SGreg Kroah-Hartman /*
36e2055a9SGreg Kroah-Hartman * SpanDSP - a series of DSP components for telephony
46e2055a9SGreg Kroah-Hartman *
56e2055a9SGreg Kroah-Hartman * echo.c - A line echo canceller. This code is being developed
66e2055a9SGreg Kroah-Hartman * against and partially complies with G168.
76e2055a9SGreg Kroah-Hartman *
86e2055a9SGreg Kroah-Hartman * Written by Steve Underwood <steveu@coppice.org>
96e2055a9SGreg Kroah-Hartman * and David Rowe <david_at_rowetel_dot_com>
106e2055a9SGreg Kroah-Hartman *
116e2055a9SGreg Kroah-Hartman * Copyright (C) 2001, 2003 Steve Underwood, 2007 David Rowe
126e2055a9SGreg Kroah-Hartman *
136e2055a9SGreg Kroah-Hartman * Based on a bit from here, a bit from there, eye of toad, ear of
146e2055a9SGreg Kroah-Hartman * bat, 15 years of failed attempts by David and a few fried brain
156e2055a9SGreg Kroah-Hartman * cells.
166e2055a9SGreg Kroah-Hartman *
176e2055a9SGreg Kroah-Hartman * All rights reserved.
186e2055a9SGreg Kroah-Hartman */
196e2055a9SGreg Kroah-Hartman
206e2055a9SGreg Kroah-Hartman /*! \file */
216e2055a9SGreg Kroah-Hartman
226e2055a9SGreg Kroah-Hartman /* Implementation Notes
236e2055a9SGreg Kroah-Hartman David Rowe
246e2055a9SGreg Kroah-Hartman April 2007
256e2055a9SGreg Kroah-Hartman
266e2055a9SGreg Kroah-Hartman This code started life as Steve's NLMS algorithm with a tap
276e2055a9SGreg Kroah-Hartman rotation algorithm to handle divergence during double talk. I
286e2055a9SGreg Kroah-Hartman added a Geigel Double Talk Detector (DTD) [2] and performed some
296e2055a9SGreg Kroah-Hartman G168 tests. However I had trouble meeting the G168 requirements,
306e2055a9SGreg Kroah-Hartman especially for double talk - there were always cases where my DTD
316e2055a9SGreg Kroah-Hartman failed, for example where near end speech was under the 6dB
326e2055a9SGreg Kroah-Hartman threshold required for declaring double talk.
336e2055a9SGreg Kroah-Hartman
346e2055a9SGreg Kroah-Hartman So I tried a two path algorithm [1], which has so far given better
356e2055a9SGreg Kroah-Hartman results. The original tap rotation/Geigel algorithm is available
366e2055a9SGreg Kroah-Hartman in SVN http://svn.rowetel.com/software/oslec/tags/before_16bit.
376e2055a9SGreg Kroah-Hartman It's probably possible to make it work if some one wants to put some
386e2055a9SGreg Kroah-Hartman serious work into it.
396e2055a9SGreg Kroah-Hartman
406e2055a9SGreg Kroah-Hartman At present no special treatment is provided for tones, which
416e2055a9SGreg Kroah-Hartman generally cause NLMS algorithms to diverge. Initial runs of a
426e2055a9SGreg Kroah-Hartman subset of the G168 tests for tones (e.g ./echo_test 6) show the
436e2055a9SGreg Kroah-Hartman current algorithm is passing OK, which is kind of surprising. The
446e2055a9SGreg Kroah-Hartman full set of tests needs to be performed to confirm this result.
456e2055a9SGreg Kroah-Hartman
466e2055a9SGreg Kroah-Hartman One other interesting change is that I have managed to get the NLMS
476e2055a9SGreg Kroah-Hartman code to work with 16 bit coefficients, rather than the original 32
486e2055a9SGreg Kroah-Hartman bit coefficents. This reduces the MIPs and storage required.
496e2055a9SGreg Kroah-Hartman I evaulated the 16 bit port using g168_tests.sh and listening tests
506e2055a9SGreg Kroah-Hartman on 4 real-world samples.
516e2055a9SGreg Kroah-Hartman
526e2055a9SGreg Kroah-Hartman I also attempted the implementation of a block based NLMS update
536e2055a9SGreg Kroah-Hartman [2] but although this passes g168_tests.sh it didn't converge well
546e2055a9SGreg Kroah-Hartman on the real-world samples. I have no idea why, perhaps a scaling
556e2055a9SGreg Kroah-Hartman problem. The block based code is also available in SVN
566e2055a9SGreg Kroah-Hartman http://svn.rowetel.com/software/oslec/tags/before_16bit. If this
576e2055a9SGreg Kroah-Hartman code can be debugged, it will lead to further reduction in MIPS, as
586e2055a9SGreg Kroah-Hartman the block update code maps nicely onto DSP instruction sets (it's a
596e2055a9SGreg Kroah-Hartman dot product) compared to the current sample-by-sample update.
606e2055a9SGreg Kroah-Hartman
616e2055a9SGreg Kroah-Hartman Steve also has some nice notes on echo cancellers in echo.h
626e2055a9SGreg Kroah-Hartman
636e2055a9SGreg Kroah-Hartman References:
646e2055a9SGreg Kroah-Hartman
656e2055a9SGreg Kroah-Hartman [1] Ochiai, Areseki, and Ogihara, "Echo Canceller with Two Echo
666e2055a9SGreg Kroah-Hartman Path Models", IEEE Transactions on communications, COM-25,
676e2055a9SGreg Kroah-Hartman No. 6, June
686e2055a9SGreg Kroah-Hartman 1977.
69*4e74eeb2SAlexander A. Klimov https://www.rowetel.com/images/echo/dual_path_paper.pdf
706e2055a9SGreg Kroah-Hartman
716e2055a9SGreg Kroah-Hartman [2] The classic, very useful paper that tells you how to
726e2055a9SGreg Kroah-Hartman actually build a real world echo canceller:
736e2055a9SGreg Kroah-Hartman Messerschmitt, Hedberg, Cole, Haoui, Winship, "Digital Voice
746e2055a9SGreg Kroah-Hartman Echo Canceller with a TMS320020,
75*4e74eeb2SAlexander A. Klimov https://www.rowetel.com/images/echo/spra129.pdf
766e2055a9SGreg Kroah-Hartman
776e2055a9SGreg Kroah-Hartman [3] I have written a series of blog posts on this work, here is
786e2055a9SGreg Kroah-Hartman Part 1: http://www.rowetel.com/blog/?p=18
796e2055a9SGreg Kroah-Hartman
806e2055a9SGreg Kroah-Hartman [4] The source code http://svn.rowetel.com/software/oslec/
816e2055a9SGreg Kroah-Hartman
826e2055a9SGreg Kroah-Hartman [5] A nice reference on LMS filters:
83*4e74eeb2SAlexander A. Klimov https://en.wikipedia.org/wiki/Least_mean_squares_filter
846e2055a9SGreg Kroah-Hartman
856e2055a9SGreg Kroah-Hartman Credits:
866e2055a9SGreg Kroah-Hartman
876e2055a9SGreg Kroah-Hartman Thanks to Steve Underwood, Jean-Marc Valin, and Ramakrishnan
886e2055a9SGreg Kroah-Hartman Muthukrishnan for their suggestions and email discussions. Thanks
896e2055a9SGreg Kroah-Hartman also to those people who collected echo samples for me such as
906e2055a9SGreg Kroah-Hartman Mark, Pawel, and Pavel.
916e2055a9SGreg Kroah-Hartman */
926e2055a9SGreg Kroah-Hartman
936e2055a9SGreg Kroah-Hartman #include <linux/kernel.h>
946e2055a9SGreg Kroah-Hartman #include <linux/module.h>
956e2055a9SGreg Kroah-Hartman #include <linux/slab.h>
966e2055a9SGreg Kroah-Hartman
976e2055a9SGreg Kroah-Hartman #include "echo.h"
986e2055a9SGreg Kroah-Hartman
996e2055a9SGreg Kroah-Hartman #define MIN_TX_POWER_FOR_ADAPTION 64
1006e2055a9SGreg Kroah-Hartman #define MIN_RX_POWER_FOR_ADAPTION 64
1016e2055a9SGreg Kroah-Hartman #define DTD_HANGOVER 600 /* 600 samples, or 75ms */
1026e2055a9SGreg Kroah-Hartman #define DC_LOG2BETA 3 /* log2() of DC filter Beta */
1036e2055a9SGreg Kroah-Hartman
1046e2055a9SGreg Kroah-Hartman /* adapting coeffs using the traditional stochastic descent (N)LMS algorithm */
1056e2055a9SGreg Kroah-Hartman
lms_adapt_bg(struct oslec_state * ec,int clean,int shift)1066e2055a9SGreg Kroah-Hartman static inline void lms_adapt_bg(struct oslec_state *ec, int clean, int shift)
1076e2055a9SGreg Kroah-Hartman {
1086e2055a9SGreg Kroah-Hartman int i;
1096e2055a9SGreg Kroah-Hartman
1106e2055a9SGreg Kroah-Hartman int offset1;
1116e2055a9SGreg Kroah-Hartman int offset2;
1126e2055a9SGreg Kroah-Hartman int factor;
1136e2055a9SGreg Kroah-Hartman int exp;
1146e2055a9SGreg Kroah-Hartman
1156e2055a9SGreg Kroah-Hartman if (shift > 0)
1166e2055a9SGreg Kroah-Hartman factor = clean << shift;
1176e2055a9SGreg Kroah-Hartman else
1186e2055a9SGreg Kroah-Hartman factor = clean >> -shift;
1196e2055a9SGreg Kroah-Hartman
1206e2055a9SGreg Kroah-Hartman /* Update the FIR taps */
1216e2055a9SGreg Kroah-Hartman
1226e2055a9SGreg Kroah-Hartman offset2 = ec->curr_pos;
1236e2055a9SGreg Kroah-Hartman offset1 = ec->taps - offset2;
1246e2055a9SGreg Kroah-Hartman
1256e2055a9SGreg Kroah-Hartman for (i = ec->taps - 1; i >= offset1; i--) {
1266e2055a9SGreg Kroah-Hartman exp = (ec->fir_state_bg.history[i - offset1] * factor);
1276e2055a9SGreg Kroah-Hartman ec->fir_taps16[1][i] += (int16_t) ((exp + (1 << 14)) >> 15);
1286e2055a9SGreg Kroah-Hartman }
1296e2055a9SGreg Kroah-Hartman for (; i >= 0; i--) {
1306e2055a9SGreg Kroah-Hartman exp = (ec->fir_state_bg.history[i + offset2] * factor);
1316e2055a9SGreg Kroah-Hartman ec->fir_taps16[1][i] += (int16_t) ((exp + (1 << 14)) >> 15);
1326e2055a9SGreg Kroah-Hartman }
1336e2055a9SGreg Kroah-Hartman }
1346e2055a9SGreg Kroah-Hartman
top_bit(unsigned int bits)1356e2055a9SGreg Kroah-Hartman static inline int top_bit(unsigned int bits)
1366e2055a9SGreg Kroah-Hartman {
1376e2055a9SGreg Kroah-Hartman if (bits == 0)
1386e2055a9SGreg Kroah-Hartman return -1;
1396e2055a9SGreg Kroah-Hartman else
1406e2055a9SGreg Kroah-Hartman return (int)fls((int32_t) bits) - 1;
1416e2055a9SGreg Kroah-Hartman }
1426e2055a9SGreg Kroah-Hartman
oslec_create(int len,int adaption_mode)1436e2055a9SGreg Kroah-Hartman struct oslec_state *oslec_create(int len, int adaption_mode)
1446e2055a9SGreg Kroah-Hartman {
1456e2055a9SGreg Kroah-Hartman struct oslec_state *ec;
1466e2055a9SGreg Kroah-Hartman int i;
1476e2055a9SGreg Kroah-Hartman const int16_t *history;
1486e2055a9SGreg Kroah-Hartman
1496e2055a9SGreg Kroah-Hartman ec = kzalloc(sizeof(*ec), GFP_KERNEL);
1506e2055a9SGreg Kroah-Hartman if (!ec)
1516e2055a9SGreg Kroah-Hartman return NULL;
1526e2055a9SGreg Kroah-Hartman
1536e2055a9SGreg Kroah-Hartman ec->taps = len;
1546e2055a9SGreg Kroah-Hartman ec->log2taps = top_bit(len);
1556e2055a9SGreg Kroah-Hartman ec->curr_pos = ec->taps - 1;
1566e2055a9SGreg Kroah-Hartman
1576e2055a9SGreg Kroah-Hartman ec->fir_taps16[0] =
1586e2055a9SGreg Kroah-Hartman kcalloc(ec->taps, sizeof(int16_t), GFP_KERNEL);
1596e2055a9SGreg Kroah-Hartman if (!ec->fir_taps16[0])
1606e2055a9SGreg Kroah-Hartman goto error_oom_0;
1616e2055a9SGreg Kroah-Hartman
1626e2055a9SGreg Kroah-Hartman ec->fir_taps16[1] =
1636e2055a9SGreg Kroah-Hartman kcalloc(ec->taps, sizeof(int16_t), GFP_KERNEL);
1646e2055a9SGreg Kroah-Hartman if (!ec->fir_taps16[1])
1656e2055a9SGreg Kroah-Hartman goto error_oom_1;
1666e2055a9SGreg Kroah-Hartman
1676e2055a9SGreg Kroah-Hartman history = fir16_create(&ec->fir_state, ec->fir_taps16[0], ec->taps);
1686e2055a9SGreg Kroah-Hartman if (!history)
1696e2055a9SGreg Kroah-Hartman goto error_state;
1706e2055a9SGreg Kroah-Hartman history = fir16_create(&ec->fir_state_bg, ec->fir_taps16[1], ec->taps);
1716e2055a9SGreg Kroah-Hartman if (!history)
1726e2055a9SGreg Kroah-Hartman goto error_state_bg;
1736e2055a9SGreg Kroah-Hartman
1746e2055a9SGreg Kroah-Hartman for (i = 0; i < 5; i++)
1756e2055a9SGreg Kroah-Hartman ec->xvtx[i] = ec->yvtx[i] = ec->xvrx[i] = ec->yvrx[i] = 0;
1766e2055a9SGreg Kroah-Hartman
1776e2055a9SGreg Kroah-Hartman ec->cng_level = 1000;
1786e2055a9SGreg Kroah-Hartman oslec_adaption_mode(ec, adaption_mode);
1796e2055a9SGreg Kroah-Hartman
1806e2055a9SGreg Kroah-Hartman ec->snapshot = kcalloc(ec->taps, sizeof(int16_t), GFP_KERNEL);
1816e2055a9SGreg Kroah-Hartman if (!ec->snapshot)
1826e2055a9SGreg Kroah-Hartman goto error_snap;
1836e2055a9SGreg Kroah-Hartman
1846e2055a9SGreg Kroah-Hartman ec->cond_met = 0;
1856e2055a9SGreg Kroah-Hartman ec->pstates = 0;
1866e2055a9SGreg Kroah-Hartman ec->ltxacc = ec->lrxacc = ec->lcleanacc = ec->lclean_bgacc = 0;
1876e2055a9SGreg Kroah-Hartman ec->ltx = ec->lrx = ec->lclean = ec->lclean_bg = 0;
1886e2055a9SGreg Kroah-Hartman ec->tx_1 = ec->tx_2 = ec->rx_1 = ec->rx_2 = 0;
1896e2055a9SGreg Kroah-Hartman ec->lbgn = ec->lbgn_acc = 0;
1906e2055a9SGreg Kroah-Hartman ec->lbgn_upper = 200;
1916e2055a9SGreg Kroah-Hartman ec->lbgn_upper_acc = ec->lbgn_upper << 13;
1926e2055a9SGreg Kroah-Hartman
1936e2055a9SGreg Kroah-Hartman return ec;
1946e2055a9SGreg Kroah-Hartman
1956e2055a9SGreg Kroah-Hartman error_snap:
1966e2055a9SGreg Kroah-Hartman fir16_free(&ec->fir_state_bg);
1976e2055a9SGreg Kroah-Hartman error_state_bg:
1986e2055a9SGreg Kroah-Hartman fir16_free(&ec->fir_state);
1996e2055a9SGreg Kroah-Hartman error_state:
2006e2055a9SGreg Kroah-Hartman kfree(ec->fir_taps16[1]);
2016e2055a9SGreg Kroah-Hartman error_oom_1:
2026e2055a9SGreg Kroah-Hartman kfree(ec->fir_taps16[0]);
2036e2055a9SGreg Kroah-Hartman error_oom_0:
2046e2055a9SGreg Kroah-Hartman kfree(ec);
2056e2055a9SGreg Kroah-Hartman return NULL;
2066e2055a9SGreg Kroah-Hartman }
2076e2055a9SGreg Kroah-Hartman EXPORT_SYMBOL_GPL(oslec_create);
2086e2055a9SGreg Kroah-Hartman
oslec_free(struct oslec_state * ec)2096e2055a9SGreg Kroah-Hartman void oslec_free(struct oslec_state *ec)
2106e2055a9SGreg Kroah-Hartman {
2116e2055a9SGreg Kroah-Hartman int i;
2126e2055a9SGreg Kroah-Hartman
2136e2055a9SGreg Kroah-Hartman fir16_free(&ec->fir_state);
2146e2055a9SGreg Kroah-Hartman fir16_free(&ec->fir_state_bg);
2156e2055a9SGreg Kroah-Hartman for (i = 0; i < 2; i++)
2166e2055a9SGreg Kroah-Hartman kfree(ec->fir_taps16[i]);
2176e2055a9SGreg Kroah-Hartman kfree(ec->snapshot);
2186e2055a9SGreg Kroah-Hartman kfree(ec);
2196e2055a9SGreg Kroah-Hartman }
2206e2055a9SGreg Kroah-Hartman EXPORT_SYMBOL_GPL(oslec_free);
2216e2055a9SGreg Kroah-Hartman
oslec_adaption_mode(struct oslec_state * ec,int adaption_mode)2226e2055a9SGreg Kroah-Hartman void oslec_adaption_mode(struct oslec_state *ec, int adaption_mode)
2236e2055a9SGreg Kroah-Hartman {
2246e2055a9SGreg Kroah-Hartman ec->adaption_mode = adaption_mode;
2256e2055a9SGreg Kroah-Hartman }
2266e2055a9SGreg Kroah-Hartman EXPORT_SYMBOL_GPL(oslec_adaption_mode);
2276e2055a9SGreg Kroah-Hartman
oslec_flush(struct oslec_state * ec)2286e2055a9SGreg Kroah-Hartman void oslec_flush(struct oslec_state *ec)
2296e2055a9SGreg Kroah-Hartman {
2306e2055a9SGreg Kroah-Hartman int i;
2316e2055a9SGreg Kroah-Hartman
2326e2055a9SGreg Kroah-Hartman ec->ltxacc = ec->lrxacc = ec->lcleanacc = ec->lclean_bgacc = 0;
2336e2055a9SGreg Kroah-Hartman ec->ltx = ec->lrx = ec->lclean = ec->lclean_bg = 0;
2346e2055a9SGreg Kroah-Hartman ec->tx_1 = ec->tx_2 = ec->rx_1 = ec->rx_2 = 0;
2356e2055a9SGreg Kroah-Hartman
2366e2055a9SGreg Kroah-Hartman ec->lbgn = ec->lbgn_acc = 0;
2376e2055a9SGreg Kroah-Hartman ec->lbgn_upper = 200;
2386e2055a9SGreg Kroah-Hartman ec->lbgn_upper_acc = ec->lbgn_upper << 13;
2396e2055a9SGreg Kroah-Hartman
2406e2055a9SGreg Kroah-Hartman ec->nonupdate_dwell = 0;
2416e2055a9SGreg Kroah-Hartman
2426e2055a9SGreg Kroah-Hartman fir16_flush(&ec->fir_state);
2436e2055a9SGreg Kroah-Hartman fir16_flush(&ec->fir_state_bg);
2446e2055a9SGreg Kroah-Hartman ec->fir_state.curr_pos = ec->taps - 1;
2456e2055a9SGreg Kroah-Hartman ec->fir_state_bg.curr_pos = ec->taps - 1;
2466e2055a9SGreg Kroah-Hartman for (i = 0; i < 2; i++)
2476e2055a9SGreg Kroah-Hartman memset(ec->fir_taps16[i], 0, ec->taps * sizeof(int16_t));
2486e2055a9SGreg Kroah-Hartman
2496e2055a9SGreg Kroah-Hartman ec->curr_pos = ec->taps - 1;
2506e2055a9SGreg Kroah-Hartman ec->pstates = 0;
2516e2055a9SGreg Kroah-Hartman }
2526e2055a9SGreg Kroah-Hartman EXPORT_SYMBOL_GPL(oslec_flush);
2536e2055a9SGreg Kroah-Hartman
oslec_snapshot(struct oslec_state * ec)2546e2055a9SGreg Kroah-Hartman void oslec_snapshot(struct oslec_state *ec)
2556e2055a9SGreg Kroah-Hartman {
2566e2055a9SGreg Kroah-Hartman memcpy(ec->snapshot, ec->fir_taps16[0], ec->taps * sizeof(int16_t));
2576e2055a9SGreg Kroah-Hartman }
2586e2055a9SGreg Kroah-Hartman EXPORT_SYMBOL_GPL(oslec_snapshot);
2596e2055a9SGreg Kroah-Hartman
2606e2055a9SGreg Kroah-Hartman /* Dual Path Echo Canceller */
2616e2055a9SGreg Kroah-Hartman
oslec_update(struct oslec_state * ec,int16_t tx,int16_t rx)2626e2055a9SGreg Kroah-Hartman int16_t oslec_update(struct oslec_state *ec, int16_t tx, int16_t rx)
2636e2055a9SGreg Kroah-Hartman {
2646e2055a9SGreg Kroah-Hartman int32_t echo_value;
2656e2055a9SGreg Kroah-Hartman int clean_bg;
2666e2055a9SGreg Kroah-Hartman int tmp;
2676e2055a9SGreg Kroah-Hartman int tmp1;
2686e2055a9SGreg Kroah-Hartman
2696e2055a9SGreg Kroah-Hartman /*
2706e2055a9SGreg Kroah-Hartman * Input scaling was found be required to prevent problems when tx
2716e2055a9SGreg Kroah-Hartman * starts clipping. Another possible way to handle this would be the
2726e2055a9SGreg Kroah-Hartman * filter coefficent scaling.
2736e2055a9SGreg Kroah-Hartman */
2746e2055a9SGreg Kroah-Hartman
2756e2055a9SGreg Kroah-Hartman ec->tx = tx;
2766e2055a9SGreg Kroah-Hartman ec->rx = rx;
2776e2055a9SGreg Kroah-Hartman tx >>= 1;
2786e2055a9SGreg Kroah-Hartman rx >>= 1;
2796e2055a9SGreg Kroah-Hartman
2806e2055a9SGreg Kroah-Hartman /*
2816e2055a9SGreg Kroah-Hartman * Filter DC, 3dB point is 160Hz (I think), note 32 bit precision
2826e2055a9SGreg Kroah-Hartman * required otherwise values do not track down to 0. Zero at DC, Pole
2836e2055a9SGreg Kroah-Hartman * at (1-Beta) on real axis. Some chip sets (like Si labs) don't
2846e2055a9SGreg Kroah-Hartman * need this, but something like a $10 X100P card does. Any DC really
2856e2055a9SGreg Kroah-Hartman * slows down convergence.
2866e2055a9SGreg Kroah-Hartman *
2876e2055a9SGreg Kroah-Hartman * Note: removes some low frequency from the signal, this reduces the
2886e2055a9SGreg Kroah-Hartman * speech quality when listening to samples through headphones but may
2896e2055a9SGreg Kroah-Hartman * not be obvious through a telephone handset.
2906e2055a9SGreg Kroah-Hartman *
2916e2055a9SGreg Kroah-Hartman * Note that the 3dB frequency in radians is approx Beta, e.g. for Beta
2926e2055a9SGreg Kroah-Hartman * = 2^(-3) = 0.125, 3dB freq is 0.125 rads = 159Hz.
2936e2055a9SGreg Kroah-Hartman */
2946e2055a9SGreg Kroah-Hartman
2956e2055a9SGreg Kroah-Hartman if (ec->adaption_mode & ECHO_CAN_USE_RX_HPF) {
2966e2055a9SGreg Kroah-Hartman tmp = rx << 15;
2976e2055a9SGreg Kroah-Hartman
2986e2055a9SGreg Kroah-Hartman /*
2996e2055a9SGreg Kroah-Hartman * Make sure the gain of the HPF is 1.0. This can still
3006e2055a9SGreg Kroah-Hartman * saturate a little under impulse conditions, and it might
3016e2055a9SGreg Kroah-Hartman * roll to 32768 and need clipping on sustained peak level
3026e2055a9SGreg Kroah-Hartman * signals. However, the scale of such clipping is small, and
3036e2055a9SGreg Kroah-Hartman * the error due to any saturation should not markedly affect
3046e2055a9SGreg Kroah-Hartman * the downstream processing.
3056e2055a9SGreg Kroah-Hartman */
3066e2055a9SGreg Kroah-Hartman tmp -= (tmp >> 4);
3076e2055a9SGreg Kroah-Hartman
3086e2055a9SGreg Kroah-Hartman ec->rx_1 += -(ec->rx_1 >> DC_LOG2BETA) + tmp - ec->rx_2;
3096e2055a9SGreg Kroah-Hartman
3106e2055a9SGreg Kroah-Hartman /*
3116e2055a9SGreg Kroah-Hartman * hard limit filter to prevent clipping. Note that at this
3126e2055a9SGreg Kroah-Hartman * stage rx should be limited to +/- 16383 due to right shift
3136e2055a9SGreg Kroah-Hartman * above
3146e2055a9SGreg Kroah-Hartman */
3156e2055a9SGreg Kroah-Hartman tmp1 = ec->rx_1 >> 15;
3166e2055a9SGreg Kroah-Hartman if (tmp1 > 16383)
3176e2055a9SGreg Kroah-Hartman tmp1 = 16383;
3186e2055a9SGreg Kroah-Hartman if (tmp1 < -16383)
3196e2055a9SGreg Kroah-Hartman tmp1 = -16383;
3206e2055a9SGreg Kroah-Hartman rx = tmp1;
3216e2055a9SGreg Kroah-Hartman ec->rx_2 = tmp;
3226e2055a9SGreg Kroah-Hartman }
3236e2055a9SGreg Kroah-Hartman
3246e2055a9SGreg Kroah-Hartman /* Block average of power in the filter states. Used for
3256e2055a9SGreg Kroah-Hartman adaption power calculation. */
3266e2055a9SGreg Kroah-Hartman
3276e2055a9SGreg Kroah-Hartman {
3286e2055a9SGreg Kroah-Hartman int new, old;
3296e2055a9SGreg Kroah-Hartman
3306e2055a9SGreg Kroah-Hartman /* efficient "out with the old and in with the new" algorithm so
3316e2055a9SGreg Kroah-Hartman we don't have to recalculate over the whole block of
3326e2055a9SGreg Kroah-Hartman samples. */
3336e2055a9SGreg Kroah-Hartman new = (int)tx * (int)tx;
3346e2055a9SGreg Kroah-Hartman old = (int)ec->fir_state.history[ec->fir_state.curr_pos] *
3356e2055a9SGreg Kroah-Hartman (int)ec->fir_state.history[ec->fir_state.curr_pos];
3366e2055a9SGreg Kroah-Hartman ec->pstates +=
3376e2055a9SGreg Kroah-Hartman ((new - old) + (1 << (ec->log2taps - 1))) >> ec->log2taps;
3386e2055a9SGreg Kroah-Hartman if (ec->pstates < 0)
3396e2055a9SGreg Kroah-Hartman ec->pstates = 0;
3406e2055a9SGreg Kroah-Hartman }
3416e2055a9SGreg Kroah-Hartman
3426e2055a9SGreg Kroah-Hartman /* Calculate short term average levels using simple single pole IIRs */
3436e2055a9SGreg Kroah-Hartman
3446e2055a9SGreg Kroah-Hartman ec->ltxacc += abs(tx) - ec->ltx;
3456e2055a9SGreg Kroah-Hartman ec->ltx = (ec->ltxacc + (1 << 4)) >> 5;
3466e2055a9SGreg Kroah-Hartman ec->lrxacc += abs(rx) - ec->lrx;
3476e2055a9SGreg Kroah-Hartman ec->lrx = (ec->lrxacc + (1 << 4)) >> 5;
3486e2055a9SGreg Kroah-Hartman
3496e2055a9SGreg Kroah-Hartman /* Foreground filter */
3506e2055a9SGreg Kroah-Hartman
3516e2055a9SGreg Kroah-Hartman ec->fir_state.coeffs = ec->fir_taps16[0];
3526e2055a9SGreg Kroah-Hartman echo_value = fir16(&ec->fir_state, tx);
3536e2055a9SGreg Kroah-Hartman ec->clean = rx - echo_value;
3546e2055a9SGreg Kroah-Hartman ec->lcleanacc += abs(ec->clean) - ec->lclean;
3556e2055a9SGreg Kroah-Hartman ec->lclean = (ec->lcleanacc + (1 << 4)) >> 5;
3566e2055a9SGreg Kroah-Hartman
3576e2055a9SGreg Kroah-Hartman /* Background filter */
3586e2055a9SGreg Kroah-Hartman
3596e2055a9SGreg Kroah-Hartman echo_value = fir16(&ec->fir_state_bg, tx);
3606e2055a9SGreg Kroah-Hartman clean_bg = rx - echo_value;
3616e2055a9SGreg Kroah-Hartman ec->lclean_bgacc += abs(clean_bg) - ec->lclean_bg;
3626e2055a9SGreg Kroah-Hartman ec->lclean_bg = (ec->lclean_bgacc + (1 << 4)) >> 5;
3636e2055a9SGreg Kroah-Hartman
3646e2055a9SGreg Kroah-Hartman /* Background Filter adaption */
3656e2055a9SGreg Kroah-Hartman
3666e2055a9SGreg Kroah-Hartman /* Almost always adap bg filter, just simple DT and energy
3676e2055a9SGreg Kroah-Hartman detection to minimise adaption in cases of strong double talk.
3686e2055a9SGreg Kroah-Hartman However this is not critical for the dual path algorithm.
3696e2055a9SGreg Kroah-Hartman */
3706e2055a9SGreg Kroah-Hartman ec->factor = 0;
3716e2055a9SGreg Kroah-Hartman ec->shift = 0;
37285dc2c65SNathan Chancellor if (!ec->nonupdate_dwell) {
3736e2055a9SGreg Kroah-Hartman int p, logp, shift;
3746e2055a9SGreg Kroah-Hartman
3756e2055a9SGreg Kroah-Hartman /* Determine:
3766e2055a9SGreg Kroah-Hartman
3776e2055a9SGreg Kroah-Hartman f = Beta * clean_bg_rx/P ------ (1)
3786e2055a9SGreg Kroah-Hartman
3796e2055a9SGreg Kroah-Hartman where P is the total power in the filter states.
3806e2055a9SGreg Kroah-Hartman
3816e2055a9SGreg Kroah-Hartman The Boffins have shown that if we obey (1) we converge
3826e2055a9SGreg Kroah-Hartman quickly and avoid instability.
3836e2055a9SGreg Kroah-Hartman
3846e2055a9SGreg Kroah-Hartman The correct factor f must be in Q30, as this is the fixed
3856e2055a9SGreg Kroah-Hartman point format required by the lms_adapt_bg() function,
3866e2055a9SGreg Kroah-Hartman therefore the scaled version of (1) is:
3876e2055a9SGreg Kroah-Hartman
3886e2055a9SGreg Kroah-Hartman (2^30) * f = (2^30) * Beta * clean_bg_rx/P
3896e2055a9SGreg Kroah-Hartman factor = (2^30) * Beta * clean_bg_rx/P ----- (2)
3906e2055a9SGreg Kroah-Hartman
3916e2055a9SGreg Kroah-Hartman We have chosen Beta = 0.25 by experiment, so:
3926e2055a9SGreg Kroah-Hartman
3936e2055a9SGreg Kroah-Hartman factor = (2^30) * (2^-2) * clean_bg_rx/P
3946e2055a9SGreg Kroah-Hartman
3956e2055a9SGreg Kroah-Hartman (30 - 2 - log2(P))
3966e2055a9SGreg Kroah-Hartman factor = clean_bg_rx 2 ----- (3)
3976e2055a9SGreg Kroah-Hartman
3986e2055a9SGreg Kroah-Hartman To avoid a divide we approximate log2(P) as top_bit(P),
3996e2055a9SGreg Kroah-Hartman which returns the position of the highest non-zero bit in
4006e2055a9SGreg Kroah-Hartman P. This approximation introduces an error as large as a
4016e2055a9SGreg Kroah-Hartman factor of 2, but the algorithm seems to handle it OK.
4026e2055a9SGreg Kroah-Hartman
4036e2055a9SGreg Kroah-Hartman Come to think of it a divide may not be a big deal on a
4046e2055a9SGreg Kroah-Hartman modern DSP, so its probably worth checking out the cycles
4056e2055a9SGreg Kroah-Hartman for a divide versus a top_bit() implementation.
4066e2055a9SGreg Kroah-Hartman */
4076e2055a9SGreg Kroah-Hartman
4086e2055a9SGreg Kroah-Hartman p = MIN_TX_POWER_FOR_ADAPTION + ec->pstates;
4096e2055a9SGreg Kroah-Hartman logp = top_bit(p) + ec->log2taps;
4106e2055a9SGreg Kroah-Hartman shift = 30 - 2 - logp;
4116e2055a9SGreg Kroah-Hartman ec->shift = shift;
4126e2055a9SGreg Kroah-Hartman
4136e2055a9SGreg Kroah-Hartman lms_adapt_bg(ec, clean_bg, shift);
4146e2055a9SGreg Kroah-Hartman }
4156e2055a9SGreg Kroah-Hartman
4166e2055a9SGreg Kroah-Hartman /* very simple DTD to make sure we dont try and adapt with strong
4176e2055a9SGreg Kroah-Hartman near end speech */
4186e2055a9SGreg Kroah-Hartman
4196e2055a9SGreg Kroah-Hartman ec->adapt = 0;
4206e2055a9SGreg Kroah-Hartman if ((ec->lrx > MIN_RX_POWER_FOR_ADAPTION) && (ec->lrx > ec->ltx))
4216e2055a9SGreg Kroah-Hartman ec->nonupdate_dwell = DTD_HANGOVER;
4226e2055a9SGreg Kroah-Hartman if (ec->nonupdate_dwell)
4236e2055a9SGreg Kroah-Hartman ec->nonupdate_dwell--;
4246e2055a9SGreg Kroah-Hartman
4256e2055a9SGreg Kroah-Hartman /* Transfer logic */
4266e2055a9SGreg Kroah-Hartman
4276e2055a9SGreg Kroah-Hartman /* These conditions are from the dual path paper [1], I messed with
4286e2055a9SGreg Kroah-Hartman them a bit to improve performance. */
4296e2055a9SGreg Kroah-Hartman
4306e2055a9SGreg Kroah-Hartman if ((ec->adaption_mode & ECHO_CAN_USE_ADAPTION) &&
4316e2055a9SGreg Kroah-Hartman (ec->nonupdate_dwell == 0) &&
4326e2055a9SGreg Kroah-Hartman /* (ec->Lclean_bg < 0.875*ec->Lclean) */
4336e2055a9SGreg Kroah-Hartman (8 * ec->lclean_bg < 7 * ec->lclean) &&
4346e2055a9SGreg Kroah-Hartman /* (ec->Lclean_bg < 0.125*ec->Ltx) */
4356e2055a9SGreg Kroah-Hartman (8 * ec->lclean_bg < ec->ltx)) {
4366e2055a9SGreg Kroah-Hartman if (ec->cond_met == 6) {
4376e2055a9SGreg Kroah-Hartman /*
4386e2055a9SGreg Kroah-Hartman * BG filter has had better results for 6 consecutive
4396e2055a9SGreg Kroah-Hartman * samples
4406e2055a9SGreg Kroah-Hartman */
4416e2055a9SGreg Kroah-Hartman ec->adapt = 1;
4426e2055a9SGreg Kroah-Hartman memcpy(ec->fir_taps16[0], ec->fir_taps16[1],
4436e2055a9SGreg Kroah-Hartman ec->taps * sizeof(int16_t));
4446e2055a9SGreg Kroah-Hartman } else
4456e2055a9SGreg Kroah-Hartman ec->cond_met++;
4466e2055a9SGreg Kroah-Hartman } else
4476e2055a9SGreg Kroah-Hartman ec->cond_met = 0;
4486e2055a9SGreg Kroah-Hartman
4496e2055a9SGreg Kroah-Hartman /* Non-Linear Processing */
4506e2055a9SGreg Kroah-Hartman
4516e2055a9SGreg Kroah-Hartman ec->clean_nlp = ec->clean;
4526e2055a9SGreg Kroah-Hartman if (ec->adaption_mode & ECHO_CAN_USE_NLP) {
4536e2055a9SGreg Kroah-Hartman /*
4546e2055a9SGreg Kroah-Hartman * Non-linear processor - a fancy way to say "zap small
4556e2055a9SGreg Kroah-Hartman * signals, to avoid residual echo due to (uLaw/ALaw)
4566e2055a9SGreg Kroah-Hartman * non-linearity in the channel.".
4576e2055a9SGreg Kroah-Hartman */
4586e2055a9SGreg Kroah-Hartman
4596e2055a9SGreg Kroah-Hartman if ((16 * ec->lclean < ec->ltx)) {
4606e2055a9SGreg Kroah-Hartman /*
4616e2055a9SGreg Kroah-Hartman * Our e/c has improved echo by at least 24 dB (each
4626e2055a9SGreg Kroah-Hartman * factor of 2 is 6dB, so 2*2*2*2=16 is the same as
4636e2055a9SGreg Kroah-Hartman * 6+6+6+6=24dB)
4646e2055a9SGreg Kroah-Hartman */
4656e2055a9SGreg Kroah-Hartman if (ec->adaption_mode & ECHO_CAN_USE_CNG) {
4666e2055a9SGreg Kroah-Hartman ec->cng_level = ec->lbgn;
4676e2055a9SGreg Kroah-Hartman
4686e2055a9SGreg Kroah-Hartman /*
4696e2055a9SGreg Kroah-Hartman * Very elementary comfort noise generation.
4706e2055a9SGreg Kroah-Hartman * Just random numbers rolled off very vaguely
4716e2055a9SGreg Kroah-Hartman * Hoth-like. DR: This noise doesn't sound
4726e2055a9SGreg Kroah-Hartman * quite right to me - I suspect there are some
4736e2055a9SGreg Kroah-Hartman * overflow issues in the filtering as it's too
4746e2055a9SGreg Kroah-Hartman * "crackly".
4756e2055a9SGreg Kroah-Hartman * TODO: debug this, maybe just play noise at
4766e2055a9SGreg Kroah-Hartman * high level or look at spectrum.
4776e2055a9SGreg Kroah-Hartman */
4786e2055a9SGreg Kroah-Hartman
4796e2055a9SGreg Kroah-Hartman ec->cng_rndnum =
4806e2055a9SGreg Kroah-Hartman 1664525U * ec->cng_rndnum + 1013904223U;
4816e2055a9SGreg Kroah-Hartman ec->cng_filter =
4826e2055a9SGreg Kroah-Hartman ((ec->cng_rndnum & 0xFFFF) - 32768 +
4836e2055a9SGreg Kroah-Hartman 5 * ec->cng_filter) >> 3;
4846e2055a9SGreg Kroah-Hartman ec->clean_nlp =
4856e2055a9SGreg Kroah-Hartman (ec->cng_filter * ec->cng_level * 8) >> 14;
4866e2055a9SGreg Kroah-Hartman
4876e2055a9SGreg Kroah-Hartman } else if (ec->adaption_mode & ECHO_CAN_USE_CLIP) {
4886e2055a9SGreg Kroah-Hartman /* This sounds much better than CNG */
4896e2055a9SGreg Kroah-Hartman if (ec->clean_nlp > ec->lbgn)
4906e2055a9SGreg Kroah-Hartman ec->clean_nlp = ec->lbgn;
4916e2055a9SGreg Kroah-Hartman if (ec->clean_nlp < -ec->lbgn)
4926e2055a9SGreg Kroah-Hartman ec->clean_nlp = -ec->lbgn;
4936e2055a9SGreg Kroah-Hartman } else {
4946e2055a9SGreg Kroah-Hartman /*
4956e2055a9SGreg Kroah-Hartman * just mute the residual, doesn't sound very
4966e2055a9SGreg Kroah-Hartman * good, used mainly in G168 tests
4976e2055a9SGreg Kroah-Hartman */
4986e2055a9SGreg Kroah-Hartman ec->clean_nlp = 0;
4996e2055a9SGreg Kroah-Hartman }
5006e2055a9SGreg Kroah-Hartman } else {
5016e2055a9SGreg Kroah-Hartman /*
5026e2055a9SGreg Kroah-Hartman * Background noise estimator. I tried a few
5036e2055a9SGreg Kroah-Hartman * algorithms here without much luck. This very simple
5046e2055a9SGreg Kroah-Hartman * one seems to work best, we just average the level
5056e2055a9SGreg Kroah-Hartman * using a slow (1 sec time const) filter if the
5066e2055a9SGreg Kroah-Hartman * current level is less than a (experimentally
5076e2055a9SGreg Kroah-Hartman * derived) constant. This means we dont include high
5086e2055a9SGreg Kroah-Hartman * level signals like near end speech. When combined
5096e2055a9SGreg Kroah-Hartman * with CNG or especially CLIP seems to work OK.
5106e2055a9SGreg Kroah-Hartman */
5116e2055a9SGreg Kroah-Hartman if (ec->lclean < 40) {
5126e2055a9SGreg Kroah-Hartman ec->lbgn_acc += abs(ec->clean) - ec->lbgn;
5136e2055a9SGreg Kroah-Hartman ec->lbgn = (ec->lbgn_acc + (1 << 11)) >> 12;
5146e2055a9SGreg Kroah-Hartman }
5156e2055a9SGreg Kroah-Hartman }
5166e2055a9SGreg Kroah-Hartman }
5176e2055a9SGreg Kroah-Hartman
5186e2055a9SGreg Kroah-Hartman /* Roll around the taps buffer */
5196e2055a9SGreg Kroah-Hartman if (ec->curr_pos <= 0)
5206e2055a9SGreg Kroah-Hartman ec->curr_pos = ec->taps;
5216e2055a9SGreg Kroah-Hartman ec->curr_pos--;
5226e2055a9SGreg Kroah-Hartman
5236e2055a9SGreg Kroah-Hartman if (ec->adaption_mode & ECHO_CAN_DISABLE)
5246e2055a9SGreg Kroah-Hartman ec->clean_nlp = rx;
5256e2055a9SGreg Kroah-Hartman
5266e2055a9SGreg Kroah-Hartman /* Output scaled back up again to match input scaling */
5276e2055a9SGreg Kroah-Hartman
5286e2055a9SGreg Kroah-Hartman return (int16_t) ec->clean_nlp << 1;
5296e2055a9SGreg Kroah-Hartman }
5306e2055a9SGreg Kroah-Hartman EXPORT_SYMBOL_GPL(oslec_update);
5316e2055a9SGreg Kroah-Hartman
5326e2055a9SGreg Kroah-Hartman /* This function is separated from the echo canceller is it is usually called
5336e2055a9SGreg Kroah-Hartman as part of the tx process. See rx HP (DC blocking) filter above, it's
5346e2055a9SGreg Kroah-Hartman the same design.
5356e2055a9SGreg Kroah-Hartman
5366e2055a9SGreg Kroah-Hartman Some soft phones send speech signals with a lot of low frequency
5376e2055a9SGreg Kroah-Hartman energy, e.g. down to 20Hz. This can make the hybrid non-linear
5386e2055a9SGreg Kroah-Hartman which causes the echo canceller to fall over. This filter can help
5396e2055a9SGreg Kroah-Hartman by removing any low frequency before it gets to the tx port of the
5406e2055a9SGreg Kroah-Hartman hybrid.
5416e2055a9SGreg Kroah-Hartman
5426e2055a9SGreg Kroah-Hartman It can also help by removing and DC in the tx signal. DC is bad
5436e2055a9SGreg Kroah-Hartman for LMS algorithms.
5446e2055a9SGreg Kroah-Hartman
5456e2055a9SGreg Kroah-Hartman This is one of the classic DC removal filters, adjusted to provide
5466e2055a9SGreg Kroah-Hartman sufficient bass rolloff to meet the above requirement to protect hybrids
5476e2055a9SGreg Kroah-Hartman from things that upset them. The difference between successive samples
5486e2055a9SGreg Kroah-Hartman produces a lousy HPF, and then a suitably placed pole flattens things out.
5496e2055a9SGreg Kroah-Hartman The final result is a nicely rolled off bass end. The filtering is
5506e2055a9SGreg Kroah-Hartman implemented with extended fractional precision, which noise shapes things,
5516e2055a9SGreg Kroah-Hartman giving very clean DC removal.
5526e2055a9SGreg Kroah-Hartman */
5536e2055a9SGreg Kroah-Hartman
oslec_hpf_tx(struct oslec_state * ec,int16_t tx)5546e2055a9SGreg Kroah-Hartman int16_t oslec_hpf_tx(struct oslec_state *ec, int16_t tx)
5556e2055a9SGreg Kroah-Hartman {
5566e2055a9SGreg Kroah-Hartman int tmp;
5576e2055a9SGreg Kroah-Hartman int tmp1;
5586e2055a9SGreg Kroah-Hartman
5596e2055a9SGreg Kroah-Hartman if (ec->adaption_mode & ECHO_CAN_USE_TX_HPF) {
5606e2055a9SGreg Kroah-Hartman tmp = tx << 15;
5616e2055a9SGreg Kroah-Hartman
5626e2055a9SGreg Kroah-Hartman /*
5636e2055a9SGreg Kroah-Hartman * Make sure the gain of the HPF is 1.0. The first can still
5646e2055a9SGreg Kroah-Hartman * saturate a little under impulse conditions, and it might
5656e2055a9SGreg Kroah-Hartman * roll to 32768 and need clipping on sustained peak level
5666e2055a9SGreg Kroah-Hartman * signals. However, the scale of such clipping is small, and
5676e2055a9SGreg Kroah-Hartman * the error due to any saturation should not markedly affect
5686e2055a9SGreg Kroah-Hartman * the downstream processing.
5696e2055a9SGreg Kroah-Hartman */
5706e2055a9SGreg Kroah-Hartman tmp -= (tmp >> 4);
5716e2055a9SGreg Kroah-Hartman
5726e2055a9SGreg Kroah-Hartman ec->tx_1 += -(ec->tx_1 >> DC_LOG2BETA) + tmp - ec->tx_2;
5736e2055a9SGreg Kroah-Hartman tmp1 = ec->tx_1 >> 15;
5746e2055a9SGreg Kroah-Hartman if (tmp1 > 32767)
5756e2055a9SGreg Kroah-Hartman tmp1 = 32767;
5766e2055a9SGreg Kroah-Hartman if (tmp1 < -32767)
5776e2055a9SGreg Kroah-Hartman tmp1 = -32767;
5786e2055a9SGreg Kroah-Hartman tx = tmp1;
5796e2055a9SGreg Kroah-Hartman ec->tx_2 = tmp;
5806e2055a9SGreg Kroah-Hartman }
5816e2055a9SGreg Kroah-Hartman
5826e2055a9SGreg Kroah-Hartman return tx;
5836e2055a9SGreg Kroah-Hartman }
5846e2055a9SGreg Kroah-Hartman EXPORT_SYMBOL_GPL(oslec_hpf_tx);
5856e2055a9SGreg Kroah-Hartman
5866e2055a9SGreg Kroah-Hartman MODULE_LICENSE("GPL");
5876e2055a9SGreg Kroah-Hartman MODULE_AUTHOR("David Rowe");
5886e2055a9SGreg Kroah-Hartman MODULE_DESCRIPTION("Open Source Line Echo Canceller");
5896e2055a9SGreg Kroah-Hartman MODULE_VERSION("0.3.0");
590