DSPRelated.com
Forums

A Query about "Improvements to the Sliding DFT Algorithm"

Started by etheory 4 weeks ago6 replieslatest reply 3 weeks ago190 views

Hi all!

My first post here! Long time lurker first time poster.

This one is kinda specifically directed at @RickLyons but happy for anyone to answer.

I am reading: https://www.researchgate.net/publication/352874961... and for specifically the topology of Figures 4(a) and 5(a) I have implemented them in code:

template < typename CF >
struct SlidingNUDFT2
{
  // @param: freq frequency in hertz
  // @param: N DFT size in samples
  // @param: T inverse of sample rate
  __forceinline void setup(float freq, float N, float inv_N, float T)
  {
    const float k = freq * N * T;
    const float twopik = 2.f * PI * k;
    const float twopikinvN = twopik * inv_N;
    coeff1 = exp(-1if * twopik); // exp(-i2πk)
    coeff2 = exp(1if * twopikinvN); // exp(i2πk/N)
    cosCoeff = 2.f * cosf(twopikinvN); // 2cos(2πk/N)
  }

  __forceinline CF step(float xn, float xn_minus_N)
  {
#if 0
    // Figure 5(a)
    // need to contact authors, as this doesn't seem to work correctly....
    // stage 1 - complex comb 1
    const CF comb = (xn * coeff1) - xn_minus_N;
    // stage 2 - real resonator
    const CF rr = comb + (rrprev * cosCoeff) - rrprev2;
    // stage 3 - complex comb 2
    const CF comb2 = (rr * coeff2) - rrprev;
    // state update
    rrprev2 = rrprev;
    rrprev = rr;
    return comb2;
#else
    // Figure 4(a)
    // stage 1 - complex comb 1
    const CF comb = (xn * coeff1) - xn_minus_N;
    // stage 2 - complex comb 2
    const CF rr = (comb + rrprev) * coeff2;
    // state update
    rrprev = rr;
    return rr;
#endif
  }

  float cosCoeff;
  CF rrprev = 0.f;
  CF rrprev2 = 0.f;
  CF coeff1, coeff2;
};

And the Blackman-Harris filtered version:

template < typename CF >
struct SlidingNUDFT2_Filtered
{
  __forceinline void setup(float freq, float N, float inv_N, float T)
  {
    f[0].setup(freq, N, inv_N, T);
    f[1].setup(freq - 1.f / (N * T), N, inv_N, T);
    f[2].setup(freq + 1.f / (N * T), N, inv_N, T);
    f[3].setup(freq - 2.f / (N * T), N, inv_N, T);
    f[4].setup(freq + 2.f / (N * T), N, inv_N, T);
    f[5].setup(freq - 3.f / (N * T), N, inv_N, T);
    f[6].setup(freq + 3.f / (N * T), N, inv_N, T);
  }

  __forceinline CF step(float xn, float xn_minus_N)
  {
    return (1.f / 0.35875f) * (0.35875f * f[0].step(xn, xn_minus_N)
    - 0.5f * 0.48829f * (f[1].step(xn, xn_minus_N) + f[2].step(xn, xn_minus_N))
    + 0.5f * 0.14128f * (f[3].step(xn, xn_minus_N) + f[4].step(xn, xn_minus_N))
    - 0.5f * 0.01168f * (f[5].step(xn, xn_minus_N) + f[6].step(xn, xn_minus_N)));
  }

  SlidingNUDFT2<CF> f[7];
};

Where CF = std::complex.

You can see there is an issue with the result however, with the "marginally stable" version of 4(a) looking like this:

4_a_4805.jpg

and the "guaranteed stable" version of 5(a) looking like this:

5_a_25466.jpg

Note that the 5(a) version has artifacts on the bottom right that seem like a stability issue.

Note that in my code listing above for 5(a) I have swapped the order of the second complex filter and the real resonator, but that isn't an issue since all the operations are linear and hence order independent, and I got exactly the same issue in my results using the exact same flow order as the paper.

Any suggestions about what might be going wrong here are welcome.

Thank you,

Luke

[ - ]
Reply by napiermJanuary 21, 2025

Hello,

This looks very interesting but I can't be the only one who is a bit lost.

What is it you are doing here?

I don't know how to interpret these images.

Mark Napier


[ - ]
Reply by etheoryJanuary 21, 2025

Hi Mark,

I am using the SlidingDFT to perform a log-space spectrogram of a short drum loop.

The image is a log-space 20Hz-20kHz up the vertical axis (which is precisely why the paper was so attractive to me, since the k value doesn't have to be an integer, you can do log-space graphs, which for audio is super useful), and time along the x axis.

The luminance of the greyscale image is the amplitude in dB from -120dB is black to 0dB is white.

I hope that helps.

[ - ]
Reply by Rick LyonsJanuary 24, 2025

Hello Luke.

I'm not able to say whether or not my "Non-integer k Sliding DFT" (Figure 5) is applicable to your application. With that being said, you can test your Figure 5 code by applying an impulse input (a single one-valued sample followed by more than N zero-valued samples) to your Figure 5 code. The correct impulse response is a complex sinusoid whose length is N samples.

The DFT of the correct impulse response (after padding it with lots of zeros) should have a sin(x)/x spectral magnitude with a peak value of N.

I'm not able to interpret your code but I must say your definition of variable 'k' troubles me. The variable k should be a dimensionless number between zero and integer N. It looks to me like your k is a value of time measured in seconds, which is not correct. I may be wrong. Luke, that's about all I can say to you at this point. 

Good Luck.

[-Rick-]



[ - ]
Reply by etheoryJanuary 25, 2025

Hi @RickLyons thanks so much for responding!


Here is what I get from an impulse input when I do a 64 sample SlidingDFT for a range of different k values:

checkimpulseresponse_27407.jpg


Which looks right to me. Though I note with your "guaranteed stable" version, after the N samples, there is a bit of content left over, i.e.

61, 0.301741600037, -0.953390121460
62, 0.806765913963, -0.590871930122
63, 1.000000357628, -0.000000238419
64, 0.000000417233, 0.000000000000
65, 0.000000298023, 0.000000208616
66, 0.000000149012, 0.000000357628
67, -0.000000096858, 0.000000357628
68, -0.000000268221, 0.000000238419
69, -0.000000357628, 0.000000029802
70, -0.000000298023, -0.000000178814
71, -0.000000119209, -0.000000357628
72, 0.000000115484, -0.000000417233
73, 0.000000327826, -0.000000238419
74, 0.000000417233, 0.000000000000
75, 0.000000357628, 0.000000283122
76, 0.000000119209, 0.000000476837
77, -0.000000207685, 0.000000476837
78, -0.000000447035, 0.000000238419
79, -0.000000596046, -0.000000089407
80, -0.000000417233, -0.000000387430
81, -0.000000119209, -0.000000536442
82, 0.000000202679, -0.000000536442
83, 0.000000476837, -0.000000298023
84, 0.000000536442, 0.000000044703
85, 0.000000417233, 0.000000372529
86, 0.000000119209, 0.000000596046
87, -0.000000257045, 0.000000596046
88, -0.000000596046, 0.000000357628
89, -0.000000655651, -0.000000059605
90, -0.000000476837, -0.000000447035
91, -0.000000089407, -0.000000655651
92, 0.000000329688, -0.000000596046
93, 0.000000625849, -0.000000298023
94, 0.000000655651, 0.000000104308
95, 0.000000417233, 0.000000461936
96, 0.000000089407, 0.000000596046
97, -0.000000253320, 0.000000536442
98, -0.000000506639, 0.000000298023
99, -0.000000596046, -0.000000074506
100, -0.000000417233, -0.000000417233

This happens for both the "marginally stable" and "guaranteed stable" versions. Though I guess this is to be expected. Float isn't perfect precision.


As for the value of k, hopefully this interactive spreadsheet will clear up my use-case:
https://www.desmos.com/calculator/u2upssovok

This shows the absolute value of the transfer function plotted for equation 7 of your paper.

This shows that k, varying between 0 and N varies the position of a complex bandpass at a peak frequency of 2*pi*k/N. Therefore I am effectively setting k as the frequency at which to position the bandpass filter.

This allows me to arbitrarily position the SlidingDFT frequencies wherever I want them to be, thus allowing log-space SlidingDFT as a variant to the linear-space SlidingDFT typically used.

Is that not a correct interpretation? It certainly seems to be given the magnitude response.

I therefore basically interpret your paper as a re-interpretation of the SlidingDFT as a fractionally-settable complex band-pass filter.

Thanks, I hope that helps you understand my usage of your awesome paper!

Thank you,

Luke


[ - ]
Reply by Rick LyonsJanuary 25, 2025

Hello Luke. First, I take back what I wrote about your software definition of variable 'k'. My mistake. Your software definition for 'k' is fine.

That interactive spreadsheet demo is *beautifully* informative. Good job!

[-Rick-]


[ - ]
Reply by etheoryJanuary 25, 2025

OK good to know thanks @RickLyons.

My main point of confusion is still why the guaranteed stable version begins to introduce noise in the low frequency SlidingDFT after a few hundred samples, whereas the marginally stable version doesn't.

I still cannot figure out if this is an issue with my implementation or the algorithm itself.

I'd be happy to post all my code and any more details in order to resolve it.

Thanks,

Luke