Multirate: interpolation valuable when you can decimate only?

Started by jerryjameso 5 years ago7 replieslatest reply 5 years ago113 views

Hi all,

I would like some opinions on resampling architectures.

I have a current FPGA system that (greatly - 100x - overclocked) has a 65/64 followed by 5/18 but there seems to be no need for this high ratio (and subsequently over 1K taps)!

I think that breaking (haven't done the Matlab work yet to start analyzing the redesign) this down into 13/16 -> 5/12 -> 5/6 seems obviously better. 

Are there approaches discussed in some papers / books that described general resampling architecture trades that I have yet to find discussing these sublime points? Despite that high ratio (65/64), Is it best to avoid interpolating when possible?

Thanks and best, Jerry

[ - ]
Reply by kazJanuary 18, 2019

In all cases you are interpolating/decimating to achieve fractional rate conversion.

Your second three ratios look better for filters size but requires three designs.

One other extreme is single polyphase architecture that interpolates by 325 and decimates by 1152. This means you only need to compute 1 output every 1152 and discard the rest which simplifies resource greatly but requires large LUT yet few taps.

[ - ]
Reply by jerryjamesoJanuary 18, 2019

Thanks Kaz,

I understand that I am both interpolating (Numerator) and decimating (denom) in all cases; I'll rephrase.  Is it better to avoid the higher sample rate in a given stage when the end result is a lower rate (352/1152)  from the channelizing code? Is there any motivation driving avoid a stage (such as 65/64) that surrounds that slight rate increase when you're feeding a 5/18 after - aside from the larger filter that the 65/64 drives.

Appreciate the discussion. Thanks for the polyphase description.


[ - ]
Reply by kazJanuary 18, 2019

option 1: breaking down the design into a cascade can indeed help since at each stage you cut off just as required by that stage and let next filter or final filter help the chop off.

option 2: if you go for 352/1152 single filter then you need to cut off at Nyquist after 1152 decimation which is very sharp indeed. The polyphase approach works better here as you only need one polyphase every 1152 ticks of input. such filter (main filter or prototype is actually just stored in memory as LUT. you only point to a given location every 1152 ticks and compute polyphase output which can be very small. on next output sample you jump 1152 modulo 352 and so on...

option 3 use CIC filter + equaliser

option 1 is more common but option 2 can be made variable and is attractive if you target single design task. The quality of output signal is more to do with filter design than implementation structure.

[ - ]
Reply by dgshaw6January 18, 2019

I agree with the single polyphase structure you suggests, but I would implement it so that you get 1 new output for every 3 or 4 inputs.

So you have a delay line with the 1152 sampling input, and a series of phases of FIR coefficients, from which you can select, based on the possible alignments of the input to the output.

I just went through this with a 100 MHz to 30.72 MHz rate conversion.

Matlab took quite some time to design the filter I wanted, but then I remapped it into a 2 dim array of the different phases, and it works great.

[ - ]
Reply by kazJanuary 18, 2019

For single filter approach as polyphase structure here is what I do:

prototype filter cutoff: either I or D decides lowest cutoff. In this case I will design cutoff at 1/(2*1152)

prototype filter size: I * integer e.g. 352*10 = 3520 taps (for LUT), 10 is polyphase length


to address all 10 taps of a polyphase in one clock tick arrange LUT accordingly as 10 blocks each with 352 locations

pointer: start with 0 then add D modulo I to get a pointer at every input advance. 

get sum of products of 10 taps as usual(input pipe just 10 stages)

That is it.

Modelling: Matlab upfirdn should give identical results using same filter

Theory: interpolation/decimation is done partly in head and partly on platform.

when we jump the pointer we are interpolating in head but discarding so just ignore them


You advance input only once every time overflow of modulo adder occurs.

The gain of filter may have to be too high due to I factor.

[ - ]
Reply by jerryjamesoJanuary 18, 2019


Thanks for the good discussion on the topic!

[ - ]
Reply by SlartibartfastJanuary 18, 2019

I suspect that one of the reasons the subtle tradeoffs don't get detailed treatment in papers/texts is because the implementation possibilities are so varied, and often change quickly as the technologies change.   This has been true in silicon/FPGAs over the years, where the available hardware has evolved enough to make big changes in the approaches used.   Sometimes those tradeoffs make on method better or worse than others, so since it depends on so many different things it means you are usually on your own for sorting out what might be best for your particular application and implementation space.

That said, I think kaz caught some of the basic options well, i.e., a basic cascade structure, a polyphase structure, or a CIC structure followed by a smaller FIR to clean things up.

So, yeah, it depends...  ;)