DSPRelated.com
Forums

DSP C++ Project

Started by gilgamash 6 years ago5 replieslatest reply 6 years ago643 views

Greetings all,

I am looking for an interesting project in the DSP and Machine Learning area which has its focus on modern C++ (C++14/C++17, I really don't want old fashioned C oder C98/C03 any longer...). Best with Code Versioning, CI, etc...

Looking forward to hearing from you,

best regards,

Andy


[ - ]
Reply by johndyson10August 1, 2018

I don't know if my project applies to your criteria -- it isn't 'machine learning', but is definitely intense DSP using SIMD, using the C++17 dialect.  I am not using all of the features of C++17, and using very little of the std library except for some of the multi-threading primitives on a couple of associated projects.

The project is a desperately needed DolbyA decoder (much music master material from between middle 1960s through early 1990s is DolbyA encoded, and some of that material has also leaked out into consumer distribtion - but isn't my specific market.)  The real DolbyA HW has a few disadvantages -- first, those decoders work only in the analog domain -- and a lot of material is left DolbyA encoded on digital masters, secondly the HW DolbyA units produce excess intermod distortion, like any simple fast gain control device will do.)

So, my project is a DolbyA compatible decoder (only), which matches real DolbyA decoding precise in a way that it sounds the same, but not the same in the sense of the fastest parts of the attack/decay so as to avoid much of the excess intermod effects, uses other algorithms to avoid creation of some other of the distortion products, and also does some other things to mitigate SOME intermod even after it has been created.  Great pains also taken to avoid aliasing (nonlinear operations on the audio, resulting distortion products wrapping Nyquist rate.)

The code is written in (like mentioned above) a fairly up-to-date C++ dialect, using whatever makes sense in the application -- incl things like lambda expressions when more appropriate than separate functions.

The code is probably 75% SIMD vector operations (timewise probably 95%) using the vector primitives available on GCC and CLANG/LLVM... The vector primitives are used by wrapper classes that usefully support data items different in shape than the CPU primitives, or when it is better to avoid letting the compiler choose how to split up a type.  In the case of the DolbyA compatible decoder project, the code uses lots of 4wide 64bit and 8wide 32bit data items and operations.  The wrapper routines implement the math, logic, conversion and selection primitives that I have needed.  The code doesn't depend on the nice ?: vector operation from GCC because CLANG/LLVM doesnt' support that operation in the same way, so I use a composite primitive that acts similarly (looks something like "res = Vselect(Vgt(a,b), c, d);" as being the same thing as the vector version of the GCC "res = (a > b) ? c : d".)  These keeps the source code compatible between GCC and CLANG/LLVM.  The vector wrapper classes also support a simple printf for debugging purposes only.  There is also support in the class libraries to deal with CPUS with differing SIMD shapes -- so it provides close to optimum classes for ATOM (SSE2/3) or Haswell+ (AVX/AVX2) type machines. The vector class library also supports degenerate vectors of length one for symmetry and compatibility reasons.

The code also contains a large attack/decay algorithm class library (for supporting other than DolbyA decoding - other audio processing schemes also).  The attack/decay class library makes writing compressors/expanders/limiters/NR software much easier.  The primitives are almost exactly what is needed to provide the best performance in the sense of the attack/decay shapes and avoiding intermod effects (to the extent possible.)  So, those details have already been written for flexibility, thereby allowing more focus on the higher level matters.

If I couldn't have used C++, the project would have required much more tedious programming.  So many things are able to be abstracted away like multi-channel audio data, gain control, close to optimal expansion for FIR filters, managing consistent delays between different kinds of constant delay filters -- so mixing the results of a 255 tap and a 1023 tap filter still automatically keeps the resulting filtered data time synchronized.

The abstractions available in C++ have been a lifesaver, and the project would have required more expressions copied over and over again with the simpler (but still very useful) C language, and worse -- more tedious operations on different, but similar variables (e.g. left and right audio data.)  Also, the raw efficiency of C++ and how I utilized it (with similar results from two different compilers), has given me close to the best performance possible given the algorithms and CPU type that I am using.

So -- this is a general description of a project which has benefitted greatly by using C++, and benefitting even more from both the vector extensions (not part of the standard) and some of the newer/useful C++11 though C++17 features.

John

[ - ]
Reply by gilgamashAugust 1, 2018

Hi Josh,

this sounds interesting enough -- is there a repository where I can have a look at? Also, what ind of technical descriptions are you using for the decoding? Would love to see if I can fit in!

Regards and thanks for taking the effort to replying,

Andy


[ - ]
Reply by johndyson10August 1, 2018

Thanks -- DolbyA is really frustrating because there are no real publically available specifications.  There are a few patents (which I haven't used the techniques in them), and the HW schematics.  Those are the only 'specs' available.  After very exhaustive testing and iterations with myself listening and recording pros listening -- I have a design that DOES work better than a real HW DolbyA.  The big advantage is relative lack of intermod.  For example, ABBA is one of my 'basket case' tests -- where some of the choruses on normal releases degrade into vocal 'blobs'.  That severel blurring effect is the result of intermod.  The decoder does a rather surprising job of extracting the voices so they are somewhat (if not perfectly) separated.  The anti-intermod was the result of very serious, almost insanely heroic countermeasures.  Since one of the possible 'customers' are historical archives - only the best possible is acceptable.

I need to hold some of the techniques private, but I plan to expose numerous techniques applicable to general purpose compressors/expanders.  Soon, I hope to have enough time to a few serious paragraphs on this topic so that 'DSP' programmers starting on the project know what to expect (lots of unexpected performance issues), and how to solve those issues.  Slow attack/release time compressors/expanders are relatively easy to do -- but when the attack times are very quick there are unpleasant natural distortion mechanisms.  Also, there is the matter of detection techniques and dynamic attack/release times that might be helpful in GP applications.   The DolbyA compatible decoder is nearly a worst case of difficulty while still retaining enough quality to recover the recording.  Obviously (after studying the behavior) the timing was pushed in the original HW design to avoid apparent noise modulation which would counteract the NR advantage.

We worked really hard to gather every possible research to create a 'specification' which mostly resides in the software itself.  Actually, considering the 'insanely heroice measures' in the code -- the actual decoder is a pretty short piece of C++, but there are some fairly innovative supporting C++ classes, and almost all of the intense math is done with the SIMD classes (using SIMD instructions.)

So, I do intend to write up something that might be useful for a future compressor/expander/NR developer.  Because of other people depending on the future of the DolbyA comopatible decoder project, I am retaining some of the tricks (or herculean effort details) simply because I am hoping for the project to be successful. Frankly, I don't intend to make money on it, but simply trying to keep focus on a design that might be a lot of help for history to be maintained.  When the time comes, we do intend to make the soruce code (or linkable libraries) available for plugin developers. But -- to clarify -- many useful techniques devloped for the decoder project (and other compressors/expanders in the past) are going to be documented in a useful form.

Maybe I should soon make time to document some of this stuff -- if I had'nt developed a bunch of useful C++ clases as an infrastructure, I doubt that I would be sane today :-).


John

[ - ]
Reply by johndyson10August 1, 2018

I wanted to follow up -- even if slightly off topic being about the DolbyA compatible decoder project (Instead of C++ in DSP applications.)  A lot of what I wrote probably appears to be blather -- only because there is so much to talk about and a HUGE number of technical details that I want to talk about so very much.

However, in the case of the decoder, the best proof is in the listening.  Note that what the decoder does is to avoid the intermodulation that a normal DolbyA would produce.  With many kinds of music, that intermod causes a fuzziness or a lack of precision in the sound.  I have a site where I have a snippet of the very best, most accurate COMMERCIAL copy of 30seconds of an ABBA song (Mama Mia) -- from a Polar CD -- no processing.  The other snippet is a copy, processed by my processor from a master tape of exactly the same recording.  If the master tape is processed with a real DolbyA unit, it will also sound fuzzy like the CD.  The compatible decoder is the difference.

Generally, if you listen to either version of the song, the frequency response balance is very similar, but the clarity is significantly better.  ABBA is not known for pristine recording clarity, but I think that comparing 'Mia-orig.mp3' and 'Mia-decodera.mp3' will show the difference between the 'blob-chorus' vs. 'vocal-chorus' where the decoder doing a better job of extracting the voices.  THERE IS NO EQUALIZATION/FAKERY IN MY COPY.

Actually, the benefit is variable, but almost always makes a GREAT improvement for complex choruses, complex groups of instruments, and subtle instrument sounds.  90% of the benefit comes from very aggressive intermodulation distortion improvement.

Here is the site with the two mp3 examples.  There are many cases where lossless is critical to hear the improvement -- but this one is flawed, but obvious:  https://spaces.hightail.com/space/xghqJodgrj

John

[ - ]
Reply by Lito844August 1, 2018

This is one of the most interesting threads I've seen here in some time. If I wasn't so overextended I'd consider getting involved. 

In any event I just wanted to let you know there are folks like me on this forum that find this kind of thing fascinating and your passion for it shines brightly through your writing. Keep us posted.