What is the most efficient way to represent or approximate a waveform?
Started by 7 years ago●8 replies●latest reply 7 years ago●315 viewsHello,
I was wondering what was the most efficient way of representing a waveform?
Note: By efficient, I mean using the least amount of storage space.
Take for example the following waveform:
Assuming the aforementioned waveform is 5 mega-bytes in size, is there a more efficient way of representing the same waveform (ex. by using something like an FFT perhaps)?
If so, would it be possible to go back and forth between the original waveform and the waveform representation?
Lastly, assuming the only other way to represent a waveform is through an approximation (resulting in a partial loss of data), roughly how much data is discarded / lost with the waveform approximation?
Thank you,
Nelson
Shannon taught us that lossless (bit exact) compression is not possible "in general". Look up "source coding". However, frequently the requirement is something that "sounds like" or "looks like" the original. That is how speech and music compression mechanisms work. To build such a lossy compression mechanism, you need is a criterion for comparing two waveforms, and a model for the generation of the waveform. You can then do "analysis by synthesis" to find the model parameters that best match the given waveform.
How good you can do depends on how true the model is, and how similar the reconstructed waveform needs to be to the original. As an example, telephone quality speech (8000 samples per second) would require 128 kbps to represent using 16-bit samples (8 bits adds too much noise using linear sampling). But the ear hears logarithmically (i.e., is more sensitive at small amplitudes), so you can get indistinguishable reconstructed sound using logarithmic sampling (PCM) at 64 kbps. Speech is also not white noise - it has much more energy at low frequencies! So, by coding differences (ADPCM) one can drop to 32 kbps. More sophisticated models (e.g., CELP-based) can drop down to 8 kbps or lower with some degradation (but only work on speech, not other signals with the same bandwidth). Even more complex methods can get you down to about 3 kbps. You can do a similar analysis of music compression, with similar results.
Y(J)S
An FFT is a transformation that results in the same (or likely 2x) the data points. So, by itself, it won't help. "2X" because if the input is Real of "n" points then the output has "n" complex data points or "2n" values.
But, it may be possible to determine that some of the spectrum can be filtered out (zeroed) and the number of samples remaining would be less. Then you just have to keep track of where the removed zeros belong... a form of coding unto itself.
Data compression is likely the better way but you asked....
Waveform compression has been around for a long time and is well-studied. Depending on your requirements you can look into all of the various audio compression techniques (e.g., mp3, mp4), and standards or video or voice compression if that fits your application better.
Many of the popular audio compression algorithms are transform based, but used cosine or sine transforms instead of full Fourier Transforms. And, yes, generally compression comes at a cost of some loss.
You could also just treat your waveform data like a data file and use the non-lossy file compression standards, too (tar, zip, etc.). Those depend on the amount of entropy in the data, so if your signal is high in entropy it may not compress very well.
You have a lot of options from existing techniques and standards, so I'd suspect that one of them may fit your application well.
Just adding one information. Waveform compression loss is usually measured in SQNR and in dB, https://en.wikipedia.org/wiki/Signal-to-quantization-noise_ratio
Fourier Transforms are not the best way to exactly encode a signal. The transform is defined over infinite time, so practical implementations window the signal, which distorts the frequency domain. Then, to get the original signal back the inverse transform is also defined over infinite time so in practice the frequency domain is also windowed, adding more distortion.
There are some good audio compression methods, like MP3 (smaller result with some distortion) and FLAC (less compressed with no distortion), but I'm not sure if those are effective on waveforms that are not audio. You could even try something simple like ZIP.
Most important aspects have been mentioned before, but I wanted to give another viewpoint on lossless compression: The redundancy that can be removed from a wave-file (and I suspect you are talking about audio - both because nelsona always asks about audio *g* and you posted an audacity screenshot :) is mostly contained in the sample storage format. In normnal PCM-Wave, a certain upper limit to the sample value is assumed and an adequate amount of bits to represent values up to this limit are reserved for every sample. That is very inefficient, because 16 or 24 or n bit are stored to save values that are a lot smaller and would require less bits.
Lossless compression modes usually benefit from this redundancy by replacing this storage mechanism by something that is prefix-based (like Golomb-Codes or Rice-Codes - look it up in wikipedia), meaning every number has some prefix that tells the reading device how many bits to read.
In addition those methods usually apply some method to get the sample values even smaller like ADPCM or the likes.
Depending on the content of your signal you can obtain bit perfect size reduction of about 20% (mastered pop song) to 70% (classical music)
With flac or other compression formats you can squeeze the storage space. And you can go back and forth without loss.
- Uli
you can decimate as long as the result contains the information you are after