The Spectral Complexity of a Single Musical Note

Started by Rick Lyons 5 years ago11 replieslatest reply 5 years ago1272 views
On page 25 of the most recent January issue of the IEEE Signal Processing Magazine the following interesting spectral plot was presented. That plot is the spectral magnitude of a 34.65 Hz C#1 musical note played on a piano. (In that plot, copied here without permission, I added the arrows.) The blue spectral spikes in the plot are called "partials" or "overtones." (The vertical dashed lines are integer multiples, harmonics, of the 34.65 Hz fundamental frequency.)


What's interesting about that spectral plot is that in the upper frequency range the Nth partials are located at frequencies higher than N times fundamental’s harmonic frequencies! For example, as shown by my arrows, the 22nd partial is located at a noticeably higher frequency than the 22nd harmonic frequency. The article attributes this phenomenon to the stiffness of piano strings. So a piano C#1 note's partials (overtones) are not harmonic!

And to me, another fascinating characteristic of the above spectrum is that in the audio signal there exists NO spectral energy at the C#1 note's fundamental frequency of 34.65 Hz. The fundamental is missing! But when we hear a piano's C#1 note our ear/brain combination, somehow, perceives the existence of a 34.65 Hz fundamental tone. (By the way, the C#1's note's fundamental frequency is certainly low. In their standard tunings no bass guitar or cello can play a C#1 note.)

[ - ]
Reply by woodpeckerMay 7, 2019

Hi Rick,

This is indeed an interesting spectral plot. However, a series of plots would be even more interesting. ie:

a) Plots from microphones placed at different distances from the piano string.

b) A spectrogram to show how the difference between harmonic and partial frequencies vary with time.

c) A comparison plot from a bass guitar with the E string detuned to C#1.

As Jeff has said, more information from the article might help us figure out why this effect occurs (eg FFT parameters). Unfortunately, I don't have access to the article so cannot really comment further.

Could the lack of fundamental frequency be due to some kind of phase cancellation effect ?

[ - ]
Reply by CedronMay 7, 2019
Hi Rick,

I, too, was puzzled by the lack of the fundamental.  Out of the three explanations so far:

1) Sounding board

2) Recording Equipment

3) Actually Missing

I would rate the last one least likely by far.

What also interested me was the attentuated tones #8, #16, and #24.  It turns out that is due to where on the string the hammer hits.  You might find this site interesting:


This lecture addresses what I was looking for:

Donald E. Hall:  The hammer and the string

There is also a sounds section where many samples are available.

[ - ]
Reply by Rick LyonsMay 7, 2019

Hi Cedron.

Your referenced material, "The Hammer and the String" was very interesting. It clearly illustrates the complexity of what initially appears to be a simple mechanical system.

To determine whether or not the fundamental spectral component of a piano's C#1 exists or not, as Laurent Millot said, we would need to know the frequency response of the microphone and data acquisition system used to capture the piano's audio.

There is a topic in audio signal processing know as the "missing fundamental." Our standard telephone system operates on the "missing fundamental" principle. Cedron, when you have nothing better to do, have a look at the following web page:


[ - ]
Reply by CedronMay 7, 2019
Sorry, I should have mentioned that I am quite familiar with the concept of this missing fundamental. What your ear hears (and autocorrelation detects) is the repeat pattern, whether that particular tone is part of the waveform doesn't matter. In this case, the sound originates with a vibrating string, in which typically the fundamental is dominant.

If you pluck a guitar string, and gently press your finger at the halfway point, you will dampen the fundamental. That is the only way I know how it would be missing the fundamental. But, it also dampens all the odd numbered overtones (counting the fundamental as the first) which clearly doesn't happen in your graph.

If you look closely at your graph, there is a max at the fundamental, it's just not as big or clean as the others. This does not distinguish between #1 and #2 though. There is also a bit of a max at the half fundamental. This would tend to suggest it is the sounding board with some non-linear effects. If it were a HPF, then there would have to be some other effect in the process creating that.

It could be both, I don't think we can definitely tell from just this. Like you said, it's complicated.
[ - ]
Reply by CedronMay 7, 2019


I took a small excerpt of ten waveforms from one of the strikes in "sound_example_8.mp3".  

This is what the 1/N normalized DFT looks like:


Zooming in a bit:


You can clearly see that the fundamental is present, but not dominant.  Interestingly, the same overtones drifting from the harmonic spot drifting occurs here as well. 
[ - ]
Reply by jbrowerMay 7, 2019

Rick, does the article say anything about mic / data acquisition system specs ?  The plot seems to me a tad suspicious in that "all shape seems to be lost" around 60 Hz.  Maybe the measurement system could not capture anything below that ?

I'm curious, what's the first key where the the fundamental does appear ?


[ - ]
Reply by lmillotMay 7, 2019


I suspect also that the data acquisition system includes a high-pass filter with a cutoff frequency around 100 Hz, in order to suppress at least the contribution of the electric supply (50 Hz or 60 Hz according the countries). 

I have very often encountered this problem within the musical acoustics presentations in convetions and, mainly, for the studies of wind instruments. 

I used to perform experiments on diatonic harmonica within the end of the nineties, during the preparation of my PhD, and I used a data acquisition system able to go almost down to 0 Hz (National Instrument data acquisition card and related conditioners) with a quite tiny Endevco pressure sensor supporting up to 170 dB SPL. 

With these experiments, I could notice at least three things:

- even with the "do not blow or draw too strongly" consign given to the musician, I measured peak level up to 166 dB SPL within the reeds channels of the diatonic harmonica;

- most of the energy of the over pressure within the reed channel was located under 50 Hz which means that if you do not measure "from 0 Hz",but over 100Hz or even above 200 Hz as some musical acousticians, you may not access to the major part of the inner phenomenon;

- when passing from the inside of the reed channel to the outside of the instrument, a level loss of around 90 dB occurs and there is an energy balance modification which favorise the musical content of the pressure signal: the remaining energy was rather concentrated within the fundamental and the harmonics, so higher than 390 Hz for the notes I could tested.

In fact, I do think that for many musical instruments the measurements are performed at too high frequency and that we don't access to the right phenomenon notably when studying wind instruments or sources (events and front of loudspeakers, so-called proximity effect for microphones).

But, this problem could be easily understood as people in Acoustics, and also in Musical Acoustics, do assume that physical phenomena are waves without mass transfer. Yet, when looking, for instance, at the jaws of Dizzie Gillespie when playing the trumpet I cannot accept the fact that the air from his mouth never enters within the trumpet according to the common model of a wave-based resonator which cannot manage a mass transfer. 

This is what I call a source-resonator paradox, when an only wave-based resonator is excited by a flow, but it does not seem to be an abnormal coupling for my colleagues in Musical Acoustics...

Laurent Millot  

[ - ]
Reply by Rick LyonsMay 7, 2019
Hi Jeff. The article's title is "Automatic Music Transcription" which is the design of algorithms to convert audio music signals such as the following:


                         Figure 1

into a human-readable musical score such as the following:


                 Figure 2

Because polyphonic music (instruments simultaneously playing multiple notes) signals are so spectrally complicated automatic music transcription is a terribly complicated problem. Here's a quote from the article's text:

"For example, even in a simple piece as in Figure 2, most pairs of simultaneous notes are separated by musically consonant intervals, which acoustically means that many of their partials overlap [e.g., the A and D notes around 4 seconds, marked with gray circles in Figure 2, share a high number of partials]. In this case, it can be difficult to disentangle how much energy belongs to which note. The task is further complicated by the fact that the spectrotemporal properties of notes vary considerably between different pitches, playing styles, dynamics, and recording conditions. Furthermore, the stiffness property of the strings affects the travel speed of transverse waves based on their frequency. As a result, the partials of instruments like the piano are not found at perfect integer multiples of the fundamental frequency. [See Lyons' original posted spectrum.] Because of this property, called inharmonicity, the positions of partials differ between individual pianos."

Jeff, to answer your questions; the article gave no discussion of microphones or data acquisition hardware. They did state that the "missing fundamental" in my original spectral plot was due to the inability of a piano's soundboard to resonate at frequencies below 50 Hz.

You asked, "What's the first key where the fundamental does appear?" I don't have an answer for you but earlier this evening I thought to myself, "I wonder if higher-pitched piano notes contain a fundamental spectral component. I'd sure like to see the spectrum of a piano's C#5 note (whose fundamental is at 554.36 Hz)."

[ - ]
Reply by jbrowerMay 7, 2019


I did a quick survey on automatic music transcription.  It sounds to me like it's an even more difficult problem than speech recognition.  Speech also exhibits inharmonicity, generating from a fundamental pitch a limited number of inharmonic frequencies (vowels, or "formants" in the vocal tract lingo).  Communication depends on a moving pattern of these to produce symbols (words).  As I understand it, the cochlea generates a sliding FFT analysis (a spectrograph) and the brain performs what amounts to image recognition in the frequency domain on formant patterns.  I might guess that similar techniques are being used for music transcription.

Thinking about it in these terms, the fundamental becomes less important.  Speaker independent speech recognition (aka ASR) algorithms by definition ignore the fundamental (i.e. each person's pitch).  It could be derived from the formants, and I think to some extent that is done for speaker identification.


[ - ]
Reply by KnutIngeMay 7, 2019

Due to the acoustic-mechanics of pianos, overtones are not exact multiples of the fundamental:


I have heard that this is one reason why piano tuners will often apply "stretch-tuning"; tuning high keys slightly above their mathematically correct value (at least in a well tempered sense) and low keys slightly below. This supposedly "aligns" the fundamental of high keys with harmonics of the low keys.

20 years ago, when sampled piano sounds was cool and memory was scarce, you would have electronic musical instruments with 512kB or 4MB of rom which contained sampled sounds that were either "one-shot" (drums, percussive) or looped (piano, organ,...). The problem with piano sounds was that as it features inharmonic harmonics (sic), the waveform does not lend itself well to defining two loop-points in some periodically-stationary region.


[ - ]
Reply by Tim WescottMay 7, 2019

Bells are weak in their fundamentals: https://en.wikipedia.org/wiki/Strike_tone.

The inharmonicity is a known thing, and it's why the partials are called that, and not harmonics.  "Due to the stiffness of piano strings" is an interesting synopsis -- basically, if the string were the same density and under the same tension, but were perfectly flexible, the partials would fall on the harmonics.  The string isn't, and it's constrained at the ends in a way that either doesn't help to put the partials on the harmonics, or even makes it worse.

See also https://en.wikipedia.org/wiki/Stretched_tuning