Me and my project partner have been working on recreating the project describes in the paper
"AUDIO TRANSPORT: A GENERALIZED PORTAMENTO VIA OPTIMAL TRANSPORT", by Trevor Henderson and Justin Solomon.
In this project we interpolate between two signal in the STFT-domain, using optimal transport.
After the interpolation it is required to synthesize an audio signal out of the achieved spectra.
So far, we haven't been able to synthesize the signal successfully - the synthesized audio suffers of transients.
Any thumb rules we should know about signal-synthesis?
You give no detail about what you might be doing, so it is impossible to offer any detailed criticism or make any assumptions about what you might or might not be doing. So I'll ask this: are you familiar with overlap-save or overlap-add techniques?
The goal of this project is to "stitch" two audio signals in a smooth way, working in the STFT-domain, as it achieves an effect named "Portamento" - as desribed in the paper.
As to what we are trying to do:
Given two signals, namely x&y we perform the operations as follows:
1) Calculate their STFT with Matlab's STFT function, so their spectras are X&Y
using a hann window of size 2206, padded by zeros up to the size 8192, with 50% overlap.
2) Find an optimal plan for transporting the spectral density in X to Y (Using the solution of optimal transport in 1D), under the assumption that both signals have the same energy.
3) Linear interpolation of the spectral intensity, with respect to the optimal plan achieved in (2).
4) Phase accumulation - adding phase to the synthetic spectra of the transition part described in (3).
5) Synthesis of the transition part, using the istft funtion in Matlab
using a rectangular window of size 2206, padded by zeros up to the size 8192, with 75% overlap.
As for stages 1-4, we are pretty sure we got them right from reading the paper and testing our outputs, but in (5) we just guessed what the writers of the paper mean, and judging by the results we got - we got this wrong.
About overlap-save or overlap-add techniques - we didn't hear of them before, and after a short reading in google, it seems like those techniques are aimed for filtering long time-serieses, or filtering in RT. In case we are mistaken and it has to do with our work we'd love to read further about it.
Thank you in advance!