The reproduction of voiced sounds by physical modeling is addressed.
A major focus is put on the possibility of fitting a physically
constrained model to real voice samples. A source-filter
scheme is adopted in which the vocal tract is represented by an allpole
filter and the voice source model relies on a lumped mechano
aerodynamic scheme inspired by the mass-spring paradigm. The
vocal folds are represented by a mechanical resonator plus a delay
line which takes into account the vertical phase differences. The
vocal fold displacement is coupled to the glottal flow by means of a
general parametric nonlinear model. An adaptive data-driven identification
procedure is outlined, where the parameters of the model
are tuned in order to accurately match the target speech waveform.
The simultaneous optimization of the source and the vocal tract parameters
is discussed. A recursive algorithm based on the Kalman filtering approach is proposed and evaluated. The performance on time varying voiced signals is discussed.