HiFi-Glot: Neural formant synthesis with differentiable resonant filters
Lauri Juvela, Pablo Pérez Zarazaga, Gustav Eje Henter, Zofia Malisz
Summary
The goal of this work is to develop a speaker-independent speech synthesis system driven by a small set of phonetically meaningful speech parameters.
The system is built with a similar structure to the source-filter model, allowing us to independently inspect and manipulate the spectral envelope and glottal excitation.
The system provides a controllable environment where it is possible to manipulate the different individual speech parameters to generate a realistic speech signal.
A pre-print from this article can be found here.
Visual overview
Code
Source code and pre-trained models can be found following the instructions in our repository
</link>
Synthesised speech
We first present some samples generated as copy synthesis with the proposed HiFi-Glot model compared to our previous work on neural formant synthesis (NFS), an end-to-end implementation of this model (NFS-E2E) and Praat.
| System | Reference | HiFi-GLot | NFS-E2E | NFS | Praat |
|---|---|---|---|---|---|
| Sample 1 | |||||
| Sample 2 | |||||
| Sample 3 |
Manipulation samples are created by scaling a specific formant frequency (F1-F4) by a factor in the range 0.7 - 1.3.
| Scale F1 | 0.7 | 0.8 | 0.9 | 1.0 | 1.1 | 1.2 | 1.3 |
|---|---|---|---|---|---|---|---|
| HiFi-Glot | |||||||
| NFS-E2E | |||||||
| NFS | |||||||
| Praat |
| Scale F2 | 0.7 | 0.8 | 0.9 | 1.0 | 1.1 | 1.2 | 1.3 |
|---|---|---|---|---|---|---|---|
| HiFi-Glot | |||||||
| NFS-E2E | |||||||
| NFS | |||||||
| Praat |
| Scale F3 | 0.7 | 0.8 | 0.9 | 1.0 | 1.1 | 1.2 | 1.3 |
|---|---|---|---|---|---|---|---|
| HiFi-Glot | |||||||
| NFS-E2E | |||||||
| NFS | |||||||
| Praat |
| Scale F4 | 0.7 | 0.8 | 0.9 | 1.0 | 1.1 | 1.2 | 1.3 |
|---|---|---|---|---|---|---|---|
| HiFi-Glot | |||||||
| NFS-E2E | |||||||
| NFS | |||||||
| Praat |
Citation information
@article{juvela2024hifi,
title={HiFi-Glot: Neural Formant Synthesis with Differentiable Resonant Filters},
author={Juvela, Lauri and P{\'e}rez Zarazaga, Pablo and Henter, Gustav Eje and Malisz, Zofia},
journal={arXiv preprint arXiv:2409.14823},
year={2024}
}