Deep learning speech synthesis uses Deep Neural Networks (DNN) to produce artificial speech from text (text-to-speech) or spectrum (vocoder). The deep neural networks are trained using a large amount of recorded speech and, in the case of a text-to-speech system, the associated labels and/or input text.
Some DNN-based speech synthesizers are approaching the naturalness of the human voice.
https://en.wikipedia.org/wiki/Deep_learning_speech_synthesis
No comments:
Post a Comment