![]() ![]() 31st Conference on neural information processing systems (NIPS). ![]() Furthermore, the proposed method decreases Mel cepstral distortion in comparison with Tacotron2.Īghasi A, Abdi A, Nguyen N, Romberg J (2017) Net-Trim: Convex pruning of deep neural networks with performance guarantee. Experimental results show that our proposed method increases Tacotron2 mean opinion score from 3.01 to 3.97. To handle this problem, we propose to use a convex optimization method, named Net-Trim. In addition, in the case of Tacotron2, Mel-spectrogram generation process is unstable due to high dropout rate at inference time. ![]() We use multi-resolution convolution and part of speech embedding layers in the encoder part of Tacotron2, to overcome the exceptions and Ezafe detection problem. For the lack of data problem, we collect a dataset proper for end-to-end text-to-speech including 21 hours of Persian speech and corresponding text. In this paper, we propose to use an special end-to-end tts system named Tacotron2, and suggest solutions for the mentioned problems. The challenges of using these models for Persian language are lack of a proper data, and also detection of exceptions and Ezafe between words inherently (without grapheme-to-phoneme). An end-to-end text-to-speech system generates acoustic features directly from input text to synthesize speech from it.
0 Comments
Leave a Reply. |