Nana-HDR: A Non-attentive Non-autoregressive Hybrid Model for TTS
Note: Text is not shown in the training set, and it is converted into system input (Chinese Pinyin with tone and pause) through the pre-trained front-end model.
Part1: Audio samples synthesized by the Tacotron with GMM-based attention, Fastspeech and our Nana-HDR (using the original LPCNet as the vocoder).
Female voice:
Tacotron with GMM-based attention
Fastspeech
Nana-HDR
example1:胡说!别人不了解,我还不了解你?
Tacotron with GMM-based attention
Fastspeech
Nana-HDR
example2:也好,你们促使我做了一个大胆的决定。
Tacotron with GMM-based attention
Fastspeech
Nana-HDR
example3:我本不是她正经婆婆,没的摆什么谱,三天两头来见,她也累我也烦。
Tacotron with GMM-based attention
Fastspeech
Nana-HDR
example4:老爷,你可还记得几年前三姑娘夭折的时候。
Tacotron with GMM-based attention
Fastspeech
Nana-HDR
example5:强撑着道:“你念得……很好,只是错了几个字而已,不妨事的,慢慢学就好。”
Tacotron with GMM-based attention
Fastspeech
Nana-HDR
example6:想她堂堂一大学生装文盲容易吗?
Male voice:
Tacotron with GMM-based attention
Fastspeech
Nana-HDR
example1:镇东王府邸占地极广,正门日间夜间都是大大敞开。
Tacotron with GMM-based attention
Fastspeech
Nana-HDR
example2:老爷,你可还记得几年前三姑娘夭折的时候。
Tacotron with GMM-based attention
Fastspeech
Nana-HDR
example3:强撑着道:“你念得……很好,只是错了几个字而已,不妨事的,慢慢学就好。”
Tacotron with GMM-based attention
Fastspeech
Nana-HDR
example4:哼,那就要命一条要头一颗,真的无路可走,她也不会客气。
Tacotron with GMM-based attention
Fastspeech
Nana-HDR
example5:却被你后娘拿你硬冒名顶了进宫,耽误了你一辈子。
Tacotron with GMM-based attention
Fastspeech
Nana-HDR
example6:明兰浑身哆嗦着,迅速抬头四下看,只见小船被灯笼照得通明。
Part2: Audio samples synthesized in ablation studies (using the original LPCNet as the vocoder).