By admin , 21 五月, 2025

https://github.com/hexgrad/kokoro

Kokoro is an open-weight TTS model with 82 million parameters. Despite its lightweight architecture, it delivers comparable quality to larger models while being significantly faster and more cost-efficient. With Apache-licensed weights, Kokoro can be deployed anywhere from production environments to personal projects.

支持普通话,合成8个字2.5秒的短句耗时0.7秒。这个模型似乎真的是很快!可以普通话音调不太对,也不提供开源的训练代码。

来自AI的比较:

标签

By admin , 20 五月, 2025

梅尔频谱参数

  • num_mels: 80
    梅尔滤波器组的数量,决定了梅尔频谱的维度。80 是 Tacotron2 的标准配置。
  • mel_fmin: 50.0, mel_fmax: 7600.0
    梅尔频谱的最低和最高频率(Hz)。对于粤语,这些值覆盖了大部分语音的频率范围(粤语的声调变化可能需要较高的上限)。

标签

最新内容

最新评论