Common Voice Dataset

By admin , 11 十月, 2024

https://commonvoice.mozilla.org/en/datasets

We’re building an open source, multi-language dataset of voices that anyone can use to train speech-enabled applications.

Includes both Cantonese and Mandarin Chinese!!

抽样粤语（Chinese Hong Kong）语音数据的质量不好，录音人声音不够清晰（不是声优级别的声音），背景噪音较大，标记文件有错。另外还有个Cantonese的分类。

感觉可能用现有的TTS生成数据质量会好得多。

6/25/2025粤语音频统计：

总文件数：123195 个
总时长：8552分7.33秒（513127.33 秒）
平均时长：4.17秒（4.17 秒）
最长时长：1分42.5秒（102.50 秒）
最短时长：0.2秒（0.20 秒）

标签

TTS
AI

评论

您的名字

CAPTCHA

本站使用的软件

请输入"Drupal"

This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.

最新内容

量子算法全集
1 day 22 hours ago
爱给素材
2 weeks 4 days ago
AI世界生成工具
2 weeks 4 days ago
geogebra数学工具
2 weeks 4 days ago
能级跃迁
3 weeks ago
Minecraft for Unity
3 weeks 1 day ago
音效资源网站
4 weeks ago
Unity Grib键使用
1 month ago
达梦数据库
1 month ago
麒麟国产操作系统
1 month ago

最新评论

Mate从LTS版本中移除。变成全部都是短期的版本… 3 months 1 week ago
关于ubuntu-mate 3 months 1 week ago
鱼与漁 4 months 2 weeks ago
SC娛樂城 8 months 3 weeks ago
感谢分享 9 months ago
我没有做过很全面仔细的测试，但在我测试不多的句子里… 9 months 2 weeks ago
语速不一有遇到过吗 9 months 2 weeks ago
26个拼音字母 1 year 4 months ago
如果要把基金从场内转到场外，需要先在场外购买对应基金… 1 year 4 months ago
GPL-2… 1 year 5 months ago