• bitcoinBitcoin(BTC)$73,986.00-0.45%
  • ethereumEthereum(ETH)$2,335.931.68%
  • tetherTether(USDT)$1.00-0.01%
  • rippleXRP(XRP)$1.511.69%
  • binancecoinBNB(BNB)$667.71-2.15%
  • usd-coinUSDC(USDC)$1.000.00%
  • solanaSolana(SOL)$93.94-0.71%
  • tronTRON(TRX)$0.3022841.34%
  • Figure HelocFigure Heloc(FIGR_HELOC)$1.030.92%
  • dogecoinDogecoin(DOGE)$0.099105-1.42%
  • whitebitWhiteBIT Coin(WBT)$57.86-0.17%
  • USDSUSDS(USDS)$1.00-0.06%
  • cardanoCardano(ADA)$0.285712-0.34%
  • HyperliquidHyperliquid(HYPE)$41.073.67%
  • bitcoin-cashBitcoin Cash(BCH)$473.47-0.69%
  • leo-tokenLEO Token(LEO)$9.060.50%
  • chainlinkChainlink(LINK)$9.770.68%
  • moneroMonero(XMR)$367.901.33%
  • Ethena USDeEthena USDe(USDE)$1.00-0.05%
  • CantonCanton(CC)$0.151690-0.87%
  • stellarStellar(XLM)$0.1731740.14%
  • USD1USD1(USD1)$1.00-0.11%
  • zcashZcash(ZEC)$268.7915.57%
  • litecoinLitecoin(LTC)$57.65-0.30%
  • avalanche-2Avalanche(AVAX)$10.260.61%
  • daiDai(DAI)$1.000.05%
  • hedera-hashgraphHedera(HBAR)$0.098789-0.27%
  • RainRain(RAIN)$0.008809-1.89%
  • paypal-usdPayPal USD(PYUSD)$1.00-0.02%
  • suiSui(SUI)$1.02-3.12%
  • shiba-inuShiba Inu(SHIB)$0.000006-2.48%
  • the-open-networkToncoin(TON)$1.341.11%
  • crypto-com-chainCronos(CRO)$0.079711-0.28%
  • MemeCoreMemeCore(M)$1.714.65%
  • World Liberty FinancialWorld Liberty Financial(WLFI)$0.103005-2.12%
  • tether-goldTether Gold(XAUT)$4,983.51-0.05%
  • mantleMantle(MNT)$0.84-0.16%
  • polkadotPolkadot(DOT)$1.590.26%
  • BittensorBittensor(TAO)$277.22-4.83%
  • uniswapUniswap(UNI)$3.98-2.85%
  • pax-goldPAX Gold(PAXG)$5,012.50-0.07%
  • Circle USYCCircle USYC(USYC)$1.120.00%
  • BlackRock USD Institutional Digital Liquidity FundBlackRock USD Institutional Digital Liquidity Fund(BUIDL)$1.000.00%
  • okbOKB(OKB)$95.31-2.84%
  • AsterAster(ASTER)$0.787.83%
  • nearNEAR Protocol(NEAR)$1.441.93%
  • aaveAave(AAVE)$121.40-0.43%
  • Global DollarGlobal Dollar(USDG)$1.000.01%
  • Falcon USDFalcon USD(USDF)$1.000.08%
  • SkySky(SKY)$0.0750070.46%
TradePoint.io
  • Main
  • AI & Technology
  • Stock Charts
  • Market & News
  • Business
  • Finance Tips
  • Trade Tube
  • Blog
  • Shop
No Result
View All Result
TradePoint.io
No Result
View All Result

Google AI Releases WAXAL: A Multilingual African Speech Dataset for Training Automatic Speech Recognition and Text-to-Speech Models

March 17, 2026
in AI & Technology
Reading Time: 4 mins read
A A
Google AI Releases WAXAL: A Multilingual African Speech Dataset for Training Automatic Speech Recognition and Text-to-Speech Models
ShareShareShareShareShare

Speech technology still has a data distribution problem. Automatic Speech Recognition (ASR) and Text-to-Speech (TTS) systems have improved rapidly for high-resource languages, but many African languages remain poorly represented in open corpora. A team of researchers from Google and other collaborators introduce WAXAL, an open multilingual speech dataset for African languages covering 24 languages, with an ASR component built from transcribed natural speech and a TTS component built from studio-quality single-speaker recordings.

WAXAL is structured as two separate resources because ASR and TTS have different data requirements. The ASR side is designed around diverse speakers, natural environments, and spontaneous language production. The TTS side is designed around controlled recording conditions, phonetically balanced scripts, and cleaner single-speaker audio suited for synthesis. That separation is technically important: a dataset that is useful for robust recognition in noisy real-world settings is usually not the same dataset that produces strong single-speaker TTS models.

YOU MAY ALSO LIKE

Senators tell ByteDance to shut down Seedance 2.0 AI video app ‘immediately’

Boox’s new Go E Ink tablet includes a 10-inch display and runs Android 15

https://arxiv.org/pdf/2602.02734

How the ASR data was collected

The ASR portion of WAXAL was collected using image-prompted speech. Speakers were shown images and asked to describe what they saw in their native language, which is a more natural setup than simple prompted reading. Recordings were captured in speakers’ natural environments, each with a minimum duration of 15 seconds. The collection process also tracked metadata such as speaker age, gender, language, and recording environment. Only a subset of the full collected audio was transcribed: the research team states that the current ASR release includes transcriptions for about 10% of the total recorded audio. Those transcriptions were produced by paid local linguistic experts, using local scripts where available and English-alphabet transliteration otherwise.

This is important for anyone building multilingual ASR systems. Image-prompted speech tends to capture more natural lexical and syntactic variation than tightly scripted reading, but it also makes transcription harder and increases variation across speakers, domains, and acoustic conditions. WAXAL leans into that tradeoff rather than avoiding it. The result is not a perfectly clean benchmark dataset; it is closer to a field-collected multilingual ASR data with real variability baked in.

How the TTS data was collected

The TTS side of WAXAL was built very differently. The TTS dataset was designed for high-quality, single-speaker synthetic voices. For each target language, the research team created a phonetically balanced script of approximately 108,500 words. They contracted 72 community participants, evenly split between male and female voice actors, and recorded them in professional studio-like environments to reduce background noise and preserve audio fidelity. The target was approximately 16 hours of clean edited audio per voice actor.

This is the right design choice for synthesis. TTS models care much more about consistency in pronunciation, recording conditions, microphone quality, and speaker identity than ASR systems do. WAXAL therefore avoids the common mistake of treating ‘speech data’ as a single category, when in practice ASR and TTS pipelines want very different supervision signals.

Key Takeaways

  • WAXAL is an open multilingual speech corpus built for low-resource African language ASR and TTS.
  • The ASR data uses image-prompted, natural speech collected in real-world environments.
  • The TTS data uses studio-quality, single-speaker recordings with phonetically balanced scripts.

Check out Paper and Dataset here. Also, feel free to follow us on Twitter and don’t forget to join our 120k+ ML SubReddit and Subscribe to our Newsletter. Wait! are you on telegram? now you can join us on telegram as well.

The post Google AI Releases WAXAL: A Multilingual African Speech Dataset for Training Automatic Speech Recognition and Text-to-Speech Models appeared first on MarkTechPost.

Credit: Source link

ShareTweetSendSharePin

Related Posts

Senators tell ByteDance to shut down Seedance 2.0 AI video app ‘immediately’
AI & Technology

Senators tell ByteDance to shut down Seedance 2.0 AI video app ‘immediately’

March 17, 2026
Boox’s new Go E Ink tablet includes a 10-inch display and runs Android 15
AI & Technology

Boox’s new Go E Ink tablet includes a 10-inch display and runs Android 15

March 17, 2026
Mistral AI Releases Mistral Small 4: A 119B-Parameter MoE Model that Unifies Instruct, Reasoning, and Multimodal Workloads
AI & Technology

Mistral AI Releases Mistral Small 4: A 119B-Parameter MoE Model that Unifies Instruct, Reasoning, and Multimodal Workloads

March 16, 2026
Android tablets and foldables are getting a Chrome bookmark bar
AI & Technology

Android tablets and foldables are getting a Chrome bookmark bar

March 16, 2026
Next Post
Gulf Smelter Cuts Tighten Aluminum Outlook

Gulf Smelter Cuts Tighten Aluminum Outlook

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Search

No Result
View All Result
Trump sharply criticizes Supreme Court justices he appointed after tariff ruling

Trump sharply criticizes Supreme Court justices he appointed after tariff ruling

March 16, 2026
Costco customers fume as fan favorite churros’ replacement costs nearly double

Costco customers fume as fan favorite churros’ replacement costs nearly double

March 16, 2026
Investigators return to Nancy Guthrie’s home

Investigators return to Nancy Guthrie’s home

March 11, 2026

About

Learn more

Our Services

Legal

Privacy Policy

Terms of Use

Bloggers

Learn more

Article Links

Contact

Advertise

Ask us anything

©2020- TradePoint.io - All rights reserved!

Tradepoint.io, being just a publishing and technology platform, is not a registered broker-dealer or investment adviser. So we do not provide investment advice. Rather, brokerage services are provided to clients of Tradepoint.io by independent SEC-registered broker-dealers and members of FINRA/SIPC. Every form of investing carries some risk and past performance is not a guarantee of future results. “Tradepoint.io“, “Instant Investing” and “My Trading Tools” are registered trademarks of Apperbuild, LLC.

This website is operated by Apperbuild, LLC. We have no link to any brokerage firm and we do not provide investment advice. Every information and resource we provide is solely for the education of our readers. © 2020 Apperbuild, LLC. All rights reserved.

No Result
View All Result
  • Main
  • AI & Technology
  • Stock Charts
  • Market & News
  • Business
  • Finance Tips
  • Trade Tube
  • Blog
  • Shop

© 2023 - TradePoint.io - All Rights Reserved!