• bitcoinBitcoin(BTC)$77,440.000.28%
  • ethereumEthereum(ETH)$2,132.370.76%
  • tetherTether(USDT)$1.000.00%
  • binancecoinBNB(BNB)$660.411.67%
  • rippleXRP(XRP)$1.37-0.04%
  • usd-coinUSDC(USDC)$1.000.01%
  • solanaSolana(SOL)$87.622.06%
  • tronTRON(TRX)$0.3635470.23%
  • Figure HelocFigure Heloc(FIGR_HELOC)$1.03-0.87%
  • dogecoinDogecoin(DOGE)$0.1064831.77%
  • HyperliquidHyperliquid(HYPE)$61.245.36%
  • USDSUSDS(USDS)$1.00-0.03%
  • zcashZcash(ZEC)$638.87-3.33%
  • cardanoCardano(ADA)$0.2539802.66%
  • leo-tokenLEO Token(LEO)$9.98-0.80%
  • whitebitWhiteBIT Coin(WBT)$57.060.28%
  • bitcoin-cashBitcoin Cash(BCH)$380.420.56%
  • chainlinkChainlink(LINK)$10.004.24%
  • moneroMonero(XMR)$389.12-1.23%
  • CantonCanton(CC)$0.153846-1.54%
  • the-open-networkToncoin(TON)$2.01-0.59%
  • stellarStellar(XLM)$0.1500583.89%
  • USD1USD1(USD1)$1.00-0.07%
  • suiSui(SUI)$1.133.50%
  • Ethena USDeEthena USDe(USDE)$1.000.01%
  • daiDai(DAI)$1.000.01%
  • litecoinLitecoin(LTC)$54.190.86%
  • avalanche-2Avalanche(AVAX)$9.572.77%
  • hedera-hashgraphHedera(HBAR)$0.0907142.83%
  • MemeCoreMemeCore(M)$2.882.33%
  • RainRain(RAIN)$0.007510-0.23%
  • paypal-usdPayPal USD(PYUSD)$1.00-0.01%
  • shiba-inuShiba Inu(SHIB)$0.0000061.77%
  • crypto-com-chainCronos(CRO)$0.0700551.64%
  • nearNEAR Protocol(NEAR)$2.3232.71%
  • Circle USYCCircle USYC(USYC)$1.120.00%
  • BittensorBittensor(TAO)$288.783.34%
  • Global DollarGlobal Dollar(USDG)$1.000.01%
  • tether-goldTether Gold(XAUT)$4,515.990.44%
  • BlackRock USD Institutional Digital Liquidity FundBlackRock USD Institutional Digital Liquidity Fund(BUIDL)$1.000.00%
  • uniswapUniswap(UNI)$3.672.39%
  • polkadotPolkadot(DOT)$1.346.99%
  • OndoOndo(ONDO)$0.46455415.75%
  • mantleMantle(MNT)$0.67-1.79%
  • Ondo US Dollar YieldOndo US Dollar Yield(USDY)$1.13-0.37%
  • pax-goldPAX Gold(PAXG)$4,517.940.46%
  • World Liberty FinancialWorld Liberty Financial(WLFI)$0.061187-1.91%
  • HTX DAOHTX DAO(HTX)$0.0000020.41%
  • AsterAster(ASTER)$0.700.94%
  • Falcon USDFalcon USD(USDF)$1.000.03%
TradePoint.io
  • Main
  • AI & Technology
  • Stock Charts
  • Market & News
  • Business
  • Finance Tips
  • Trade Tube
  • Blog
  • Shop
No Result
View All Result
TradePoint.io
No Result
View All Result

Build Recurrent-Depth Transformers with OpenMythos for MLA, GQA, Sparse MoE, and Loop-Scaled Reasoning

May 22, 2026
in AI & Technology
Reading Time: 1 min read
A A
Build Recurrent-Depth Transformers with OpenMythos for MLA, GQA, Sparse MoE, and Loop-Scaled Reasoning
ShareShareShareShareShare

YOU MAY ALSO LIKE

RGB, OLED And The TVs You Should Buy Today

Microsoft Releases Fara1.5: A Family of Browser Computer-Use Agents (4B/9B/27B) That Outperform OpenAI Operator and Gemini 2.5 Computer Use on Online-Mind2Web

def build_model(attn_type: str = "mla", max_loop_iters: int = 8) -> tuple:
   """Build a small OpenMythos model. Two attention variants supported.
   MLA  — Multi-Latent Attention (compressed KV cache, DeepSeek-V2 style)
   GQA  — Grouped-Query Attention (fewer KV heads than Q heads)
   """
   base = dict(
       vocab_size       = 64,
       dim              = 128,
       n_heads          = 4,
       max_seq_len      = 32,
       max_loop_iters   = max_loop_iters,
       prelude_layers   = 1,
       coda_layers      = 1,
       n_experts        = 4,
       n_shared_experts = 1,
       n_experts_per_tok= 2,
       expert_dim       = 64,
       lora_rank        = 8,
       attn_type        = attn_type,
   )
   if attn_type == "gqa":
       cfg = MythosConfig(**base, n_kv_heads=2)
   else:
       cfg = MythosConfig(
           **base, n_kv_heads=4,
           kv_lora_rank=32, q_lora_rank=32,
           qk_rope_head_dim=16, qk_nope_head_dim=16, v_head_dim=16,
       )
   model = OpenMythos(cfg).to(device)
   return model, cfg
model_mla, cfg_mla = build_model("mla")
model_gqa, cfg_gqa = build_model("gqa")
def n_params(m): return sum(p.numel() for p in m.parameters())
print(f"\n[MLA] params: {n_params(model_mla):>10,}")
print(f"[GQA] params: {n_params(model_gqa):>10,}")
def spectral_radius(model):
   A = model.recurrent.injection.get_A().detach().cpu()
   if A.dim() == 1:
       rho = A.abs().max().item()
   else:
       rho = torch.linalg.eigvals(A.float()).abs().max().item()
   return rho
print(f"\nρ(A) MLA: {spectral_radius(model_mla):.4f}   (must be < 1)")
print(f"ρ(A) GQA: {spectral_radius(model_gqa):.4f}   (must be < 1)")
ids = torch.randint(0, cfg_mla.vocab_size, (2, 16), device=device)
with torch.no_grad():
   logits = model_mla(ids, n_loops=4)
   gen    = model_mla.generate(ids, max_new_tokens=4, n_loops=8)
print(f"\nForward logits shape:  {tuple(logits.shape)}")
print(f"Generation shape:      {tuple(gen.shape)}")

Credit: Source link

ShareTweetSendSharePin

Related Posts

RGB, OLED And The TVs You Should Buy Today
AI & Technology

RGB, OLED And The TVs You Should Buy Today

May 22, 2026
Microsoft Releases Fara1.5: A Family of Browser Computer-Use Agents (4B/9B/27B) That Outperform OpenAI Operator and Gemini 2.5 Computer Use on Online-Mind2Web
AI & Technology

Microsoft Releases Fara1.5: A Family of Browser Computer-Use Agents (4B/9B/27B) That Outperform OpenAI Operator and Gemini 2.5 Computer Use on Online-Mind2Web

May 22, 2026
Meta Quietly Released A New Reddit-Like App Called Forum
AI & Technology

Meta Quietly Released A New Reddit-Like App Called Forum

May 22, 2026
Alibaba’s proprietary Qwen3.7-Max can run for 35 hours autonomously and supports external harnesses like Anthropic’s Claude Code
AI & Technology

Alibaba’s proprietary Qwen3.7-Max can run for 35 hours autonomously and supports external harnesses like Anthropic’s Claude Code

May 21, 2026
Next Post
Motorsports world mourns the unexpected loss of NASCAR legend Kyle Busch – Motorsport.com

Motorsports world mourns the unexpected loss of NASCAR legend Kyle Busch - Motorsport.com

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Search

No Result
View All Result
ACM Awards 2026 live updates and winners: Cody Johnson wins Male Artist of the Year; Avery Anna, Riley Green and others take the stage – Yahoo

ACM Awards 2026 live updates and winners: Cody Johnson wins Male Artist of the Year; Avery Anna, Riley Green and others take the stage – Yahoo

May 18, 2026
Zyphra Releases ZAYA1-8B-Diffusion-Preview: The First MoE Diffusion Model Converted From an Autoregressive LLM With Up to 7.7x Speedup

Zyphra Releases ZAYA1-8B-Diffusion-Preview: The First MoE Diffusion Model Converted From an Autoregressive LLM With Up to 7.7x Speedup

May 15, 2026
A Major Catalyst Is Emerging For BDCs That The Market Is Completely Missing

A Major Catalyst Is Emerging For BDCs That The Market Is Completely Missing

May 16, 2026

About

Learn more

Our Services

Legal

Privacy Policy

Terms of Use

Bloggers

Learn more

Article Links

Contact

Advertise

Ask us anything

©2020- TradePoint.io - All rights reserved!

Tradepoint.io, being just a publishing and technology platform, is not a registered broker-dealer or investment adviser. So we do not provide investment advice. Rather, brokerage services are provided to clients of Tradepoint.io by independent SEC-registered broker-dealers and members of FINRA/SIPC. Every form of investing carries some risk and past performance is not a guarantee of future results. “Tradepoint.io“, “Instant Investing” and “My Trading Tools” are registered trademarks of Apperbuild, LLC.

This website is operated by Apperbuild, LLC. We have no link to any brokerage firm and we do not provide investment advice. Every information and resource we provide is solely for the education of our readers. © 2020 Apperbuild, LLC. All rights reserved.

No Result
View All Result
  • Main
  • AI & Technology
  • Stock Charts
  • Market & News
  • Business
  • Finance Tips
  • Trade Tube
  • Blog
  • Shop

© 2023 - TradePoint.io - All Rights Reserved!