• bitcoinBitcoin(BTC)$77,454.000.52%
  • ethereumEthereum(ETH)$2,133.161.03%
  • tetherTether(USDT)$1.00-0.01%
  • binancecoinBNB(BNB)$658.541.68%
  • rippleXRP(XRP)$1.360.06%
  • usd-coinUSDC(USDC)$1.000.00%
  • solanaSolana(SOL)$87.502.29%
  • tronTRON(TRX)$0.3644460.67%
  • Figure HelocFigure Heloc(FIGR_HELOC)$1.03-0.89%
  • dogecoinDogecoin(DOGE)$0.1062691.78%
  • HyperliquidHyperliquid(HYPE)$61.276.89%
  • USDSUSDS(USDS)$1.00-0.01%
  • zcashZcash(ZEC)$643.73-1.65%
  • cardanoCardano(ADA)$0.2524932.32%
  • leo-tokenLEO Token(LEO)$9.98-0.61%
  • whitebitWhiteBIT Coin(WBT)$57.140.61%
  • bitcoin-cashBitcoin Cash(BCH)$380.361.49%
  • chainlinkChainlink(LINK)$9.913.89%
  • moneroMonero(XMR)$388.20-1.54%
  • CantonCanton(CC)$0.153721-0.44%
  • the-open-networkToncoin(TON)$1.99-0.49%
  • stellarStellar(XLM)$0.1489843.91%
  • USD1USD1(USD1)$1.00-0.05%
  • suiSui(SUI)$1.122.85%
  • Ethena USDeEthena USDe(USDE)$1.000.04%
  • daiDai(DAI)$1.000.01%
  • litecoinLitecoin(LTC)$54.231.41%
  • avalanche-2Avalanche(AVAX)$9.532.62%
  • hedera-hashgraphHedera(HBAR)$0.0906463.01%
  • MemeCoreMemeCore(M)$2.871.32%
  • RainRain(RAIN)$0.007504-0.22%
  • paypal-usdPayPal USD(PYUSD)$1.00-0.02%
  • shiba-inuShiba Inu(SHIB)$0.0000061.62%
  • crypto-com-chainCronos(CRO)$0.0697671.53%
  • Circle USYCCircle USYC(USYC)$1.120.00%
  • nearNEAR Protocol(NEAR)$2.2227.40%
  • BittensorBittensor(TAO)$291.674.70%
  • Global DollarGlobal Dollar(USDG)$1.000.01%
  • tether-goldTether Gold(XAUT)$4,520.910.36%
  • BlackRock USD Institutional Digital Liquidity FundBlackRock USD Institutional Digital Liquidity Fund(BUIDL)$1.000.00%
  • uniswapUniswap(UNI)$3.641.97%
  • polkadotPolkadot(DOT)$1.336.43%
  • mantleMantle(MNT)$0.67-1.83%
  • Ondo US Dollar YieldOndo US Dollar Yield(USDY)$1.13-1.13%
  • pax-goldPAX Gold(PAXG)$4,522.040.37%
  • OndoOndo(ONDO)$0.4298776.42%
  • World Liberty FinancialWorld Liberty Financial(WLFI)$0.060557-1.78%
  • HTX DAOHTX DAO(HTX)$0.0000020.75%
  • AsterAster(ASTER)$0.700.68%
  • Falcon USDFalcon USD(USDF)$1.000.03%
TradePoint.io
  • Main
  • AI & Technology
  • Stock Charts
  • Market & News
  • Business
  • Finance Tips
  • Trade Tube
  • Blog
  • Shop
No Result
View All Result
TradePoint.io
No Result
View All Result

Build Recurrent-Depth Transformers with OpenMythos for MLA, GQA, Sparse MoE, and Loop-Scaled Reasoning

May 22, 2026
in AI & Technology
Reading Time: 1 min read
A A
Build Recurrent-Depth Transformers with OpenMythos for MLA, GQA, Sparse MoE, and Loop-Scaled Reasoning
ShareShareShareShareShare

YOU MAY ALSO LIKE

Microsoft Releases Fara1.5: A Family of Browser Computer-Use Agents (4B/9B/27B) That Outperform OpenAI Operator and Gemini 2.5 Computer Use on Online-Mind2Web

Meta Quietly Released A New Reddit-Like App Called Forum

def build_model(attn_type: str = "mla", max_loop_iters: int = 8) -> tuple:
   """Build a small OpenMythos model. Two attention variants supported.
   MLA  — Multi-Latent Attention (compressed KV cache, DeepSeek-V2 style)
   GQA  — Grouped-Query Attention (fewer KV heads than Q heads)
   """
   base = dict(
       vocab_size       = 64,
       dim              = 128,
       n_heads          = 4,
       max_seq_len      = 32,
       max_loop_iters   = max_loop_iters,
       prelude_layers   = 1,
       coda_layers      = 1,
       n_experts        = 4,
       n_shared_experts = 1,
       n_experts_per_tok= 2,
       expert_dim       = 64,
       lora_rank        = 8,
       attn_type        = attn_type,
   )
   if attn_type == "gqa":
       cfg = MythosConfig(**base, n_kv_heads=2)
   else:
       cfg = MythosConfig(
           **base, n_kv_heads=4,
           kv_lora_rank=32, q_lora_rank=32,
           qk_rope_head_dim=16, qk_nope_head_dim=16, v_head_dim=16,
       )
   model = OpenMythos(cfg).to(device)
   return model, cfg
model_mla, cfg_mla = build_model("mla")
model_gqa, cfg_gqa = build_model("gqa")
def n_params(m): return sum(p.numel() for p in m.parameters())
print(f"\n[MLA] params: {n_params(model_mla):>10,}")
print(f"[GQA] params: {n_params(model_gqa):>10,}")
def spectral_radius(model):
   A = model.recurrent.injection.get_A().detach().cpu()
   if A.dim() == 1:
       rho = A.abs().max().item()
   else:
       rho = torch.linalg.eigvals(A.float()).abs().max().item()
   return rho
print(f"\nρ(A) MLA: {spectral_radius(model_mla):.4f}   (must be < 1)")
print(f"ρ(A) GQA: {spectral_radius(model_gqa):.4f}   (must be < 1)")
ids = torch.randint(0, cfg_mla.vocab_size, (2, 16), device=device)
with torch.no_grad():
   logits = model_mla(ids, n_loops=4)
   gen    = model_mla.generate(ids, max_new_tokens=4, n_loops=8)
print(f"\nForward logits shape:  {tuple(logits.shape)}")
print(f"Generation shape:      {tuple(gen.shape)}")

Credit: Source link

ShareTweetSendSharePin

Related Posts

Microsoft Releases Fara1.5: A Family of Browser Computer-Use Agents (4B/9B/27B) That Outperform OpenAI Operator and Gemini 2.5 Computer Use on Online-Mind2Web
AI & Technology

Microsoft Releases Fara1.5: A Family of Browser Computer-Use Agents (4B/9B/27B) That Outperform OpenAI Operator and Gemini 2.5 Computer Use on Online-Mind2Web

May 22, 2026
Meta Quietly Released A New Reddit-Like App Called Forum
AI & Technology

Meta Quietly Released A New Reddit-Like App Called Forum

May 22, 2026
Alibaba’s proprietary Qwen3.7-Max can run for 35 hours autonomously and supports external harnesses like Anthropic’s Claude Code
AI & Technology

Alibaba’s proprietary Qwen3.7-Max can run for 35 hours autonomously and supports external harnesses like Anthropic’s Claude Code

May 21, 2026
How CopilotKit Is Redefining the Agentic AI Stack in 2026
AI & Technology

How CopilotKit Is Redefining the Agentic AI Stack in 2026

May 21, 2026
Next Post
Motorsports world mourns the unexpected loss of NASCAR legend Kyle Busch – Motorsport.com

Motorsports world mourns the unexpected loss of NASCAR legend Kyle Busch - Motorsport.com

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Search

No Result
View All Result
New Crash Data Highlights The Slow Progress Of Tesla’s Robotaxis

New Crash Data Highlights The Slow Progress Of Tesla’s Robotaxis

May 15, 2026
Razr Fold, Bose Lifestyle Ultra Speaker, Ultrahuman Ring Pro And More

Razr Fold, Bose Lifestyle Ultra Speaker, Ultrahuman Ring Pro And More

May 16, 2026
Find 100x Low Risk High Reward Polymarket Strategies With AI

Find 100x Low Risk High Reward Polymarket Strategies With AI

May 17, 2026

About

Learn more

Our Services

Legal

Privacy Policy

Terms of Use

Bloggers

Learn more

Article Links

Contact

Advertise

Ask us anything

©2020- TradePoint.io - All rights reserved!

Tradepoint.io, being just a publishing and technology platform, is not a registered broker-dealer or investment adviser. So we do not provide investment advice. Rather, brokerage services are provided to clients of Tradepoint.io by independent SEC-registered broker-dealers and members of FINRA/SIPC. Every form of investing carries some risk and past performance is not a guarantee of future results. “Tradepoint.io“, “Instant Investing” and “My Trading Tools” are registered trademarks of Apperbuild, LLC.

This website is operated by Apperbuild, LLC. We have no link to any brokerage firm and we do not provide investment advice. Every information and resource we provide is solely for the education of our readers. © 2020 Apperbuild, LLC. All rights reserved.

No Result
View All Result
  • Main
  • AI & Technology
  • Stock Charts
  • Market & News
  • Business
  • Finance Tips
  • Trade Tube
  • Blog
  • Shop

© 2023 - TradePoint.io - All Rights Reserved!