• bitcoinBitcoin(BTC)$60,228.002.03%
  • ethereumEthereum(ETH)$1,580.643.21%
  • tetherTether(USDT)$1.000.01%
  • binancecoinBNB(BNB)$566.001.73%
  • usd-coinUSDC(USDC)$1.000.02%
  • rippleXRP(XRP)$1.064.29%
  • solanaSolana(SOL)$72.138.30%
  • tronTRON(TRX)$0.320182-0.38%
  • Figure HelocFigure Heloc(FIGR_HELOC)$1.03-0.48%
  • HyperliquidHyperliquid(HYPE)$64.133.52%
  • dogecoinDogecoin(DOGE)$0.0757193.29%
  • RainRain(RAIN)$0.0156980.16%
  • USDSUSDS(USDS)$1.000.00%
  • leo-tokenLEO Token(LEO)$9.290.86%
  • zcashZcash(ZEC)$414.031.43%
  • LABLAB(LAB)$19.758.23%
  • moneroMonero(XMR)$319.764.02%
  • CantonCanton(CC)$0.1517272.73%
  • stellarStellar(XLM)$0.1739310.67%
  • whitebitWhiteBIT Coin(WBT)$48.672.49%
  • cardanoCardano(ADA)$0.1487456.23%
  • chainlinkChainlink(LINK)$7.374.27%
  • USD1USD1(USD1)$1.000.01%
  • daiDai(DAI)$1.000.00%
  • Ethena USDeEthena USDe(USDE)$1.00-0.01%
  • the-open-networkGram (prev. Toncoin)(GRAM)$1.550.23%
  • bitcoin-cashBitcoin Cash(BCH)$196.734.93%
  • litecoinLitecoin(LTC)$42.214.40%
  • hedera-hashgraphHedera(HBAR)$0.0723610.87%
  • Circle USYCCircle USYC(USYC)$1.130.00%
  • Global DollarGlobal Dollar(USDG)$1.000.06%
  • avalanche-2Avalanche(AVAX)$6.629.63%
  • suiSui(SUI)$0.716.01%
  • paypal-usdPayPal USD(PYUSD)$1.000.02%
  • crypto-com-chainCronos(CRO)$0.0548471.17%
  • shiba-inuShiba Inu(SHIB)$0.0000043.81%
  • tether-goldTether Gold(XAUT)$4,069.802.15%
  • nearNEAR Protocol(NEAR)$1.821.59%
  • BlackRock USD Institutional Digital Liquidity FundBlackRock USD Institutional Digital Liquidity Fund(BUIDL)$1.000.00%
  • Ondo US Dollar YieldOndo US Dollar Yield(USDY)$1.13-0.89%
  • BittensorBittensor(TAO)$213.893.06%
  • World Liberty FinancialWorld Liberty Financial(WLFI)$0.0583302.47%
  • pax-goldPAX Gold(PAXG)$4,074.662.18%
  • uniswapUniswap(UNI)$2.954.97%
  • AsterAster(ASTER)$0.631.43%
  • worldcoin-wldWorldcoin(WLD)$0.4744943.22%
  • okbOKB(OKB)$76.313.19%
  • Ripple USDRipple USD(RLUSD)$1.00-0.01%
  • OndoOndo(ONDO)$0.3210695.85%
  • HTX DAOHTX DAO(HTX)$0.0000020.43%
TradePoint.io
  • Main
  • AI & Technology
  • Stock Charts
  • Market & News
  • Business
  • Finance Tips
  • Trade Tube
  • Blog
  • Shop
No Result
View All Result
TradePoint.io
No Result
View All Result

New agentic memory framework uses 118K tokens per query. LangMem burns through 3.26M.

June 26, 2026
in AI & Technology
Reading Time: 9 mins read
A A
New agentic memory framework uses 118K tokens per query. LangMem burns through 3.26M.
ShareShareShareShareShare

Long-horizon reasoning exposes a core weakness in AI agents: context windows fill up fast, and retrieval pipelines return noise instead of signal.

To solve this, researchers at the National University of Singapore developed MRAgent, a framework that abandons the static “retrieve-then-reason” approach. Instead, it uses a mechanism that allows an agent to dynamically develop its memory based on accumulating evidence. 

YOU MAY ALSO LIKE

Building Supervised Fine-Tuning Data from NVIDIA Open-SWE-Traces: Trajectory Parsing, Patch Analysis, Token Budgets, and Tool-Use Metrics

It’s A Dumb Time To Buy An Xbox, Even With The Coming Price Hike

This multi-step memory reconstruction is integrated into the reasoning process of the large language model (LLM). While not the only framework in this space, MRAgent significantly reduces token consumption and runtime costs compared to other agentic memory management approaches.

The limits of passive retrieval in long-horizon tasks

In classic retrieval pipelines, documents are retrieved through vector search or graph traversal and passed on to an LLM for reasoning. This passive approach fails because it cannot combine reasoning with memory access, creating three major bottlenecks:

  • These systems cannot revise their retrieval strategy mid-reasoning. If an agent fetches a document and discovers a crucial missing cue — a specific date or person — it has no way to issue a new query based on that finding.

  • Fixed similarity scores and predefined graph expansions return surface-level matches that flood the LLM’s context window with irrelevant noise, degrading reasoning.

  • Current systems rely heavily on pre-constructed structures such as top-k results and static relevance functions, limiting the flexibility required to scale across unpredictable, long-horizon user interactions.

The researchers argue that to overcome these limitations, developers must shift toward an “active and associative reconstruction process,” a concept inspired by cognitive neuroscience. 

Passive retrieval vs active memory reconstruction (source: arXiv)

Under this paradigm, memory recall unfolds sequentially rather than operating as a passive read-out of a static database. The system starts with small, specific triggers from the user’s prompt, such as a person’s name, an action, or a place. These initial hints point to connecting concepts or categories instead of massive blocks of text. 

By following these metadata stepping stones, the agent gathers small pieces of evidence one by one. It uses each new piece of information to guide its next step until it successfully pieces together the full, accurate story.

How MRAgent implements active memory reconstruction

Instead of viewing memory as a static database, MRAgent (Memory Reasoning Architecture for LLM Agents) treats it as an interactive environment. When processing a complex query, the agent uses the backbone LLM’s reasoning abilities to explore multiple candidate retrieval paths across a structured memory graph. 

At each step, the LLM evaluates the intermediate evidence it has gathered and uses it to iteratively optimize its search. It infers new search constraints, pursues the paths with the best information, and prunes irrelevant branches. This allows MRAgent to piece together deeply buried information without filling the LLM’s context with noise.

MRAgent architecture

MRAgent architecture (source: arXiv)

To make this active exploration computationally efficient and scalable, the framework organizes its database using a “Cue-Tag-Content” mechanism. This operates as a multi-layered associative graph with three node types:

  • Cues: Fine-grained keywords, such as entities or contextual attributes extracted from user interactions.

  • Content: The actual stored memory units. These are divided into multi-granular layers, such as episodic memory for concrete events and semantic memory for stable facts and user preferences.

  • Tags: Semantic bridges that summarize the relational associations between specific Cues and Content.

This structure enables a highly efficient two-stage retrieval process. The LLM first navigates from Cues to candidate Tags. Because Tags explicitly expose the semantic relationships and structural associations of the data, the agent evaluates these short summaries to judge their relevance. The LLM identifies promising traversal paths and discards irrelevant branches before spending compute and prompt tokens to access the detailed, heavy memory contents.

For example, a user might ask an AI agent, “How did Nate use the prize money when he won his third video game tournament?”

  • MRAgent first extracts fine-grained starting cues from the prompt, such as “Nate,” “video game tournament,” and “win.”

  • The agent maps these initial cues to the memory graph and looks at the available associative Tags connected to them. The agent sees tags like “Tournament Victory” and “Tournament Participation.” Since it is only concerned with what the person did after they won the championship, MRAgent drops the tournament participation tag and pursues the victory tag.

  • The agent retrieves the episodic content linked to the chosen Cue-Tag pair, retrieving three distinct memory episodes where Nate won a tournament.

  • MRAgent looks at the three memories, decides one of them in particular is relevant to the query, and discards the other two.

  • With this information, it updates its cues and starts another round of discovery and pruning. From the new episodic memory it has retrieved, the agent adds “tournament earnings” to its cues and uses that to traverse new tags and home in on new memories. It repeats this process until it gathers enough information to answer the query, which could be something like “Nate saved the money.”

MRAgent performance on industry benchmarks

MRAgent operates alongside several other frameworks addressing agentic memory building. Alternatives include A-MEM, a graph-based agentic memory framework, and MemoryOS, a hierarchical memory framework. Other persistent memory frameworks include LangMem and Mem0.

The researchers tested MRAgent on the LoCoMo and LongMemEval industry benchmarks. These test the abilities of agents to resolve queries on long-horizon tasks and conversations across dozens of sessions and hundreds of turns of dialogue. The backbone models used were Gemini 2.5 Flash and Claude Sonnet 4.5. The system was tested against standard RAG, A-MEM, MemoryOS, LangMem, and Mem0. 

MRAgent consistently outperformed every baseline across both models and all question types by a significant margin. 

However, for enterprise developers, the most critical metric is often computational cost. In the LongMemEval tests, MRAgent slashed prompt token consumption to just 118k per sample. By comparison, A-Mem consumed 632k tokens, and LangMem burned through 3.26 million tokens per query. MRAgent also effectively halved the runtime compared to A-Mem, dropping from 1,122 seconds to 586 seconds.

MRAgent performance

MRAgent performance (source: arXiv)

What makes MRAgent efficient in practice is its on-demand behavior. Evaluating tags and pruning irrelevant paths before retrieval saves money and context space. Furthermore, the system autonomously evaluates its accumulated context and inherently knows when to stop searching, completely avoiding redundant data exploration.

Implementation and development catch

While MRAgent is highly effective, the Cue-Tag-Content structure needs to be prepared before the agent can query it. Developers must figure out how to architect the underlying memory database to enable the LLM to efficiently navigate associative items and prune irrelevant paths without exploding compute costs.

Fortunately, developers do not have to manually label or structure this data. The authors designed MRAgent with an automated distillation pipeline that uses LLMs to process raw interaction histories and automatically populate the memory graph. For a developer, the job is to implement and orchestrate this automated ingestion pipeline, rather than manually tag data.

You need to set up a background job or streaming pipeline that passes raw user interactions through prompt templates to extract this metadata before storing it in your graph database.

However, the authors emphasize that this is a lightweight construction phase and MRAgent intentionally keeps ingestion simple. 

The authors have released the code on GitHub.

Credit: Source link

ShareTweetSendSharePin

Related Posts

Building Supervised Fine-Tuning Data from NVIDIA Open-SWE-Traces: Trajectory Parsing, Patch Analysis, Token Budgets, and Tool-Use Metrics
AI & Technology

Building Supervised Fine-Tuning Data from NVIDIA Open-SWE-Traces: Trajectory Parsing, Patch Analysis, Token Budgets, and Tool-Use Metrics

June 27, 2026
It’s A Dumb Time To Buy An Xbox, Even With The Coming Price Hike
AI & Technology

It’s A Dumb Time To Buy An Xbox, Even With The Coming Price Hike

June 26, 2026
Perplexity Launches Computer for Counsel: A Multi-Model Agentic Layer for Legal Workflows
AI & Technology

Perplexity Launches Computer for Counsel: A Multi-Model Agentic Layer for Legal Workflows

June 26, 2026
OpenAI Previews GPT-5.6 With Sol, Terra, and Luna: Tiered Models, New Reasoning Modes, Limited Access
AI & Technology

OpenAI Previews GPT-5.6 With Sol, Terra, and Luna: Tiered Models, New Reasoning Modes, Limited Access

June 26, 2026
Next Post
Saks emerges from bankruptcy with new name and focus on luxe retail

Saks emerges from bankruptcy with new name and focus on luxe retail

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Search

No Result
View All Result
Tropical Storm Arthur brings catastrophic rains and flooding to Mississippi and Louisiana

Tropical Storm Arthur brings catastrophic rains and flooding to Mississippi and Louisiana

June 21, 2026
Inside Roblox’s new age verification system amid persistent child safety problems

Inside Roblox’s new age verification system amid persistent child safety problems

June 24, 2026
Stepson of Norway’s crown prince convicted of rape

Stepson of Norway’s crown prince convicted of rape

June 23, 2026

About

Learn more

Our Services

Legal

Privacy Policy

Terms of Use

Bloggers

Learn more

Article Links

Contact

Advertise

Ask us anything

©2020- TradePoint.io - All rights reserved!

Tradepoint.io, being just a publishing and technology platform, is not a registered broker-dealer or investment adviser. So we do not provide investment advice. Rather, brokerage services are provided to clients of Tradepoint.io by independent SEC-registered broker-dealers and members of FINRA/SIPC. Every form of investing carries some risk and past performance is not a guarantee of future results. “Tradepoint.io“, “Instant Investing” and “My Trading Tools” are registered trademarks of Apperbuild, LLC.

This website is operated by Apperbuild, LLC. We have no link to any brokerage firm and we do not provide investment advice. Every information and resource we provide is solely for the education of our readers. © 2020 Apperbuild, LLC. All rights reserved.

No Result
View All Result
  • Main
  • AI & Technology
  • Stock Charts
  • Market & News
  • Business
  • Finance Tips
  • Trade Tube
  • Blog
  • Shop

© 2023 - TradePoint.io - All Rights Reserved!