• bitcoinBitcoin(BTC)$60,377.000.14%
  • ethereumEthereum(ETH)$1,580.380.04%
  • tetherTether(USDT)$1.000.00%
  • binancecoinBNB(BNB)$557.44-1.11%
  • usd-coinUSDC(USDC)$1.000.01%
  • rippleXRP(XRP)$1.05-0.05%
  • solanaSolana(SOL)$71.58-0.38%
  • tronTRON(TRX)$0.3211370.16%
  • Figure HelocFigure Heloc(FIGR_HELOC)$1.041.52%
  • HyperliquidHyperliquid(HYPE)$62.88-0.88%
  • dogecoinDogecoin(DOGE)$0.073977-1.91%
  • USDSUSDS(USDS)$1.000.02%
  • RainRain(RAIN)$0.015567-0.44%
  • leo-tokenLEO Token(LEO)$9.411.39%
  • zcashZcash(ZEC)$387.60-4.98%
  • CantonCanton(CC)$0.1530541.34%
  • stellarStellar(XLM)$0.172314-1.02%
  • moneroMonero(XMR)$310.13-1.32%
  • whitebitWhiteBIT Coin(WBT)$48.04-0.38%
  • LABLAB(LAB)$17.72-9.31%
  • chainlinkChainlink(LINK)$7.32-0.56%
  • cardanoCardano(ADA)$0.145780-0.88%
  • USD1USD1(USD1)$1.000.00%
  • daiDai(DAI)$1.000.00%
  • Ethena USDeEthena USDe(USDE)$1.000.02%
  • the-open-networkGram (prev. Toncoin)(GRAM)$1.55-1.79%
  • bitcoin-cashBitcoin Cash(BCH)$193.67-1.04%
  • litecoinLitecoin(LTC)$42.770.85%
  • Circle USYCCircle USYC(USYC)$1.130.00%
  • hedera-hashgraphHedera(HBAR)$0.071394-0.85%
  • Global DollarGlobal Dollar(USDG)$1.000.02%
  • suiSui(SUI)$0.69-2.59%
  • paypal-usdPayPal USD(PYUSD)$1.00-0.01%
  • avalanche-2Avalanche(AVAX)$6.33-3.39%
  • crypto-com-chainCronos(CRO)$0.054469-0.84%
  • tether-goldTether Gold(XAUT)$4,066.26-0.09%
  • shiba-inuShiba Inu(SHIB)$0.000004-1.08%
  • nearNEAR Protocol(NEAR)$1.884.73%
  • BlackRock USD Institutional Digital Liquidity FundBlackRock USD Institutional Digital Liquidity Fund(BUIDL)$1.000.00%
  • Ondo US Dollar YieldOndo US Dollar Yield(USDY)$1.150.99%
  • BittensorBittensor(TAO)$209.40-1.52%
  • World Liberty FinancialWorld Liberty Financial(WLFI)$0.0583791.34%
  • pax-goldPAX Gold(PAXG)$4,070.57-0.10%
  • uniswapUniswap(UNI)$2.93-0.28%
  • AsterAster(ASTER)$0.62-1.07%
  • okbOKB(OKB)$77.79-0.01%
  • worldcoin-wldWorldcoin(WLD)$0.454693-1.32%
  • Ripple USDRipple USD(RLUSD)$1.000.02%
  • HTX DAOHTX DAO(HTX)$0.000002-0.11%
  • OndoOndo(ONDO)$0.311048-1.95%
TradePoint.io
  • Main
  • AI & Technology
  • Stock Charts
  • Market & News
  • Business
  • Finance Tips
  • Trade Tube
  • Blog
  • Shop
No Result
View All Result
TradePoint.io
No Result
View All Result

Liquid AI Ships LFM2.5-230M with llama.cpp, MLX, vLLM, SGLang, and ONNX Support for On-Device Inference

June 28, 2026
in AI & Technology
Reading Time: 14 mins read
A A
Liquid AI Ships LFM2.5-230M with llama.cpp, MLX, vLLM, SGLang, and ONNX Support for On-Device Inference
ShareShareShareShareShare

Liquid AI shipped LFM2.5-230M, it’s the company’s smallest model to date. The release targets a specific job: running agentic tasks on phones, robots, and automation devices. Both the base and instruction-tuned checkpoints are open-weight on Hugging Face.

The pitch is narrow on purpose. This is not a general reasoning model. It is built for data extraction and tool use on edge hardware.

TL;DR

  • Liquid AI’s LFM2.5-230M is its smallest model yet: 230M params, open-weight, built on LFM2.
  • Runs on-device at 213 tok/s on a Galaxy S25 Ultra and 42 on a Raspberry Pi 5.
  • Beats larger models (Qwen3.5-0.8B, Gemma 3 1B) on instruction following and data extraction.
  • Tuned for tool use and extraction; not for math, code generation, or creative writing.
  • Day-one support across llama.cpp, MLX, vLLM, SGLang, and ONNX, with a 293–375 MB footprint.

What is LFM2.5-230M?

LFM2.5-230M is a 230-million-parameter, text-only model. It is built on the LFM2 architecture. The model has 14 layers total. Eight are double-gated LIV convolution blocks. The remaining six are grouped-query attention (GQA) blocks. The hybrid layout targets fast CPU inference.

The context length is 32,768 tokens. The vocabulary size is 65,536. The knowledge cutoff is mid-2024. It supports ten languages, including English, Chinese, Arabic, and Japanese.

Liquid AI team ships two checkpoints. LFM2.5-230M-Base is the pre-trained model for fine-tuning. LFM2.5-230M is the general-purpose instruction-tuned version. The license is lfm1.0.

Training and Post-Training

The model was pre-trained on 19 trillion tokens. That total includes a 32K context extension phase. The post-training recipe then runs in three stages.

First comes supervised fine-tuning with distillation from the larger LFM2.5-350M. Second is direct preference optimization (DPO). Third is multi-domain reinforcement learning. This preserves flexibility for downstream specialization.

The distillation step is what keeps a 230M model competitive with larger checkpoints. It inherits behavior from the bigger LFM2.5-350M on targeted tasks.

Benchmark

Liquid AI team evaluated LFM2.5-230M across ten benchmarks. They span knowledge, instruction following, data extraction, and tool use.

The instruction-following results support that. On IFEval, LFM2.5-230M scores 71.71. That beats Qwen3.5-0.8B (59.94) and Gemma 3 1B IT (63.49). On IFBench it scores 38.40, ahead of both. On CaseReportBench, a clinical data-extraction test, it scores 22.51.

Model Params IFEval IFBench CaseReportBench BFCLv4 MMLU-Pro
LFM2.5-230M 230M 71.71 38.40 22.51 21.03 20.25
LFM2.5-350M 350M 76.96 40.69 32.45 21.86 20.01
Granite 4.0-H-350M 350M 61.27 17.22 12.44 13.28 13.14
Qwen3.5-0.8B (Instruct) 800M 59.94 22.87 13.83 18.70 37.42
Gemma 3 1B IT 1B 63.49 20.33 2.28 7.17 14.04

LFM2.5-230M leads on instruction following and data extraction. It trails on broad knowledge: MMLU-Pro is 20.25, behind Qwen3.5-0.8B’s 37.42. It is also weak on some agentic tool use. On τ²-Bench Telecom it scores just 5.26.

Liquid AI is direct about the limits. It does not recommend the model for reasoning-heavy workloads. That means advanced math, code generation, and creative writing.

Use Cases With Examples

The model fits two jobs well.

  • The first is large-scale data extraction pipelines. Picture a pipeline parsing 100,000 clinical reports into structured fields. A 4-bit build with a 293–375 MB memory footprint runs that on commodity CPUs. You extract locally, with no per-token API bill.
  • The second job is lightweight on-device agentic workloads. Think a home automation hub that turns speech into tool calls. Or a phone assistant that routes a request to the right function.

As an early signal, Liquid AI deployed the model on a Unitree G1 humanoid robot. It ran entirely on the robot’s onboard NVIDIA Jetson Orin. There the model acted as a skill-selection layer. It turned one natural-language instruction into a sequence of tool calls. Those calls invoked low-level skills from NVIDIA’s SONIC framework.

LFM2.5 supports function calling in four steps. You define tools as JSON in the system prompt. The model writes a Pythonic function call between special tokens. You execute the call and return the result. The model then writes a plain-text answer.

By default the call is a Python list. It sits between the <|tool_call_start|> and <|tool_call_end|> tokens. Here is the documented pattern, with the tool JSON abbreviated:

<|im_start|>system
List of tools: [{"name": "get_candidate_status",
  "parameters": {"candidate_id": {"type": "string"}}}]<|im_end|>
<|im_start|>user
What is the current status of candidate ID 12345?<|im_end|>
<|im_start|>assistant
<|tool_call_start|>[get_candidate_status(candidate_id="12345")]<|tool_call_end|>Checking the current status of candidate ID 12345.<|im_end|>

You can also force JSON-formatted calls through the system prompt.

Running It: A Minimal Example

The model works with Transformers 5.0.0 and up. The recommended generation settings are temperature 0.1, top_k 50, and repetition_penalty 1.05. Note the do_sample=True flag, which is required for those sampling settings to apply.

from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "LiquidAI/LFM2.5-230M"
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    device_map="auto",
    dtype="bfloat16",
)
tokenizer = AutoTokenizer.from_pretrained(model_id)

inputs = tokenizer.apply_chat_template(
    [{"role": "user", "content": "What is C. elegans?"}],
    add_generation_prompt=True,
    tokenize=True,
    return_dict=True,
    return_tensors="pt",
).to(model.device)

output = model.generate(
    **inputs,
    do_sample=True,
    temperature=0.1,
    top_k=50,
    repetition_penalty=1.05,
    max_new_tokens=512,
)
print(tokenizer.decode(output[0][inputs["input_ids"].shape[-1]:], skip_special_tokens=True))

Liquid AI also publishes fine-tuning recipes. They cover SFT, DPO, and GRPO with LoRA, via Unsloth and TRL. Each ships as a Colab notebook.

Interactive Explainer



Check out the Model weight on HF, Technical details and Docs. Also, feel free to follow us on Twitter and don’t forget to join our 150k+ML SubReddit and Subscribe to our Newsletter. Wait! are you on telegram? now you can join us on telegram as well.

Need to partner with us for promoting your GitHub Repo OR Hugging Face Page OR Product Release OR Webinar etc.? Connect with us

YOU MAY ALSO LIKE

Building a Stable Fable 5 Traces Workflow in Colab: Parsing Tool Calls, Auditing Data, and Training Baselines

Here’s Your Daily Reminder That You Don’t Own Digital Content


Credit: Source link

ShareTweetSendSharePin

Related Posts

Building a Stable Fable 5 Traces Workflow in Colab: Parsing Tool Calls, Auditing Data, and Training Baselines
AI & Technology

Building a Stable Fable 5 Traces Workflow in Colab: Parsing Tool Calls, Auditing Data, and Training Baselines

June 28, 2026
Here’s Your Daily Reminder That You Don’t Own Digital Content
AI & Technology

Here’s Your Daily Reminder That You Don’t Own Digital Content

June 27, 2026
Claude Code turned every engineer into three. Now companies need more product thinkers
AI & Technology

Claude Code turned every engineer into three. Now companies need more product thinkers

June 27, 2026
Are Smart Bulbs More Expensive To Run Than Standard LEDs?
AI & Technology

Are Smart Bulbs More Expensive To Run Than Standard LEDs?

June 27, 2026
Next Post
LIVE: Trump signs bill to fund ICE and Border Patrol | NBC News

LIVE: Trump signs bill to fund ICE and Border Patrol | NBC News

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Search

No Result
View All Result
Apple hikes MacBook, iPad prices by as much as 0 due to memory chip shortage – here are the products getting hit

Apple hikes MacBook, iPad prices by as much as $500 due to memory chip shortage – here are the products getting hit

June 25, 2026
YouTube Settles Early Test Case Over Social Media Harm To Children

YouTube Settles Early Test Case Over Social Media Harm To Children

June 24, 2026
Alarming video of cross burning in Chicago

Alarming video of cross burning in Chicago

June 27, 2026

About

Learn more

Our Services

Legal

Privacy Policy

Terms of Use

Bloggers

Learn more

Article Links

Contact

Advertise

Ask us anything

©2020- TradePoint.io - All rights reserved!

Tradepoint.io, being just a publishing and technology platform, is not a registered broker-dealer or investment adviser. So we do not provide investment advice. Rather, brokerage services are provided to clients of Tradepoint.io by independent SEC-registered broker-dealers and members of FINRA/SIPC. Every form of investing carries some risk and past performance is not a guarantee of future results. “Tradepoint.io“, “Instant Investing” and “My Trading Tools” are registered trademarks of Apperbuild, LLC.

This website is operated by Apperbuild, LLC. We have no link to any brokerage firm and we do not provide investment advice. Every information and resource we provide is solely for the education of our readers. © 2020 Apperbuild, LLC. All rights reserved.

No Result
View All Result
  • Main
  • AI & Technology
  • Stock Charts
  • Market & News
  • Business
  • Finance Tips
  • Trade Tube
  • Blog
  • Shop

© 2023 - TradePoint.io - All Rights Reserved!