• bitcoinBitcoin(BTC)$75,942.00-0.30%
  • ethereumEthereum(ETH)$2,272.37-0.77%
  • tetherTether(USDT)$1.00-0.02%
  • rippleXRP(XRP)$1.36-0.96%
  • binancecoinBNB(BNB)$618.95-0.74%
  • usd-coinUSDC(USDC)$1.000.00%
  • solanaSolana(SOL)$82.98-0.74%
  • tronTRON(TRX)$0.323305-0.13%
  • Figure HelocFigure Heloc(FIGR_HELOC)$1.03-0.75%
  • dogecoinDogecoin(DOGE)$0.1035804.29%
  • whitebitWhiteBIT Coin(WBT)$54.010.11%
  • USDSUSDS(USDS)$1.00-0.01%
  • leo-tokenLEO Token(LEO)$10.34-0.29%
  • HyperliquidHyperliquid(HYPE)$39.58-0.28%
  • cardanoCardano(ADA)$0.244298-0.71%
  • bitcoin-cashBitcoin Cash(BCH)$447.710.01%
  • moneroMonero(XMR)$373.78-1.67%
  • chainlinkChainlink(LINK)$9.11-1.23%
  • CantonCanton(CC)$0.1529122.32%
  • zcashZcash(ZEC)$323.44-3.06%
  • stellarStellar(XLM)$0.160315-1.24%
  • USD1USD1(USD1)$1.00-0.02%
  • daiDai(DAI)$1.00-0.05%
  • MemeCoreMemeCore(M)$3.36-2.98%
  • litecoinLitecoin(LTC)$55.400.45%
  • avalanche-2Avalanche(AVAX)$9.15-0.18%
  • hedera-hashgraphHedera(HBAR)$0.088498-0.64%
  • Ethena USDeEthena USDe(USDE)$1.00-0.01%
  • RainRain(RAIN)$0.0077363.30%
  • shiba-inuShiba Inu(SHIB)$0.0000060.55%
  • suiSui(SUI)$0.91-1.70%
  • paypal-usdPayPal USD(PYUSD)$1.000.00%
  • the-open-networkToncoin(TON)$1.322.36%
  • crypto-com-chainCronos(CRO)$0.068303-1.23%
  • Circle USYCCircle USYC(USYC)$1.120.01%
  • tether-goldTether Gold(XAUT)$4,543.22-1.03%
  • Global DollarGlobal Dollar(USDG)$1.00-0.01%
  • BittensorBittensor(TAO)$250.92-1.96%
  • BlackRock USD Institutional Digital Liquidity FundBlackRock USD Institutional Digital Liquidity Fund(BUIDL)$1.000.00%
  • pax-goldPAX Gold(PAXG)$4,539.96-1.12%
  • mantleMantle(MNT)$0.62-1.16%
  • World Liberty FinancialWorld Liberty Financial(WLFI)$0.064573-12.73%
  • polkadotPolkadot(DOT)$1.21-1.44%
  • uniswapUniswap(UNI)$3.20-0.97%
  • Pi NetworkPi Network(PI)$0.1913301.92%
  • SkySky(SKY)$0.082951-5.89%
  • Falcon USDFalcon USD(USDF)$1.000.04%
  • okbOKB(OKB)$81.99-0.78%
  • nearNEAR Protocol(NEAR)$1.32-1.42%
  • AsterAster(ASTER)$0.662.04%
TradePoint.io
  • Main
  • AI & Technology
  • Stock Charts
  • Market & News
  • Business
  • Finance Tips
  • Trade Tube
  • Blog
  • Shop
No Result
View All Result
TradePoint.io
No Result
View All Result

Definity embeds agents inside Spark pipelines to catch failures before they reach agentic AI systems

April 29, 2026
in AI & Technology
Reading Time: 4 mins read
A A
Definity embeds agents inside Spark pipelines to catch failures before they reach agentic AI systems
ShareShareShareShareShare

For most data engineering teams, managing pipeline reliability often means waiting for an alert, manually tracing failures across distributed jobs and clusters, and fixing problems after they’ve already hit the business. Agentic AI needs the data to be there, clean and on time. A pipeline that fails silently or delivers stale data doesn’t just break a dashboard — it breaks the AI system depending on it.

YOU MAY ALSO LIKE

Poolside AI Introduces Laguna XS.2 and M.1: Agentic Coding Models Reaching 68.2% and 72.5% on SWE-bench Verified

How to build custom reasoning agents with a fraction of the compute

That gap is what Definity, a Chicago-based data pipeline operations startup, is building into: embedding agents directly inside the Spark or DBT driver to act during a pipeline run, not after it. One enterprise customer identified 33% of its optimization opportunities in the first week of deployment and cut troubleshooting and optimization effort by 70%, according to Definity. The company also claims customers are resolving complex Spark issues up to 10x faster.

“You need three big things for agentic data operations: full stack context that is real time and production aware. Control of the pipeline. And the ability to validate in a feedback loop. Without that, you can be outside looking in and read only,” Roy Daniel, CEO and co-founder of Definity told VentureBeat in an exclusive interview.

The company on Wednesday announced that it has raised $12 million in Series A financing led by GreatPoint Ventures, with participation from Dynatrace and existing investors StageOne Ventures and Hyde Park Venture Partners.

Why existing pipeline monitoring breaks down at scale

Existing tools approach the problem from outside the execution layer — Datadog, which acquired data quality monitor Metaplane last year, Databricks system tables, and platforms like Unravel Data and Acceldata all read metrics after a job completes. Dynatrace has monitoring capabilities; it also participated in Definity’s Series A.

The Definity approach is differentiated from other options in the way the solution is architected. According to Daniel, that means by the time a platform monitoring tool surfaces a problem, the pipeline has already run — and the failure, the wasted compute or the bad data is already downstream.

“It’s always after the fact,” Daniel said. “By the time you know something happened, it already happened.”

How Definity’s in-execution agents work

The core architectural difference is where the agent sits — inside the pipeline rather than watching from outside it.

Inline instrumentation. The Definity system installs a JVM agent directly inside the pipeline execution layer via a single line of code, running below the platform layer and pulling execution data directly from Spark.

Execution context during the run. The agent captures query execution behavior, memory pressure, data skew, shuffle patterns and infrastructure utilization as the pipeline runs. It also infers lineage between pipelines and tables dynamically — no predefined data catalog is required.

Intervention, not just observation. The agent can modify resource allocation mid-run, stop a job before bad data propagates or preempt a pipeline based on upstream data conditions. Daniel described one production deployment where the agent detected that an upstream job had been preempted and the input table it was supposed to write was stale — and stopped the downstream pipeline before it started, before bad data reached any dependent system.

What is and isn’t real time. Detection and prevention are real time. Root cause analysis and optimization recommendations run on demand when an engineer queries the assistant, with full execution context already assembled.

Overhead and data residency. The agent adds approximately one second of compute on an hour-long run. Only metadata transmits externally; full on-premises deployment is available for environments where no metadata can leave the perimeter.

What in-execution intelligence looks like in a production environment

One early user of the Definity platform is Nexxen, an ad tech platform running large-scale Spark pipelines  for mission-critical advertising workloads, running on-premises.

Dennis Meyer, Director of Data Engineering at Nexxen, told VentureBeat that the core problem he was facing was not pipeline failures but the accumulating cost of inefficiency in an environment with no elastic cloud capacity to absorb waste.

“The main challenge wasn’t about pipelines breaking, but about managing an increasingly complex and large-scale environment,” Meyer said. “Because we operate on-prem, we don’t have the flexibility of instant elasticity, so inefficiencies have a direct cost impact.”

Existing monitoring tools gave Nexxen partial visibility but not enough to act on systematically. “We had existing monitoring tools in place, but needed full-stack visibility to understand workload behavior holistically and to systematically prioritize optimizations,” Meyer said.

Nexxen deployed Definity with no pipeline code changes. According to Meyer, the team identified 33% of its optimization opportunities within the first week, and engineering effort on troubleshooting and optimization dropped by 70%. The platform freed infrastructure capacity, allowing the team to support workload growth without additional hardware investment.

“The key shift was moving from reactive troubleshooting to proactive, continuous optimization,” Meyer said. “At scale, the biggest gap often isn’t tooling — it’s actionable visibility.”

What this means for enterprise data teams

For data engineering teams running production Spark environments, the shift from reactive monitoring to in-execution intelligence has architectural and organizational implications worth thinking through.

Pipeline ops is becoming an AI infrastructure problem. Data pipelines that previously supported analytics now carry AI workloads with direct business dependencies. Failures that were once an inconvenience are now blocking production AI delivery.

Troubleshooting time is a recoverable cost. According to Meyer, Nexxen cut engineering effort on troubleshooting and optimization by 70% after deploying Definity. For teams running lean, that time going back to the roadmap is the most direct near-term case for evaluating this category.

Credit: Source link

ShareTweetSendSharePin

Related Posts

Poolside AI Introduces Laguna XS.2 and M.1: Agentic Coding Models Reaching 68.2% and 72.5% on SWE-bench Verified
AI & Technology

Poolside AI Introduces Laguna XS.2 and M.1: Agentic Coding Models Reaching 68.2% and 72.5% on SWE-bench Verified

April 29, 2026
How to build custom reasoning agents with a fraction of the compute
AI & Technology

How to build custom reasoning agents with a fraction of the compute

April 28, 2026
American AI startup Poolside launches free, high-performing open model Laguna XS.2 for local agentic coding
AI & Technology

American AI startup Poolside launches free, high-performing open model Laguna XS.2 for local agentic coding

April 28, 2026
Texas Instruments made a new flagship graphing calculator: the TI-84 Evo
AI & Technology

Texas Instruments made a new flagship graphing calculator: the TI-84 Evo

April 28, 2026
Next Post
Guest column | These tests claim to tell your ‘biological age.’ Why the science isn’t there yet. – The Washington Post

Guest column | These tests claim to tell your ‘biological age.’ Why the science isn’t there yet. - The Washington Post

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Search

No Result
View All Result
Full Episode: TODAY Show – April 16

Full Episode: TODAY Show – April 16

April 23, 2026
Man jailed for stealing handbag containing a Fabergé egg and watoh worth nearly  million

Man jailed for stealing handbag containing a Fabergé egg and watoh worth nearly $3 million

April 27, 2026
Meta, Microsoft Cuts Could Hit 23,000 Jobs

Meta, Microsoft Cuts Could Hit 23,000 Jobs

April 28, 2026

About

Learn more

Our Services

Legal

Privacy Policy

Terms of Use

Bloggers

Learn more

Article Links

Contact

Advertise

Ask us anything

©2020- TradePoint.io - All rights reserved!

Tradepoint.io, being just a publishing and technology platform, is not a registered broker-dealer or investment adviser. So we do not provide investment advice. Rather, brokerage services are provided to clients of Tradepoint.io by independent SEC-registered broker-dealers and members of FINRA/SIPC. Every form of investing carries some risk and past performance is not a guarantee of future results. “Tradepoint.io“, “Instant Investing” and “My Trading Tools” are registered trademarks of Apperbuild, LLC.

This website is operated by Apperbuild, LLC. We have no link to any brokerage firm and we do not provide investment advice. Every information and resource we provide is solely for the education of our readers. © 2020 Apperbuild, LLC. All rights reserved.

No Result
View All Result
  • Main
  • AI & Technology
  • Stock Charts
  • Market & News
  • Business
  • Finance Tips
  • Trade Tube
  • Blog
  • Shop

© 2023 - TradePoint.io - All Rights Reserved!