Interview with Hamza Tahir: Co-founder and CTO of ZenML

IBM thinks that over a billion new applications will be built with gen AI : Here’s how they’re going to help that happen with agentic AI

Elon Musk Lost 25% of Fortune During His Quest to Gut the US Government

Bio: Hamza Tahir is a software developer turned ML engineer. An indie hacker by heart, he loves ideating, implementing, and launching data-driven products. His previous projects include PicHance, Scrilys, BudgetML, and you-tldr. Based on his learnings from deploying ML in production for predictive maintenance use-cases in his previous startup, he co-created ZenML, an open-source MLOps framework for creating production grade ML pipelines on any infrastructure stack.

Question: From Early Projects to ZenML: Given your rich background in software development and ML engineering—from pioneering projects like BudgetML to co-founding ZenML and building production pipelines at maiot.io—how has your personal journey influenced your approach to creating an open-source ecosystem for production-ready AI?

My journey from early software development to co-founding ZenML has deeply shaped how I approach building open-source tools for AI production. Working on BudgetML taught me that accessibility in ML infrastructure is critical – not everyone has enterprise-level resources, yet everyone deserves access to robust tooling.

At my first startup maiot.io, I witnessed firsthand how fragmented the MLOps landscape was, with teams cobbling together solutions that often broke in production. This fragmentation creates real business pain points – for example, many enterprises struggle with lengthy time-to-market cycles for their ML models due to these exact challenges.

These experiences drove me to create ZenML with a focus on being production-first, not production-eventual. We built an ecosystem that brings structure to the chaos of managing models, ensuring that what works in your experimental environment transitions smoothly to production. Our approach has consistently helped organizations reduce deployment times and increase efficiency in their ML workflows.

The open-source approach wasn’t just a distribution strategy—it was foundational to our belief that MLOps should be democratized, allowing teams of all sizes to benefit from best practices developed across the industry. We’ve seen organizations of all sizes—from startups to enterprises—accelerate their ML development cycles by 50-80% by adopting these standardized, production-first practices.

Question: From Lab to Launch: Could you share a pivotal moment or technical challenge that underscored the need for a robust MLOps framework in your transition from experimental models to production systems?

ZenML grew out of our experience working in predictive maintenance. We were essentially functioning as consultants, implementing solutions for various clients. A little over four years ago when we started, there were far fewer tools available and those that existed lacked maturity compared to today’s options.

We quickly discovered that different customers had vastly different needs—some wanted AWS, others preferred GCP. While Kubeflow was emerging as a solution that operated on top of Kubernetes, it wasn’t yet the robust MLOps framework that ZenML offers now.

The pivotal challenge was finding ourselves repeatedly writing custom glue code for each client implementation. This pattern of constantly developing similar but platform-specific solutions highlighted the clear need for a more unified approach. We initially built ZenML on top of TensorFlow’s TFX, but eventually removed that dependency to develop our own implementation that could better serve diverse production environments.

Question: Open-Source vs. Closed-Source in MLOps: While open-source solutions are celebrated for innovation, how do they compare with proprietary options in production AI workflows? Can you share how community contributions have enhanced ZenML’s capabilities in solving real MLOps challenges?

Proprietary MLOps solutions offer polished experiences but often lack adaptability. Their biggest drawback is the “black box” problem—when something breaks in production, teams are left waiting for vendor support. With open-source tools like ZenML, teams can inspect, debug, and extend the tooling themselves.

This transparency enables agility. Open-source frameworks incorporate innovations faster than quarterly releases from proprietary vendors. For LLMs, where best practices evolve weekly, this speed is invaluable.

The power of community-driven innovation is exemplified by one of our most transformative contributions—a developer who built the “Vertex” orchestrator integration for Google Cloud Platform. This wasn’t just another integration—it represented a completely new approach to orchestrating pipelines on GCP that opened up an entirely new market for us.

Prior to this contribution, our GCP users had limited options. The community member developed a comprehensive Vertex AI integration that enabled seamless orchestration in

Question: Integrating LLMs into Production: With the surge in generative AI and large language models, what are the key obstacles you’ve encountered in LLMOps, and how does ZenML help mitigate these challenges?

LLMOps presents unique challenges including prompt engineering management, complex evaluation metrics, escalating costs, and pipeline complexity.

ZenML helps by providing:

Structured pipelines for LLM workflows, tracking all components from prompts to post-processing logic
Integration with LLM-specific evaluation frameworks
Caching mechanisms to control costs
Lineage tracking for debugging complex LLM chains

Our approach bridges traditional MLOps and LLMOps, allowing teams to leverage established practices while addressing LLM-specific challenges. ZenML’s extensible architecture lets teams incorporate emerging LLMOps tools while maintaining reliability and governance.

Question: Streamlining MLOps Workflows: What best practices would you recommend for teams aiming to build secure, scalable ML pipelines using open-source tools, and how does ZenML facilitate this process?

For teams building ML pipelines with open-source tools, I recommend:

Start with reproducibility through strict versioning
Design for observability from day one
Embrace modularity with interchangeable components
Automate testing for data, models, and security
Standardize environments through containerization

ZenML facilitates these practices with a Pythonic framework that enforces reproducibility, integrates with popular MLOps tools, supports modular pipeline steps, provides testing hooks, and enables seamless containerization.

We’ve seen these principles transform organizations like Adeo Leroy Merlin. After implementing these best practices through ZenML, they reduced their ML development cycle by 80%, with their small team of data scientists now deploying new ML use cases from research to production in days rather than months, delivering tangible business value across multiple production models.

The key insight: MLOps isn’t a product you adopt, but a practice you implement. Our framework makes following best practices the path of least resistance while maintaining flexibility.

Question: Engineering Meets Data Science: Your career spans both software engineering and ML engineering—how has this dual expertise influenced your design of MLOps tools that cater to real-world production challenges?

My dual background has revealed a fundamental disconnect between data science and software engineering cultures. Data scientists prioritize experimentation and model performance, while software engineers focus on reliability and maintainability. This divide creates significant friction when deploying ML systems to production.

ZenML was designed specifically to bridge this gap by creating a unified framework where both disciplines can thrive. Our Python-first APIs provide the flexibility data scientists need while enforcing software engineering best practices like version control, modularity, and reproducibility. We’ve embedded these principles into the framework itself, making the right way the easy way.

This approach has proven particularly valuable for LLM projects, where the technical debt accumulated during prototyping can become crippling in production. By providing a common language and workflow for both researchers and engineers, we’ve helped organizations reduce their time-to-production while simultaneously improving system reliability and governance.

Question: MLOps vs. LLMOps: In your view, what distinct challenges do traditional MLOps face compared to LLMOps, and how should open-source frameworks evolve to address these differences?

Traditional MLOps focuses on feature engineering, model drift, and custom model training, while LLMOps deals with prompt engineering, context management, retrieval-augmented generation, subjective evaluation, and significantly higher inference costs.

Open-source frameworks need to evolve by providing:

Consistent interfaces across both paradigms
LLM-specific cost optimizations like caching and dynamic routing
Support for both traditional and LLM-specific evaluation
First-class prompt versioning and governance

ZenML addresses these needs by extending our pipeline framework for LLM workflows while maintaining compatibility with traditional infrastructure. The most successful teams don’t see MLOps and LLMOps as separate disciplines, but as points on a spectrum, using common infrastructure for both.

Question: Security and Compliance in Production: With data privacy and security being critical, what measures does ZenML implement to ensure that production AI models are secure, especially when dealing with dynamic, data-intensive LLM operations?

ZenML implements robust security measures at every level:

Granular pipeline-level access controls with role-based permissions
Comprehensive artifact provenance tracking for complete auditability
Secure handling of API keys and credentials through encrypted storage
Data governance integrations for validation, compliance, and PII detection
Containerization for deployment isolation and attack surface reduction

These measures enable teams to implement security by design, not as an afterthought. Our experience shows that embedding security into the workflow from the beginning dramatically reduces vulnerabilities compared to retrofitting security later. This proactive approach is particularly crucial for LLM applications, where complex data flows and potential prompt injection attacks create unique security challenges that traditional ML systems don’t face.

Question: Future Trends in AI: What emerging trends for MLOps and LLMOps do you believe will redefine production workflows over the next few years, and how is ZenML positioning itself to lead these changes?

Agents and workflows represent a critical emerging trend in AI. Anthropic notably differentiated between these approaches in their blog about Claude agents, and ZenML is strategically focusing on workflows primarily for reliability considerations.

While we may eventually reach a point where we can trust LLMs to autonomously generate plans and iteratively work toward goals, current production systems demand the deterministic reliability that well-defined workflows provide. We envision a future where workflows remain the backbone of production AI systems, with agents serving as carefully constrained components within a larger, more controlled process—combining the creativity of agents with the predictability of structured workflows.

The industry is witnessing unprecedented investment in LLMOps and LLM-driven projects, with organizations actively experimenting to establish best practices as models rapidly evolve. The definitive trend is the urgent need for systems that deliver both innovation and enterprise-grade reliability—precisely the intersection where ZenML is leveraging its years of battle-tested MLOps experience to create transformative solutions for our customers.

Question: Fostering Community Engagement: Open source thrives on collaboration—what initiatives or strategies have you found most effective in engaging the community around ZenML and encouraging contributions in MLOps and LLMOps?

We’ve implemented several high-impact community engagement initiatives that have yielded measurable results. Beyond actively soliciting and integrating open-source contributions for components and features, we hosted one of the first large-scale MLOps competitions in 2023, which attracted over 200 participants and generated dozens of innovative solutions to real-world MLOps challenges.

We’ve established multiple channels for technical collaboration, including an active Slack community, regular contributor meetings, and comprehensive documentation with clear contribution guidelines. Our community members regularly discuss implementation challenges, share production-tested solutions, and contribute to expanding the ecosystem through integrations and extensions. These strategic community initiatives have been instrumental in not only growing our user base substantially but also advancing the collective knowledge around MLOps and LLMOps best practices across the industry.

Question: Advice for Aspiring AI Engineers: Finally, what advice would you give to students and early-career professionals who are eager to dive into the world of open-source AI, MLOps and LLMOps, and what key skills should they focus on developing?

For those entering MLOps and LLMOps:

Build complete systems, not just models—the challenges of production offer the most valuable learning
Develop strong software engineering fundamentals
Contribute to open-source projects to gain exposure to real-world problems
Focus on data engineering—data quality issues cause more production failures than model problems
Learn cloud infrastructure basics–Key skills to develop include Python proficiency, containerization, distributed systems concepts, and monitoring tools. For bridging roles, focus on communication skills and product thinking. Cultivate “systems thinking”—understanding component interactions is often more valuable than deep expertise in any single area. Remember that the field is evolving rapidly. Being adaptable and committed to continuous learning is more important than mastering any particular tool or framework.

Question: How does ZenML’s approach to workflow orchestration differ from traditional ML pipelines when handling LLMs, and what specific challenges does it solve for teams implementing RAG or agent-based systems?

At ZenML, we believe workflow orchestration must be paired with robust evaluation systems—otherwise, teams are essentially flying blind. This is especially crucial for LLM workflows, where behaviour can be much less predictable than traditional ML models.

Our approach emphasizes “eval-first development” as the cornerstone of effective LLM orchestration. This means evaluation runs as quality gates or as part of the outer development loop, incorporating user feedback and annotations to continually improve the system.

For RAG or agent-based systems specifically, this eval-first approach helps teams identify whether issues are coming from retrieval components, prompt engineering, or the foundation models themselves. ZenML’s orchestration framework makes it straightforward to implement these evaluation checkpoints throughout your workflow, giving teams confidence that their systems are performing as expected before reaching production.

Question: What patterns are you seeing emerge for successful hybrid systems that combine traditional ML models with LLMs, and how does ZenML support these architectures?

ZenML takes a deliberately unopinionated approach to architecture, allowing teams to implement patterns that work best for their specific use cases. Common hybrid patterns include RAG systems with custom-tuned embedding models and specialized language models for structured data extraction.

This hybrid approach—combining custom-trained models with foundation models—delivers superior results for domain-specific applications. ZenML supports these architectures by providing a consistent framework for orchestrating both traditional ML components and LLM components within a unified workflow.

Our platform enables teams to experiment with different hybrid architectures while maintaining governance and reproducibility across both paradigms, making the implementation and evaluation of these systems more manageable.

Question: As organizations rush to implement LLM solutions, how does ZenML help teams maintain the right balance between experimentation speed and production governance?

ZenML handles best practices out of the box—tracking metadata, evaluations, and the code used to produce them without teams having to build this infrastructure themselves. This means governance doesn’t come at the expense of experimentation speed.

As your needs grow, ZenML grows with you. You might start with local orchestration during early experimentation phases, then seamlessly transition to cloud-based orchestrators and scheduled workflows as you move toward production—all without changing your core code.

Lineage tracking is a key feature that’s especially relevant given emerging regulations like the EU AI Act. ZenML captures the relationships between data, models, and outputs, creating an audit trail that satisfies governance requirements while still allowing teams to move quickly. This balance between flexibility and governance helps prevent organizations from ending up with “shadow AI” systems built outside official channels.

Question: What are the key integration challenges enterprises face when incorporating foundation models into existing systems, and how does ZenML’s workflow approach address these?

A key integration challenge for enterprises is tracking which foundation model (and which version) was used for specific evaluations or production outputs. This lineage and governance tracking is critical both for regulatory compliance and for debugging issues that arise in production.

ZenML addresses this by maintaining a clear lineage between model versions, prompts, inputs, and outputs across your entire workflow. This provides both technical and non-technical stakeholders with visibility into how foundation models are being used within enterprise systems.

Our workflow approach also helps teams manage environment consistency and version control as they move LLM applications from development to production. By containerizing workflows and tracking dependencies, ZenML reduces the “it works on my machine” problems that often plague complex integrations, ensuring that LLM applications behave consistently across environments.

Asif Razzaq is the CEO of Marktechpost Media Inc.. As a visionary entrepreneur and engineer, Asif is committed to harnessing the potential of Artificial Intelligence for social good. His most recent endeavor is the launch of an Artificial Intelligence Media Platform, Marktechpost, which stands out for its in-depth coverage of machine learning and deep learning news that is both technically sound and easily understandable by a wide audience. The platform boasts of over 2 million monthly views, illustrating its popularity among audiences.