August 6, 2024

Why we invested in Merly and the future of Machine Programming

Caitlin Dullanty
Venture @ IBM

IBM Ventures recently announced a seed investment in code quality startup: Merly.

This investment comes at a time when software quality is under a microscope after the biggest IT failure in history. While we may not definitively know the origin of the CrowdStrike bug, we do know that it illustrates software fragility and underscores the importance of quality.

The investment announcement also comes at a time when the share of AI-generated code is ballooning. The AI unlock on developer productivity is meaningful in the short-term, but what are the consequences in the long-term? Even in the pre-LLM era, the majority of engineer time was spent on understanding code that was not self-composed. This dynamic is only exacerbated as AI-generated code becomes more prolific, intensifying the interpretability—and potentially security—challenge.

Enter: Merly Mentor. Mentor is not an LLM for code—it’s an advanced AI system that can reason over the entire lifetime of a repository. No context window limits. No stochastic noise. Mentor’s purpose-built system is already performing inference on over a billion lines of code per week and doing so on only a handful of nodes.

Code understanding at scale -- Mentor goes where engineer eyes are not

LLMs are amazing generative AI tools—emphasis on generative. They work great for code gen (see here for IBM’s latest code models), but code gen is just one part of the software development lifecycle (SDLC).

What LLMs don’t do is understand or reason. They guess—based on things they’ve seen before. And every time they guess, they must be told exactly what to guess about (a la context or a prompt). This means that the developer must point the system to an issue and ask it how to address the problem, after which the LLM gets the answer right (the new code works) or gets it wrong (the new code doesn’t work).

Mentor doesn’t guess and it doesn’t require developer guidance. It is a fully learned deterministic system trained on over a trillion lines of code to understand semantics. Not only does this allow Mentor to evaluate code at scale—meaning over the lifetime of a code repository—it means you get the same output every time.

Mentor is also not bound to executable (i.e., compilable) code, as it knows that just because the code works, doesn’t mean it works correctly. Mentor sees around the corner for developers and engineering leaders, watching for anomalies, bugs, and unusual contributor behavior—so they’re not exposed to blind spots that could rear their heads, either through major security vulnerabilities or a slow drip of tech debt in legacy codebases. (See how Mentor could have retroactively spotted the XZ-backdoor in Linux here).

Mentor solves a really hard—and, as the CrowdStrike disaster reminded us, a really important—problem in the SDLC: code quality. After scoring the codebase and identifying issues, Mentor can call an LLM to generate remediation suggestions. But that is a small part of the Mentor value, and one we are excited to collaborate with Merly at IBM.

Merly is the team to solve code quality -- Dr. Gottschlich leads the field

CEO, Chief Scientist, and Merly founder Dr. Justin Gottschlich is often noted as a creator of what we call Machine Programming. Machine Programming—broadly termed—is the field of automated software development. Not only did Dr. Gottschlich start the Machine Programming team at Intel Labs, but he also designed and teaches the graduate level computer science course by the same name at Stanford University. Mentor’s novel AI architecture is not the product of a quick invention—it’s underpinned by Justin and team’s combined 50 years of large-scale software and AI system research and engineering experience. Plainly put: Merly is the team to watch in this space.

Expect vertically integrated AI apps to create differentiated value

One of our favorite things about Mentor is that it is custom built for its use case. Justin and team understand the developer persona deeply and have designed Mentor’s technological fabric specifically to address real user needs. Instead of slapping a UI on top of an existing model, Mentor pairs left brain type thinking (analytical, computational) with right brain type thinking (generative, creative). In general, we think it is possible a large portion of the enduring value in AI applications will accrue around vertically integrated solutions like Mentor.

Merly and IBM -- Excited to partner on the future of Machine Programming

Mentor is already being used by thousands of early adopters, including those at IBM, Red Hat, and the Cloud Native Computing Foundation (part of the Linux Foundation). We are beyond excited to partner with Justin and team on the future of Machine Programming and quality software development!

-----

Originally posted here on LinkedIn on July 30, 2024.