Stay up to date on Tachyon. Work Email Get Started Authorization bugs are some of the most dangerous vulnerabilities in software. If an attacker finds a gap, they can access data they shouldn't, take destructive actions, or cross tenancy boundaries. They're also among the hardest to find. Unlike injection bugs, there's no pattern to grep for. Authorization bugs are bugs of malformed logic —everything can look reasonable and still be wrong, because the failure mode is that a check didn't run on some path, or a check that did run is insufficient. And unlike authentication, authorization has far less standardization. Authentication has well-worn protocols and mature libraries, along full SaaS solution like Clerk or WorkOS. Authorization, on the other hand, is very application-specific: roles, ownership, object hierarchies, and edge cases vary across products. Most teams roll their own. This makes static automated detection unreliable. The usual source-to-sink taint model doesn't map cleanly to missing checks. Black-box scanners would need to discover the right users, roles, and state transitions, then generate specific request sequences that trigger a gap: a combinatorial problem that scales poorly. This post walks through an authorization bypass that Tachyon found in the open-source MLflow tracking server ( CVE-2025-14297 ), with details on how it surfaced a bug that traditional tooling would miss. This vulnerability affects self-hosted MLflow OSS deployments using the basic-auth mode; it does not apply to Databricks-managed MLflow. MLflow and the Authorization Invariant MLflow is an open-source platform (built by Databricks) for managing the machine learning lifecycle. It’s used as a system of record for ML experimentation workflows. Teams track experiments, create runs, log parameters, and share trained models and eval outputs. They can also use it for inference, serving their models in production. In a self-hosted deployment, MLflow typically sits at the center of an organization’s ML workflow. Multiple users and teams interact with the same tracking server. CI pipelines and evaluation jobs write results back, and production systems may pull artifacts from MLflow and load them dynamically. In this system, the key security invariant is straightforward: every read or write of a protected resource must be authorized at the object level . The system must authenticate the user, understand their identity, and then verify whether that user is allowed to perform that action on that specific experiment, run, or artifact. Inferring Auth Invariants As an aside, one would expect this to be a simple invariant to infer. One would be wrong. There are many widely used open-source projects with clear multi-tenant usage that ship without any builtin authorization, and consider it outside their domain. For example, Prometheus has historically taken the stance that authentication and authorization are out of scope, leaving operators to secure it externally. This comes up often in community discussions . Similar assumptions appear elsewhere: Jaeger’s UI does not provide native authentication and points users to reverse proxies instead, and Apache Spark is explicit in its documentation that its Web UIs do not include built-in auth filters, expecting operators to supply their own if needed. This means that CVE-hunters, scanners, and anyone analyzing OSS projects, have to do quite a bit of meta work to determine whether a bypass is an actual in-scope vulnerability. This process of inferring scope is one of the hardest parts of building a reliable scanner. But the simplicity of the invariant belied deep complexity in implementation. MLflow has over fifty API endpoints across three different protocols: gRPC, HTTP/REST (via Flask), and GraphQL. These were each added to the project at different times, by different engineers, in the service of different features. Each of those needs authorization enforcement; a single missing check is a security incident . That's where authorization bugs tend to hide: in the contextual and temporal gaps that emerge naturally as systems evolve. Basic Auth, What Went Wrong, and Minimal Reproduction MLflow’s experimental basic-auth mode (enabled via mlflow server --app-name=basic-auth or mlflow ui --app-name=basic-auth ) is meant for multi-user deployments where resource-level access controls matter. In this configuration, requests are authenticated via HTTP Basic Auth, and MLflow enforces a permissions model over experiments and related resources. How the authorization layer is structured The key implementation detail is that basic-auth’s authorization is driven by an internal “validator” mechanism. A Flask before_request hook runs early in request handling: Authenticate the incoming request (establish the user identity). Attempt to look up an authorization validator for the request being made. If a validator is found, perform per-object authorization checks (e.g., “can this user read artifacts for this...
Tachyon discovered an authorization bypass (CVE-2025-14297) in self-hosted MLflow tracking servers using basic-auth mode, where flawed logic allows unauthorized access to protected resources. The vulnerability affects MLflow OSS deployments; Databricks-managed MLflow is not impacted. The article details the bug's discovery but does not provide specific affected version ranges, a fixed version number, a CVSS score, or a workaround.