Unpatched RAGFlow Vulnerability Allows Post-Auth RCE Bottom Line A currently-unpatched vulnerability in the most recent version of RAGFlow (0.24) allows low-privilege authenticated users to run arbitrary code. Only RAGFlow instances using Infinity for chunk storage are vulnerable. We have submitted a PR and expect that the issue will be patched soon. Impacted Software RAGFlow 0.24.0 (current version as of April 8th 2026) Video Walkthrough RAGFlow Background RAGFlow is a wildly-popular project for Retrieval Augmented Generation. It gives LLMs a structured library of documents they can refer to when responding to prompts. As of early April 8th 2026, the project had 77.5k stars on Github and was widely adopted across many companies. RAGFlow has over 77k stars on Github Most users configure the service to listen on an internal network, but at least 1,918 instances are directly accessible on the public internet according to Shodan. At least 1,918 RAGFlow instances are exposed on the public internet. Disclosure Process We created a github security report ( GHSA-vw46-rrp3-c99v ) on March 3rd, 2026, and attempted to follow up with the project maintainers several times via email without success. Our initial bug report, filed March 3rd, 2026 Given how easily-discoverable the flaw is, and in keeping with our outbound disclosure policy , after a month we decided that the most effective way to get the issue fixed was to submit a patch ourselves ( https://github.com/infiniflow/ragflow/pull/14091 ). Unfortunately, creating this public PR means that any attackers monitoring the project are now aware of the vulnerability. We hope that this blog gets defenders up to speed as well, so that they can take appropriate countermeasures. Releasing this post before remediation was not a decision we took lightly. Given how rapidly vulnerability discovery is accelerating because of LLM-powered research flows, we believe that it will become more and more common for known, reported vulnerabilities to escape the attention of project maintainers, and that in these cases it's better to make the issue public as responsibly as possible instead of letting attackers quietly discover and exploit it without legitimate users being aware they're running vulnerable software. The Flaw Original Sin While researching RAGFlow, the function _rank_feature_scores() initially caught our attention. It's invoked during the rerank phase of document retrieval and converts a value from database search to a python dict using eval() like this: # rag/nlp/search.py def _rank_feature_scores ( self , query_rfea , search_res ) : # ... for t , sc in eval ( search_res . field [ i ] . get ( TAG_FLD , "{}" ) ) . items ( ) : eval() will execute any python code. If the value of TAG_FLD is a typical dictionary declaration ( { "foo": "bar"} ) it works as expected, but if there was a way to corrupt its value in the datastore, we'd have an easy RCE. Corrupting tag_feas Anatomy of Retrieval The property we were interested in is accessed during chunk retrieval through public API endpoints like /api/v1/retrieval, which take a question and return relevant documents. Retrieval starts with a query to the configured datastore to find a broad set of documents that may be relevant. This initial query can be things like a vector search or a fulltext search. Depending on configuration, the engine sometimes then "reranks" the results from the initial search -- that is, performs potentially more expensive computations on them to better score relevance. The vulnerable code we were interested in is in this re-ranking process. This means that we needed to trace the application data flow to find a way to get malicious data into the TAG_FLD property, and then craft a search that would access this malicious data during reranking. Data Flow: Datastore To Eval() We started by tracing the data flow, looking for a public-facing vector to corrupt the data, and trying to understand any sanitization or validation we'd have to dodge. It turns out .get(TAG_FLD, "{}") reads the " tag_feas " property of a document chunk (where "feas" is short for "features"). This field is supposed to be an object that captures how relevant pre-defined tags are to the chunk, e.g.: { "tag1" : 0.1 , "tag2" : 0.3 } When the retrieval API is invoked, if Infinity is the backend, RAGFlow searches for chunks like this: # rag/nlp/search.py, Dealer.search()): # Fields to retrieve from chunks as part of search -- includes TAG_FLD, which is our target tag_feas property src = req . get ( "fields" , [ . . . , PAGERANK_FLD , TAG_FLD , "row_id()" ] ) # rag/nlp/search.py, Dealer.retrieval()): # Actual search invocation -- includes field list above (req) sres = await self . search ( req , [ index_name ( tid ) for tid in tenant_ids ] , kb_ids , embd_mdl , highlight , rank_feature = rank_feature ) # Infinity-specific processing of tag_feas column in search result # rag/utils/infinity_conn.py: elif re . search ( r"_feas$" , k ) : res2 [ column...
A currently unpatched post-authentication remote code execution vulnerability exists in RAGFlow version 0.24.0, specifically affecting instances using Infinity for chunk storage, where low-privilege authenticated users can exploit an unsafe `eval()` call on user-controlled data. No patch is available at this time, but a public pull request has been submitted to address the flaw. IT administrators should immediately assess exposure by checking if their RAGFlow instances are internet-accessible and consider restricting access or monitoring for exploitation attempts until an official fix is released.