CT
CyberTimes
← Back to Threat Watch
CVE-2026-33626April 24, 2026 · CyberTimes Security Team

CVE-2026-33626: LMDeploy SSRF Flaw Exploited in 12 Hours — Attackers Stole AWS Cloud Credentials via AI Image Loader

A high-severity Server-Side Request Forgery vulnerability in LMDeploy — one of the most widely used open-source toolkits for deploying and serving large language models — was actively exploited in the

TL;DR — 15 Second Read

  • CVE-2026-33626 is a Server-Side Request Forgery (SSRF) vulnerability in LMDeploy's vision-language load_image() function that fetches arbitrary URLs without validating internal or private IP addresses — allowing attackers to reach cloud metadata services, internal networks, and sensitive backend resources.
  • Sysdig's honeypot detected the first exploitation attempt just 12 hours and 31 minutes after the advisory was published on GitHub — with no proof-of-concept code publicly available at the time — confirming attackers built a working exploit directly from reading the advisory description.
  • The attacker conducted a methodical 8-minute session across 10 requests, probing the AWS Instance Metadata Service (IMDS) for cloud credentials, scanning Redis, MySQL, and internal HTTP admin interfaces, and confirming external DNS exfiltration capability — a complete cloud credential theft and internal reconnaissance operation.
  • This is part of an accelerating pattern: AI infrastructure tools including LMDeploy, Marimo, and inference servers are being weaponized within hours of disclosure regardless of install base size, as detailed advisory language effectively provides LLM-ready exploit blueprints to attackers.
Severity🟠 HIGH
CVSS Score7.5/10
ExploitedYes — active
Fix StatusPatch available
lmdeploy, ssrf, ai-security, cloud-security, aws-credentials, server-side-request-forgery, network-security, cybersecurity, information-security, llm-security

A high-severity Server-Side Request Forgery vulnerability in LMDeploy — one of the most widely used open-source toolkits for deploying and serving large language models — was actively exploited in the wild within 12 hours and 31 minutes of its public disclosure. CVE-2026-33626, affecting all versions of LMDeploy 0.12.0 and earlier with vision language support, allows an attacker to weaponize the AI framework's image loading function as a generic HTTP request primitive to steal cloud credentials, probe internal networks, and reach backend services that were never intended to be accessible.

The speed of exploitation — under 13 hours, with no proof-of-concept code available — is the defining characteristic of this incident. The attacker needed only to read the GitHub Security Advisory, which accurately named the vulnerable file (lmdeploy/vl/utils.py), the vulnerable function (load_image()), the root cause, and included sample vulnerable code. For any modern LLM, that advisory is a complete exploit specification. For the cybersecurity and information security communities building and deploying AI infrastructure, this incident is a direct warning: AI inference servers are now a primary target class for cloud credential theft, and the window between disclosure and exploitation has effectively collapsed.


Affected products

  • ·LMDeploy — all versions 0.12.0 and earlier with vision language (VLM) support enabled Affected models include deployments using internlm-xcomposer2, OpenGVLab/InternVL2-8B, and any other vision-language model served through LMDeploy's inference stack

How to Fix

Step-by-step remediation

The primary remediation is upgrading to the patched LMDeploy release. Monitor the project's GitHub releases page for the fix and apply it as an emergency update. After upgrading, verify the patch by reviewing the load_image() function in lmdeploy/vl/utils.py — it should include IP address validation logic that checks destination addresses against private and reserved ranges before making any outbound request.

For immediate network-level mitigation while patching is in progress: implement an egress firewall rule on your LMDeploy host or container that blocks outbound requests to all RFC 1918 private address ranges and the IMDS address 169.254.169.254. On Linux systems, this can be implemented with iptables rules blocking OUTPUT traffic to these destinations from the process user running LMDeploy. On cloud platforms, security groups and VPC network ACLs can enforce the same restriction at the infrastructure level.

For AWS deployments specifically: switch from IMDSv1 to IMDSv2 immediately. IMDSv2 requires a PUT request to obtain a session token before metadata can be read — simple HTTP GET requests to 169.254.169.254 will be rejected. This single configuration change blocks the most common SSRF-to-IMDS attack pattern. Enable this via the AWS console under EC2 instance settings → Modify instance metadata options → IMDSv2: Required.

Additionally, apply the principle of least privilege to the IAM role attached to your LMDeploy instance. If the inference server only needs to read from specific S3 model storage buckets, its IAM role should only have s3:GetObject on those specific buckets — nothing more. Over-permissioned IAM roles are what turn an SSRF into a full cloud account compromise.


What happened

LMDeploy's vision-language module includes a load_image() function in lmdeploy/vl/utils.py that fetches images from arbitrary URLs provided as inputs to vision-language models. The function was implemented without validation of whether the destination IP address is internal, private, or restricted — a fundamental SSRF mitigation that blocks requests to addresses like 169.254.169.254 (AWS IMDS), 10.0.0.0/8, 172.16.0.0/12, and 192.168.0.0/16.

By providing a crafted URL as an image input to a vision-language model served by LMDeploy, an attacker instructs the server to make an outbound HTTP request to any destination — including internal cloud metadata services and private network resources. The server then returns the response content, effectively acting as a proxy for the attacker's requests. The attacker never needs direct access to these internal resources; the LMDeploy server fetches them on the attacker's behalf and returns the results.

The Sysdig honeypot analysis reveals the attacker's methodology in precise detail. Over an 8-minute session beginning at 03:35 UTC on April 22, the attacker sent 10 distinct requests rotating between two different vision-language models — internlm-xcomposer2 and OpenGVLab/InternVL2-8B — likely to avoid triggering rate limiting or anomaly detection tied to a single model endpoint. The three-phase attack sequence targeted the AWS IMDS to retrieve instance credentials and IAM role information, tested external DNS exfiltration via a callback to requestrepo[.]com to confirm the SSRF could reach arbitrary external hosts, and then port-scanned the loopback interface and internal services including Redis and MySQL.

Real-World Impact

The AWS Instance Metadata Service is the primary credential store for EC2 instances and containerised workloads running on AWS. A successful IMDS request from a vulnerable LMDeploy instance returns the IAM role credentials attached to that instance — access key ID, secret access key, and session token — with whatever permissions that IAM role carries. In typical ML deployment environments, model inference servers are granted broad AWS permissions to access S3 buckets for model storage, DynamoDB for inference logging, and other services. A single successful IMDS credential theft via this SSRF can give an attacker full access to an organisation's entire AWS account scope permitted to that role.

The internal network reconnaissance phase of this attack is equally significant for network security teams. Redis and MySQL instances in AI deployment stacks commonly store inference caches, conversation histories, user session data, and application state. An SSRF-capable attacker who can reach these services from the LMDeploy process can read, write, or delete that data — without ever appearing in external network logs, because all requests originate from the legitimate LMDeploy server on the internal network.

Sysdig's broader observation is the most alarming for the AI infrastructure security community: this attack pattern — critical vulnerabilities in inference servers being weaponized within hours of advisory publication regardless of install base size — has been observed repeatedly over the past six months. CVE-2026-33626 joins Marimo (CVE-2026-39987, exploited in under 10 hours) as direct evidence that AI infrastructure tools are being treated by threat actors as a priority target class equivalent to traditional web servers and network devices.


🛡️ Prevention Tips

Never allow AI inference servers, model gateways, or agent orchestration tools to have unrestricted network egress. These tools process user-supplied or model-generated inputs that may contain crafted URLs, prompt injections, or other content designed to abuse their network access. Apply the same firewall rules to AI inference servers that you would apply to any untrusted application processing external input.

Treat detailed security advisories as exploitation timelines, not grace periods. The GHSA advisory for CVE-2026-33626 included the vulnerable file name, function name, root cause, and sample code — everything needed to build a working exploit. When an advisory is that specific, assume exploitation attempts will begin within hours of publication and prioritise accordingly.

Enforce IMDSv2 on all AWS EC2 instances and apply strict IAM role scoping to any instance running AI workloads. The credential theft potential of SSRF vulnerabilities in AI infrastructure is only as large as the permissions attached to the compromised instance's IAM role. Least-privilege IAM is the most effective control for limiting blast radius when SSRF attacks succeed.

Subscribe to security advisories for every open-source AI framework your team deploys in production. LMDeploy, vLLM, Ollama, LiteLLM, and similar inference tools are now targeted by threat actors with the same urgency as web application frameworks. Integrate these repositories into your dependency monitoring pipeline so advisory notifications arrive immediately rather than through third-party news coverage.


FAQs

What is a Server-Side Request Forgery (SSRF) vulnerability and why is it dangerous in AI tools?

SSRF occurs when an application fetches remote resources based on user-supplied input without validating the destination. In AI tools like LMDeploy that process URLs as image inputs, the server makes HTTP requests to attacker-controlled destinations — including internal cloud metadata services, databases, and admin interfaces that are normally unreachable from the internet. AI inference servers are particularly dangerous SSRF targets because they often run with broad cloud permissions and internal network access required for model loading and logging.


How do I check if my LMDeploy deployment was exploited before I could patch?

Check your AWS CloudTrail logs for any GET requests to 169.254.169.254 from your LMDeploy host, particularly in the period since April 22, 2026 at 03:35 UTC. Review your application logs for image URL inputs containing 169.254.169.254, 127.0.0.1, or requestrepo.com. If you find any suspicious IMDS requests, treat your AWS credentials as compromised — rotate the IAM role's credentials immediately and review CloudTrail for any subsequent API calls made with those credentials.


Does this vulnerability affect vLLM, Ollama, or other LLM inference frameworks?

CVE-2026-33626 is specific to LMDeploy's load_image() function in its vision-language module. However, SSRF vulnerabilities in image URL processing are a common weakness pattern in any framework that fetches external URLs as part of multimodal input processing. If you use other inference frameworks with vision-language or image URL handling capabilities, audit their URL fetching code for equivalent missing validation — the pattern is the same even if the CVE is different.


Read Next

Last updated: April 24, 2026