AI Flaws in Cloud Services and LLM Frameworks: New Attack Vectors

The digital operational environment continues to expand, integrating AI capabilities into critical infrastructure and enterprise workflows. This integration drives innovation but also introduces new cybersecurity threats. Security researchers recently disclosed concerning vulnerabilities and attack methods targeting AI code execution environments, observability platforms, and large language model (LLM) serving frameworks. These incidents show organizations must reassess their security posture, particularly concerning emerging technologies. This analysis details these recent findings and their implications, providing insights for technical and non-technical stakeholders.

Emerging AI Flaws in Cloud Services and LLM Frameworks

Rapid adoption of AI services in cloud ecosystems, along with the proliferation of LLM development and deployment frameworks, presents new attack surfaces. Threat actors adapt their techniques to exploit the characteristics of these environments, including their permissions structures and integration points. Understanding these new vectors is key for maintaining secure operations.

DNS-Based Data Exfiltration from AI Sandboxes

Cybersecurity researchers identified a method for exfiltrating sensitive data from AI code execution environments using domain name system (DNS) queries. BeyondTrust recently published findings detailing how Amazon Bedrock AgentCore Code Interpreter's sandbox mode permits outbound DNS queries. An adversary can use this to establish interactive shells and bypass network isolation. This issue, without a CVE identifier, carries a CVSS score of 7.5 out of 10.0.

Amazon Bedrock AgentCore Code Interpreter, launched in August 2025, is a fully managed service that enables AI agents to execute code within isolated sandbox environments. Its design intends to prevent agentic workloads from accessing external systems. However, the service permits DNS queries despite a "no network access" configuration, which poses a risk. Kinnaird McQuade, chief security architect at BeyondTrust, states this behavior can allow threat actors to establish command-and-control (C2) channels and conduct data exfiltration over DNS in certain scenarios, circumventing expected network isolation controls.

In experimental attack scenarios, an adversary could abuse this functionality to set up a bidirectional communication channel using DNS queries and responses. This allows obtaining an interactive reverse shell, exfiltrating sensitive information via DNS queries if the associated IAM role possesses permissions to access AWS resources (such as S3 buckets storing relevant data), and executing commands. The DNS communication mechanism also delivers additional payloads to the Code Interpreter, prompting it to poll a DNS C2 server for commands stored in DNS A records, execute them, and return results via DNS subdomain queries.

IAM role assignment is a key factor in this vulnerability's severity. While the Code Interpreter requires an IAM role to access AWS resources, a simple oversight can lead to an overprivileged role assignment. Such an assignment grants broad permissions to access sensitive data, expanding the potential impact of a compromise. BeyondTrust states this research shows how DNS resolution can undermine the network isolation guarantees of sandboxed code interpreters. Attackers could exfiltrate sensitive data from AWS resources accessible through the Code Interpreter's IAM role, potentially causing downtime, data breaches, or infrastructure deletion.

Amazon, following responsible disclosure in September 2025, determined this behavior is intended functionality rather than a defect. The company recommends customers use VPC mode instead of sandbox mode for complete network isolation. Amazon also suggests implementing a DNS firewall to filter outbound DNS traffic. Jason Soroko, a senior fellow at Sectigo, advises administrators to inventory all active AgentCore Code Interpreter instances and immediately migrate those handling critical data from Sandbox mode to VPC mode. Operating within a VPC provides the infrastructure for full network isolation, allowing teams to implement strict security groups, network ACLs, and Route53 Resolver DNS Firewalls for monitoring and blocking unauthorized DNS resolution. Security teams must also rigorously audit the IAM roles attached to these interpreters, strictly enforcing the principle of least privilege to restrict the blast radius of any potential compromise.

AI Observability and Development Platform Vulnerabilities

The security of platforms designed for AI observability and development is critical because these systems often have deep access to internal data sources and third-party services. Miggo Security recently disclosed a high-severity security flaw in LangSmith (CVE-2026-25750, CVSS score: 8.5), which exposed users to potential token theft and account takeover. This flaw affected both self-hosted and cloud deployments and was addressed in LangSmith version 0.12.71, released in December 2025.

The vulnerability stems from a URL parameter injection, specifically a lack of validation on the baseUrl parameter. This allows an attacker to steal a signed-in user's bearer token, user ID, and workspace ID, transmitting these to a server under their control. This is typically achieved through social engineering, such as deceiving a victim into clicking a specially crafted link. For example:

Cloud deployments: smith.langchain[.]com/studio/?baseUrl=
Self-hosted deployments: /studio/?baseUrl=

Successful exploitation could grant an attacker unauthorized access to the AI's trace history, internal SQL queries, CRM customer records, or proprietary source code by reviewing tool calls. Miggo researchers Liad Eliyahu and Eliana Vuijsje noted a logged-in LangSmith user could be compromised simply by accessing an attacker-controlled site or clicking a malicious link. This vulnerability shows AI observability platforms function as critical infrastructure. As these tools prioritize developer flexibility, they can bypass security guardrails, a risk compounded by their deep access to internal data and external services.

Unsafe Deserialization Flaws in LLM Serving Frameworks

Security vulnerabilities have also been identified in SGLang, an open-source framework for serving large language models and multimodal AI models. Successful exploitation of these flaws could trigger unsafe pickle deserialization, leading to remote code execution. Discovered by Orca security researcher Igor Stepansky, these vulnerabilities remain unpatched.

The identified flaws include:

CVE-2026-3059 (CVSS score: 9.8): An unauthenticated remote code execution vulnerability through the ZeroMQ (ZMQ) broker, which deserializes untrusted data using pickle.loads() without authentication. This affects SGLang's multimodal generation module.
CVE-2026-3060 (CVSS score: 9.8): An unauthenticated remote code execution vulnerability through the disaggregation module, which deserializes untrusted data using pickle.loads() without authentication. This affects SGLang's encoder parallel disaggregation system.
CVE-2026-3989 (CVSS score: 7.8): The use of an insecure pickle.load() function without validation and proper deserialization in SGLang's replay_request_dump.py, which can be exploited by providing a malicious pickle file.

Stepansky stated the first two vulnerabilities allow unauthenticated remote code execution against any SGLang deployment exposing its multimodal generation or disaggregation features to the network. The third involves insecure deserialization within a crash dump replay utility. The CERT Coordination Center (CERT/CC) issued a coordinated advisory, confirming SGLang's susceptibility to CVE-2026-3059 when the multimodal generation system is enabled and to CVE-2026-3060 when the encoder parallel disaggregation system is enabled. If either condition is met and an attacker knows the TCP port on which the ZMQ broker is listening and can send requests to the server, they can exploit the vulnerability by sending a malicious pickle file to the broker, which will then deserialize it.

SGLang users should restrict access to service interfaces and ensure they are not exposed to untrusted networks. Implement adequate network segmentation and access controls to prevent unauthorized interaction with ZeroMQ endpoints. Although there is no evidence of these vulnerabilities being exploited, monitoring for:

unexpected inbound TCP connections to the ZeroMQ broker port
unexpected child processes spawned by the SGLang Python process
file creation in unusual locations by the SGLang process
outbound connections from the SGLang process to unexpected destinations

is crucial.

Critical Security Implications for Enterprise AI

According to cybersecurity research, AI-specific vulnerabilities represent a fundamental shift in the threat landscape. The convergence of cloud computing, artificial intelligence, and traditional IT infrastructure creates complex attack surfaces that require specialized security approaches.

Research shows that organizations implementing AI services without proper security frameworks face significantly higher risks of data breaches and system compromises. The key challenge lies in understanding that traditional security models may not adequately protect AI workloads.

Attack Vector Analysis

The primary attack vectors targeting AI cloud services include:

DNS Tunneling and Exfiltration: Exploiting DNS queries to bypass network isolation
Parameter Injection: Manipulating URL parameters to steal authentication tokens
Unsafe Deserialization: Leveraging pickle deserialization for remote code execution
Privilege Escalation: Exploiting overprivileged IAM roles to access sensitive resources

Broader Cyber Threat Landscape

While AI-specific vulnerabilities represent a frontier of cybersecurity challenges, the wider threat environment continues to evolve with persistent and new attack campaigns targeting various sectors. These encompass sophisticated supply chain attacks, destructive ransomware operations, and state-sponsored cyber warfare activity.

Persistent Threats: Ransomware and Supply Chain Compromises

The Medusa ransomware gang continues to pose a significant threat, recently claiming responsibility for attacks on the University of Mississippi Medical Center (UMMC) and New Jersey's Passaic County. The attack on UMMC, a critical healthcare provider, caused a nine-day outage, forcing analog operations and rescheduling essential services. The group demanded an $800,000 ransom, threatening to leak stolen data. This shows the disruptive impact of ransomware on critical infrastructure and the need for real-time ransomware intelligence to anticipate and mitigate such threats. Effective strategies to protect against ransomware are essential. The continued activity of groups like Medusa shows the need for proactive breach detection and incident response capabilities to minimize operational disruption and data compromise.

The GlassWorm supply-chain campaign is a coordinated attack targeting hundreds of packages, repositories, and extensions across GitHub, npm, and VSCode/OpenVSX. Researchers identified 433 compromised components. This campaign uses "invisible" Unicode characters to conceal malicious code designed to harvest cryptocurrency wallet data and developer credentials. Attackers initially compromise GitHub accounts through account takeover to force-push malicious commits, then publish obfuscated packages and extensions. Using the Solana blockchain for command-and-control (C2) activity indicates a sophisticated operational methodology. Such incidents highlight the importance of supply-chain risk monitoring to detect and prevent the introduction of malicious components into software development pipelines. Organizations need full strategies for supply chain information security.

Geopolitical Cyber Operations: Warfare Convergence

The recent Iran War demonstrated a convergence of kinetic, cyber, electronic, and psychological warfare. Electronic warfare and cyber activity surged after US-Israeli strikes on Iran, involving both Iranian-aligned and pro-Western hacktivist groups. These activities targeted critical infrastructure, military logistics, and symbolic websites. Cyber operations included reconnaissance for kinetic targeting and battle-damage assessment. The conflict also featured the most extensive use of GPS spoofing and jamming recorded, severely impacting maritime, aviation, and military operations across the Persian Gulf and adjacent airspaces.

Key cyber activities observed include Distributed Denial of Service (DDoS) attacks, website defacements, data theft, and data-wiping operations. Iranian-aligned groups often conduct these operations, sometimes using third-party underground services and compromised IoT devices to amplify attack volumes. Targeting extended to oil export infrastructure, Amazon data centers, and the US-based medical technology company Stryker Corporation by the pro-Iranian hacking group Handala. Such activities show the role of cyber threat intelligence platforms in understanding the motivations, capabilities, and targets of state-sponsored actors and hacktivist groups. Underground forum intelligence and telegram threat monitoring are vital for tracking these groups, their recruitment efforts, propaganda, and intentions. In addition, the extensive use of misinformation, often employing AI-generated content, shows the challenges in discerning factual information during conflicts. Organizations also face risks of data leakage and reputational damage, making brand leak alerting a necessary component of modern security postures. Dark web monitoring services provide visibility into these clandestine activities.

Essential Security Measures for AI Infrastructure

Managing this complex threat environment requires a proactive and multi-layered security approach. For technical and non-technical stakeholders, specific actions can strengthen defenses against these varied attack vectors:

For AI and Cloud Environments:

Principle of Least Privilege: Strictly enforce the principle of least privilege for all IAM roles associated with AI services and code interpreters. Permissions should be scoped narrowly to only what is essential for specific tasks.
Network Isolation: For cloud AI services, prioritize VPC mode over sandbox environments for full network isolation. Implement DNS firewalls, security groups, and network ACLs to monitor and control outbound DNS traffic and other network communications.
Input Validation and Secure Deserialization: Developers working with LLM frameworks must implement strong input validation and secure deserialization practices to prevent remote code execution vulnerabilities. Regularly update and patch frameworks to address known flaws.
API Monitoring: Deploy application-layer SOAP API monitoring, particularly for seldom-used functions like GetScratchCodesRequest (2FA codes) or CreateAppSpecificPasswordRequest (app-specific passwords) in webmail environments, as these can indicate compromise.

General Cybersecurity Practices:

Complete Threat Intelligence: Implement a strong cyber threat intelligence platform to gain insights into emerging threats, adversary tactics, techniques, and procedures (TTPs). This includes monitoring real-time ransomware intelligence and tracking supply chain vulnerabilities.
Supply Chain Security: Institute strong supply-chain risk monitoring for all software components, libraries, and frameworks. This involves reviewing Git commit histories for anomalies and scanning for known malware indicators.
Breach Detection and Response: Improve breach detection capabilities with advanced monitoring for unusual network activity, unexpected processes, and suspicious file creations. Develop and regularly test incident response plans.
Security Audits and Testing: Conduct regular penetration testing and red team operations to identify and remediate vulnerabilities across your infrastructure, including AI deployments, cloud services, and traditional systems.
Dark Web and Underground Forum Monitoring: Use dark web monitoring services and underground forum intelligence to track discussions, stolen credentials, and planned attacks related to your organization or industry. This includes telegram threat monitoring for hacktivist groups.
Employee Education: Educate staff about social engineering tactics, the risks of clicking malicious links, and the potential for executable payloads within HTML emails, even without attachments.

Advanced AI Security Configuration

Securing AI infrastructure requires specialized knowledge of cloud security configurations and AI-specific attack vectors. The key is implementing defense-in-depth strategies that account for the unique characteristics of AI workloads.

Cloud AI Service Hardening

When deploying AI services in cloud environments, security teams must configure multiple layers of protection. AWS Bedrock, Azure OpenAI, and Google Cloud AI Platform each have specific security controls that must be properly configured:

Identity and Access Management (IAM): Configure fine-grained permissions for AI service access
Network Segmentation: Implement VPC isolation and private endpoints for AI services
Data Encryption: Enable encryption in transit and at rest for all AI data flows
Audit Logging: Enable comprehensive logging for AI service interactions and model queries

ML Pipeline Security

Machine learning pipelines present unique attack surfaces that require specialized monitoring and protection. According to security research, the most critical vulnerabilities occur during model training, deployment, and inference phases.

Security teams should implement:

Model Integrity Verification: Cryptographic signing of ML models to prevent tampering
Training Data Validation: Scanning training datasets for malicious samples or data poisoning attempts
Inference Monitoring: Real-time monitoring of model inputs and outputs for anomalous behavior
Container Security: Hardening container images used for ML workload deployment

Emerging Threat Intelligence for AI Systems

Threat intelligence specific to AI systems is rapidly evolving as attackers develop new techniques targeting machine learning infrastructure. Organizations must stay informed about emerging attack patterns and tactics.

AI-Specific Attack Techniques

Recent threat intelligence reveals several categories of AI-targeted attacks:

Model Extraction Attacks: Stealing proprietary ML models through API queries
Adversarial Input Attacks: Crafting inputs designed to fool AI models
Training Data Poisoning: Injecting malicious samples into training datasets
Infrastructure Compromise: Targeting the underlying compute and storage infrastructure

Security teams should implement monitoring for these attack patterns and maintain updated threat intelligence feeds focused on AI security threats. Vulnerability assessment tools specifically designed for AI infrastructure can help identify potential weaknesses.

Integration with Security Operations

AI security must be integrated into existing Security Operations Center (SOC) workflows. This requires training security analysts on AI-specific threats and implementing specialized detection rules for AI infrastructure monitoring.

Key integration points include:

SIEM Integration: Incorporating AI service logs into security information and event management systems
Threat Hunting: Developing AI-specific threat hunting queries and playbooks
Incident Response: Creating specialized incident response procedures for AI-related security events
Compliance Monitoring: Ensuring AI deployments meet regulatory requirements and security standards

FAQ

What are the most critical AI vulnerabilities in cloud services?

The most critical AI vulnerabilities include DNS-based data exfiltration from sandbox environments, URL parameter injection leading to token theft, and unsafe deserialization in LLM frameworks. These vulnerabilities can lead to remote code execution, data breaches, and complete system compromise.

How can organizations protect against DNS exfiltration attacks in AI sandboxes?

Organizations should migrate from sandbox mode to VPC mode for complete network isolation, implement DNS firewalls to filter outbound traffic, and strictly audit IAM roles using the principle of least privilege. Regular monitoring of DNS queries and network activity is also essential.

What is unsafe deserialization and why is it dangerous in AI frameworks?

Unsafe deserialization occurs when applications load serialized data without proper validation, allowing attackers to execute arbitrary code. In AI frameworks like SGLang, this can lead to unauthenticated remote code execution with CVSS scores up to 9.8, making it extremely dangerous.

How should enterprises approach AI security in their cloud infrastructure?

Enterprises should implement a multi-layered security approach including network segmentation, strict access controls, continuous monitoring, and regular security audits. They should also ensure proper configuration of AI services and maintain updated threat intelligence.

What role does supply chain security play in AI vulnerability management?

Supply chain security is critical as AI frameworks often rely on numerous third-party components and libraries. Organizations must monitor for compromised packages, validate code integrity, and implement security scanning throughout their development pipeline to prevent supply chain attacks targeting AI systems.

How can organizations detect if their AI systems have been compromised?

Key indicators include unexpected network connections, unusual process spawning, file creation in abnormal locations, and suspicious DNS queries. Organizations should implement comprehensive logging and monitoring solutions specifically designed to detect AI-related attack patterns and anomalous behavior in their AI infrastructure.

ALERT: AI Cloud Vulnerabilities Expose Critical Vectors

AI Flaws in Cloud Services and LLM Frameworks: New Attack Vectors

Emerging AI Flaws in Cloud Services and LLM Frameworks

DNS-Based Data Exfiltration from AI Sandboxes

AI Observability and Development Platform Vulnerabilities

Unsafe Deserialization Flaws in LLM Serving Frameworks