For almost two decades, employees' web surfing has been monitored and controlled by Secure Web Gateways (SWGs) - the traffic police of the web. SWGs are designed to intercept and analyze data moving across a network, filtering out malicious files, content and websites. However, these solutions cannot protect against modern client-side web attacks, collectively known as 'Last Mile Reassembly Attacks'.
Last Mile Reassembly Attacks take advantage of SWGs' architectural limitations, which traditionally inspect data at the network level but cannot monitor the intricate processes within the browser. This was presented at great length at DEF CON 32 by browser security researchers from SquareX. One such limitation they discovered was SWGs' lack of scanning of various protocols and channels, leaving enterprises vulnerable to malicious content delivered through unmonitored channels.
Exploiting Unmonitored Channels
Protocols such as WebRTC, WebSockets, WebTransport, and gRPC are increasingly used in modern web applications to enhance performance and real-time interaction. However, these protocols also present a unique opportunity for attackers, as SWGs typically do not monitor or analyze the traffic passing through them. Some vendors have publicly mentioned this in their documentation, and suggested to block these channels entirely as a best practice. However, this is not a practical solution as many legitimate SaaS applications like Webex, Slack, Zoom etc, require these channels to function properly.
Some unmonitored channels that are being used to smuggle malicious content:
- WebRTC (Web Real-Time Communication): Peer-to-peer communication, transmitting data using SRTP (Secure Real-Time Transport Protocol) and SCTP (Stream Control Transmission Protocol).
- WebSocket: Full-duplex communication protocol primarily used for real-time, low-latency communications.
- WebTransport: Supports low-latency data exchanges and can handle unordered/ordered streams, built on top of HTTP/3 with HTTP/2 fallback.
- gRPC: Uses protocol buffers to encode data and is built on HTTP/2, also supports bi-directional streaming.
- Server-Sent Events (SSE): A unidirectional data stream from the server to the client, used for sending notifications and updates
- WebTorrent: A streaming torrent client that uses WebRTC for peer-to-peer communication.
- Firebase Cloud Messaging (FCM): Push notifications for mobile and web apps, leveraging the Web Push Protocol.
These channels allows attackers to smuggle malicious content through these channels without triggering any alarms. For example, an attacker can transmit malware through the WebRTC protocol. Since SWGs do not inspect WebRTC traffic, the malicious payload is allowed to pass freely. The malware is only assembled on the client side once it has bypassed the SWG entirely.
Since the malicious file is not fully formed until it reaches the client side, within the victim's browser, the SWG has no visibility to it. The SWG does not even detect a file download event! By the time the malware is assembled on the client side, it's too late for traditional network-based defenses to intervene.
Even if cloud proxies were to monitor these channels, the data passing through them is not straightforward to analyze. Many of these protocols involve encrypted or encoded data that is difficult to decode, making it challenging to detect whether malicious content is present.
Furthermore, some of these protocols, such as WebTransport and WebSockets, are designed for low-latency communications and real-time data streaming. Monitoring and analyzing this data in real-time would introduce significant latency, severely degrading the user experience and rendering the application almost unusable.
For cloud proxies to monitor these additional channels effectively, they would need to overhaul their architecture, significantly increasing the complexity and processing power required. This would drive up their per user cost—a cost that is neither favorable for the vendors nor scalable for their clients. This makes such solutions prohibitively expensive for many enterprises, especially when multiplied across thousands of users. This makes it impractical for organizations to rely on SWGs to defend against these advanced threats.
The more scalable and efficient solution is to have an endpoint agent or a browser security agent sitting directly on the browser and monitoring user activity. The only way to detect and block these complex attacks is to have access to rich browser data as input to detection algorithms, and the only way to do this is to have a browser-native product.