GenAI bot traffic and cyberSecurity: What IT leaders need to know
Australia's generative artificial intelligence (GenAI) market is still in its early days despite creating just shy of 922 million U.S. dollars in revenue in 2023. Nonetheless, the market is rapidly evolving into a billion-dollar industry segment and is expected to climb to an estimated 4.2 billion U.S. dollars in 2030.1
The rapid advancement of GenAI is redefining the digital landscape, presenting both opportunities and challenges for IT leaders. As GenAI becomes increasingly integrated into various aspects of digital infrastructure and operations, its influence extends beyond mere automation. The way users interact with and process information from applications is changing. This evolution necessitates a strategic approach to understanding and managing GenAI bot traffic.
GenAI bot traffic, often claimed for driving efficiency and innovation, also poses risks if not properly managed. It can disrupt the user experience, impact business revenues, and strain resources. IT leaders are tasked with a dual responsibility: leveraging the benefits of GenAI to enhance their systems while mitigating the risks associated with unauthorised or malicious GenAI bot activities. Effective management of this bot traffic is not just a technical necessity but a strategic imperative to maintain a competitive edge and operational integrity.
GenAI bot traffic varies in its functions and impact. It can broadly be organised into three categories based on its interaction with web applications.
1. GenAI Bot Traffic for AI Training
GenAI crawlers, programmed to scour the internet, are responsible for generating significant amounts of bot traffic by collecting vast amounts of data. This data helps in building and refining GenAI models. Although it is an essential step for GenAI development, these crawlers often do not direct traffic back to the websites from which they collect data.
Additionally, these GenAI models may use the content out of context or without proper attribution. Furthermore, this bot traffic could pose a competitive threat, as users might opt to interact with information provided through GenAI interfaces rather than visiting the original websites.
It is recommended that websites restrict the access to GenAI crawlers for the above reasons. E-commerce platforms should block GenAI crawlers to protect their product catalogues, customer reviews, and pricing information. News portals and social media sites should restrict GenAI crawler access to protect their intellectual property.
However, allowing GenAI crawlers to access certain parts of a website, like marketing and sales information, can enhance a brand's visibility and influence. Additionally, engaging in licensing agreements with GenAI vendors can help monetise intellectual property while retaining control over its distribution.
2. GenAI Bot Traffic Triggered by User Prompts
This traffic is activated by specific user inputs, performing actions in real-time according to user commands. For instance, a prompt such as "summarise the top news items on this website today" can enhance the user experience for the GenAI system's users.
But this can lead to reductions in direct traffic and, subsequently, revenues for the original content providers. This is because the user receives the desired information through the GenAI interface without needing to visit the actual website.
Typically, proactive measures need to be employed by online platforms to block this type of GenAI bot traffic. These measures are intended to protect their web traffic, revenue, and content integrity by limiting how external GenAI systems interact with their sites.
3. User-Driven GenAI Bot Traffic Through Plugins
Many third-party plugins or extensions on GenAI platforms facilitate user-initiated interactions with website content. For example, there are many custom GPTs available on OpenAI GPT Store that claim to help in scraping websites. Although these tools claim to enhance the user experience, they are mostly exploited for malicious purposes, such as unauthorised data scraping.
The impact of this category of GenAI bot traffic is particularly significant for online businesses. E-commerce platforms, for example, face the risk of unauthorised automated scraping. This can lead to competitive disadvantages as sensitive data like pricing and inventory levels could be accessed and utilised by competitors. Additionally, this kind of activity could skew analytics, affecting the accuracy of data-driven business decisions.
Content-rich sites also suffer as these plugins can enable unauthorised access and redistribution of copyrighted material. This not only infringes on copyright laws but also potentially diminishes revenue streams that rely on original content and traffic to the site. Hence it becomes important to block this kind of GenAI bot traffic originating from third-party plugins, extensions, or custom GPTs.
The landscape of GenAI bot traffic is complex and continuously evolving. IT leaders must stay informed and proactive in developing strategies to effectively manage these three types of GenAI bot traffic. By doing so, they can protect their digital assets and business competitiveness while harnessing the potential of GenAI to drive business growth and innovation.
IT leaders should regularly review and adjust their bot management strategies in collaboration with cybersecurity experts. This proactive approach will help in tailoring solutions that address specific operational and security needs and ensuring a secure and efficient digital environment for their organisations.