Fastly launches AI Accelerator to boost developer efficiency

Mon, 17th Jun 2024

FYI, this story is more than a year old

Fastly has introduced the Fastly AI Accelerator, aiming to enhance developers' experiences by improving performance efficiencies and reducing costs in applications that utilise large language models (LLMs). The new solution is designed to address the challenges posed by the high volume of similar prompts processed by popular AI applications.

"AI technologies generally, and large language models specifically, are aggressively reshaping the technology industry, and the way millions worldwide—developers included—work every day," stated Stephen O'Grady, Principal Analyst with RedMonk. He noted that while there is significant focus on the largest models, developers and enterprises are increasingly considering medium and smaller models for their cost-effectiveness, shorter training cycles, and compatibility with limited hardware profiles.

Fastly's new AI Accelerator utilises semantic caching to reduce the frequency of API calls necessary for retrieving similar information, thereby decreasing both costs and latency. This approach leverages Fastly's Edge Cloud Platform and its advanced caching technology, employing a specialised API gateway that substantially enhances performance. Initially, the AI Accelerator supports ChatGPT and will expand to include additional models.

Anil Dash, Vice President of Developer Experience at Fastly, remarked, "At Fastly, we're always listening to developers to understand both what they're excited about and what their biggest pain points are. Fastly AI Accelerator gives developers exactly what they want, by making the experience of their favourite LLMs a lot faster and more efficient, so they can focus on what makes their app or site unique, and what keeps their users happy."

The semantic caching function of Fastly AI Accelerator operates by providing cached responses for repeated or similar questions directly from Fastly's high-performance edge platform. This avoids the need to request the same information repeatedly from the AI provider, thus streamlining the process.

Developers looking to implement the Fastly AI Accelerator need to update their applications to use a new API endpoint, a process that typically involves modifying just one line of code. The solution is geared towards understanding the context of incoming requests and delivering similar responses when queries are alike, moving beyond traditional caching techniques.

In addition to launching the AI Accelerator, Fastly is also widening the accessibility for developers with an expanded free account tier. This allows developers to set up new sites, create applications, or launch services swiftly, supported by a range of features including access to Fastly's Content Delivery Network, substantial memory and storage allocations, and various security tools like TLS and continuous DDoS mitigation.

Through these initiatives, Fastly aims to empower developers to build more efficient, secure, and engaging online experiences. The company's emphasis on leveraging its edge cloud platform seeks to address some of the predominant challenges faced by developers using large language models, ensuring improved performance and cost management.

Stephen O'Grady summed up the ongoing trend, highlighting the increasing relevance of medium and smaller models: "Whether it's to lower costs, to shorten training cycles, or to run on more limited hardware profiles, they're an increasingly important option." This underscores the importance of solutions like Fastly AI Accelerator in the evolving landscape of AI and software development.

Share on:

Guides

Search

Fastly launches AI Accelerator to boost developer efficiency

Top stories