In 2025, a surge of AI-powered applications is expected to transform consumer and business experiences, finally delivering on the much-anticipated potential of generative AI. However, this perspective contrasts with the current landscape, where major players like OpenAI, Google, and xAI are locked in an expensive competition to develop the most advanced large language models (LLMs), aiming for artificial general intelligence (AGI). This ongoing "arms race" has concentrated resources and attention on a few companies, limiting broader innovations in the AI ecosystem.
Elon Musk exemplifies this trend, having invested $6 billion into xAI and acquired a substantial number of Nvidia H100 GPUs, which has cost him over $3 billion for training his model, Grok. This level of spending raises questions about the sustainability of developing LLMs, leaving smaller companies with a daunting hurdle to overcome. High inference costs associated with using these models can make it financially impossible for developers to create viable applications, as the costs for generating responses from these models often mirror those of unaffordable consumer tech.
Forecasts suggest an imminent shift in the AI landscape. Drawing on lessons from the PC and mobile eras, where technological advances continually reduced costs and improved performance, a new approach could emerge. The burgeoning field of AI inference is anticipated to experience a tenfold reduction in costs per year, driven by innovative algorithms, better chip technologies, and efficiency improvements in models. For instance, while a query using OpenAI’s premium model was approximately $10 in mid-2023, reports predict this figure will drop to around $1 per query by mid-2024, making advanced AI applications significantly more accessible.
The upcoming paradigm shift will encourage developers to prioritize the creation of lightweight models, which may not be quite as powerful as the largest models but are designed to be fast and cost-efficient. This might mark a departure from the prevailing focus on developing behemoth models, encouraging founders to innovate and cater to specific applications. An example of this new direction can be seen in Rhymes.ai, which has created a model that rivals leading offerings at a fraction of the cost—$3 million as opposed to the over $100 million typically spent by larger entities.
In summary, the landscape of AI is set to evolve as new strategies emerge focusing on efficiency and cost-effectiveness, enabling a wider range of applications to flourish. By embracing lightweight models integrated with innovative application designs, the industry can drive a healthy and responsive AI ecosystem, ultimately benefiting users and businesses alike.