The Cost of Building and Training LLMs
GPT-4, PaLM, Claude, Bard, LaMDA, Chinchilla, Sparrow – the list of large-language models on the market continues to grow. However, behind their remarkable capabilities, users are discovering substantial costs.
Building and training LLMs is an expensive endeavor. It requires thousands of GPUs, which offer the parallel processing power needed to handle the massive datasets these models learn from. The cost of the GPUs alone can amount to millions of dollars. According to a technical overview of OpenAI’s GPT-3 language model, training required at least $5 million worth of GPUs.
Training LLMs also incurs significant electricity costs. The electricity used to train GPT-3 alone is estimated to have cost $100 million. Moreover, training is getting more expensive as larger models require more computational power and longer training times. The size of training datasets is also increasing, leading to longer, power-draining training iterations. State-of-the-art models are trained for weeks or months to reach optimal performance, accumulating substantial costs.
Inferencing Costs
Running inference on the trained models is also expensive. In January 2023, ChatGPT used nearly 30,000 GPUs to handle hundreds of millions of daily user requests. The energy consumption for these queries is equivalent to the daily energy consumption of approximately 33,000 U.S. households.
Additionally, accessing these models through APIs comes at a significant cost. Each token produced by the models incurs a fee. Furthermore, large models often produce verbose output, wasting computational resources and increasing costs.
Data Center Costs
All the computing required for LLMs takes place in data centers, where hundreds of thousands of processing units, memory, and storage devices are stored. The increasing demand for large models is driving up data center prices.
Sustainability and Waste
Recent research shows that a significant portion of the computational operations in large models is wasted, with over 99% of floating-point operations resulting in zero calculations. Reliance solely on massive models is unsustainable in the long term and inefficient in terms of resource utilization.
Choosing the Right Approach
Considering the economic implications of large language models is crucial for businesses and individuals. Factors such as resources, data quality and size, technical expertise, and alignment with business strategy should be taken into account when deciding whether to adopt or build an LLM.
Companies should ensure their teams understand scaling laws, data curation, and techniques to mitigate instability. It’s essential to strike a balance between utilizing the capabilities of LLMs and managing the costs associated with them. In some cases, alternative approaches may prove to be more cost-effective and efficient.
As AI continues to revolutionize industries, understanding the economic implications of large language models becomes paramount. Investing in LLMs should be a strategic decision that aligns with long-term business goals and considers the overall costs involved.