Author:
Reading time:
Meta has just released its latest large language model, Llama 3, into the search bars of its major apps: Facebook, Messenger, Instagram, WhatsApp, and a standalone website at Meta.ai, sending a strong message that it will not give up to ChatGPT in the AI race. But that’s not all. Meta announced Llama 3, the next major version of its foundational open-source model, claiming it outperforms competing models of its class on key benchmarks and is better overall at tasks like coding.
There are two versions: an 8-billion parameter model and a 70-billion parameter one, both of which will be accessible on all major cloud providers. (At a very high level, parameters dictate a model’s complexity and capacity to learn from its training data.)
Meta’s goal is ambitious: Meta AI is aimed to be “the most intelligent AI assistant that people can freely use across the world,” says CEO Mark Zuckerberg, adding modestly, “With Llama 3, we basically feel like we’re there.” What exactly justifies these presumptions? Meta has introduced additional features, including accelerated image generation and direct access to web search results.
The Meta AI assistant is the only chatbot I know of that now integrates real-time search results from both Bing and Google — Meta decides when either search engine is used to answer a prompt. Its image generation has also been upgraded to create animations (essentially GIFs), and high-res images now generate on the fly as you type. Meanwhile, a Perplexity-inspired panel of prompt suggestions when you first open a chat window is meant to “demystify what a general-purpose chatbot can do.
Ahmad Al-Dahle, Meta’s head of generative AI,
History tends to repeat itself, especially when it’s profitable, and that’s exactly the case with Meta, which has turned copying rivals’ ideas into a business strategy. Its famous social media formats—Stories and Reel—were both pioneered by Snapchat and TikTok, respectively, but when incorporated into Meta’s ecosystem, they became game changers.
Sure, it can be seen as an act of “evil genius,” but what is clear is that this strategy works thanks to the gargantuan Meta’s reach, scale, and adaptability.
Now, it can be the same.
Meta may have arrived late to the party, but the company is too big, powerful, and determined to be ignored now that it has joined. The introduction of Llama 3 is definitely a leap forward in the realm of Large Language Models. With its developer-friendly approach, epitomized by its open-source nature, Llama 3 can shake up the scene.
Edwin Lisowski, COO & co-founder at Addepto
Llama 3 shows how AI models are growing fast. Last year’s biggest Llama 2, had 70 billion parts, but the next large Llama 3 will have over 400 billion, and it will learn from over 15 trillion words, compared to Llama 2, which learned from 2 trillion.
Read more: Open Source LLM in 2024: Your Comprehensive Guide for Open-Source Large Language Models
A standout feature of Llama models is that they’re open-source. This allows developers to tweak them for different tasks, making Llama models excellent for using powerful language models without costly licenses.
Training Llama 3 involves a huge dataset of over 15 trillion tokens from public sources, seven times larger than Llama 2. It also contains four times more code.
Meta improved Llama 3 for real-life situations. They created a new set of tests with 1,800 questions covering 12 important areas, such as advice, brainstorming, question answering, coding, creative writing, and more. They keep these tests secret, even from their own teams, to ensure the model learns without becoming too specific.
Based on the data, Llama 3’s 70B model has excelled in real-life situations. Human evaluators frequently preferred it over other models, with Llama 3 achieving a win rate of 63.7% against Meta Llama 2, 63.2% against GPT-3.5, 59.3% against Mistral Medium, and 52.9% against Claude Sonnet. This ranking by human evaluators indicates that Meta’s pre-trained Llama 3 model has set a new benchmark for LLMs of this size.
Meta has grander ambitions in store. They’ve announced the development of a highly anticipated model with over 400 billion parameters, set to be released in the coming months. This new model promises to support additional languages and offer expanded functionalities, including larger context windows to handle more complex queries.
Category: