- AI Jungle Guide
- Posts
- AI-Jungle Guide #5: Search
AI-Jungle Guide #5: Search
Welcome to the AI-Jungle Guide!
Today’s AI co-host is Meta Llama 3.1 405B Instruct FP8 [1].
[Me]: Write a 1-2 funny sentence introduction to the newest issue of the AI-Jungle Guide newsletter. It's about AI search.
[Llama 3.1]: Welcome to the latest issue of the AI-Jungle Guide, where we help you navigate the wilds of AI search without getting lost in a sea of irrelevant cat videos (although, let's be real, those can be pretty entertaining too). In this issue, we'll be exploring the latest developments in AI-powered search and how to tame the beast that is information overload.

Llama 3.1 according to leonardo.ai
News
SearchGPT: Because Who Needs Human Fact-Checkers Anyway.
[Me]: Write a title and a funny 1-2 sentence paragraph about SearchGPT:- search GPT is a prototype for AI based search by OpenAI
- looks strikingly similar to perplexity.ai
- open if it has the same weaknesses like citing things that are not in the source material
- you can join the waitlist now
[Llama 3.1]: SearchGPT [2], OpenAI's new AI-powered search prototype, is like that one friend who sometimes makes stuff up but sounds really confident about it. You can join the waitlist[3] now and be one of the first to experience the thrill of possibly getting fake information cited from real sources!
[Me]: Well Llama is probably not really neutral here :)Llama 3.1: The Open-Source AI That's Almost as Smart as It Is Large (and That's Saying Something)".
[Me]: Write a title and a funny 1-2 sentence paragraph about Llama 3.1 (yourself):- latest open source model by meta
- up to 405B parameters for the largest version (which is huge!)
- 70B and 8B versions available to run at home.
- not (yet) multimodal
- the biggest model can keep up with some of the commercial ones like GPT-4 and Claude
[Llama 3.1]: Meet Llama 3.1 [1], Meta's latest open-source model that's so big, it needs its own zip code (405B parameters, anyone?). But don't worry, it comes in smaller, more household-friendly sizes too (70B and 8B) [4], because who needs a social life when you can run a giant AI model from the comfort of your own home?
[Me]: Ouch, this hits close to home.Transformers: The Attention-Seekers of the AI World.
[Me]: Write a title and a funny 1-2 sentence paragraph about Transformers:- architecture that is used by most recent LLMs
- Transformers bring "Attention" to the processing of input text
- They help to have much better "prediction" than the simple approach used before
- responsible for the GPT (Generative Pre-Trained Transformer) break-through
[Llama 3.1]: Transformers are the drama queens of the AI world, bringing "Attention"[5] to the processing of input text and stealing the spotlight with their impressive prediction skills. Thanks to them, we've gone from simple, clunky language models to the likes of GPT, which can generate text so human-like, you'll wonder if robots are secretly running the internet (spoiler alert: they probably are).
[Me]: I couldn’t have said it better (But I’ll try anyway).

What perplexity.ai knows about the AI-Jungle guide!
Background
What is AI based search and how does it work?
AI “search engines” like Perplexity.ai [6] or the upcoming SearchGPT [2] work like this: First, they parse your query and convert it into a normal search query term. Then, they use a regular search engine like Google or Bing to find relevant websites. The content of these sites is then scraped and put into a vector database, which is used by the GPT to include relevant information when it generates an answer to the original query. Or in short: the LLM searches the web and looks at the results much like you would as a user. The problem of hallucinations is also present in AI-based search, and sometimes the LLM will reference a website that doesn’t support the fact that is presented.How big is Llama 3.1 405B?
The model file for the largest of the new Llamas is around 820GB [7]. If you were to put it in GPUs, it would also come with some overhead. Given that the larger consumer GPUs still have only about 16GB of RAM, you would need quite a few of those. You can bring the model size down with quantization (reducing the number of bits per weight), but that only gets you to around 250GB for a 4-bit quant (down from 32) [8].What are transformers and why are they important?
In the context of large language models (LLMs), the word "transformer" doesn’t imply a truck that converts into a giant robot but rather a type of neural network architecture designed to handle sequential data, making it particularly effective for natural language processing tasks. Transformers use self-attention mechanisms [5] to weigh the importance of different words in a sentence, allowing them to capture long-range dependencies and context more effectively than previous models like RNNs or LSTMs. Their ability to process text in parallel and their scalability have made them fundamental to the development of state-of-the-art language models such as GPT and BERT.

Let Grant Sanderson simplify that for you (no really!)
Try it yourself:
Use AI based search!
The easiest way to use AI for search right now is to head over to Perplexity.ai [6] and enter some search terms. You can also give Consensus.app [9] a try if your research is more academic. Alternatively, you can set up your own AI search at home, perhaps with a locally hosted model [10].
Try Llama 3.1!
For the largest version, you’ll probably want to use HuggingChat [11], unless you have 900GB of VRAM at home. The smaller models are also available for download, and the 8B version is perfect for homegrown AI [4].Learn about Transformers!
You use them daily, so it’s time to do some research! You can watch Grant Sanderson's YouTube videos [12] or read the original paper "Attention Is All You Need" [5].
Spread the Word!
Learned something new? Please do me a favour and forward this newsletter to someone else who could benefit from it! Or invite them to subscribe at: https://aijungle.guide!
[1] https://llama.meta.com/
[2] https://openai.com/index/searchgpt-prototype/
[3] https://chatgpt.com/search
[4] https://huggingface.co/meta-llama/Meta-Llama-3.1-8B
[5] https://arxiv.org/abs/1706.03762
[6] https://www.perplexity.ai/
[7] https://huggingface.co/meta-llama/Meta-Llama-3.1-405B
[8] https://huggingface.co/nisten/meta-405b-instruct-cpu-optimized-gguf
[9] https://consensus.app
[10] https://jointerminus.medium.com/building-a-local-perplexity-alternative-with-perplexica-ollama-and-searxng-71602523e256
[11] https://huggingface.co/chat/
[12] https://www.youtube.com/watch?v=wjZofJX0v4M