LLM Basic Terms
Planted March 21, 2024
With AI tools like ChatGPT and Gemini rising, there are a lot of terms floating around us that may be confusing to people. Here’s a simple guide:
Large Language Model (LLM) These AI tools are trained on human languages and thus are called language models. They are called large because they are trained on obscene amount of data. Imagine a large whale opening its mouth with everything going inside. That’s LLM with text.
Fine-Tuning Like French Fries taste better with double frying, LLMs perform better when they are trained more than once. Most models are pre-trained before they are released and then, people can fine-tune them to specific needs.
Temperature This decides how randomness of the output. I call this the Hermione-Ron scale. Low temperature means asking a question to Hermione (accurate). High temperature means asking a question to Ron (creative). You’ll get an answer in both cases but in one case, you’ll trust the answer more.
Parameters A straight line equation (y = mx + c) has 1 parameter - x. LLMs have billions (and trillions) of parameters. There parameters are what make LLMs so flexible in answering different types of questions.
Token Like we can’t eat a whole cake in a single bite, LLMs need to divide the input into small chunks and consume it piece by piece. These pieces are called tokens. They are usually a word, partial word, or a character. Example: “He is walking.” will be tokenized to “He” “is” “walk” “ing” “.”
Hope this helps de-mystifying some of the mist around LLMs.