Demystifying AI

Understanding AI: A Simple Glossary

Want to know more about AI but don’t have much time to spend learning all the lingo? Read this instead.

Lorna Khemraz

10 Sep 2024 • 6 min read

This cheat sheet gives you quick, easy-to-understand explanations of key AI terms, so you can get up to speed fast.

AI : AI or Artificial Intelligence is when machines are designed to perform tasks that typically require human thinking, such as speech recognition or decision-making. The theoretical foundation began in the 1950s with figures like Alan Turing, whose "Turing Test" examined whether a machine could mimic human intelligence. John McCarthy, another pioneer, coined the term "Artificial Intelligence" for the 1956 Dartmouth Conference, which marked the field's formal birth.
Artificial Neural Networks: An Artificial Neural Network, or ANN for short, is a computer system inspired by how neurons in the brain communicate. It’s made up of layers of connected "neurons" that pass information to each other. Each neuron looks at the data, processes it, and passes it on, similarly to how neurons in our brain communicate.
At first, the network doesn’t know much, but it learns by seeing lots of examples ( i.e. training, see further down), adjusting itself to recognize patterns and make smarter decisions over time. This is how it helps with tasks like recognizing faces, understanding speech, or even driving cars—similar to how humans learn through experience.
Attributes: Attributes refer to any key characteristics or properties that define data or an object in an AI system. They are the essential details that the AI system uses to process and make sense of data. For example, in an AI system identifying objects in an image, the attributes might be features like shape, colour, or size. These attributes help the AI recognize and work with the data.
Chunk: In AI, particularly in the context of RAG (see further down), a chunk refers to a smaller piece of data that is broken down from a larger dataset for easier processing. These chunks are first embedded (converted into vector representations) and then stored in a vector database, which helps the AI efficiently retrieve relevant information during processing. Embedding (see further down) is crucial in this process, allowing the AI to understand and organize the chunks meaningfully.
Classification: Classification happens then when AI sorts things into categories. In technical terms, classification is a supervised machine learning method where the model tries to predict the correct label of a given input data. This is an important part of AI systems because it allows them to interpret, understand, and respond to various types of data accordingly.
Context Window: A context window is the amount of information that a Large-Language-Model or an LLM (see further down) can process when generating a response. In simpler terms, it’s the amount of information AI looks at before giving an answer. A large context window can be really helpful because the AI has access to a large amount of information but can also be problematic if the information fed to the AI includes lots of low quality and irrelevant information (see Hallucinations and Hallucitations further below).
Corpus: A corpus is a large collection of text or data that AI uses to learn how to perform its intended function.
Data Labelling: Data Labelling is when humans help AI learn by tagging or labelling raw data. For example, we might label photos as “cat” or “dog” so AI knows what it’s looking at.
Deep Learning: Deep learning is a type of AI that mimics how our brain processes information. It uses layers of connected "neurons" called neural networks (see above) to learn from large amounts of data.
Unlike traditional machine learning models, deep learning models can handle raw data, like images or text, and identify patterns without needing a lot of pre-labeled examples. As it processes data, it keeps improving, making deep learning powerful for tasks like recognizing faces, understanding speech, and even generating creative content like art or music.
Embedding: Embedding is the way in which AI turns words or objects into numbers, so it can understand and work with them. Objects like text, images and audio are represented as vectors in high dimensional space where the distance between vectors are semantically meaningful - for example, “father” would live in much closer proximity to “dad” in the vector space (IBM). It’s like translating words into a language AI can read.
Fine-tuning: Adjusting a pre-trained AI model to make it better at specific tasks. It’s like giving AI a final bit of polish to improve its performance in a particular area.
Foundational Model: This is a large, general-purpose AI model that has been trained on a vast amount of data and can be adapted to various tasks, like a base model for many purposes.
Generative AI: AI that creates new content, such as images, text, or music, from patterns it has learned.
Grounding: Grounding is what happens when AI links abstract data (like words) to real-world meanings, helping it understand concepts in context.
Hallucination: Hallucination (or ‘hallucitation’) is when AI generates information or answers that sound believable but are actually false or made-up. It’s like AI confidently giving the wrong answer and is often due to misunderstandings in complex models.
Input: Input refers to the data or information that is fed into a model or system to be processed. This could be anything like images, text, numbers, or audio. The AI uses this input to analyze and understand patterns or relationships. Input is essentially the starting point for an AI model to begin its work—what you give it to understand, learn from, or make decisions about. The quality and type of input directly impact how well the model performs.
LLMs: Short for Large Language Models, LLMs are advanced AI systems trained on vast amounts of text data to understand and generate human-like language. They use deep learning, particularly neural networks, to process and learn from patterns in language, allowing them to perform tasks like text generation, translation, summarization, and answering questions.
Machine Learning: Machine Learning or ML, is a type of AI where computers learn from data instead of following strict rules. They recognize patterns in the data and use that knowledge to make predictions or decisions. As they are exposed to more data, they get better at their tasks, like identifying images or recommending products, without needing constant updates from humans (IBM).
Model orchestration: Model orchestration is the process of coordinating different AI models, data processes, and systems to work together efficiently. It involves managing how each part interacts, ensuring tasks happen in the right order, and making sure everything runs faster and more smoothly.
Natural Language Processing (NLP): This is a branch of AI that focuses on helping computers understand, interpret, and respond to human language, like Siri or chatbots.
Output: The result or answer AI gives after processing the input. For example, if you ask AI a question, the output is the answer it provides.
Parameters: Parameters are the adjustable elements within an AI model that influence its behaviour and decision-making processes, essentially shaping the model's learning and output generation capabilities. They are like dials that the system tunes to fit the data it’s learning from. For example, in a neural network, parameters help decide the strength of connections between neurons. The better the parameters are set, the more accurate the model’s predictions or decisions will be.
Prompt: A question or command given to an LLM model to generate a response. For example, when you ask an AI to write an essay, that request is the "prompt."
Prompt Engineering: The art of crafting prompts to get the best possible results from an AI. It’s about knowing how to ask AI questions effectively (McKinsey & Company).
RAG: Retrieval-Augmented Generation, or RAG, is an AI technique where the model pulls relevant information from a large database before answering, blending this data with what it already knows to give precise, context-rich responses.
Self-supervised Learning: This is a method where AI models learn from unlabelled data by creating their own labels from the data itself. In this process, the system uses part of the data to predict another part, enabling it to learn patterns and relationships without needing humans to manually label the data. For example, in a text, the model might learn by predicting missing words in a sentence.
Semantics: Semantics in AI refers to understanding the meaning behind words, phrases, or symbols, rather than just the literal text or data. It's about grasping the context and relationships between words to interpret their true meaning. For example, in language models, semantics enables AI to not only recognize a word but also understand how it relates to others in a sentence, capturing nuances like tone, intent, or emotion. This deeper understanding helps AI systems perform tasks like answering questions or translating languages more accurately.
Sentiment: AI’s ability to detect emotions or attitudes in text, like whether a product review is positive, negative, or neutral.
Temperature: In AI, temperature controls how random or creative the AI’s output is. A lower temperature means more predictable responses, while a higher temperature makes the output more varied or creative.
Text Mining: This is when AI scans through large amounts of unstructured text to find useful information or patterns, like pulling out key facts from news articles.
Tokens: Tokens are the basic units of data that models work with, especially in natural language processing. These can be individual words, parts of words, or even characters, depending on how the model breaks down the input text. For example, a sentence like "I love AI" might be broken into three tokens: "I", "love", and "AI". By working with tokens, the AI can analyze and understand language step-by-step, helping it process text and generate accurate responses or predictions.
Training: The process of teaching an AI model by feeding it data so it can learn patterns and improve its performance over time.

Ready to learn more? Here's an easy-to-digest guide on LLMs or, if you're ready to level up, our very own AI Engineer, Marcel Marais, debates RAG in this article.

Find out more about us here.