Explaining Large Language Models: The Challenges, the Benefits and the Fast-Approaching Future
It should’ve been relatively easy to ask ChatGPT to write this article, but for one problem. It’s not me. It doesn’t have my knowledge or my reference points. And it’s not inside my head. Yet. More on that shortly. First, let’s dig into what Large Language Models such as ChatGPT are, why they’re important and what the future of AI holds.
ChatGPT is possibly the world’s most famous Large Language Model (LLM), with more than 180 million users. The openai.com website - home to ChatGPT - generated 1.5 billion visits in September 2023 alone. It’s so ubiquitous that my Uber driver planned his holiday to Florence using ChatGPT and your coworker’s daughter probably asked it questions about Macbeth when she was prepping for a test.
But how do LLMs work? Firstly, a language model is a machine learning model that aims to predict and generate plausible language. When your email provider autocompletes your sentences, that’s a language model. A large language model uses massive amounts of data. Its effectiveness comes from its training process.
Training an LLM involves exposing it to huge amounts of data, enabling it to learn the patterns and connections between words, and then predict the next word in a sentence. For example, GPT-4 was trained using online text databases. This included a staggering 570 GB of data from books, online texts, Wikipedia entries, articles and other internet-based writing. In total, 300 billion words were fed into it.
Large Language Model Architecture
The most common architecture for an LLM is the transformer model. This consists of an encoder and a decoder. An encoder converts input text into an intermediate representation, and a decoder converts that intermediate representation into useful text.
The model processes data by tokenizing the input and conducting mathematical equations to uncover relationships between tokens. The model can then see the kinds of patterns a human would see - hence the term, Artificial Intelligence (AI).
Generative AI is the overarching term for AI applications or models that can generate any kind of content - whether it’s the written word, code, music or visuals. All large language models are a form of generative AI, but not all generative AI tools are built on LLMs.
Use Cases for Large Language Models
The massive datasets ingested by LLMs mean they can recognize, translate, predict or generate text or other content. They can be trained to translate and write text, write software code, answer questions and summarize documents.
There are multiple potential use cases in a number of key fields, including:
Healthcare - Literature mining and text summarization for clinical research
Banking & Finance - Generating data for financial simulations
Retail & Marketing - Creating unique content for websites, product descriptions, brochures, ad campaigns etc.
Metaverse - Creating virtual worlds e.g. for medical training simulations
Data & Coding - Code generation and generating documentation
Currently, LLMs are making an impact in a number of fields, including customer service, translation and healthcare. There can’t be many people who haven’t been asked by their bank or another online business to take their customer service questions to a chatbot.
A chatbot is an LLM that’s been fine-tuned to handle specific tasks. Once it’s been given the relevant knowledge, with the right prompts it will generate contextually relevant answers to customer questions. At least it will some of the time - if the questions aren’t too complicated or out of the ordinary.
Train an LLM in healthcare, and it could help to draft medical reports or answer patient questions. You’ll need a human to check for accuracy, but there’s no doubt these models can save us all a lot of time. As the oft-paraphrased quote goes, “AI won’t steal your job - a human using AI will”.
The Advantages and Challenges of Large Language Models
LLM Advantages
One of the biggest advantages of LLMs is the sheer volume of data they can process, which has huge benefits in areas such as communication. Imagine a world where people can communicate with each other almost instantaneously - even when they don’t speak the same language.
At this point, we don’t need to imagine it! Software testing facilitated by generative AI is pervasive, whether it's individuals employing automated testing tools or entire applications undergoing rigorous testing processes. Software testing now operates swiftly, with reduced reliance on human intervention to obtain comprehensive test results. The efficiency gains, both in terms of time and cost, are evident in this and other applications of Large Language Models (LLMs). On a positive note for professionals in the software testing field, businesses are unlikely to eliminate human testers entirely, recognizing the potential for costly errors when AI fails to grasp the intricacies of the testing process.
The observed time and cost efficiencies in software testing also play a crucial role in the advancement of Natural Language Processing (NLP) applications. Pre-trained language models, such as ChatGPT, are increasingly embraced by developers working on NLP apps. Many of us experience the benefits of NLP progress daily as we interact with smart assistants like Apple's Siri and Amazon's Alexa, relying on them to discern our speech patterns, comprehend our intentions, and furnish a helpful, pertinent response.
LLM Challenges
One of the major issues when it comes to training a large language model is the amount of computing power required. Training an LLM requires a huge amount of energy - and corresponding carbon emissions. On the flip side, once the LLM is trained, it doesn’t use more than standard computing power to answer your questions.
Another issue is bias. Humans hold biases and prejudices, and since LLMs are fed texts written by humans, they can pick up and amplify human biases. According to a study carried out at the University of California, Berkeley, LLMs have been shown to amplify bias, especially when it comes to racial and gender biases. This means the training data needs to be carefully curated and models must be monitored so any biases can be mitigated.
There’s also a question mark around the trustworthiness of LLMs. They can sometimes generate nonsensical or inaccurate text, so they do need to be monitored. After all, you wouldn’t want to rely on an LLM to make an important medical or legal decision without input from a human doctor or lawyer, would you?
Eggplant’s Synthetic Domain Expert (SDE)
Eggplant is embracing the benefits of offline LLMs and early in 2024, we’ll be releasing a generative AI platform. This will give you the ability to create your own synthetic domain expert. Give it sources of knowledge to ingest and then it can supercharge and augment testing including the generation of test scripts and models.
The ability to train your own completely offline internal LLM removes the privacy and security issues around a platform like Chat-GPT and allows you to create your own industry-specific version.
Later in the year, we’ll have a copilot assistant – much like Microsoft’s Clippy, but our version of Eggy which is less annoying – who will watch what you’re doing on screen and pop up with something you’ve missed or info on another user. It might tell you, “Bob is looking at this screen at the moment and he’s found a load of issues. Do you want to bring Bob in?”
Towards the end of 2024, we’ll be releasing Eggplant Test Studio, built in visual studio code integrated development environment (IDE). This will bring copilot-like functionality of code generation and when combined with our model-based testing solution will also recommend best paths for testing, leading to higher coverage, uncovering dormant defects and high efficiency - with business testers finding it easier to adopt advanced features.
There’s a lot of noise in the industry right now about LLMs and generative AI, and hopefully this article – written by a human – makes it clear that there are real benefits fast approaching. Language understanding and language generation by AI is improving all the time. And these are benefits worth welcoming, so we can improve our work, save time on more mundane tasks, and truly innovate to increase efficiency.