Author:
Senior Data Scientist
Reading time:
In recent years, we have witnessed a remarkable transformation in the field of artificial intelligence, particularly in the domain of conversational agents. The emergence of Conversational AI, powered by the development of Large Language Models (LLMs), has opened new doors to more natural, engaging, and human-like interactions between humans and computers. This paradigm shift in technology holds the promise of scalable, asynchronous, and cost-effective communication with many applications.
By processing and generating text in a contextually relevant manner, LLMs simulate the dynamics of human interaction. They can craft creative responses, adapt to conversation context, and even adopt distinctive agent traits, imitating various dialects and verbosity levels.
Here is a tutorial that explains step by step how to build an LLM-based chatbot that has its own tone of voice and is equipped with unique knowledge that goes beyond what ChatGPT can offer now.
Rule-based chatbots are engineered to handle specific cases and well-defined tasks. Their responses are programmed using logic and heuristics. While these chatbots can be effective for closed sets of tasks, they often lack the depth and nuance of genuine conversation. Users may perceive them as artificial and mechanical, as their interactions are governed by predefined rules. While they serve well for tasks like basic customer support or information retrieval, they struggle to provide a seamless and immersive conversational experience.
Conversational chatbots represent a significant advancement in the realm of AI-driven communication. These agents utilize machine learning techniques to mimic entire conversations, complete with contextual understanding. Unlike rule-based chatbots, conversational AI systems can adapt to the flow of dialogue, making interactions more natural and engaging. This is achieved by training the models on vast amounts of data, allowing them to grasp the nuances of language and context. Techniques like Retrieval-Augmented Generation (RAG) enable these models to integrate external knowledge and provide accurate responses even in complex scenarios.
One of the key differentiators of conversational AI is its ability to operate in a language abstraction space rather than relying solely on specific words. By understanding the underlying meaning and intent behind user prompts, these models can provide more contextually relevant and coherent responses. This enables them to bridge language barriers, support different proficiency levels, and adapt to various dialects seamlessly.
Conversational AI is not just about mimicking human-like conversations; it’s about redefining the way we interact with technology. The scalability and asynchronicity of conversational agents allow businesses and individuals to engage with users across different time zones and at their own convenience. Moreover, the cost-effectiveness of AI-driven interactions reduces the need for a large human support team, making it an appealing option for businesses aiming to streamline their operations.
There are several components of a conversational system, and they will be introduced in this tutorial. The main purpose is to familiarize you with those building blocks from an LLM interaction perspective.
In this tutorial, we will create a simple ChatBot system and extend it step-by-step with additional components to achieve conversational capabilities.
There are many awesome libraries that implement those components, for example, LangChain provides high-level, ready-to-use modules to achieve those functionalities.
However, in this tutorial, for educational purposes, we will build our own components from the ground to show the general approach to conversational necessities.
Our ChatBot prototype code:
class ChatBot: ... def chat(self, user_query: str): """Input user query to a chatbot and print the response.""" print(f">>> USER: {user_query}") answer = ... # Get the answer print(f" <<< BOT: {answer}")
From the user’s perspective, we will use the chat method to input user’s queries into the chat. We will use the OpenAI module as an example for LLM interaction. It provides Python structures to interact with OpenAI endpoints. We have to import the necessary modules and set up the OpenAI API key. A common practice is to use environment variables to store those values (use export bash command before running the scripts).
import os import openai openai.api_key = os.getenv("OPENAI_API_KEY") # or provide the one directly
In the simplest architecture, we provide direct access to LLM for the user. This allows for free-form conversation with the knowledge already acquired by LLM. The service is de facto a proxy between the user and LLM API and does not provide additional features. The advantage of this solution is that the user is exposed to full LLM capabilities, and the implementation is trivial. The system is stateless, so no session information has to be stored. From a business perspective, the user’s token usage can be tracked.
Let’s implement _get_completion method of ChatBot class according to the OpenAI documentation. We will use the GPT-4 model as an example, but other models can be used as well.
def _get_completion(self, prompt: str) -> str: messages = [{"role": "user", "content": prompt}] response = openai.ChatCompletion.create( model="gpt-4", messages=messages, temperature=0, ) return response.choices[0].message["content"]
Our chat method needs to be updated with the new completion function:
def chat(self, user_query: str): """Input user query to a chatbot and print the response.""" print(f">>> USER: {user_query}") answer = self._get_completion(user_query) print(f" <<< BOT: {answer}")
This way, we can chat with the LLM and get the desired information:
if __name__ == "__main__": my_bot = ChatBot() my_bot.chat("What is the highest building in the world?") my_bot.chat("Hi, my name is Bob!")
The output will be printed:
>>> USER: What is the highest building in the world? <<< BOT: The highest building in the world is the Burj Khalifa in Dubai, United Arab Emirates. >>> USER: Hi, my name is Bob! <<< BOT: Hello Bob! How can I assist you today?
To add additional features to our ChatBot, we will introduce a simple prompt engineering technique to provide context and additional guidance for the conversation. We can ask the chatbot to answer using more sophisticated language and to react when the user does not provide the question. We don’t want any numeric output as well. This approach opens doors for more features, such as:
We can define the behavior of our ChatBot using a class variable:
class ChatBot: role_description = """ You are an assistant that uses very sophisticated English to answer User's questions. Don't include any numbers in your answer. If the user's prompt does not contain a question, kindly inform about it. """
And need to modify our _get_completion method:
def _get_completion(self, prompt: str) -> str: messages = [ {"role": "system", "content": self.role_description}, {"role": "user", "content": prompt}, ] response = openai.ChatCompletion.create( model="gpt-4", messages=messages, temperature=0, ) return response.choices[0].message["content"]
This way we get more interesting responses from the prompts used previously.
>>> USER: What is the highest building in the world? <<< BOT: The edifice that currently holds the distinction of being the tallest in the world is the Burj Khalifa, located in the United Arab Emirates. This architectural marvel, with its unparalleled height, is a testament to human ingenuity and the relentless pursuit of pushing boundaries in design and construction. >>> USER: Hi, my name is Bob! <<< BOT: Greetings, Bob! It's a pleasure to make your acquaintance. However, I must inform you that your statement does not contain a question. How may I assist you today?
However, if we ask a question in the context of the conversation, the ChatBot will not be able to answer properly:
>>> USER: What is the highest building in the world? <<< BOT: The edifice that currently holds the title of the world's tallest building is the Burj Khalifa, located in Dubai, United Arab Emirates. This architectural marvel, with its unique design and impressive height, is a testament to human ingenuity and the advancements in construction technology. >>> USER: In which continent is it? <<< BOT: I'm sorry, but your query is incomplete. Could you please specify the place or country you're referring to?
Integrating chat memory into Conversational AI enables more engaging and immersive conversations. By retaining previous messages within a dialogue, chatbots offer a deeper level of interaction.
This capability opens the door to applications such as dynamic surveys and step-by-step information acquisition. However, production-grade systems necessitate session management to link conversations with specific users, introducing scalability and design challenges.
To add conversation memory, we introduce an instance variable, self.messages, which will store all of the messages from the conversation. In this simple approach, messages are stored as a list without more advanced management.
It is worth noticing that the LangChain framework provides many different ways to handle conversation memory.
def __init__(self): self.messages = [{"role": "system", "content": self.role_description}] def _get_completion(self, prompt: str) -> str: self.messages.append({"role": "user", "content": prompt}) response = openai.ChatCompletion.create( model="gpt-4", messages=self.messages, temperature=0 ) answer = response.choices[0].message["content"] self.messages.append({"role": "assistant", "content": answer}) return answer
After this addition, we can ask chatbot follow-up questions while retaining the context of the conversation.
>>> USER: What is the highest buiilding in the world? <<< BOT: The edifice that currently holds the distinction of being the tallest in the world is the Burj Khalifa, located in the United Arab Emirates. This architectural marvel, with its awe-inspiring height and grandeur, is a testament to human ingenuity and the advancements in construction technology. >>> USER: In which continent is it? <<< BOT: The Burj Khalifa, the world's tallest building, graces the skyline of the city of Dubai, which is situated in the continent of Asia.
In previous architectures, the system relied solely on the internal knowledge of the LLM. This can pose limitations, especially when specific information needs to be sourced from external repositories such as document databases.
The Retrieval-Augmented Generation (RAG) technique marks a transformative step and enables the incorporation of context-rich information from knowledge management systems. However, it introduces additional requirements on the infrastructure, like vector databases, access control, and others. We will develop a very simple example of RAG using a similarity search method.
Let’s consider that we have the following facts that we want to use as a knowledge base for our LLM:
facts = [ "Sarah has three sisters.", "The weather is sunny right now.", "Sarah lives in Burj Khalifa", ]
We want to add a specific fact that corresponds to the user’s query to enrich the response with information. To do this, we need to calculate the embedding vector for each fact in order to encode its meaning in mathematical form.
class ChatBot: ... def __init__(self): self.messages = [{"role": "system", "content": self.role_description}] self.facts = [] self.fact_embeddings = [] @staticmethod def _get_embedding(string: str) -> list[float]: response = Embedding.create(model="text-embedding-ada-002", input=string) return response["data"][0]["embedding"] def add_facts(self, facts: list[str]): self.facts += facts self.fact_embeddings += [self._get_embedding(fact) for fact in facts]
This allows us to use a technique called similarity search, where we effectively search for the fact that corresponds to the user’s query. The algorithm iterates through all the facts and selects the one that is most relevant for the query.
import numpy as np ... class ChatBot: ... def _similarity_search(self, string: str) -> str: input_embedding = self._get_embedding(string) similarity = np.dot(self.fact_embeddings, input_embedding) facts_index = np.argmax(similarity) return self.facts[facts_index]
Now we can add the most relevant fact as a context to our conversation:
class ChatBot: ... def _get_completion(self, prompt: str) -> str: self.messages.append({"role": "user", "content": prompt}) current_messages = self.messages.copy() if self.facts: similar_fact = self._similarity_search(prompt) current_messages.append( { "role": "system", "content": f"Use the following information to answer the question: {similar_fact}", } ) response = ChatCompletion.create( model="gpt-4", messages=current_messages, temperature=0 ) answer = response.choices[0].message["content"] self.messages.append({"role": "assistant", "content": answer}) return answer
As a result, we can ask the chatbot about the facts beyond common knowledge already internalized by the LLM.
if __name__ == "__main__": my_bot = ChatBot() facts = [ "Sarah has three sisters.", "The weather is sunny right now.", "Sarah lives in Burj Khalifa", ] my_bot.add_facts(facts) my_bot.chat("What is the biggest building in the world?") my_bot.chat("Who lives there?") my_bot.chat("How many sisters does she have?")
>>> USER: What is the biggest building in the world? <<< BOT: The tallest edifice in the world is the Burj Khalifa, located in Dubai, United Arab Emirates. This architectural marvel holds numerous records, including the highest occupied floor, the highest outdoor observation deck, and the tallest service elevator. Its construction is a testament to human ingenuity and the relentless pursuit of pushing boundaries. >>> USER: Who lives there? <<< BOT: The Burj Khalifa is a residential and commercial building, housing a multitude of individuals and businesses. Among its many residents is a lady named Sarah. She, along with others, enjoys the luxury and prestige that comes with residing in such a renowned structure. >>> USER: How many sisters does she have? <<< BOT: Sarah is blessed with the companionship of three sisters. The bond they share is undoubtedly strong and filled with love and mutual respect.
Undoubtedly, adding more context and internal knowledge to the conversation gives it a more natural, immersing flow.
The development of Conversational AI, facilitated by the rise of Large Language Models, marks a significant turning point in the technology landscape. These AI-driven chatbots are not only capable of understanding and responding to human conversations, but also of fostering engaging, context-aware, and natural interactions.
As we stand on the cusp of this paradigm shift, we are witnessing the transformation of human-computer communication into a more scalable, asynchronous, and cost-effective endeavor. Conversational AI is not just a technological advancement; it’s a glimpse into a future where machines truly understand and engage with us in a remarkably human-like manner. If you’re interested in realizing the potential of Conversational AI for your business, discover more about AI consulting services. Our team of experts is ready to assist you on this transformative journey.
General:
Chat history:
Conversational Retrieval QA:
Chatbots vs. Virtual Agents:
Generative Agents in Langchain:
Simple QA over documents (pinecone):
Category: