Build a Customer Support Bot in 20 Minutes With Tanuki + GPT4

Martbakler
Towards AI
Published in
12 min readDec 10, 2023

--

A programmer Tanuki — created with DALL·E 3

TLDR: This workflow responds to customer feedback messages and parses them into prioritized support tickets using GPT4 + Tanuki (open-source).

Who is this useful for? Anyone interested in creating applications integrating GPT4 capabilities to complex workflows with little to no effort.

How advanced is this post? Basic knowledge of Python is required, but that’s about it. Tanuki takes care of the rest.

What this post doesn’t do. We don’t get Tweets (see their API) or post to your customer support service (depends heavily on the tech stack).

Introduction

GPT4 has taken the world by storm. Potential use cases range from chatbots to classifiers, to data analytics, and now to vision.

One remaining challenge is creating applications that require structured outputs, as language models return outputs in natural text.

In addition, LLMs in zero-shot fashion are known to be misaligned (language models may understand the task differently) and produce hallucinations. To counter this, we have created Tanuki, a simple way to create aligned LLM-powered applications and ensure structured and typed outputs.

Here, I will demonstrate how to use Tanuki in 20 minutes to create:

  1. A (Customer Support) chatbot
  2. A (Customer support) classifier, and
  3. Reliable structured objects for logging into databases.

Use-case in a nutshell

In this example use case, we will create a workflow that both acts as a customer support chatbot and creates potential issues in an internal system. In particular, we assume a tweet to a company’s support account as input. The workflow will create an empathetic response to the tweet and classify it to determine whether it requires further action from the company’s support team. If the tweet does require further action, then a support ticket is created that you can save to your internal database for further action.

Tanuki will be used in this example to create the response, classify the tweet, and create the support ticket if needed. We were able to build this Python app in 20 minutes. Now, we want to show you how to do the same. If you’re impatient, you can find the repo and use-case in the following links:

What is Tanuki, and why is it useful for this use case?

Tanuki is an open-source library for easy and seamless building of LLM-powered apps with little to no required LLM or prompting knowledge. The useful features of Tanuki, especially for this use case are:

  • Type-Awareness: The outputs will always follow the type hinting structure that the user specifies (a nightmare to deal with using LLMs, as the outputs are free-form text). The outputs can be specified as Python base types (ints, lists, dictionaries) or more complex types such as Pydantic classes or Literals
  • Language model behavior is easily alignable through demonstrative mock examples (follows the idea of In-Context Learning). Adding these align statements is not required but usually significantly improves model performance. Adding mock examples is as easy as writing them down in a notepad.
  • Tanuki carries out automatic model distillation, i.e., the teacher model (GPT-4) is automatically distilled down to a smaller fine-tuned model giving up to 15x reductions in both cost and latency without sacrificing performance.

These features are relevant for this use-case as:

  1. The outputs of the LLM will be used downstream by other systems. This required the outputs to always be typed to ensure nothing breaks and there will be no data-induced runtime bugs. This is crucial, as the workflow creates structured ticket objects that need to be logged to databases that expect data in a given schema.
  2. What should (and shouldn’t) be acted upon is subjective and not obvious. Moreover, how the language model responds to clients' feedback is of huge importance (the tone and message need to be correct to not anger customers who are already having issues). Thus, alignment is crucial, as you wish to ensure that the LLMs understand what is an appropriate response and that any further action (you don’t want to miss a customer request) is aligned with how an employee would address these problems. Aligning the language model to these issues is the only way to ensure performance suitable for production use.
  3. The amount of potential support tickets and feedback is colossal in production settings. Thus, reducing costs by a factor of 15 can represent a huge saving and motivation for long-term use, especially as the performance will remain constant, and the workflow will not be affected by potential future version changes to GPT4.

To scope out the project, we’ll use the following requirements

First, we assume the following general workflow:

  • A user feedback message regarding a product or service is sent to the Twitter account.
  • The LLM analyzes the feedback and responds empathetically (or as best as it can), i.e., the chatbot aspect.
  • Given the customer feedback, the LLM will classify the feedback as “requires action” or not, i.e., classifier aspect.
  • If it does, then the chatbot will create a customer ticket object to be used later in downstream applications.

We will use OpenAI’s GPT4 for this use-case. To start, we’ll need to set up some environmental variables with your OpenAI API key. For this, we should create a .env file and add it to the directory. Later the .env file will be read and the environment variables will be correctly parsed.

OPENAI_API_KEY=sk-XXX

And this is all you need to configure to get started with Tanuki! Then, let's next see how to build the use case.

Building the workflow

As mentioned before, you could use GPT4 with prompts if cost was not a factor, although getting typed outputs would require additional work. If you had a few weeks, you could even fine-tune an open-source LLM to handle this task. Instead, we’ll use Tanuki to do this in 20 minutes.

First line of work — we need to install Tanuki

pip install tanuki.py

Then let’s lay out some groundwork. We can assume the input tweet objects will be as follows:

from pydantic import BaseModel
class Tweet(BaseModel):
"""
Tweet object
The name is the account of the user
The text is the tweet they sent
id is a unique classifier
"""
name: str
text: str
id: str

Then, we will create a responsePydantic object is given this tweet to post back to the user, and if a human needs to act given the input message, an SupportTicketobject needs to also be created to be saved to the database

from typing import Literal, Optional
class Response(BaseModel):
"""
Response object, where the response attribute is the response sent to the customer
requires_ticket is a boolean indicating whether the incoming tweet was a question or a direct issue
that would require human intervention and action
"""
requires_ticket: bool
response: str

class SupportTicket(BaseModel):
"""
Support ticket object, where
issue is a brief description of the issue customer had
urgency conveys how urgently the team should respond to the issue
"""
issue: str
urgency: Literal["low", "medium", "high"]

Now, having laid out the groundwork, we can start creating the actual functions that will do all the hard work for us.

First, we create the function that creates the Response object from the incoming tweet. We specify the tweet as the input and the Response as the output typehint. Specifying the type hints is crucial, as that will tell the language model carrying out the function what to create as the final output. Tanuki will also always ensure that the outputs adhere to the type hints, so we can safely say nothing will break due to incorrect objects or unreliable outputs.

Next, we add a summary in the function docstring of what the LLM needs to do and add the @tanuki.patch decorator. This ensures that the outputs from classify_and_respond are well-typed to be analyzed downstream.

import tanuki
@tanuki.patch
def classify_and_respond(tweet: Tweet) -> Response:
"""
Respond to the customer support tweet text empathetically and nicely.
Convey that you care about the issue and if the problem was a direct issue that the support team should fix or a question, the team will respond to it.
"""

To ensure reliable performance and add the align statements to steer the LLM performance towards what to output to the user, we will create another function called align_respond with the @tanuki.align decorator.

In the align_respond we will align the LLM outputs by showing examples of valid inputs and outputs. This alignment will:

  1. Show it how to respond to the customer's request
  2. Show what are requests from the customer that need to be logged (and an internal ticket needs to be created).

These align will be mocked out as the inputs and desired outputs using Python assert statements. Below are a couple of examples for the align statements for the chatbot output objects:

@tanuki.align
def align_respond():
input_tweet_1 = Tweet(name = "Laia Johnson",
text = "I really like the new shovel but the handle broke after 2 days of use. Can I get a replacement?",
id = "123")
assert classify_and_respond(input_tweet_1) == Response(
requires_ticket=True,
response="Hi, we are sorry to hear that. We will get back to you with a replacement as soon as possible, can you send us your order nr?"
)
input_tweet_2 = Tweet(name = "Keira Townsend",
text = "I hate the new design of the iphone. It is so ugly. I am switching to Samsung",
id = "10pa")
assert classify_and_respond(input_tweet_2) == Response(
requires_ticket=False,
response="Hi, we are sorry to hear that. We will take this into consideration and let the product team know of the feedback"
)
input_tweet_3 = Tweet(name = "Thomas Bell",
text = "@Amazonsupport. I have a question about ordering, do you deliver to Finland?",
id = "test")
assert classify_and_respond(input_tweet_3) == Response(
requires_ticket=True,
response="Hi, thanks for reaching out. The question will be sent to our support team and they will get back to you as soon as possible"
)
input_tweet_4 = Tweet(name = "Jillian Murphy",
text = "Just bought the new goodybox and so far I'm loving it!",
id = "009")
assert classify_and_respond(input_tweet_4) == Response(
requires_ticket=False,
response="Hi, thanks for reaching out. We are happy to hear that you are enjoying the product"
)

Asserts like the above greatly reduce the likelihood of hallucinations and unexpected failures by aligning the LLM to the intended behavior. I like to think of this as “test-driven alignment” (test-driven development for LLMs).

Exactly the same needs to be done for the second part, i.e., the creation of the support ticket for logging. Following the same structure of patch and align functions, we have the following:

@tanuki.patch
def create_support_ticket(tweet_text: str) -> SupportTicket:
"""
Using the tweet text create a support ticket for saving to the internal database
Create a short summary of action that needs to be taken and the urgency of the issue
"""

@tanuki.align
def align_supportticket():
input_tweet_1 = "I really like the new shovel but the handle broke after 2 days of use. Can I get a replacement?"
assert create_support_ticket(input_tweet_1) == SupportTicket(
issue="Needs a replacement product because the handle broke",
urgency = "high"
)
input_tweet_2 = "@Amazonsupport. I have a question about ordering, do you deliver to Finland?"
assert create_support_ticket(input_tweet_2) == SupportTicket(
issue="Find out and answer whether we currently deliver to Finland",
urgency="low"
)
input_tweet_3 = "Just bought the new goodybox and so far I'm loving it! The cream package was slightly damaged however, would need that to be replaced"
assert create_support_ticket(input_tweet_3) == SupportTicket(
issue="Needs a new cream as package was slightly damaged",
urgency="medium"
)

To bring it all together, we create a analyse_and_respond() function to create the response and support ticket if needed, and we’re done!

def analyse_and_respond(tweet: Tweet) -> tuple[Optional[SupportTicket], Response]:
# get the response
response = classify_and_respond(tweet)
# if the response requires a ticket, create a ticket
if response.requires_ticket:
support_ticket = create_support_ticket(tweet.text)
return response, support_ticket
return response, None

The full and final code for all of this should look like this:

from dotenv import load_dotenv
load_dotenv()

from pydantic import BaseModel
from typing import Literal, Optional
import tanuki
class Tweet(BaseModel):
"""
Tweet object
The name is the account of the user
The issue is the message they sent
"""
name: str
text: str
id: str

class Response(BaseModel):
"""
Response object, where the response attribute is the response sent to the customer
requires_ticket is a boolean indicating whether the incoming tweet was a question or a direct issue
that would require human intervention and action
"""
requires_ticket: bool
response: str

class SupportTicket(BaseModel):
"""
Support ticket object, where
issue is a brief description of the issue customer had
urgency conveys how urgently the team should respond to the issue
"""
issue: str
urgency: Literal["low", "medium", "high"]

# response creation
@tanuki.patch
def classify_and_respond(tweet: Tweet) -> Response:
"""
Respond to the customer support tweet text empathetically and nicely.
Convey that you care about the issue and if the problem was a direct issue that the support team should fix or a question, the team will respond to it.
"""

@tanuki.align
def align_respond():
input_tweet_1 = Tweet(name = "Laia Johnson",
text = "I really like the new shovel but the handle broke after 2 days of use. Can I get a replacement?",
id = "123")
assert classify_and_respond(input_tweet_1) == Response(
requires_ticket=True,
response="Hi, we are sorry to hear that. We will get back to you with a replacement as soon as possible, can you send us your order nr?"
)
input_tweet_2 = Tweet(name = "Keira Townsend",
text = "I hate the new design of the iphone. It is so ugly. I am switching to Samsung",
id = "10pa")
assert classify_and_respond(input_tweet_2) == Response(
requires_ticket=False,
response="Hi, we are sorry to hear that. We will take this into consideration and let the product team know of the feedback"
)
input_tweet_3 = Tweet(name = "Thomas Bell",
text = "@Amazonsupport. I have a question about ordering, do you deliver to Finland?",
id = "test")
assert classify_and_respond(input_tweet_3) == Response(
requires_ticket=True,
response="Hi, thanks for reaching out. The question will be sent to our support team and they will get back to you as soon as possible"
)
input_tweet_4 = Tweet(name = "Jillian Murphy",
text = "Just bought the new goodybox and so far I'm loving it!",
id = "009")
assert classify_and_respond(input_tweet_4) == Response(
requires_ticket=False,
response="Hi, thanks for reaching out. We are happy to hear that you are enjoying the product"

)

# support ticket creation
@tanuki.patch
def create_support_ticket(tweet_text: str) -> SupportTicket:
"""
Using the tweet text create a support ticket for saving to the internal database
Create a short summary of action that needs to be taken and the urgency of the issue
"""

@tanuki.align
def align_supportticket():
input_tweet_1 = "I really like the new shovel but the handle broke after 2 days of use. Can I get a replacement?"
assert create_support_ticket(input_tweet_1) == SupportTicket(
issue="Needs a replacement product because the handle broke",
urgency = "high"
)
input_tweet_2 = "@Amazonsupport. I have a question about ordering, do you deliver to Finland?"
assert create_support_ticket(input_tweet_2) == SupportTicket(
issue="Find out and answer whether we currently deliver to Finland",
urgency="low"
)
input_tweet_3 = "Just bought the new goodybox and so far I'm loving it! The cream package was slightly damaged however, would need that to be replaced"
assert create_support_ticket(input_tweet_3) == SupportTicket(
issue="Needs a new cream as package was slightly damaged",
urgency="medium"
)

# final function for the workflow
def analyse_and_respond(tweet: Tweet) -> tuple[Optional[SupportTicket], Response]:
# get the response
response = classify_and_respond(tweet)
# if the response requires a ticket, create a ticket
if response.requires_ticket:
support_ticket = create_support_ticket(tweet.text)
return response, support_ticket
return response, None

And that’s it! Now, we can test the workflow out with a couple of examples.

def main():
"""
This function analyses the incoming tweet and returns a response output and if needed a ticket output
"""
# start with calling aligns to register the align statements
align_respond()
align_supportticket()

input_tweet_1 = Tweet(name = "Jack Bell",
text = "Bro @Argos why did my order not arrive? I ordered it 2 weeks ago. Horrible service",
id = "1")
response, ticket = analyse_and_respond(input_tweet_1)

print(response)
# requires_ticket=True
# response="Hi Jack, we're really sorry to hear about this. We'll look into it right away and get back to you as soon as possible."

print(ticket)
# issue="Customer's order did not arrive after 2 weeks"
# urgency='high'

input_tweet_2 = Tweet(name = "Casey Montgomery",
text = "@Argos The delivery time was 3 weeks but was promised 1. Not a fan. ",
id = "12")
response, ticket = analyse_and_respond(input_tweet_2)

print(response)
# requires_ticket=True
# response="Hi Casey, we're really sorry to hear about the delay in your delivery. We'll look into this issue and get back to you as soon as possible."

print(ticket)
# issue='Delivery time was longer than promised'
# urgency='medium'

input_tweet_3 = Tweet(name = "Jacks Parrow",
text = "@Argos The new logo looks quite ugly, wonder why they changed it",
id = "1123")
response, ticket = analyse_and_respond(input_tweet_3)

print(response)
# requires_ticket=False
# response="Hi Jacks Parrow, we're sorry to hear that you're not a fan of the new logo. We'll pass your feedback on to the relevant team. Thanks for letting us know."

print(ticket)
# None

Seems like it’s working as expected! The outputs look good, are following the tone we aligned it to use and the tickets are created when needed with appropriate urgency.

And all this took us under half an hour to create.

What’s Next

While this was just a small example, it demonstrates just how easily developers can create LLM-powered functions and apps using Tanuki.

If this sounds interesting and you’d love to learn more (or, better yet, get involved), please join our Discord. This was just one use-case we have created, you can find others here (for instance, creating structured data from web scraping, creating a To-Do list app from natural text describing to-do items, blocking profound language, and many more)

If you have any questions, let me know here in the comments or at our Discord. Talk to you guys soon!

--

--