Jeremy McMinis mcminis1

Conversation Design

Today, creating chatbots that handle complex tasks naturally yet remain under our control is a significant challenge. Traditional graph-based chatbots offer control but can feel rigid and unnatural, while modern LLM (Large Language Model) chatbots excel in naturalness but can veer off course in complex interactions. This article introduces a hybrid approach, combining the best of both worlds: the precision of graph-based systems with the fluid conversation style of LLMs. Our goal is to provide a solution that makes chatbots both more engaging for users and easier to manage for developers, especially in intricate tasks. We'll explore how this innovative method can transform chatbot interactions, making them feel more natural without sacrificing control.

Old school: Graph based Chatbots

In the pre-LLM days conversations were built using intent detection and directed graphs. The way it starts is a conversation designer builds a graph structure where all of the nodes are the things you

Get the model file Hugging face: The Bloke. I chose llama-2-13b-chat.ggmlv3.q4_K_S.bin. It's a good tradeoff between size, speed, and quality. It's about 12G on my card and I get 60 tokens/ second throughput.

install and start the llama-2 server

$ llama.cpp/server -t 8 -ngl 128 -m ./llama-2-13b-chat.ggmlv3.q4_K_S.bin -eps 1e-5 -c 4096 -b 1024 --port 8009

Here's my examples data format for the examples I want to use in my few shot prompt

	```

	tools = [
	{
	"name": "article_analysis",
	"description": "Analyzes an academic article abstract and provides key information extracted from it. This includes named entities mentioned in the abstract along with their types and context, the main topics covered in the abstract, an assessment of the overall quality of the abstract on a 0-1 scale, and a 0-1 score indicating the novelty or originality of the work described in the abstract compared to existing literature. The tool takes the raw text of the article title, abstract, and authors as input. It should be used to quickly summarize and assess an academic article when only the abstract is available. The tool does not analyze the full article text, only the title and abstract.",
	"input_schema": {
	"type": "object",
	"properties": {
	"entities": {

	import numpy as np
	from InstructorEmbedding import INSTRUCTOR

	default_embed_instruction = "Represent the paragraph for retrieval"
	default_query_instruction = "Represent the paragraph for retrieving supporting documents:"

	class VectorDatabase:
	def __init__(self, embed_instruction=default_embed_instruction, query_instruction=default_query_instruction):
	self.__embedding_model = INSTRUCTOR('hkunlp/instructor-large')
	self.__data = dict()

	import openai
	import asyncio
	import sys
	import uvloop
	from mr_graph.graph import Graph
	# mr_graph >= 0.2.5

	default_model_name = 'gpt-3.5-turbo'

	from bs4 import BeautifulSoup
	import re

	from gpt_index import Document
	from manifest import Manifest
	from langchain.llms.manifest import ManifestWrapper
	from langchain.embeddings.huggingface import HuggingFaceEmbeddings
	from gpt_index import LangchainEmbedding
	from gpt_index import GPTSimpleVectorIndex, SimpleDirectoryReader, LLMPredictor, PromptHelper

	import numpy as np
	import matplotlib.pyplot as plt

	import torch
	from torch.nn import RNNCell, Linear, Sequential, HuberLoss, Flatten, Module
	from torch.utils.data import DataLoader

	# Configuration
	## RNN definition
	### maximum length of sequence

	import numpy as np
	import matplotlib.pyplot as plt
	import pickle

	# Configuration
	## RNN definition
	### maximum length of sequence
	T = 6
	### hidden state vector dimension
	hidden_dim = 32