Portkey Integration

Portkey.ai is a full-stack LLMOps platform to monitor & log all anyscale requests for latency, costs and accuracy. It also provides capabilities to use multiple models from different providers and setup load balancer, fallbacks and a/b tests between them.

To use Portkey with Anyscale endpoints, use the python SDK.

!pip install portkey-ai

Add Portkey's API key to the environment variable to use it across calls later.

import os

os.environ["PORTKEY_API_KEY"] = "YOUR_PORTKEY_API_KEY"

Now, let's try to make a completion request using LLama-70b.

import portkey
from portkey import Config, LLMOptions

# Set the Anyscale API Token & model
anyscale_token = "YOUR_ANYSCALE_TOKEN"
model = "meta-llama/Llama-2-70b-chat-hf" # Pick any anyscale model

# Construct the Portkey Config
llm = LLMOptions(provider="anyscale", api_key=anyscale_token)
portkey.config = Config(mode="single", llms=[llm])

# Your Prompt
messages = [{"role": "user", "content": "What is the meaning of life, universe and everything?"}]

response = portkey.ChatCompletions.create(model=model, messages=messages)
print(response)

Streaming also works out of the box

stream_response = portkey.ChatCompletions.create(model=model, messages=messages, stream=True)

for i in stream_response:
    print(i.delta, end="", flush=True)

All the requests & responses will be logged to your dashboard with a comprehensive analytics dashboard.

--GIF OF PORTKEY WITH ANYSCALE--

Advanced Use Cases

Can I use Portkey via OpenAI SDK / Langchain / LlamaIndex?

Yes, You can use Portkey with OpenAI, Langchain and LlamaIndex with our native integrations. Choose "anyscale" as the provider and you'd be good to go.

I want to fallback to another model if llama2 fails

Portkey's AI Gateway allows you to setup fallbacks between open & closed source models using the fallback mode in the config object.

# Construct Portkey Config
anyscale = LLMOptions(provider="anyscale", api_key=anyscale_token)
openai = LLMOptions(provider="openai", api_key=oai_token, model="gpt-4")
portkey.config = Config(mode="fallback", llms=[anyscale, openai])

# Make the completion call as before

How can I A/B test multiple models or prompts?

You can use the ab_test mode with weights to run an experiment! This is how you can experiment (50-50) between the 13B and the 7B Llama2 models on Anyscale.

# Construct Portkey Config
modelA = LLMOptions(provider="anyscale", api_key=token, model="meta-llama/Llama-2-13b-chat-hf", weight=0.5)
modelB = LLMOptions(provider="anyscale", api_key=token, model="meta-llama/Llama-2-7b-chat-hf", weight=0.5)
portkey.config = Config(mode="ab_test", llms=[modelA, modelB])

# Make the completion call as before

Your Portkey dashboard will capture the results of both models to help you decide which one to use.

More?

Head over to the Portkey docs to set up Virtual Keys, Semantic Caching, and Feedback collection.