Portkey.ai is a full-stack LLMOps platform to monitor & log all anyscale requests for latency, costs and accuracy. It also provides capabilities to use multiple models from different providers and setup load balancer, fallbacks and a/b tests between them.
To use Portkey with Anyscale endpoints, use the python SDK.
!pip install portkey-ai
Add Portkey's API key to the environment variable to use it across calls later.
import os
os.environ["PORTKEY_API_KEY"] = "YOUR_PORTKEY_API_KEY"
Now, let's try to make a completion request using LLama-70b.
import portkey
from portkey import Config, LLMOptions
# Set the Anyscale API Token & model
anyscale_token = "YOUR_ANYSCALE_TOKEN"
model = "meta-llama/Llama-2-70b-chat-hf" # Pick any anyscale model
# Construct the Portkey Config
llm = LLMOptions(provider="anyscale", api_key=anyscale_token)
portkey.config = Config(mode="single", llms=[llm])
# Your Prompt
messages = [{"role": "user", "content": "What is the meaning of life, universe and everything?"}]
response = portkey.ChatCompletions.create(model=model, messages=messages)
print(response)
Streaming also works out of the box
stream_response = portkey.ChatCompletions.create(model=model, messages=messages, stream=True)
for i in stream_response:
print(i.delta, end="", flush=True)
All the requests & responses will be logged to your dashboard with a comprehensive analytics dashboard.
--GIF OF PORTKEY WITH ANYSCALE--
Yes, You can use Portkey with OpenAI, Langchain and LlamaIndex with our native integrations. Choose "anyscale" as the provider and you'd be good to go.
Portkey's AI Gateway allows you to setup fallbacks between open & closed source models using the fallback
mode in the config object.
# Construct Portkey Config
anyscale = LLMOptions(provider="anyscale", api_key=anyscale_token)
openai = LLMOptions(provider="openai", api_key=oai_token, model="gpt-4")
portkey.config = Config(mode="fallback", llms=[anyscale, openai])
# Make the completion call as before
You can use the ab_test
mode with weights
to run an experiment! This is how you can experiment (50-50) between the 13B and the 7B Llama2 models on Anyscale.
# Construct Portkey Config
modelA = LLMOptions(provider="anyscale", api_key=token, model="meta-llama/Llama-2-13b-chat-hf", weight=0.5)
modelB = LLMOptions(provider="anyscale", api_key=token, model="meta-llama/Llama-2-7b-chat-hf", weight=0.5)
portkey.config = Config(mode="ab_test", llms=[modelA, modelB])
# Make the completion call as before
Your Portkey dashboard will capture the results of both models to help you decide which one to use.
Head over to the Portkey docs to set up Virtual Keys, Semantic Caching, and Feedback collection.