Skip to content

Instantly share code, notes, and snippets.

@bobmayuze
Last active July 23, 2024 22:46
Show Gist options
  • Save bobmayuze/fc1b81a929436b975e886cb81c8de676 to your computer and use it in GitHub Desktop.
Save bobmayuze/fc1b81a929436b975e886cb81c8de676 to your computer and use it in GitHub Desktop.
llama3 405B Hosted by Lepton AI for Emergency Llama 3.1 Hacknight
import os
import openai
client = openai.OpenAI(
base_url="https://llama3-405b-hackathon.lepton.run/api/v1/",
# Get your token from lepton.ai under Settings -> Tokens -> Workspace Token
api_key="YOUR_LEPTONAI_TOKEN"
)
completion = client.chat.completions.create(
model="llama3-405b",
messages=[
{"role": "user", "content": "Which one is bigger? 3.9 or 3.11"},
],
max_tokens=128,
stream=True,
)
for chunk in completion:
if not chunk.choices:
continue
content = chunk.choices[0].delta.content
if content:
print(content, end="")
# Get your token from lepton.ai under Settings -> Tokens -> Workspace Token
export LEPTON_API_TOKEN=YOUR_LEPTONAI_TOKEN
curl https://llama3-405b-hackathon.lepton.run/api/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $LEPTON_API_TOKEN" \
-d '{
"model": "llama3-405b",
"messages": [{"role": "user", "content": "Which one is bigger? 3.9 or 3.11"}],
"temperature": 0.7
}'
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment