Skip to content

Instantly share code, notes, and snippets.

@timroster
Last active August 20, 2024 23:42
Show Gist options
  • Save timroster/1a20d4ee09fc9858223ec62e7f277a91 to your computer and use it in GitHub Desktop.
Save timroster/1a20d4ee09fc9858223ec62e7f277a91 to your computer and use it in GitHub Desktop.
Getting started with watsonx.ai lightweight

Getting started with watsonx.ai lightweight engine

The watsonx.ai lightweight engine allows organizations to deploy an optimized stack for inference and embedding only using IBM and third party custom foundation models.

This platform does include HAP (hate, abuse, profanity) and personally-indentifiable content filtering (AI guardrails) capabilities of watsonx.ai.

The lightweight engine does not have a UI like the full version of watsonx.ai which includes the Prompt Lab so it can be a little difficult to indentify how to get started after the initial product instalation completes.

This guide is intended to ease that transition.

Lightweight Engine is only the API

Since the lightweight engine is only an API, the first question should be "how to obtain an API key?" After the installation completes, there is a step at the end of the installation to obtain the default credentials and URL.

The default user API key can be created with the following steps. The output of the cpd-cli manage get instance details command will look like:

CPD Url: cpd-cpdproject.apps.clustername.cluster.domain
CPD Username: cpadmin 
CPD Password: <random-character-password-with-32-characters>

Use these values to set up environment variables:

export CPDUSER=cpadmin
export CPDUSER_PASSWORD=<random-character-password-with-32-characters>
export CPD_URL=cpd-cpdproject.apps.clustername.cluster.domain

Use curl with the environment variables (or your favorite API tool) to use these credentials to create a bearer token. Note, if you do not have jq installed, then adjust the curl command to just provide the output and extract the token value from the json object.

TOKEN=$(curl --request POST \
  --url https://${CPD_URL}/icp4d-api/v1/authorize \
  --header 'Content-Type: application/json' \
  --data "{
  \"username\":\"${CPDUSER}\",
  \"password\": \"${CPDUSER_PASSWORD}\"
}" | jq -r '.token')

The bearer token can be used to create an API key with:

APIKEY=$(curl -s -X GET \
    --url ${CPD_URL}/usermgmt/v1/user/apiKey \
    --header "Authorization: Bearer ${TOKEN}" \
    | jq -r '.apiKey' )

when calling the usermgmt/v1/user/apiKey endpoint, a new apikey will be created and the previous key (if one exists) will be expired. This can also be a technique to rotate the api key as needed for security reasons.

Using the API Key

The API key is a long lived credential that can be used with either the Foundation Model REST API to list foundation models, perform text inference with a model or generate embeddings.

When using the REST API, the first step is to generate a bearer token, this is very similar to the previous example, except it uses the apikey:

TOKEN=$(curl --request POST \
  --url https://${CPD_URL}/icp4d-api/v1/authorize \
  --header 'Content-Type: application/json' \
  --data "{
  \"username\":\"${CPDUSER}\",
  \"api_key\": \"${APIKEY}\"
}" | jq -r '.token')

Then, when using the text inference or generate embeddings endpoints, omit the project_id that is shown in the API documentation. Here is an example that uses the ibm/granite-13b-chat-v2 foundation model:

curl --request POST \
  --url "https://${CPD_URL}/ml/v1/text/generation?version=2023-05-29" \
  --header "Accept: application/json" \
  --header "Authorization: Bearer ${TOKEN}" \
  --header "Content-Type: application/json" \
  --data '{
	"input": "What does the acronym IBM mean?\n\n",
	"parameters": {
		"decoding_method": "greedy",
		"max_new_tokens": 200,
		"min_new_tokens": 0,
		"stop_sequences": [
			"\n\n"
		],
		"repetition_penalty": 1
	},
	"model_id": "ibm/granite-13b-chat-v2"
}'

Additionally, with the API key, it may also be possible to use the watsonx.ai python SDK, however it is important to note that the lightweight engine only supports a subset of the functionality that the SDK supports. Specifically, limit use to foundation model inference and embedding generation

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment