The watsonx.ai lightweight engine allows organizations to deploy an optimized stack for inference and embedding only using IBM and third party custom foundation models.
This platform does include HAP (hate, abuse, profanity) and personally-indentifiable content filtering (AI guardrails) capabilities of watsonx.ai.
The lightweight engine does not have a UI like the full version of watsonx.ai which includes the Prompt Lab so it can be a little difficult to indentify how to get started after the initial product instalation completes.
This guide is intended to ease that transition.
Since the lightweight engine is only an API, the first question should be "how to obtain an API key?" After the installation completes, there is a step at the end of the installation to obtain the default credentials and URL.
The default user API key can be created with the following steps. The output of the cpd-cli manage get instance details
command will look like:
CPD Url: cpd-cpdproject.apps.clustername.cluster.domain
CPD Username: cpadmin
CPD Password: <random-character-password-with-32-characters>
Use these values to set up environment variables:
export CPDUSER=cpadmin
export CPDUSER_PASSWORD=<random-character-password-with-32-characters>
export CPD_URL=cpd-cpdproject.apps.clustername.cluster.domain
Use curl with the environment variables (or your favorite API tool) to use these credentials to create a bearer token. Note, if you do not have jq
installed, then adjust the curl command to just provide the output and extract the token
value from the json object.
TOKEN=$(curl --request POST \
--url https://${CPD_URL}/icp4d-api/v1/authorize \
--header 'Content-Type: application/json' \
--data "{
\"username\":\"${CPDUSER}\",
\"password\": \"${CPDUSER_PASSWORD}\"
}" | jq -r '.token')
The bearer token can be used to create an API key with:
APIKEY=$(curl -s -X GET \
--url ${CPD_URL}/usermgmt/v1/user/apiKey \
--header "Authorization: Bearer ${TOKEN}" \
| jq -r '.apiKey' )
when calling the
usermgmt/v1/user/apiKey
endpoint, a new apikey will be created and the previous key (if one exists) will be expired. This can also be a technique to rotate the api key as needed for security reasons.
The API key is a long lived credential that can be used with either the Foundation Model REST API to list foundation models, perform text inference with a model or generate embeddings.
When using the REST API, the first step is to generate a bearer token, this is very similar to the previous example, except it uses the apikey
:
TOKEN=$(curl --request POST \
--url https://${CPD_URL}/icp4d-api/v1/authorize \
--header 'Content-Type: application/json' \
--data "{
\"username\":\"${CPDUSER}\",
\"api_key\": \"${APIKEY}\"
}" | jq -r '.token')
Then, when using the text inference or generate embeddings endpoints, omit the project_id
that is shown in the API documentation. Here is an example that uses the ibm/granite-13b-chat-v2
foundation model:
curl --request POST \
--url "https://${CPD_URL}/ml/v1/text/generation?version=2023-05-29" \
--header "Accept: application/json" \
--header "Authorization: Bearer ${TOKEN}" \
--header "Content-Type: application/json" \
--data '{
"input": "What does the acronym IBM mean?\n\n",
"parameters": {
"decoding_method": "greedy",
"max_new_tokens": 200,
"min_new_tokens": 0,
"stop_sequences": [
"\n\n"
],
"repetition_penalty": 1
},
"model_id": "ibm/granite-13b-chat-v2"
}'
Additionally, with the API key, it may also be possible to use the watsonx.ai python SDK, however it is important to note that the lightweight engine only supports a subset of the functionality that the SDK supports. Specifically, limit use to foundation model inference and embedding generation