Deploy and Inference DeepSeek-R1 Distilled Models with IBM watsonx.ai

30 de jan. de 2025
2 min de leitura

Introduction:

IBM watsonx.ai is an enterprise-grade studio for developing generative AI applications and deploying them into your applications of choice. Our priority is to provide users with a set of open, trusted, and performant models to power their generative AI applications.

The power of open-source:

Open-source is a key driver of innovation in the AI community. By making high-quality models like DeepSeek-R1 available, we can accelerate the development of AI applications and foster a culture of collaboration and knowledge-sharing. The release of DeepSeek-R1, an open-sourced reasoning model on par with OpenAI's o1 series of models, is a significant step in this direction. We hope that it will inspire other model providers to follow suit and contribute to the growth of the open-source AI ecosystem.

On watsonx.ai you can use our Custom Foundation Models feature to deploy distilled variants of DeepSeek-R1 based on any Llama or Qwen architectures:

Getting started with DeepSeek on watsonx.ai

To deploy the distilled variants of DeepSeek-R1 based on any Llama or Qwen architectures, follow these steps:

Step 1: Prepare your model:

Make sure you have the required files to bring the model into the IBM Cloud Object Storage, with the two main requirements being:

The file list for the model must contain a config.json file.
The model must be in a safetensors format with the supported transformers library and must include a tokenizer.json file

Step 2: Import and deploy your model:

In your deployment space or project, go to the Assets tab.
Find your model in the asset list, click the Menu icon, and select Deploy.
Enter a name for your deployment and optionally enter a serving name, description, and tags.
Select a configuration and a software specification for your model.

Step 3: Start prompting:

Use the watsonx.ai API, Python client SDK, or the UI to prompt your deployed model.

curl -X POST "https://<your cloud hostname>/ml/v1/deployments/<your deployment ID>/text/generation?version=2024-01-29" \
-H "Authorization: Bearer $TOKEN" \
-H "content-type: application/json" \
--data '{
 "input": "Hello, what is your name",
 "parameters": {
    "max_new_tokens": 200,
    "min_new_tokens": 20
 }
}'

Note: Replace the bearer token, API key, and cloud URL with the credentials for your account.

Conclusion:

Using the guide above, you can quickly get started with deploying distilled variants of DeepSeek-R1 to inference in a secure manner, both on SaaS and On-Premises software. For a more detailed walk-through of the deployment process using Custom Foundation Models feature on watsonx.ai, please refer to our documentation.

By Nisarg Patel

Font: Comunidade IBM

Deploy and Inference DeepSeek-R1 Distilled Models with IBM watsonx.ai

Posts recentes

Inscreva-se e receba novidades