top of page

Deploy and Inference DeepSeek-R1 Distilled Models with IBM watsonx.ai

Introduction:


IBM watsonx.ai is an enterprise-grade studio for developing generative AI applications and deploying them into your applications of choice. Our priority is to provide users with a set of open, trusted, and performant models to power their generative AI applications.


The power of open-source:


Open-source is a key driver of innovation in the AI community. By making high-quality models like DeepSeek-R1 available, we can accelerate the development of AI applications and foster a culture of collaboration and knowledge-sharing. The release of DeepSeek-R1, an open-sourced reasoning model on par with OpenAI's o1 series of models, is a significant step in this direction. We hope that it will inspire other model providers to follow suit and contribute to the growth of the open-source AI ecosystem.


On watsonx.ai you can use our Custom Foundation Models feature​ to deploy distilled variants of DeepSeek-R1 based on any Llama or Qwen architectures:



Getting started with DeepSeek on watsonx.ai


To deploy the distilled variants of DeepSeek-R1 based on any Llama or Qwen architectures, follow these steps:


Step 1: Prepare your model:


Make sure you have the required files to bring the model into the IBM Cloud Object Storage, with the two main requirements being:


  1. The file list for the model must contain a config.json file.

  2. The model must be in a safetensors format with the supported transformers library and must include a tokenizer.json file


Step 2: Import and deploy your model:


  1. In your deployment space or project, go to the Assets tab.

  2. Find your model in the asset list, click the Menu icon, and select Deploy.

  3. Enter a name for your deployment and optionally enter a serving name, description, and tags.

  4. Select a configuration and a software specification for your model.


Step 3: Start prompting:


Use the watsonx.ai API, Python client SDK, or the UI to prompt your deployed model.

curl -X POST "https://<your cloud hostname>/ml/v1/deployments/<your deployment ID>/text/generation?version=2024-01-29" \
-H "Authorization: Bearer $TOKEN" \
-H "content-type: application/json" \
--data '{
 "input": "Hello, what is your name",
 "parameters": {
    "max_new_tokens": 200,
    "min_new_tokens": 20
 }
}'

Note: Replace the bearer token, API key, and cloud URL with the credentials for your account.


Conclusion:


Using the guide above, you can quickly get started with deploying distilled variants of DeepSeek-R1 to inference in a secure manner, both on SaaS and On-Premises software. For a more detailed walk-through of the deployment process using Custom Foundation Models feature on watsonx.ai, please refer to our documentation. 



Font: Comunidade IBM


4 visualizações0 comentário

Posts recentes

Ver tudo

Comentários


bottom of page