Spring Boot接入DeepSeek

本文最后更新于：3 分钟前

前言

在生成式AI快速发展的浪潮中，DeepSeek 凭借强大的推理能力和灵活的模型接口，成为企业接入智能助手的重要选择。本文将以实际项目为例，在 Spring Boot 中快速集成 DeepSeek API，涵盖同步与流式对话、多轮上下文、JSON结构化输出、推理模型调用等核心能力，构建稳定高效的 AI 服务。

api_key 申请及测试

在 DeepSeek 开放平台 https://platform.deepseek.com/ 中进行额度充值；
在 API keys 栏 https://platform.deepseek.com/api_keys 中创建新的 API key；

申请完毕测试 API keys 的可用性：

curl -X POST "https://api.deepseek.com/chat/completions" 
-H "Authorization: Bearer {your API key}" 
-H "Content-Type: application/json" 
--data-raw "{\"messages\":[{\"content\":\"你好\",\"role\":\"user\"}],\"model\":\"deepseek-reasoner\",\"stream\":false}"

当得到类似于以下的 JSON 格式报文输出，则说明 API key 有效

{
    "id": "1f3e100c-8cbc-4185-b905-5e60285c0543",
    "object": "chat.completion",
    "created": 1751619898,
    "model": "deepseek-reasoner",
    "choices": [
        {
            "index": 0,
            "message": {
                "role": "assistant",
                "content": "你好呀！xxxxxxxxxxxxxxxxxxxx"
            },
            "logprobs": null,
            "finish_reason": "stop"
        }
    ],
    "usage": {
        "prompt_tokens": 6,
        "completion_tokens": 208,
        "total_tokens": 214,
        "prompt_tokens_details": {
            "cached_tokens": 0
        },
        "completion_tokens_details": {
            "reasoning_tokens": 163
        },
        "prompt_cache_hit_tokens": 0,
        "prompt_cache_miss_tokens": 6
    },
    "system_fingerprint": "fp_393bca965e_prod0623_fp8_kvcache"
}

DeepSeek 官方 API文档地址：https://api-docs.deepseek.com/zh-cn/。

基本参数

应用配置

在 Spring Boot 应用中添加常用的配置：

application.yaml

deepseek:
  auth-prefix: Bearer
  api-key: sk-3747284e80bc49038cb053c109aa1941
  base-url: https://api.deepseek.com
  chat-path: /chat/completions
  models-path: /models
  balance-path: /user/balance

对应的配置类为：

@Data
@Configuration
@ConfigurationProperties(prefix = "deepseek")
public class DeepSeekProperties {
	// 认证前缀，通常为 "Bearer "
    private String authPrefix;
    // API 密钥，用于身份验证
    private String apiKey;
    // DeepSeek API 的基础 URL
    private String baseUrl;
    // 聊天路径，通常为 "/chat/completions"
    private String chatPath;
    // 模型列表路径，通常为 "/models"
    private String modelsPath;
    // 余额查询路径，通常为 "/user/balance"
    private String balancePath;
}

模型

DeepSeek 含有两种模型，分别为：

deepseek-chat：对话模型，对应为DeepSeek-V3-0324，上下文长度64K，输出长度默认4k、最大8k。
deepseek-reasoner：推理模型，对应为DeepSeek-R1-0528，上下文长度64K，输出长度默认32k、最大64k（含思维链）。

关于模型具体可查看 https://api-docs.deepseek.com/zh-cn/quick_start/pricing。

定义模型枚举如下：

@Getter
@AllArgsConstructor
public enum ModelEnum {

    CHAT("deepseek-chat", "对话模型"),
    REASONER("deepseek-reasoner", "推理模型");

    private final String code;
    private final String desc;

}

角色

system
- 系统角色（设定初始指令）。
- 用于设定模型行为的指令或人格。
- 一般只出现一次，放在 messages 的最前面。
- 模型会将其当作“世界观”或“行为指导”。
示例：
1
{ "role": "system", "content": "你是一位经验丰富的Java后端开发专家，请用中文回答。" }
user
- 用户角色（提问者）。
- 表示人类用户输入的内容。
- 多轮对话中，每次提问都用这个角色。
示例：
1
{ "role": "user", "content": "请给我一段MyBatis分页查询的示例代码。" }
assistant
- 助手角色（模型生成的回复）。
- 表示模型（AI）的回答。
- 多轮对话中每次回应都用这个角色。
- 在补全式接口（如流式输出）中，模型生成的回复也自动归为这个角色。
示例：
1
{ "role": "assistant", "content": "以下是MyBatis分页查询的示例..." }
tool
- 工具角色（函数调用结果）。
- 表示工具调用的返回值，通常配合函数/工具调用功能。
- tool_call_id 需要指定与哪个函数调用绑定。
- 当前仅在支持函数调用的模型中使用（DeepSeek Chat 暂未确认是否已支持）。
示例：
1
2
3
4
5
{ "role": "tool", "tool_call_id": "call_123", "content": "{ \"weather\": \"晴\", \"temperature\": 28 }" }

定义角色枚举如下：

@Getter
@AllArgsConstructor
public enum RoleEnum {

    SYSTEM("system", "系统"),
    USER("user", "用户"),
    ASSISTANT("assistant", "助手"),
    TOOL("tool", "工具");

    private final String code;
    private final String desc;

}

Temperature

温度：取值范围为 [0, 2.0]，默认1.0，更高的值会使输出更随机。

推荐值为：

场景	温度
代码生成/数学解题	0.0
数据抽取/分析	1.0
通用对话	1.3
翻译	1.3
创意类写作/诗歌创作	1.5

定义 temperature 枚举如下：

@Getter
@AllArgsConstructor
public enum TemperatureEnum {

    CODE_GENERATION(0.0, "代码生成/数学解题"),
    DATA_EXTRACTION(1.0, "数据抽取/分析"),
    GENERAL_DIALOGUE(1.3, "通用对话"),
    TRANSLATION(1.3, "翻译"),
    CREATIVE_WRITING(1.5, "创意类写作/诗歌创作");

    private final double value;
    private final String desc;

}

错误码

调用 DeepSeek API 时，可能会遇到以下错误：

错误码	描述
400 - 格式错误	原因：请求体格式错误解决方法：请根据错误信息提示修改请求体
401 - 认证失败	原因：API key 错误，认证失败解决方法：请检查您的 API key 是否正确，如没有 API key，请先创建 API key
402 - 余额不足	原因：账号余额不足解决方法：请确认账户余额，并前往充值页面进行充值
422 - 参数错误	原因：请求体参数错误解决方法：请根据错误信息提示修改相关参数
429 - 请求速率达到上限	原因：请求速率（TPM 或 RPM）达到上限解决方法：请合理规划您的请求速率。
500 - 服务器故障	原因：服务器内部故障解决方法：请等待后重试。若问题一直存在，请联系我们解决
503 - 服务器繁忙	原因：服务器负载过高解决方法：请稍后重试您的请求

典型场景

简单对话

根据 DeepSeek 官方文档 **对话补全 | DeepSeek API Docs**，可定义以下数据结构：

RequestDTO：

@Data
@NoArgsConstructor
@AllArgsConstructor
@Accessors(chain = true)
public class RequestDTO {
    // 模型名称：deepseek-chat、deepseek-reasoner
    private String model;
    // chat最大token数为8k，默认值为4k；reasoner最大token数为64k，默认值为32k
    @JsonAlias("max_tokens")
    private Integer maxTokens;
    private Integer maxTokens = 4096;
    // 请求消息列表
    private List<Message> messages;
    // 是否使用流式响应（SSE），data: [DONE]结尾
    private boolean stream;
    // 流式响应选项
    @JsonAlias("stream_options")
    private StreamOptions streamOptions;
    // 响应格式
    @JsonAlias("response_format")
    private ResponseFormat responseFormat;
    // 温度：0-2，默认1，更高的值会使输出更随机
    private Double temperature = 1.0;
    // 采样温度：0-1，默认值为1.0，越高则输出越随机
    @JsonAlias("top_p")
    private Double topP = 1.0;
    // 频率惩罚：-2.0到2.0之间，默认值为0.0，越大则越不容易重复
    @JsonAlias("frequency_penalty")
    private Double frequencyPenalty = 0.0;
    // 存在惩罚：-2.0到2.0之间，默认值为0.0，越大则越容易谈论新主题
    @JsonAlias("presence_penalty")
    private Double presencePenalty = 0.0;
    // 停止词列表：模型生成的文本中遇到这些词时会停止生成
    @JsonAlias("stop")
    private List<String> stopStringList;
    // 工具列表
    private List<Tool> tools;
    /**
     * 调用tool的行为
     * 1. none：模型不调用工具，而是生成一条消息
     * 2. auto：模型可以选择生成一条消息或调用一个或多个tool
     * 3. required：模型必须调用一个或多个 tool
     */
    @JsonAlias("tool_choice")
    private String toolChoice;
    // 是否返回输出token的对数频率
    private boolean logprobs;
    // 0-20的整数，指定每个输出位置返回输出概率top N的token，且返回这些token的对数概率（logprobs必须为true）
    private Integer top_logprobs;

    @Data
    @NoArgsConstructor
    @AllArgsConstructor
    @Accessors(chain = true)
    public static class Message {
        // 角色：system、user、assistant、tool
        private String role;
        // 内容
        private String content;
        // 参与者的名称，区分相同角色的不同参与者
        private String name;
    }

    @Data
    public static class ResponseFormat {
        // 格式：text（默认）、json_object（JSON对象，需通过系统或用户消息指示模型生成JSON）
        private String type;
    }

    @Data
    public static class StreamOptions {
        // 是否包含使用信息（data: [DONE]前输出）
        @JsonAlias("include_usage")
        private boolean includeUsage = false;
    }

    @Data
    public static class Tool {
        // 类型，目前仅支持 "function"
        private String type;
        // 函数
        private Function function;
    }

    @Data
    public static class Function {
        // 函数名称
        private String name;
        // 函数描述
        private String description;
        // 函数参数，符合 OpenAPI 规范的 JSON Schema
        private Object parameters;
    }
}

非流式响应 BlockingResponseDTO：

@Data
public class BlockingResponseDTO {
    // 对话唯一标识符
    private String id;
    // 响应的选择列表，每个选择包含生成的消息和相关信息
    private List<Choice> choices;
    // 创建时间戳（秒）
    private Long created;
    // 模型名称
    private String model;
    // 模型运行的后端配置
    @JsonProperty("system_fingerprint")
    private String systemFingerprint;
    // 对象的类型，值为 "chat.completion"
    private String object;
    // 用量信息
    private Usage usage;

    @Data
    public static class Choice {
        // 选择列表中的索引，从0开始
        private int index;
        // 生成的消息
        private Message message;
        /**
         * 停止生成的原因
         * 1. stop：自然停止或遇到请求stop字段中的字符串
         * 2. length：达到模型上下文的最大长度或请求的最大token数
         * 3. content_filter：内容过滤器触发
         * 4. insufficient_system_resource：系统推理资源不足，生成被打断
         */
        @JsonProperty("finish_reason")
        private String finishReason;
        // choice的对数概率信息
        private Logprobs logprobs;
    }

    @Data
    public static class Message {
        // 生成消息的角色
        private String role;
        // 生成的消息内容
        private String content;
        // 推理内容（deepseek-reasoner模型专用）
        @JsonProperty("reasoning_content")
        private String reasoningContent;
        // tool调用列表
        @JsonProperty("tool_calls")
        private List<Tool> toolCalls;
    }

    @Data
    private static class Tool {
        // tool调用的ID
        private String id;
        // tool的类型，目前仅支持 "function"
        private String type;
        // 工具参数
        private Function function;
    }

    @Data
    public static class Function {
        // 函数名称
        private String name;
        // 调用函数的参数，符合JSON Schema规范
        private String arguments;
    }

    @Data
    public static class Logprobs {
        // 包含输出token对数概率信息的列表
        @JsonProperty("content")
        private List<Content> content;
    }

    public static class Content {
        // 输出的token
        private String token;
        // 该token的对数概率
        private Double logprob;
        /**
         * 一个包含该token UTF-8字节表示的整数列表。
         * 一般在一个UTF-8字符被拆分成多个token来表示时有用。
         * 如果token没有对应的字节表示，则该值为 null。
         */
        @JsonProperty("bytes")
        private List<Integer> bytes;
        // 一个包含在该输出位置上，输出概率top N的token的列表，以及它们的对数概率。
        @JsonProperty("top_logprobs")
        List<InnerContent> topLogprobs;
    }

    public static class InnerContent {
        // 输出的token
        private String token;
        // 该token的对数概率
        private Double logprob;
        /**
         * 一个包含该token UTF-8字节表示的整数列表。
         * 一般在一个UTF-8字符被拆分成多个token来表示时有用。
         * 如果token没有对应的字节表示，则该值为 null。
         */
        @JsonProperty("bytes")
        private List<Integer> bytes;
    }

    @Data
    public static class Usage {
        // 用户输入中命中的缓存token数量
        @JsonProperty("prompt_cache_hit_tokens")
        private int promptCacheHitTokens;
        // 用户输入中未命中的token数量
        @JsonProperty("prompt_cache_miss_tokens")
        private int promptCacheMissTokens;
        // 用户输入的token数量
        @JsonProperty("prompt_tokens")
        private int promptTokens;
        // 模型生成的token数量
        @JsonProperty("completion_tokens")
        private int completionTokens;
        // 总token数量（用户输入 + 模型生成）
        @JsonProperty("total_tokens")
        private int totalTokens;
        // 详细信息
        @JsonProperty("completion_tokens_details")
        private CompletionTokensDetails completionTokensDetails;
        // 用户输入中命中的缓存token数量的详细信息
        @JsonProperty("prompt_tokens_details")
        private PromptTokensDetails promptTokensDetails;
    }

    @Data
    public static class CompletionTokensDetails {
        // 推理模型产生的思维连token数量
        @JsonProperty("reasoning_tokens")
        private int reasoningTokens;
    }

    @Data
    public static class PromptTokensDetails {
        // 用户输入中命中的缓存token数量
        @JsonProperty("cached_tokens")
        private int cachedTokens;
    }

}

流式响应 StreamingResponseDTO：

@Data
public class StreamingResponseDTO {
    // 对话唯一标识符
    private String id;
    // 响应的选择列表，每个选择包含生成的消息和相关信息
    private List<Choice> choices;
    // 创建时间戳（秒）
    private Long created;
    // 模型名称
    private String model;
    // 模型运行的后端配置
    @JsonProperty("system_fingerprint")
    private String systemFingerprint;
    // 对象的类型，值为 "chat.completion"
    private String object;
    // 用量信息
    private Usage usage;

    @Data
    public static class Choice {
        // 选择列表中的索引，从0开始
        private int index;
        // 消息增量
        private Delta delta;
        /**
         * 停止生成的原因
         * 1. stop：自然停止或遇到请求stop字段中的字符串
         * 2. length：达到模型上下文的最大长度或请求的最大token数
         * 3. content_filter：内容过滤器触发
         * 4. insufficient_system_resource：系统推理资源不足，生成被打断
         */
        @JsonProperty("finish_reason")
        private String finishReason;
    }

    @Data
    public static class Delta {
        // 生成消息的角色
        private String role;
        // 生成的消息内容
        private String content;
        // 推理内容（deepseek-reasoner模型专用）
        @JsonProperty("reasoning_content")
        private String reasoningContent;
    }

    @Data
    public static class Usage {
        // 用户输入中命中的缓存token数量
        @JsonProperty("prompt_cache_hit_tokens")
        private int promptCacheHitTokens;
        // 用户输入中未命中的token数量
        @JsonProperty("prompt_cache_miss_tokens")
        private int promptCacheMissTokens;
        // 用户输入的token数量
        @JsonProperty("prompt_tokens")
        private int promptTokens;
        // 模型生成的token数量
        @JsonProperty("completion_tokens")
        private int completionTokens;
        // 总token数量（用户输入 + 模型生成）
        @JsonProperty("total_tokens")
        private int totalTokens;
        // 详细信息
        @JsonProperty("completion_tokens_details")
        private CompletionTokensDetails completionTokensDetails;
        // 用户输入中命中的缓存token数量的详细信息
        @JsonProperty("prompt_tokens_details")
        private PromptTokensDetails promptTokensDetails;
    }

    @Data
    public static class CompletionTokensDetails {
        // 推理模型产生的思维连token数量
        @JsonProperty("reasoning_tokens")
        private int reasoningTokens;
    }

    @Data
    public static class PromptTokensDetails {
        // 
        @JsonProperty("cached_tokens")
        private int cachedTokens;
    }

}

创建 DeepSeekUtil 工具类如下，注入需要的 DeepSeekProperties、WebClient 以及 ObjectMapper 组件

@Slf4j
@Component
@RequiredArgsConstructor
public class DeepSeekUtil {

    private final DeepSeekProperties deepSeekProperties;
    private final WebClient webClient;
    private final ObjectMapper objectMapper;

}

阻塞式对话，可以自定义所有 DeepSeek 请求参数，一次性获取大模型生成结果

public BlockingResponseDTO doBlockingRequest(RequestDTO requestDTO) {
    try {
        requestDTO.setStream(false);
        BlockingResponseDTO resp = webClient.post()
                .uri(deepSeekProperties.getBaseUrl())
                .header("Content-Type", MediaType.APPLICATION_JSON_VALUE)
                .header("Authorization", deepSeekProperties.getAuthPrefix() + " " + deepSeekProperties.getApiKey())
                .bodyValue(requestDTO)
                .retrieve()
                .bodyToMono(BlockingResponseDTO.class)
                .block();  // 当前保持阻塞调用
        log.info("【API调用】响应：{}", resp);
        return resp;
    } catch (Exception e) {
        log.error("【API调用】请求失败：{}", e.getMessage(), e);
        return null;
    }
}

封装一个简易的文本对话方法，除了用户输入内容，其余都使用默认的请求参数，同时返回大模型生成的文本

public String blockingRequest(String content) {
    RequestDTO requestDTO = new RequestDTO()
            .setModel(ModelEnum.CHAT.getCode())
            .setMessages(Collections.singletonList(new RequestDTO.Message().setRole(RoleEnum.USER.getCode()).setContent(content)))
            .setStream(false);
    BlockingResponseDTO resp = doBlockingRequest(requestDTO);
    if (resp != null && !resp.getChoices().isEmpty()) {
        return resp.getChoices().get(0).getMessage().getContent();
    }
    return null;
}

流式对话，可以自定义所有 DeepSeek 请求参数，流式性获取大模型生成结果

public Flux<StreamingResponseDTO> doStreamingRequest(RequestDTO requestDTO) {
    // 开启流式输出
    requestDTO.setStream(true);
    return webClient.post()
            .uri(deepSeekProperties.getBaseUrl())
            .header("Authorization", deepSeekProperties.getAuthPrefix() + " " + deepSeekProperties.getApiKey())
            .contentType(MediaType.APPLICATION_JSON)
            .accept(MediaType.APPLICATION_NDJSON)  // 或者 MediaType.TEXT_EVENT_STREAM
            .bodyValue(requestDTO)
            .retrieve()
            .bodyToFlux(String.class)
            .flatMap(line -> {
                // 去掉前缀 "data:" 和空行
                line = line.trim();
                if (line.isEmpty() || line.equals("data: [DONE]") || line.equals("[DONE]")) {
                    return Mono.empty();                // 过滤掉 DONE
                }
                if (line.startsWith("data:")) {
                    line = line.substring(5).trim();
                }
                try {
                    return Mono.just(objectMapper.readValue(line, StreamingResponseDTO.class));
                } catch (Exception e) {
                    return Mono.error(e);
                }
            })
            .doOnError(e -> log.error("【Stream API】错误", e));
}

同样封装一个简易的文本对话方法，输入用户输出内容，流式输出大模型生成的文本

public Flux<String> streamingRequest(String content) {
    RequestDTO requestDTO = new RequestDTO()
            .setModel(ModelEnum.CHAT.getCode())
            .setMessages(Collections.singletonList(new RequestDTO.Message().setRole(RoleEnum.USER.getCode()).setContent(content)))
            .setStream(true);
    // sse事件交由controller处理
    return doStreamingRequest(requestDTO).map(e -> e.getChoices().get(0).getDelta().getContent());
}

多轮对话

DeepSeek 对话 API 是一个“无状态” API，即服务端不记录用户请求的上下文，用户在每次请求时，需将之前所有对话历史拼接好（即完整记录角色以及文本、维护好 messages 参数）后，传递给对话 API。

场景：询问广东省河流长度。

第一次对话问：

{
    "model": "deepseek-chat",
    "maxTokens": 4096,
    "messages": [
        {
            "role": "user",
            "content": "广东省最长的河流是哪条？（简要回答名称及长度即可）",
            "name": null
        }
    ],
    "stream": false,
    // 其他字段
}

第一次对话答：

{
    "id": "4761872d-3ac6-4b68-83fb-fe6318a2f814",
    "choices": [
        {
            "index": 0,
            "message": {
                "role": "assistant",
                "content": "广东省最长的河流是**珠江**，其干流长度约**2320公里**（含西江段）。",
                "reasoning_content": null,
                "tool_calls": null
            },
            "logprobs": null,
            "finish_reason": "stop"
        }
    ],
    "created": 1751563029,
    "model": "deepseek-chat",
    // 其他字段
}

第二次对话问：

{
    "model": "deepseek-chat",
    "maxTokens": 4096,
    "messages": [
        {
            "role": "user",
            "content": "广东省最长的河流是哪条？（简要回答名称及长度即可）",
            "name": null
        },
        {
            "role": "assistant",
            "content": "广东省最长的河流是**珠江**，其干流长度约**2320公里**（含西江段）。",
            "name": null
        },
        {
            "role": "user",
            "content": "那第二、第三长的呢？",
            "name": null
        }
    ],
    "stream": false,
    // 其他字段
}

第二次对话答：

{
    "id": "c4ead835-0e9b-42ca-adee-65fe71c3dce9",
    "choices": [
        {
            "index": 0,
            "message": {
                "role": "assistant",
                "content": "广东省第二、第三长的河流分别是：  \n\n1. **韩江**（约470公里）  \n2. **北江**（约468公里）  \n\n（注：长度数据可能因测量标准不同略有差异。）",
                "reasoning_content": null,
                "tool_calls": null
            },
            "logprobs": null,
            "finish_reason": "stop"
        }
    ],
    "created": 1751563087,
    "model": "deepseek-chat",
    // 其他字段
}

如以上所示，在第二次对话中，请求参数 messages 添加了上一次的 user 的问和 assistant 的答，并将 user 新的提问添加到这两个 message 之后。

JSON 输出

在很多场景下，用户需要让模型严格按照 JSON 格式来输出，以实现输出的结构化，便于后续逻辑进行解析。DeepSeek 提供了 JSON Output 功能，来确保模型输出合法的 JSON 字符串。

在调用对话 API 时需要做到：

设置 response_format 参数为 {'type': 'json_object'}；
用户传入的 system 或 user prompt 中必须含有 json 字样，并给出希望模型输出的 JSON 格式的样例；
需要合理设置 max_tokens 参数，防止 JSON 字符串被中途截断。

示例：

RequestDTO requestDTO = new RequestDTO();
requestDTO.setModel(ModelEnum.CHAT.getCode());
requestDTO.setMaxTokens(8192);
requestDTO.setMessages(
        Arrays.asList(
                new RequestDTO.Message()
                        .setRole("system")
                        .setContent("""
                                你需要按照纯文本的JSON格式字符串进行响应。
                                JSON中包含一个字段叫做answer，内容为你的回答。
                                案例：
                                用户提问：1+1等于多少？
                                如果你理解了，回复应当是：{"answer": "2"}
                            """
                        ),
                new RequestDTO.Message()
                        .setRole("user")
                        .setContent("2+3等于多少？")
        )
);
requestDTO.setResponseFormat(new RequestDTO.ResponseFormat().setType("json_object"));
requestDTO.setStream(false);
return deepSeekUtil.doBlockingRequest(requestDTO).getChoices().get(0).getMessage().getContent();

运行，模型输出：

1
2
3

{
    "answer": "5"
}

推理模型

推理模型与对话模型的主要区别有：

输入参数 model 的值为 deepseek-reasoner。
输入参数 max_tokens：模型单次回答的最大长度（含思维连输出），默认为 32K，最大为 64K。
输出字段 message.reasoning_content 为思维链内容。
API 最大支持 64K 上下文，输出的 reasoning_content 长度不计入 64K 上下文长度中。
不支持的功能：FIM 补全 (Beta)
不支持的参数：temperature、top_p、presence_penalty、frequency_penalty、logprobs、top_logprobs。其中设置 logprobs、top_logprobs 会报错。

非流式

使用非流式调用，大模型会将推理内容和正文一起返回，如：

{
    "id": "8c1a2be9-ac0e-4f4f-bd50-800785489b09",
    "choices": [
        {
            "index": 0,
            "message": {
                "role": "assistant",
                "content": "正文内容xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx",
                "reasoning_content": "推理内容xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx",
                "tool_calls": null
            },
            "logprobs": null,
            "finish_reason": "stop"
        }
    ],
    // 其他字段
}

流式

使用流式调用，大模型会先返回推理内容、正文 content 为 null 的响应报文，如：

{
    "id": "45113af0-4011-406e-a3f6-a0dc7bacfa2b",
    "choices": [
        {
            "index": 0,
            "delta": {
                "role": null,
                "content": null,
                "reasoning_content": "推理内容xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
            },
            "finish_reason": null
        }
    ],
    // 其他字段
}

推理完成后，会返回正文，此时响应报文的推理内容为 null，如：

{
    "id": "45113af0-4011-406e-a3f6-a0dc7bacfa2b",
    "choices": [
        {
            "index": 0,
            "delta": {
                "role": null,
                "content": "正文内容xxxxxxxxxxxxxxxxxxxxxxxxxxxx",
                "reasoning_content": null
            },
            "finish_reason": null
        }
    ],
    // 其他字段
}

列出模型

接口 ResponseDTO 如下：

@Data
public class ModelResponseDTO {
    // “list”
    private String object;
    // 模型列表
    private List<Model> data;

    @Data
    public static class Model {
        // 模型的标识符
        private String id;
        // 对象的类型，其值为model
        private String object;
        // 拥有该模型的组织
        @JsonProperty("owned_by")
        private String ownedBy;
    }

}

在 DeepSeekUtil 中定义获取模型列表的方法如下：

public ModelResponseDTO listModels() {
    try {
        ModelResponseDTO resp = webClient.get()
                .uri(deepSeekProperties.getBaseUrl() + deepSeekProperties.getModelsPath())
                .header("Authorization", deepSeekProperties.getAuthPrefix() + " " + deepSeekProperties.getApiKey())
                .retrieve()
                .bodyToMono(ModelResponseDTO.class)
                .block();
        log.info("【API调用】响应：{}", resp);
        return resp;
    } catch (Exception e) {
        log.error("【API调用】请求失败：{}", e.getMessage(), e);
        return null;
    }
}

查询余额

接口 ResponseDTO 如下：

@Data
public class BalanceResponseDTO {
    // 是否有余额可供API调用
    @JsonProperty("is_available")
    private boolean isAvailable;
    // 余额信息
    @JsonProperty("balance_infos")
    private List<BalanceInfo> balanceInfos;

    @Data
    public static class BalanceInfo {
        // 货币类型，例如CNY、USD
        @JsonProperty("currency")
        private String currency;
        // 可用余额（含赠金以及充值余额）
        @JsonProperty("total_balance")
        private String totalBalance;
        // 未过期的赠金余额
        @JsonProperty("granted_balance")
        private String grantedBalance;
        // 充值余额
        @JsonProperty("topped_up_balance")
        private String toppedUpBalance;
    }
}

在 DeepSeekUtil 中定义获取余额的方法如下：

public BalanceResponseDTO getBalance() {
    try {
        BalanceResponseDTO resp = webClient.get()
                .uri(deepSeekProperties.getBaseUrl() + deepSeekProperties.getBalancePath())
                .header("Authorization", deepSeekProperties.getAuthPrefix() + " " + deepSeekProperties.getApiKey())
                .retrieve()
                .bodyToMono(BalanceResponseDTO.class)
                .block();
        log.info("【API调用】响应：{}", resp);
        return resp;
    } catch (Exception e) {
        log.error("【API调用】请求失败：{}", e.getMessage(), e);
        return null;
    }
}

总结

本文演示了在 Spring Boot 中调用 DeepSeek AI 的全过程：

申请密钥 ➜ 配置 Yaml ➜ 编写工具类 ➜ 同步/流式对话
进阶的多轮上下文、严格 JSON、推理模型
以及模型列表与余额查询等运维接口

开发者可以基于 DeepSeekUtil 再封装业务层 Service、接入 WebFlux SSE、或结合数据库把上下文落库，实现更丰富的 AI 场景。

DeepSeek AI 人工智能

本博客所有文章除特别声明外，均采用 CC BY-SA 4.0 协议，转载请注明出处！

Server-Sent Events基础下一篇