news 2026/4/29 23:33:24

企业级java+LangChain4j-RAG系统 限流熔断降级

作者头像

张小明

前端开发工程师

1.2k 24
文章封面图
企业级java+LangChain4j-RAG系统 限流熔断降级

企业级java+LangChain4j-RAG系统 限流熔断降级

1. 文档说明

本文档基于SpringBoot3 + LangChain4j + Milvus/Chroma + MySQL + Redis企业级AI知识库RAG项目,整合了目前业界所有主流接口限流、熔断、降级方案,包含完整可运行源码、配置、场景选型规范、生产落地标准、面试核心知识点。

所有代码无缝替换Sentinel、零冲突、可直接部署上线,适配AI问答、文档解析、大文件分片上传全业务场景。

2. 核心概念区分(生产必备)

2.1 限流(RateLimit)

控制接口QPS,防止请求量过大压垮服务,解决流量风暴、恶意刷接口问题。

适用:AI高频问答、文档上传、批量解析接口。

2.2 熔断(CircuitBreaker)

依赖服务(大模型API、向量库、数据库)超时/报错率过高时,自动切断请求,避免请求堆积、线程阻塞,防止服务雪崩

AI项目刚需:大模型接口不稳定、响应慢、极易超时。

2.3 降级(Fallback)

服务熔断/异常时,返回预设兜底结果,保证服务可用、不报错、不雪崩

3. 主流技术方案能力全景对比

技术方案

限流

熔断降级

分布式集群

性能

运维成本

适用场景

Resilience4j

✅ 全能

需Redis配合

极高

极低

单体/微服务、AI项目首选

Sentinel

中(需控制台)

阿里生态、需动态规则可视化

Redisson

✅ 分布式

✅ 强适配

极低

集群全局限流

Bucket4j

✅ 高性能

支持Redis

极致高

极低

大文件上传、高并发接口

Guava RateLimiter

✅ 单机

极高

0

小型内部项目、简单防刷

Redis+Lua

✅ 原生

极低

零框架、极简技术栈

Hystrix

老旧项目,新项目禁用

4. 项目选型决策流程图(通用复用)

开始选型 → 判断是否需要熔断降级/超时防雪崩

4.1 不需要熔断(仅限流)

  • 单机小项目 → Guava RateLimiter

  • 高并发/大文件上传 → Bucket4j

  • 集群多实例 → Redisson / Redis+Lua

  • 零框架自研 → Redis+Lua

4.2 需要熔断降级(AI项目必选)

  • 阿里生态、需要可视化控制台 → Sentinel

  • 非阿里生态、轻量无绑定 → Resilience4j

  • 单体部署 → 直接 Resilience4j 全套

  • 集群部署 → Resilience4j(熔断) + Redisson(分布式限流)

5. 本RAG项目 最终生产标准技术栈

永久固定方案,无需反复选型

  • 熔断、降级、超时、防雪崩:Resilience4j(Spring官方、无厂商绑定)

  • 单机接口QPS限流:Resilience4j 内置限流

  • 集群分布式限流:Redisson

  • 大文件分片上传高并发限流:Bucket4j

  • 极简备选:Guava、Redis+Lua

6. 完整项目依赖(pom.xml)

整合所有限流熔断方案、RAG核心依赖,可直接覆盖原有pom

<?xml version="1.0" encoding="UTF-8"?> <project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd"> <modelVersion>4.0.0</modelVersion> <parent> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-parent</artifactId> <version>3.2.6</version> <relativePath/> </parent> <groupId>com.ai</groupId> <artifactId>langchain4j-enterprise-rag</artifactId> <version>1.0.0</version> <name>LangChain4j企业级RAG系统</name> <properties> <java.version>17</java.version> <langchain4j.version>0.32.0</langchain4j.version> <mybatis-plus.version>3.5.5</mybatis-plus.version> <fastjson2.version>2.0.52</fastjson2.version> </properties> <dependencies> <!-- SpringBoot基础 --> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-web</artifactId> </dependency> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-test</artifactId> <scope>test</scope> </dependency> <!-- 数据库 --> <dependency> <groupId>com.mysql</groupId> <artifactId>mysql-connector-j</artifactId> <scope>runtime</scope> </dependency> <dependency> <groupId>com.baomidou</groupId> <artifactId>mybatis-plus-boot-starter</artifactId> <version>${mybatis-plus.version}</version> </dependency> <!-- LangChain4j RAG核心 --> <dependency> <groupId>dev.langchain4j</groupId> <artifactId>langchain4j-spring-boot-starter</artifactId> <version>${langchain4j.version}</version> </dependency> <dependency> <groupId>dev.langchain4j</groupId> <artifactId>langchain4j-tongyi</artifactId> <version>${langchain4j.version}</version> </dependency> <dependency> <groupId>dev.langchain4j</groupId> <artifactId>langchain4j-milvus</artifactId> <version>${langchain4j.version}</version> </dependency> <dependency> <groupId>dev.langchain4j</groupId> <artifactId>langchain4j-chroma</artifactId> <version>${langchain4j.version}</version> </dependency> <dependency> <groupId>dev.langchain4j</groupId> <artifactId>langchain4j-document-parser-apache-tika</artifactId> <version>${langchain4j.version}</version> </dependency> <!-- 限流熔断全套方案 --> <dependency> <groupId>org.springframework.cloud</groupId> <artifactId>spring-cloud-starter-circuitbreaker-resilience4j</artifactId> <version>3.2.1</version> </dependency> <dependency> <groupId>org.redisson</groupId> <artifactId>redisson-spring-boot-starter</artifactId> <version>3.27.0</version> </dependency> <dependency> <groupId>com.github.vladimir-bukhtoyarov</groupId> <artifactId>bucket4j-core</artifactId> <version>7.6.0</version> </dependency> <dependency> <groupId>com.google.guava</groupId> <artifactId>guava</artifactId> <version>32.1.3-jre</version> </dependency> <!-- 工具类 --> <dependency> <groupId>org.projectlombok</groupId> <artifactId>lombok</artifactId> <optional>true</optional> </dependency> <dependency> <groupId>com.alibaba.fastjson2</groupId> <artifactId>fastjson2</artifactId> <version>${fastjson2.version}</version> </dependency> <dependency> <groupId>cn.hutool</groupId> <artifactId>hutool-all</artifactId> <version>5.8.30</version> </dependency> </dependencies> <build> <plugins> <plugin> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-maven-plugin</artifactId> <configuration> <excludes> <exclude> <groupId>org.projectlombok</groupId> <artifactId>lombok</artifactId> </exclude> </excludes> </configuration> </plugin> </plugins> </build> </project>

7. 全局配置文件(application-dev.yml)

spring: datasource: url: jdbc:mysql://127.0.0.1:3306/rag_db?useUnicode=true&characterEncoding=utf8&serverTimezone=Asia/Shanghai&allowMultiQueries=true username: root password: 123456 driver-class-name: com.mysql.cj.jdbc.Driver data: redis: host: 127.0.0.1 port: 6379 password: database: 0 # 通义千问大模型配置 langchain4j: tongyi: api-key: sk-xxx你的keyxxx model-name: qwen-turbo timeout: 60s # 向量库配置 milvus: host: 127.0.0.1 port: 19530 collection-name: enterprise_knowledge chroma: host: 127.0.0.1 port: 8000 # Resilience4j 熔断+限流核心配置 resilience4j: circuitbreaker: instances: aiChatCircuit: slidingWindowSize: 10 failureRateThreshold: 50 waitDurationInOpenState: 10000 permittedNumberOfCallsInHalfOpen: 3 uploadCircuit: slidingWindowSize: 10 failureRateThreshold: 50 ratelimiter: instances: aiChatLimit: limitForPeriod: 5 limitRefreshPeriod: 1000 timeoutDuration: 2000 uploadLimit: limitForPeriod: 2 limitRefreshPeriod: 1000

8. 全套配置类源码

8.1 Bucket4jConfig.java 高性能限流配置

package com.ai.rag.config; import io.github.bucket4j.Bandwidth; import io.github.bucket4j.Bucket; import io.github.bucket4j.Refill; import org.springframework.context.annotation.Bean; import org.springframework.context.annotation.Configuration; import java.time.Duration; @Configuration public class Bucket4jConfig { @Bean public Bucket aiChatBucket() { Bandwidth bandwidth = Bandwidth.classic(5, Refill.greedy(5, Duration.ofSeconds(1))); return Bucket.builder().addLimit(bandwidth).build(); } @Bean public Bucket uploadBucket() { Bandwidth bandwidth = Bandwidth.classic(2, Refill.greedy(2, Duration.ofSeconds(1))); return Bucket.builder().addLimit(bandwidth).build(); } }

8.2 GuavaLimitConfig.java 轻量单机限流配置

package com.ai.rag.config; import com.google.common.util.concurrent.RateLimiter; import org.springframework.context.annotation.Bean; import org.springframework.context.annotation.Configuration; @Configuration public class GuavaLimitConfig { @Bean public RateLimiter aiGuavaLimiter() { return RateLimiter.create(5.0); } @Bean public RateLimiter uploadGuavaLimiter() { return RateLimiter.create(2.0); } }

9. 全套限流工具类

9.1 RedisLimitUtil.java Redisson分布式限流

package com.ai.rag.util; import org.redisson.api.RRateLimiter; import org.redisson.api.RateIntervalUnit; import org.redisson.api.RateType; import org.redisson.api.RedissonClient; import org.springframework.stereotype.Component; import javax.annotation.Resource; @Component public class RedisLimitUtil { @Resource private RedissonClient redissonClient; public boolean tryLimit(String key, int qps) { RRateLimiter limiter = redissonClient.getRateLimiter(key); limiter.trySetRate(RateType.OVERALL, qps, 1, RateIntervalUnit.SECONDS); return limiter.tryAcquire(1); } }

9.2 LuaLimitUtil.java Redis+Lua原生限流

package com.ai.rag.util; import org.springframework.data.redis.core.StringRedisTemplate; import org.springframework.data.redis.core.script.DefaultRedisScript; import org.springframework.stereotype.Component; import javax.annotation.Resource; import java.util.List; @Component public class LuaLimitUtil { @Resource private StringRedisTemplate stringRedisTemplate; private static final String LUA_SCRIPT = "local key = KEYS[1] " + "local limit = tonumber(ARGV[1]) " + "local curr = redis.call('get', key) or 0 " + "if curr + 1 > limit then " + " return 0 " + "else " + " redis.call('incr', key) " + " redis.call('expire', key, 1) " + " return 1 " + "end"; public boolean tryLimit(String key, int limit) { DefaultRedisScript<Long> script = new DefaultRedisScript<>(); script.setScriptText(LUA_SCRIPT); script.setResultType(Long.class); Long result = stringRedisTemplate.execute(script, List.of(key), String.valueOf(limit)); return result != null && result == 1; } }

10. 终极整合Controller(5套方案全覆盖)

默认启用 Resilience4j 熔断限流,同时预留其余4套方案接口,无缝切换

package com.ai.rag.controller; import com.ai.rag.common.R; import com.ai.rag.service.DocumentService; import com.ai.rag.service.RagQaService; import com.ai.rag.util.LuaLimitUtil; import com.ai.rag.util.RedisLimitUtil; import com.google.common.util.concurrent.RateLimiter; import io.github.bucket4j.Bucket; import lombok.RequiredArgsConstructor; import org.springframework.web.bind.annotation.*; import org.springframework.web.multipart.MultipartFile; import javax.annotation.Resource; @RestController @RequestMapping("/api/rag") @RequiredArgsConstructor public class RagController { private final DocumentService documentService; private final RagQaService ragQaService; @Resource private Bucket aiChatBucket; @Resource private Bucket uploadBucket; @Resource private RateLimiter aiGuavaLimiter; @Resource private RateLimiter uploadGuavaLimiter; @Resource private RedisLimitUtil redisLimitUtil; @Resource private LuaLimitUtil luaLimitUtil; // 1. Resilience4j 熔断+限流 【生产默认主方案】 @GetMapping("/chat") @io.github.resilience4j.ratelimiter.annotation.RateLimiter(name = "aiChatLimit") @io.github.resilience4j.circuitbreaker.annotation.CircuitBreaker(name = "aiChatCircuit", fallbackMethod = "chatFallback") public R<String> chat(@RequestParam String sessionId, @RequestParam String question) { return R.ok(ragQaService.chat(sessionId, question)); } @PostMapping("/upload") @io.github.resilience4j.ratelimiter.annotation.RateLimiter(name = "uploadLimit") @io.github.resilience4j.circuitbreaker.annotation.CircuitBreaker(name = "uploadCircuit", fallbackMethod = "uploadFallback") public R<String> upload(@RequestParam MultipartFile file) throws Exception { documentService.uploadAndEmbed(file); return R.ok("文档上传并完成向量化"); } // 2. Bucket4j 高性能限流接口 @GetMapping("/chat/bucket4j") public R<String> chatByBucket4j(@RequestParam String sessionId, @RequestParam String question) { if (!aiChatBucket.tryConsume(1)) { return R.fail("【Bucket4j】AI问答接口访问限流"); } return R.ok(ragQaService.chat(sessionId, question)); } // 3. Guava 轻量限流接口 @GetMapping("/chat/guava") public R<String> chatByGuava(@RequestParam String sessionId, @RequestParam String question) { if (!aiGuavaLimiter.tryAcquire()) { return R.fail("【Guava】AI问答访问频繁,请稍后"); } return R.ok(ragQaService.chat(sessionId, question)); } // 4. Redisson 分布式限流接口 @GetMapping("/chat/redisson") public R<String> chatByRedisson(@RequestParam String sessionId, @RequestParam String question) { if (!redisLimitUtil.tryLimit("ai:chat:cluster:limit", 5)) { return R.fail("【Redisson】集群访问限流"); } return R.ok(ragQaService.chat(sessionId, question)); } // 5. Redis-Lua 原生限流接口 @GetMapping("/chat/lua") public R<String> chatByLua(@RequestParam String sessionId, @RequestParam String question) { if (!luaLimitUtil.tryLimit("ai:chat:lua:limit", 5)) { return R.fail("【Lua】接口请求受限"); } return R.ok(ragQaService.chat(sessionId, question)); } @DeleteMapping("/clear") public R<String> clear() { documentService.clearVectorStore(); return R.ok("向量库清空成功"); } // 统一降级兜底方法 public R<String> chatFallback(String sessionId, String question, Throwable e) { return R.fail("AI服务繁忙,已熔断降级保护,请稍后重试"); } public R<String> uploadFallback(MultipartFile file, Throwable e) { return R.fail("文档上传服务异常,已降级"); } }

11. 生产避坑规范

  • ❌ 禁止仅用限流无熔断:AI大模型超时堆积必雪崩

  • ❌ 集群环境禁止使用Guava/本地Bucket4j,限流失效

  • ❌ 新项目禁止使用Hystrix、老旧自研熔断

  • ✅ 职责拆分:熔断统一Resilience4j,限流按场景拆分

  • ✅ 大文件上传独立限流,不占用问答接口流量配额

  • ✅ 集群环境必须搭配Redisson实现全局流量管控

12. 项目启动顺序

  1. 启动 Redis、MySQL、Milvus/Chroma 向量库

  2. 修改yml中大通义千问API密钥

  3. 刷新Maven依赖

  4. 启动SpringBoot项目

  5. 默认接口/api/rag/chat自带熔断+限流降级

13. 接口测试清单

  • Resilience4j默认:GET /api/rag/chat

  • Bucket4j:GET /api/rag/chat/bucket4j

  • Guava:GET /api/rag/chat/guava

  • Redisson分布式:GET /api/rag/chat/redisson

  • Lua原生:GET /api/rag/chat/lua

  • 文档上传:POST /api/rag/upload

  • 清空向量库:DELETE /api/rag/clear

版权声明: 本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若内容造成侵权/违法违规/事实不符,请联系邮箱:809451989@qq.com进行投诉反馈,一经查实,立即删除!
网站建设 2026/4/29 23:23:34

StreamCap完整指南:如何高效录制40+直播平台的终极免费工具

StreamCap完整指南&#xff1a;如何高效录制40直播平台的终极免费工具 【免费下载链接】StreamCap Multi-Platform Live Stream Automatic Recording Tool | 多平台直播流自动录制客户端 基于FFmpeg 支持监控/定时/转码 项目地址: https://gitcode.com/gh_mirrors/st/Strea…

作者头像 李华
网站建设 2026/4/29 23:22:27

MySQl第二次作业

查询" 01 "课程比" 02 "课程成绩高的学生的信息及课程分数分别取 01、02 课成绩&#xff0c;用 JOIN 关联同一学生&#xff0c;筛选 01 课分数 > 02 课分数查询存在" 01 "课程但可能不存在" 02 "课程的情况(不存在时显示为 null )1…

作者头像 李华
网站建设 2026/4/29 23:21:25

如何轻松为GTNH整合包安装中文汉化:新手友好的完整指南

如何轻松为GTNH整合包安装中文汉化&#xff1a;新手友好的完整指南 【免费下载链接】Translation-of-GTNH GTNH整合包的汉化 项目地址: https://gitcode.com/gh_mirrors/tr/Translation-of-GTNH GTNH汉化项目是专为GregTech: New Horizons整合包打造的中文语言包&#x…

作者头像 李华
网站建设 2026/4/29 23:19:22

在 Windows 上使用 Hyper-V 虚拟机准备安装OpenClaw

从0构建WAV文件&#xff1a;读懂计算机文件的本质 虽然接触计算机有一段时间了&#xff0c;但是我的视野一直局限于一个较小的范围之内&#xff0c;往往只能看到于算法竞赛相关的内容&#xff0c;计算机各种文件在我看来十分复杂&#xff0c;认为构建他们并能达到目的是一件困难…

作者头像 李华