feat(api-server): add request backpressure controls

2026-05-17 04:56:45 +08:00
parent fb23ee79d8
commit 02271e6c73
11 changed files with 478 additions and 2 deletions
--- a/docs/【开发运维】本地开发验证与生产运维-2026-05-15.md
+++ b/docs/【开发运维】本地开发验证与生产运维-2026-05-15.md
@@ -154,6 +154,7 @@ Jenkins 按 web / api / Spacetime module / build / deploy / publish 拆分
 50 HTTP req/s 首版压测优化口径：

 - `api-server` 生产模板默认 `GENARRATIVE_API_LISTEN_BACKLOG=1024`、`GENARRATIVE_API_WORKER_THREADS=4`；本地未设置 worker threads 时继续使用 Tokio 默认值。
+- `GENARRATIVE_API_MAX_CONCURRENT_REQUESTS=512` 开启应用内 HTTP 并发背压，超过并发许可时直接返回 `429 Too Many Requests` 和 `Retry-After: 1`，`/healthz` 不受该限制。该值不是 RPS 限速；如果压测中 429 上升但内存和 p95 收敛，说明背压正在保护进程，需要结合真实容量调阈值或在 Nginx 前置限流。直连 `api-server` 的极高 RPS 压测若出现 `connection refused`，通常已经打到 TCP 监听 / accept 层，应同时检查 backlog、Nginx upstream keepalive 和前置限流。
 - `genarrative-api.service` 设置 `LimitNOFILE=65535`、`TasksMax=2048`；上线后用 `systemctl show genarrative-api.service -p LimitNOFILE -p TasksMax` 和 `cat /proc/$(pidof api-server)/limits` 核对。
 - Nginx `/api/` 与 `/admin/api/` 通过 `genarrative_api` upstream 代理到 `127.0.0.1:8082`，upstream keepalive 为 64；压测时看 `/var/log/nginx/genarrative.access.log` 中的 `request_time`、`upstream_connect_time`、`upstream_header_time`、`upstream_response_time`、`upstream_status`、`request_id`。
 - 作品列表 K6 脚本一次 iteration 默认请求两个公开接口，因此约 50 HTTP req/s 的目标命令使用 `SCENARIO=spike START_RPS=5 PEAK_RPS=25 HOLD=60s END_RPS=5 DETAIL_RATIO=0 npm run loadtest:k6:works`。