feat(deploy): prepare offline provision tools and container loadtest

This commit is contained in:
kdletters
2026-05-18 16:58:48 +08:00
parent 4f6c97ae92
commit 3eb292b403
12 changed files with 443 additions and 61 deletions

View File

@@ -19,12 +19,22 @@
## 2026-05-17 容器化方案只作为隔离压测与预发模拟路径
- 背景Windows 本机直连极高 VU 压测会放大本地连接与发送缓冲行为,和线上 Linux + Nginx + systemd 拓扑不一致;需要一个更接近生产网络层的模拟方案,但不能扰动当前生产发布链路。
- 决策:新增 `deploy/container/` 容器化方案,使用 Docker Compose 组合 Linux release `api-server`、容器 Nginx、`otelcol-contrib` debug exporter 和可选 k6。该方案只用于本机或预发压测模拟不替换当前生产 `systemd + Nginx + Jenkins` 路径。
- 决策:新增 `deploy/container/` 容器化方案,使用 Docker Compose 组合 Linux release `api-server`、容器 SpacetimeDB、容器 Nginx、`otelcol-contrib` debug exporter 和可选 k6。该方案只用于本机或预发压测模拟不替换当前生产 `systemd + Nginx + Jenkins` 路径。
- 服务器模拟参数2026-05-18 通过 `ssh genarrative-release` 采样,目标机器为 2 vCPU / 约 2 GiB RAM / Ubuntu 24.04 / Nginx `worker_connections=768`;容器方案按待发布运行口径使用 `nofile=4096`,并在 compose 中限制 `spacetimedb cpus=1.0 mem_limit=768m``api-server cpus=2.0 mem_limit=1g``nginx cpus=0.25 mem_limit=128m``otelcol cpus=0.25 mem_limit=128m``k6 cpus=0.5 mem_limit=512m`Collector 镜像默认使用 `otel/opentelemetry-collector-contrib:0.151.0`
- 隔离边界:容器方案使用独立 `deploy/container/api-server.env`、独立 Nginx 配置、独立 compose 命令和默认 `18080` 端口;真实 token 不进入镜像、不提交 Git生产 systemd 单元、Jenkins 发布脚本和 `deploy/nginx/` 模板仍是正式线上来源。
- 生产 Collectorserver-provision 可安装 `otelcol-contrib.service` 和本机 debug exporter 配置,但二进制由 Jenkins 构建机先准备 `provision-tools/otelcol-contrib` 再上传到 release 部署 agent目标机不从 GitHub 下载api-server 是否发送 OTLP 仍由 `GENARRATIVE_OTEL_ENABLED` 控制。
- 影响范围:`deploy/container/``scripts/container-compose.mjs``package.json` 容器命令、开发运维文档和容器 build context 排除规则。
- 验证方式:执行 `npm run container:config` 展开 compose 配置;需要真实运行时再执行 `npm run container:build``npm run container:up``npm run container:k6`,并结合容器 Nginx log 与 OTLP debug exporter 判断瓶颈。
- 关联文档:`deploy/container/README.md``docs/【开发运维】本地开发验证与生产运维-2026-05-15.md`
## 2026-05-18 生产 provision 改为构建机准备工具包再上传安装
- 背景:目标 release 服务器无法访问 GitHub之前的 server provision 默认仍假设 `spacetime``otelcol-contrib` 已经存在于目标机本地路径,和真实运维条件不符。
- 决策Jenkins 新增 `Prepare Provision Tools` 阶段,在 `linux && genarrative-build` 构建机执行 `scripts/prepare-server-provision-tools.sh`,通过官方 SpacetimeDB 安装入口和 OpenTelemetry release 包生成 `provision-tools/`,再用 `stash/unstash` 带到 release 部署 agent`scripts/jenkins-server-provision.sh` 只从工作区工具包复制安装,不再要求目标机自己下载或预装二进制。
- 影响范围:`jenkins/Jenkinsfile.production-server-provision``scripts/prepare-server-provision-tools.sh``scripts/jenkins-server-provision.sh`、生产运维文档。
- 验证方式Jenkins 构建机可完成工具包准备release 部署 agent 只消费工作区文件;目标机不再依赖 GitHub 外网下载。
- 关联文档:`docs/【开发运维】本地开发验证与生产运维-2026-05-15.md`
## 2026-05-16 公开作品列表短期由 BFF 订阅读模型缓存
- 背景:作品列表压测和实时性讨论中,曾考虑让浏览器前端直接订阅公开作品列表,减少 HTTP 拉取和 BFF 压力。
@@ -35,8 +45,6 @@
- 验证方式:新增公开作品列表订阅能力时,检查前端只消费专用 public read model 或 BFF HTTP DTO检查源表 row shape、权限判断和跨玩法聚合没有下沉到前端页面。
- 关联文档:`docs/【后端架构】server-rs与SpacetimeDB数据契约-2026-05-15.md``docs/【开发运维】本地开发验证与生产运维-2026-05-15.md`
## 2026-05-16 api-server OpenTelemetry 统一补齐 traces metrics logs
- 背景:压测与运行观测需要把 HTTP、SpacetimeDB 调用和应用日志串起来,同时保留本地 `journalctl` / 文件日志做故障排障。
- 决策:`api-server` 通过 OTLP HTTP base endpoint 发送 traces、metrics 和 logsCollector 统一用 `otelcol-contrib``npm run otel:debug` 负责 debug 采集,`npm run otel:rider` 负责转发到 RiderRider 只是接收与可视化端,不直接替代 Collector。
- 日志口径Rider Logs 面板只展示 log event 自身字段,请求完成日志需要直接携带 `request_id`、HTTP method、规范化 route、scheme、path、status、status_class、latency 和 slow_request更完整的 request attributes 仍以 trace/span 为准。

View File

@@ -6,20 +6,27 @@
```text
Docker Compose
├─ spacetimedb :3101独立数据卷供 api-server 连接
├─ nginx :80 -> api-server:8082负责静态站点、/admin/、/api/ 反代、upstream timing log、连接限制
├─ api-server :8082Linux release 构建,连接外部 SpacetimeDB
├─ api-server :8082Linux release 构建,连接 compose 内 SpacetimeDB
├─ otelcol :4317/4318debug exporter接收 traces / metrics / logs
└─ k6 profile=loadtest 时临时启动,在 compose 网络内压 nginx
```
当前容器模拟参数按 `genarrative-release` 服务器采样值收口为 2 vCPU / 2 GiB RAM / 4096 soft nofile / 768 worker_connections并已在 compose 里落实到 `spacetimedb cpus=1.0 mem_limit=768m``api-server cpus=2.0 mem_limit=1g``nginx cpus=0.25 mem_limit=128m``otelcol cpus=0.25 mem_limit=128m``k6 cpus=0.5 mem_limit=512m`
Collector 镜像使用 `otel/opentelemetry-collector-contrib:0.151.0`
生产服务器若启用 Collector则由 `deploy/systemd/otelcol-contrib.service``deploy/otelcol/genarrative-debug.yaml` 托管,不走容器镜像。
默认 host 端口:
- `http://127.0.0.1:13101`:容器 SpacetimeDB。
- `http://127.0.0.1:18080`:容器 Nginx。
- `127.0.0.1:4317` / `127.0.0.1:4318`:容器 Collector OTLP gRPC / HTTP。
如端口冲突,可设置:
```powershell
$env:GENARRATIVE_CONTAINER_SPACETIME_PORT="13102"
$env:GENARRATIVE_CONTAINER_HTTP_PORT="18081"
$env:GENARRATIVE_CONTAINER_OTLP_HTTP_PORT="14318"
$env:GENARRATIVE_CONTAINER_OTLP_GRPC_PORT="14317"
@@ -33,21 +40,25 @@ npm run container:init
该命令会从 `deploy/container/api-server.env.example` 生成本地 `deploy/container/api-server.env`。真实 token、库名和外部服务密钥只写本地 env 文件,不提交 Git。
Docker Desktop 下默认通过 `host.docker.internal:3101` 连接宿主机上 `npm run dev` 启动的 SpacetimeDB
Docker Desktop 下默认通过 `http://spacetimedb:3101` 连接 compose 内 SpacetimeDB宿主机只负责用 CLI 发布模块
```env
GENARRATIVE_SPACETIME_SERVER_URL=http://host.docker.internal:3101
GENARRATIVE_SPACETIME_SERVER_URL=http://spacetimedb:3101
GENARRATIVE_SPACETIME_DATABASE=genarrative-loadtest
GENARRATIVE_SPACETIME_TOKEN=
```
Linux Docker Engine 如果不能解析 `host.docker.internal`Compose 已配置 `host-gateway`;仍不通时把 `GENARRATIVE_SPACETIME_SERVER_URL` 改成宿主机网关 IP 或同网络内的 SpacetimeDB 地址
宿主机发布模块时,先用 CLI 向 `http://127.0.0.1:13101` 发布到 `genarrative-loadtest`,再启动 `npm run container:up`
Linux Docker Engine 若要从宿主机 CLI 连到容器内服务,直接用 `http://127.0.0.1:13101`;容器内部服务之间统一走 `http://spacetimedb:3101`
## 启动与验证
```bash
npm run container:config
npm run container:build
npm run container:up -- spacetimedb
spacetime publish genarrative-loadtest --server http://127.0.0.1:13101 --module-path server-rs/crates/spacetime-module --yes --build-options="--debug"
npm run container:up
npm run container:ps
curl -sS http://127.0.0.1:18080/api/runtime/puzzle/gallery
@@ -103,6 +114,17 @@ $env:DETAIL_RATIO="0"
npm run container:k6
```
容器内 `api-server` 资源上限与 Nginx 连接模型已经按 `genarrative-release` 的 2C / 2G / `nofile=4096` / `worker_connections=768` 收口;如果你要改成别的机器,就先重新采样再改这里。
SpacetimeDB 容器默认只提供运行时,不自动发布模块。首次启动或清理 `spacetime-data` 卷后,先只启动 `spacetimedb` 服务,再发布模块:
```bash
npm run container:up -- spacetimedb
spacetime publish genarrative-loadtest --server http://127.0.0.1:13101 --module-path server-rs/crates/spacetime-module --yes --build-options="--debug"
```
发布完成后再执行 `npm run container:up``npm run container:k6`。如果 `deploy/container/api-server.env` 里的 `GENARRATIVE_SPACETIME_DATABASE` 改成了别的库名,发布命令里的库名也要同步修改。
如果要压 1000 HTTP req/s`PEAK_RPS` 调到 `500`;如果要压 5000 HTTP req/s`PEAK_RPS` 调到 `2500`,并同时提高 `PREALLOCATED_VUS` / `MAX_VUS`观察是否先被带宽、Nginx `limit_conn` 或 api-server 背压限制。
## OTLP

View File

@@ -7,7 +7,7 @@ GENARRATIVE_API_HOST=0.0.0.0
GENARRATIVE_API_PORT=8082
GENARRATIVE_API_LOG=info,tower_http=info
GENARRATIVE_API_LISTEN_BACKLOG=1024
GENARRATIVE_API_WORKER_THREADS=4
GENARRATIVE_API_WORKER_THREADS=2
GENARRATIVE_API_MAX_CONCURRENT_REQUESTS=512
GENARRATIVE_OTEL_ENABLED=false
@@ -21,9 +21,8 @@ GENARRATIVE_JWT_SECRET=CHANGE_ME_FOR_CONTAINER
AUTH_REFRESH_COOKIE_SECURE=false
GENARRATIVE_AUTH_STORE_PATH=/var/lib/genarrative/auth/auth-store.json
# Docker Desktop 下连接宿主机 npm run dev 启动的 SpacetimeDB
# Linux Docker Engine 可改成宿主机网关 IP或在 compose 里接入同一网络内的 SpacetimeDB。
GENARRATIVE_SPACETIME_SERVER_URL=http://host.docker.internal:3101
# 默认连接 compose 内部 SpacetimeDB宿主机发布模块使用 127.0.0.1:13101
GENARRATIVE_SPACETIME_SERVER_URL=http://spacetimedb:3101
GENARRATIVE_SPACETIME_DATABASE=genarrative-loadtest
GENARRATIVE_SPACETIME_TOKEN=
GENARRATIVE_SPACETIME_POOL_SIZE=8

View File

@@ -1,11 +1,47 @@
name: genarrative-container-loadtest
services:
spacetimedb:
image: clockworklabs/spacetime:v2.2.0
command:
[
"start",
"--listen-addr",
"0.0.0.0:3101",
"--data-dir",
"/var/lib/spacetimedb",
"--page_pool_max_size",
"536870912",
"--non-interactive",
]
cpus: "1.0"
mem_limit: 768m
ports:
- "${GENARRATIVE_CONTAINER_SPACETIME_PORT:-13101}:3101"
volumes:
- spacetime-data:/var/lib/spacetimedb
ulimits:
nofile:
soft: 4096
hard: 4096
healthcheck:
test:
[
"CMD-SHELL",
"spacetime server ping http://127.0.0.1:3101 >/dev/null 2>&1",
]
interval: 10s
timeout: 5s
retries: 12
start_period: 20s
api-server:
build:
context: ../..
dockerfile: deploy/container/api-server.Dockerfile
target: api-runtime
cpus: "2.0"
mem_limit: 1g
env_file:
- ./api-server.env
environment:
@@ -16,7 +52,13 @@ services:
- "host.docker.internal:host-gateway"
volumes:
- api-auth-store:/var/lib/genarrative/auth
ulimits:
nofile:
soft: 4096
hard: 4096
depends_on:
spacetimedb:
condition: service_healthy
otelcol:
condition: service_started
healthcheck:
@@ -31,15 +73,23 @@ services:
context: ../..
dockerfile: deploy/container/api-server.Dockerfile
target: nginx-runtime
cpus: "0.25"
mem_limit: 128m
depends_on:
api-server:
condition: service_healthy
spacetimedb:
condition: service_healthy
ports:
- "${GENARRATIVE_CONTAINER_HTTP_PORT:-18080}:80"
extra_hosts:
- "host.docker.internal:host-gateway"
volumes:
- nginx-logs:/var/log/nginx
ulimits:
nofile:
soft: 4096
hard: 4096
healthcheck:
test: ["CMD", "wget", "-qO-", "http://127.0.0.1/api/runtime/puzzle/gallery"]
interval: 10s
@@ -48,8 +98,10 @@ services:
start_period: 20s
otelcol:
image: otel/opentelemetry-collector-contrib:0.125.0
image: otel/opentelemetry-collector-contrib:0.151.0
command: ["--config=/etc/otelcol/config.yaml"]
cpus: "0.25"
mem_limit: 128m
volumes:
- ./otelcol.yaml:/etc/otelcol/config.yaml:ro
ports:
@@ -59,6 +111,8 @@ services:
k6:
image: grafana/k6:0.52.0
profiles: ["loadtest"]
cpus: "0.5"
mem_limit: 512m
depends_on:
nginx:
condition: service_healthy
@@ -81,5 +135,6 @@ services:
command: ["run", "k6-works-list.js"]
volumes:
spacetime-data:
api-auth-store:
nginx-logs:

View File

@@ -1,7 +1,7 @@
worker_processes auto;
events {
worker_connections 4096;
worker_connections 768;
}
http {
@@ -106,7 +106,7 @@ http {
}
location ~ ^/v1/database/[^/]+/subscribe$ {
proxy_pass http://host.docker.internal:3101;
proxy_pass http://spacetimedb:3101;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "Upgrade";
@@ -115,7 +115,7 @@ http {
}
location ^~ /v1/identity {
proxy_pass http://host.docker.internal:3101;
proxy_pass http://spacetimedb:3101;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "Upgrade";

View File

@@ -0,0 +1,23 @@
receivers:
otlp:
protocols:
grpc:
endpoint: 127.0.0.1:4317
http:
endpoint: 127.0.0.1:4318
exporters:
debug:
verbosity: normal
service:
pipelines:
traces:
receivers: [otlp]
exporters: [debug]
metrics:
receivers: [otlp]
exporters: [debug]
logs:
receivers: [otlp]
exporters: [debug]

View File

@@ -0,0 +1,22 @@
[Unit]
Description=Genarrative OpenTelemetry Collector Contrib
After=network-online.target
Wants=network-online.target
[Service]
Type=simple
User=otelcol
Group=otelcol
WorkingDirectory=/etc/otelcol
ExecStart=/usr/local/bin/otelcol-contrib --config=/etc/otelcol/genarrative-debug.yaml
Restart=always
RestartSec=5
LimitNOFILE=65535
NoNewPrivileges=true
PrivateTmp=true
ProtectSystem=full
ReadWritePaths=/etc/otelcol /var/log/genarrative
[Install]
WantedBy=multi-user.target

View File

@@ -156,12 +156,14 @@ Jenkins 按 web / api / Spacetime module / build / deploy / publish 拆分
- `api-server` 生产模板默认 `GENARRATIVE_API_LISTEN_BACKLOG=1024``GENARRATIVE_API_WORKER_THREADS=4`;本地未设置 worker threads 时继续使用 Tokio 默认值。
- `GENARRATIVE_API_MAX_CONCURRENT_REQUESTS=512` 开启应用内 HTTP 并发背压,超过并发许可时直接返回 `429 Too Many Requests``Retry-After: 1``/healthz` 不受该限制。该值不是 RPS 限速;如果压测中 429 上升但内存和 p95 收敛,说明背压正在保护进程,需要结合真实容量调阈值或在 Nginx 前置限流。直连 `api-server` 的极高 RPS 压测若出现 `connection refused`,通常已经打到 TCP 监听 / accept 层,应同时检查 backlog、Nginx upstream keepalive 和前置限流。
- `genarrative-api.service` 设置 `LimitNOFILE=65535``TasksMax=2048`;上线后用 `systemctl show genarrative-api.service -p LimitNOFILE -p TasksMax``cat /proc/$(pidof api-server)/limits` 核对。
- Server provision 不在目标机下载 SpacetimeDB 或 `otelcol-contrib`。Jenkins 的 `Prepare Provision Tools` 阶段在 `linux && genarrative-build` 构建机执行 `scripts/prepare-server-provision-tools.sh`,通过官方 SpacetimeDB 安装入口 `https://install.spacetimedb.com` 和 OpenTelemetry release 包生成 `provision-tools/`,再通过 `stash/unstash` 上传到 release 部署 agent。目标机上的 `scripts/jenkins-server-provision.sh` 只从该工作区工具包安装 `/stdb/spacetime``/stdb/bin/current/*``/usr/local/bin/otelcol-contrib`
- `otelcol-contrib.service` 作为可选系统服务加入 provision默认监听 `127.0.0.1:4317/4318` 并使用 `deploy/otelcol/genarrative-debug.yaml`。api-server 是否发送 OTLP 仍由 `GENARRATIVE_OTEL_ENABLED` 控制,服务 unit 见 `deploy/systemd/otelcol-contrib.service`
- Nginx `/api/``/admin/api/` 通过 `genarrative_api` upstream 代理到 `127.0.0.1:8082`upstream keepalive 为 64压测时看 `/var/log/nginx/genarrative.access.log` 中的 `request_time``upstream_connect_time``upstream_header_time``upstream_response_time``upstream_status``request_id`
- 作品列表 K6 脚本一次 iteration 默认请求两个公开接口,因此约 50 HTTP req/s 的目标命令使用 `SCENARIO=spike START_RPS=5 PEAK_RPS=25 HOLD=60s END_RPS=5 DETAIL_RATIO=0 npm run loadtest:k6:works`
- 作品列表短期继续由 `api-server` / BFF 订阅 SpacetimeDB 公开 read model 后读本地 cache不让浏览器前端直接订阅完整列表未来如新增 `public_work_gallery_entry` 等专用公开作品列表 read model前端只可订阅稳定、低基数、公开的专用投影禁止订阅 `puzzle_work_profile``custom_world_profile` 等玩法源表后自行 join、聚合或判断权限。前端直订阅落地前必须先补齐权限、字段契约、排序 / 分页、埋点和 BFF 回退策略。
- 50 HTTP req/s 验收目标为 `http_req_failed < 1%``p95 < 2s``dropped_iterations = 0`,同时压测窗口内 Nginx 无新增 502。
容器化压测与隔离部署方案单独放在 `deploy/container/`,用于本机或预发模拟 Linux release + Nginx + OTLP Collector 拓扑,不替换当前生产 `systemd + Nginx + Jenkins` 发布路径:
容器化压测与隔离部署方案单独放在 `deploy/container/`,用于本机或预发模拟 Linux release + Nginx + OTLP Collector 拓扑,不替换当前生产 `systemd + Nginx + Jenkins` 发布路径。当前容器模拟参数按 `genarrative-release` 采样值收口为 2 vCPU / 2 GiB RAM / `nofile=4096` / `worker_connections=768`,并在 compose 里落实到 `spacetimedb cpus=1.0 mem_limit=768m``api-server cpus=2.0 mem_limit=1g``nginx cpus=0.25 mem_limit=128m``otelcol cpus=0.25 mem_limit=128m``k6 cpus=0.5 mem_limit=512m`
```bash
npm run container:init
@@ -172,7 +174,7 @@ npm run container:k6
npm run container:down
```
容器方案默认暴露 `http://127.0.0.1:18080``api-server` 在容器内监听 `0.0.0.0:8082`Nginx 通过 `api-server:8082` upstream 反代 `/api/``/admin/api/`。SpacetimeDB 默认仍连接宿主机 `http://host.docker.internal:3101`真实库名、token 和外部服务密钥只写本地 `deploy/container/api-server.env`,不提交 Git。完整拓扑、端口、k6 参数和 OTLP debug exporter 使用方法见 `deploy/container/README.md`
容器方案默认暴露 `http://127.0.0.1:18080``api-server` 在容器内监听 `0.0.0.0:8082`Nginx 通过 `api-server:8082` upstream 反代 `/api/``/admin/api/`。SpacetimeDB 也纳入 compose容器内由 `spacetimedb:3101` 提供服务,宿主机通过 `http://127.0.0.1:13101` 进行模块发布Collector 镜像使用 `otel/opentelemetry-collector-contrib:0.151.0`。生产 provision 侧则通过 Jenkins 构建机准备的 `provision-tools/otelcol-contrib` 安装本机 `otelcol-contrib.service`真实库名、token 和外部服务密钥只写本地 `deploy/container/api-server.env`,不提交 Git。完整拓扑、端口、k6 参数和 OTLP debug exporter 使用方法见 `deploy/container/README.md`
`npm run container:config` 默认只做 quiet 校验,避免把本地 env 中的 token 展开到终端;确需排查完整 compose 时再传 `-- --print`
OpenTelemetry 现阶段可选 OTLP traces / metrics / logs但本地日志与 Nginx 文件日志仍保留:

View File

@@ -22,7 +22,8 @@ pipeline {
string(name: 'COMMIT_HASH', defaultValue: '', description: '部署脚本来源 commit')
string(name: 'SERVER_NAME', defaultValue: 'genarrative.example.com', description: '证书主域名;也作为 Nginx server_name 的第一个域名')
string(name: 'SERVER_ALIASES', defaultValue: '', description: '可选,额外 Nginx server_name多个用空格或逗号分隔例如 www.genarrative.world')
string(name: 'SPACETIME_BIN_SOURCE', defaultValue: '/usr/local/bin/spacetime', description: '服务器上已有 spacetime CLI 路径')
string(name: 'PROVISION_TOOLS_DIR', defaultValue: 'provision-tools', description: '构建机准备并上传到目标机工作区的工具包目录')
string(name: 'SPACETIME_DOWNLOAD_ROOT', defaultValue: 'https://github.com/clockworklabs/SpacetimeDB/releases/latest/download', description: '构建机下载 SpacetimeDB 官方安装产物的根地址;目标机不访问该地址')
string(name: 'SPACETIME_ROOT', defaultValue: '/stdb', description: 'SpacetimeDB root-dir')
string(name: 'RELEASE_ROOT', defaultValue: '/opt/genarrative/releases', description: 'release 根目录')
string(name: 'CURRENT_LINK', defaultValue: '/opt/genarrative/current', description: '当前版本软链接')
@@ -31,6 +32,8 @@ pipeline {
string(name: 'API_PORT', defaultValue: '8082', description: 'api-server 本机监听端口')
choice(name: 'NGINX_CONFIG_MODE', choices: ['none', 'production-https', 'development-http'], description: 'Nginx 配置模式;开发服无域名时选 development-httprelease 正式入口选 production-https')
booleanParam(name: 'ENABLE_SERVICES', defaultValue: true, description: '启用并启动 spacetimedb 与 api-server systemd 服务')
booleanParam(name: 'ENABLE_OTELCOL', defaultValue: true, description: '安装并启用本机 OpenTelemetry Collectorapi-server 是否发送 OTLP 仍由环境变量控制')
string(name: 'OTELCOL_VERSION', defaultValue: '0.151.0', description: 'otelcol-contrib 版本')
}
stages {
@@ -60,8 +63,17 @@ pipeline {
}
}
}
if (!params.SPACETIME_BIN_SOURCE?.trim()) {
error('SPACETIME_BIN_SOURCE 不能为空。')
if (!params.PROVISION_TOOLS_DIR?.trim()) {
error('PROVISION_TOOLS_DIR 不能为空。')
}
if (!(params.PROVISION_TOOLS_DIR.trim() ==~ /^[0-9A-Za-z._/-]+$/) || params.PROVISION_TOOLS_DIR.startsWith('/') || params.PROVISION_TOOLS_DIR.contains('..')) {
error("PROVISION_TOOLS_DIR 只能是工作区内的相对目录,不能包含绝对路径或连续点号: ${params.PROVISION_TOOLS_DIR}")
}
if (!(params.OTELCOL_VERSION?.trim() ==~ /^[0-9]+\.[0-9]+\.[0-9]+$/)) {
error("OTELCOL_VERSION 格式应为 x.y.z: ${params.OTELCOL_VERSION}")
}
if (!params.SPACETIME_DOWNLOAD_ROOT?.trim()) {
error('SPACETIME_DOWNLOAD_ROOT 不能为空。')
}
def nginxMode = params.NGINX_CONFIG_MODE?.trim()
if (!(nginxMode in ['none', 'production-https', 'development-http'])) {
@@ -77,6 +89,58 @@ pipeline {
}
}
stage('Prepare Provision Tools') {
agent {
label 'linux && genarrative-build'
}
steps {
script {
def checkoutFromRemote = { String remoteUrl ->
checkout([
$class: 'GitSCM',
branches: [[name: "*/${params.SOURCE_BRANCH}"]],
doGenerateSubmoduleConfigurations: false,
extensions: [
[$class: 'CleanBeforeCheckout'],
[$class: 'CloneOption', shallow: true, depth: 1, noTags: true, timeout: 30, honorRefspec: true],
],
userRemoteConfigs: [[url: remoteUrl, refspec: "+refs/heads/${params.SOURCE_BRANCH}:refs/remotes/origin/${params.SOURCE_BRANCH}"]],
])
}
try {
checkoutFromRemote(env.GIT_REMOTE_URL)
env.EFFECTIVE_GIT_REMOTE_URL = env.GIT_REMOTE_URL
} catch (error) {
echo "Git 主地址拉取失败: ${env.GIT_REMOTE_URL},改用备用地址: ${env.GIT_REMOTE_FALLBACK_URL}"
checkoutFromRemote(env.GIT_REMOTE_FALLBACK_URL)
env.EFFECTIVE_GIT_REMOTE_URL = env.GIT_REMOTE_FALLBACK_URL
}
}
sh '''
bash <<'BASH'
set -euo pipefail
chmod +x scripts/jenkins-checkout-source.sh scripts/prepare-server-provision-tools.sh
SOURCE_BRANCH="${SOURCE_BRANCH:-master}" \
COMMIT_HASH="${COMMIT_HASH:-}" \
GIT_REMOTE_URL="${EFFECTIVE_GIT_REMOTE_URL:-${GIT_REMOTE_URL}}" \
GIT_REMOTE_FALLBACK_URL="${GIT_REMOTE_FALLBACK_URL:-}" \
SOURCE_COMMIT_FILE=".jenkins-source-commit" \
scripts/jenkins-checkout-source.sh
PROVISION_TOOLS_DIR="${PROVISION_TOOLS_DIR:-provision-tools}" \
OTELCOL_VERSION="${OTELCOL_VERSION:-0.151.0}" \
SPACETIME_DOWNLOAD_ROOT="${SPACETIME_DOWNLOAD_ROOT:-https://github.com/clockworklabs/SpacetimeDB/releases/latest/download}" \
scripts/prepare-server-provision-tools.sh
BASH
'''
script {
env.SOURCE_COMMIT = readFile('.jenkins-source-commit').trim()
echo "Provision 工具包已准备,源码 commit=${env.SOURCE_COMMIT}"
}
stash name: 'server-provision-tools', includes: "${params.PROVISION_TOOLS_DIR}/**", useDefaultExcludes: false
}
}
stage('Checkout Provision Files') {
agent {
label "${params.DEPLOY_TARGET == 'development' ? 'linux && genarrative-build' : 'linux && genarrative-release-deploy'}"
@@ -109,7 +173,7 @@ pipeline {
set -euo pipefail
chmod +x scripts/jenkins-checkout-source.sh
SOURCE_BRANCH="${SOURCE_BRANCH:-master}" \
COMMIT_HASH="${COMMIT_HASH:-}" \
COMMIT_HASH="${COMMIT_HASH:-${SOURCE_COMMIT:-}}" \
GIT_REMOTE_URL="${EFFECTIVE_GIT_REMOTE_URL:-${GIT_REMOTE_URL}}" \
GIT_REMOTE_FALLBACK_URL="${GIT_REMOTE_FALLBACK_URL:-}" \
SOURCE_COMMIT_FILE=".jenkins-source-commit" \
@@ -124,10 +188,18 @@ BASH
label "${params.DEPLOY_TARGET == 'development' ? 'linux && genarrative-build' : 'linux && genarrative-release-deploy'}"
}
steps {
unstash 'server-provision-tools'
sh '''
bash <<'BASH'
set -euo pipefail
chmod +x "${PROVISION_TOOLS_DIR:-provision-tools}/otelcol-contrib" \
"${PROVISION_TOOLS_DIR:-provision-tools}/spacetime/spacetime" \
"${PROVISION_TOOLS_DIR:-provision-tools}/spacetime/bin/current/spacetimedb-cli" \
"${PROVISION_TOOLS_DIR:-provision-tools}/spacetime/bin/current/spacetimedb-standalone"
chmod +x scripts/jenkins-server-provision.sh
PROVISION_TOOLS_DIR="${PROVISION_TOOLS_DIR:-provision-tools}" \
SPACETIME_BIN_SOURCE="${PROVISION_TOOLS_DIR:-provision-tools}/spacetime/spacetime" \
OTELCOL_BIN_SOURCE="${PROVISION_TOOLS_DIR:-provision-tools}/otelcol-contrib" \
scripts/jenkins-server-provision.sh
BASH
'''

View File

@@ -89,7 +89,7 @@ function printHelp(isError) {
Commands:
container:init 生成 deploy/container/api-server.env
container:build 构建 api-server 容器镜像
container:up 后台启动 api-server + nginx + otelcol
container:up 后台启动 spacetimedb + api-server + nginx + otelcol
container:down 停止并清理容器
container:logs 查看容器日志
container:ps 查看容器状态

View File

@@ -1,6 +1,24 @@
#!/usr/bin/env bash
set -euo pipefail
PROVISION_TOOLS_DIR="${PROVISION_TOOLS_DIR:-provision-tools}"
SPACETIME_BIN_SOURCE="${SPACETIME_BIN_SOURCE:-${PROVISION_TOOLS_DIR}/spacetime/spacetime}"
OTELCOL_BIN_SOURCE="${OTELCOL_BIN_SOURCE:-${PROVISION_TOOLS_DIR}/otelcol-contrib}"
require_non_root_relative_path() {
local label="$1"
local path="$2"
if [[ -z "${path}" ]]; then
echo "[server-provision] ${label} 不能为空。" >&2
exit 1
fi
if [[ "${path}" == /* || "${path}" == *..* ]]; then
echo "[server-provision] ${label} 只能是工作区内的相对路径: ${path}" >&2
exit 1
fi
}
require_path() {
local path="$1"
if [[ ! -e "${path}" ]]; then
@@ -81,16 +99,16 @@ install_sccache() {
fi
echo "[server-provision] 未找到 sccache准备通过 cargo install sccache 安装。"
if ! command -v cargo >/dev/null 2>&1; then
echo "[server-provision] 未找到 cargo无法自动安装 sccache。请先安装 Rust 工具链后重跑 Server-Provision。" >&2
exit 1
fi
if [[ "${DRY_RUN}" == "true" ]]; then
echo "+ cargo install sccache --locked"
return
fi
if ! command -v cargo >/dev/null 2>&1; then
echo "[server-provision] 未找到 cargo无法自动安装 sccache。请先安装 Rust 工具链后重跑 Server-Provision。" >&2
exit 1
fi
cargo install sccache --locked
if ! command -v sccache >/dev/null 2>&1 && [[ ! -x /root/.cargo/bin/sccache ]]; then
echo "[server-provision] sccache 安装后仍不可用,请检查 cargo bin 目录是否在 PATH 中。" >&2
@@ -98,6 +116,42 @@ install_sccache() {
fi
}
sync_otelcol_install() {
local target_bin="/usr/local/bin/otelcol-contrib"
local source_bin="${OTELCOL_BIN_SOURCE}"
local version="${OTELCOL_VERSION:-0.151.0}"
local resolved_source="${source_bin}"
if [[ "${ENABLE_OTELCOL:-true}" != "true" ]]; then
echo "[server-provision] ENABLE_OTELCOL=${ENABLE_OTELCOL:-},跳过 otelcol-contrib 配置。"
return
fi
if command -v readlink >/dev/null 2>&1; then
resolved_source="$(readlink -f "${source_bin}" 2>/dev/null || echo "${source_bin}")"
fi
if [[ ! -x "${resolved_source}" ]]; then
echo "[server-provision] otelcol-contrib 不存在或不可执行: ${source_bin}" >&2
echo "[server-provision] 请先在构建机准备好 otelcol-contrib ${version},再通过 provision-tools 上传到目标机。" >&2
exit 1
fi
if [[ "${DRY_RUN}" == "true" ]]; then
echo "+ install -m 0755 ${resolved_source} ${target_bin}"
return
fi
install -m 0755 "${resolved_source}" "${target_bin}"
if ! "${target_bin}" --version >/dev/null 2>&1; then
echo "[server-provision] otelcol-contrib 安装后无法执行: ${target_bin}" >&2
exit 1
fi
if ! "${target_bin}" --version 2>/dev/null | grep -q "${version}"; then
echo "[server-provision] 警告: otelcol-contrib 版本不是期望的 ${version}: $("${target_bin}" --version 2>/dev/null || true)" >&2
fi
}
sync_spacetime_install() {
local root_dir="$1"
local target_bin_dir="${root_dir}/bin/current"
@@ -106,14 +160,6 @@ sync_spacetime_install() {
local resolved_command="${SPACETIME_BIN_SOURCE}"
local install_dir=""
local root_bin="${root_dir}/bin"
local share_bin_dir=""
local version_dir=""
local parent_dir=""
if [[ -x "${target_cli}" && -x "${target_standalone}" ]]; then
echo "[server-provision] SpacetimeDB current 目录已存在: ${target_bin_dir}"
return
fi
echo "[server-provision] 同步 SpacetimeDB current 目录到 ${target_bin_dir}"
if [[ "${DRY_RUN}" == "true" ]]; then
@@ -128,26 +174,10 @@ sync_spacetime_install() {
install_dir="$(cd -- "$(dirname -- "${resolved_command}")" && pwd)"
mkdir -p "${root_bin}"
for share_bin_dir in \
"/usr/.local/share/spacetime/bin" \
"/root/.local/share/spacetime/bin" \
"${HOME:-}/.local/share/spacetime/bin"; do
if [[ -d "${share_bin_dir}" ]]; then
version_dir="$(find "${share_bin_dir}" -mindepth 1 -maxdepth 1 -type d | sort -V | tail -n 1)"
if [[ -n "${version_dir}" && -x "${version_dir}/spacetimedb-cli" && -x "${version_dir}/spacetimedb-standalone" ]]; then
echo "[server-provision] 同步 SpacetimeDB 安装: ${version_dir} -> ${target_bin_dir}"
rm -rf "${target_bin_dir}"
mkdir -p "${target_bin_dir}"
cp -a "${version_dir}/." "${target_bin_dir}/"
chmod +x "${target_cli}" "${target_standalone}"
chown -R spacetimedb:spacetimedb "${root_bin}"
return
fi
fi
done
if [[ -d "${install_dir}/bin" ]]; then
echo "[server-provision] 同步 SpacetimeDB 安装: ${install_dir}/bin -> ${root_bin}"
rm -rf "${root_bin}"
mkdir -p "${root_bin}"
cp -a "${install_dir}/bin/." "${root_bin}/"
elif [[ -x "${install_dir}/spacetimedb-cli" && -x "${install_dir}/spacetimedb-standalone" ]]; then
echo "[server-provision] 同步 SpacetimeDB 安装: ${install_dir} -> ${target_bin_dir}"
@@ -156,14 +186,8 @@ sync_spacetime_install() {
cp -f "${install_dir}/spacetimedb-cli" "${target_cli}"
cp -f "${install_dir}/spacetimedb-standalone" "${target_standalone}"
chmod +x "${target_cli}" "${target_standalone}"
elif [[ -f "${resolved_command}" ]]; then
parent_dir="$(cd -- "${install_dir}/.." && pwd)"
if [[ -d "${parent_dir}/bin" && -x "${parent_dir}/bin/current/spacetimedb-cli" && -x "${parent_dir}/bin/current/spacetimedb-standalone" ]]; then
echo "[server-provision] 同步 SpacetimeDB 安装: ${parent_dir}/bin -> ${root_bin}"
cp -a "${parent_dir}/bin/." "${root_bin}/"
else
echo "[server-provision] 未能从 spacetime 命令路径推断完整 SpacetimeDB 安装目录: ${resolved_command}" >&2
fi
else
echo "[server-provision] 未能从 SpacetimeDB 交付包推断完整安装目录: ${resolved_command}" >&2
fi
if [[ ! -x "${target_cli}" || ! -x "${target_standalone}" ]]; then
@@ -387,6 +411,10 @@ render_api_env_example() {
deploy/env/api-server.env.example
}
render_otelcol_service() {
cat deploy/systemd/otelcol-contrib.service
}
validate_nginx_tls() {
local cert_dir="/etc/letsencrypt/live/${SERVER_NAME}"
if [[ "${SERVER_NAME}" == "genarrative.example.com" ]]; then
@@ -523,6 +551,8 @@ render_api_service() {
require_path deploy/systemd/spacetimedb.service
require_path deploy/systemd/genarrative-api.service
require_path deploy/systemd/otelcol-contrib.service
require_path deploy/otelcol/genarrative-debug.yaml
require_path deploy/nginx/genarrative.conf
require_path deploy/nginx/genarrative-dev-http.conf
require_path deploy/nginx/snippets/genarrative-maintenance.conf
@@ -532,6 +562,7 @@ require_path scripts/deploy/maintenance-off.sh
require_path scripts/deploy/maintenance-status.sh
validate_server_names
require_non_root_relative_path "PROVISION_TOOLS_DIR" "${PROVISION_TOOLS_DIR}"
echo "[server-provision] target=${DEPLOY_TARGET}, dry_run=${DRY_RUN}, nginx_config_mode=${NGINX_CONFIG_MODE}, source_commit=$(cat .jenkins-source-commit)"
@@ -585,6 +616,16 @@ else
echo "[server-provision] 已存在环境文件,保留不覆盖: ${API_ENV_FILE}"
fi
if [[ "${ENABLE_OTELCOL:-true}" == "true" ]]; then
sync_otelcol_install
otelcol_service="$(mktemp)"
render_otelcol_service >"${otelcol_service}"
install_file "${otelcol_service}" /etc/systemd/system/otelcol-contrib.service 0644
rm -f "${otelcol_service}"
else
echo "[server-provision] ENABLE_OTELCOL=${ENABLE_OTELCOL:-},跳过 otelcol-contrib service 安装。"
fi
if [[ "${NGINX_CONFIG_MODE}" != "none" ]]; then
install_nginx_config_with_rollback
else
@@ -593,7 +634,13 @@ fi
run_cmd systemctl daemon-reload
if [[ "${ENABLE_SERVICES}" == "true" ]]; then
if [[ "${ENABLE_OTELCOL:-true}" == "true" ]]; then
run_cmd systemctl enable otelcol-contrib.service
fi
run_cmd systemctl enable spacetimedb.service genarrative-api.service
if [[ "${ENABLE_OTELCOL:-true}" == "true" ]]; then
run_cmd systemctl restart otelcol-contrib.service
fi
run_cmd systemctl restart spacetimedb.service
wait_for_spacetimedb_service
ensure_spacetime_owner_client_token

View File

@@ -0,0 +1,132 @@
#!/usr/bin/env bash
set -euo pipefail
PROVISION_TOOLS_DIR="${PROVISION_TOOLS_DIR:-provision-tools}"
OTELCOL_VERSION="${OTELCOL_VERSION:-0.151.0}"
OTELCOL_DOWNLOAD_ROOT="${OTELCOL_DOWNLOAD_ROOT:-https://github.com/open-telemetry/opentelemetry-collector-releases/releases/download}"
SPACETIME_INSTALLER_URL="${SPACETIME_INSTALLER_URL:-https://install.spacetimedb.com}"
SPACETIME_DOWNLOAD_ROOT="${SPACETIME_DOWNLOAD_ROOT:-https://github.com/clockworklabs/SpacetimeDB/releases/latest/download}"
PROVISION_TOOLS_TMP_PARENT="${PROVISION_TOOLS_TMP_PARENT:-${WORKSPACE:-$(pwd)}/.tmp/server-provision-tools}"
TMP_DIR_TO_CLEAN=""
cleanup_tmp_dir() {
if [[ -n "${TMP_DIR_TO_CLEAN}" ]]; then
rm -rf "${TMP_DIR_TO_CLEAN}"
fi
}
require_cmd() {
local name="$1"
if ! command -v "${name}" >/dev/null 2>&1; then
echo "[prepare-provision-tools] 缺少命令: ${name}" >&2
exit 1
fi
}
download_file() {
local url="$1"
local output="$2"
if command -v curl >/dev/null 2>&1; then
curl -fsSL --retry 3 --retry-delay 2 "${url}" -o "${output}"
elif command -v wget >/dev/null 2>&1; then
wget -O "${output}" "${url}"
else
echo "[prepare-provision-tools] 需要 curl 或 wget 下载: ${url}" >&2
exit 1
fi
}
make_spacetime_wrapper() {
local target="$1"
cat >"${target}" <<'EOF'
#!/usr/bin/env sh
set -eu
SELF_DIR=$(CDPATH= cd -- "$(dirname -- "$0")" && pwd)
exec "$SELF_DIR/bin/current/spacetimedb-cli" "$@"
EOF
chmod 0755 "${target}"
}
prepare_otelcol() {
local tmp_dir="$1"
local archive="${tmp_dir}/otelcol-contrib.tar.gz"
local extract_dir="${tmp_dir}/otelcol-contrib"
local url="${OTELCOL_DOWNLOAD_ROOT}/v${OTELCOL_VERSION}/otelcol-contrib_${OTELCOL_VERSION}_linux_amd64.tar.gz"
local target="${PROVISION_TOOLS_DIR}/otelcol-contrib"
require_cmd tar
echo "[prepare-provision-tools] 下载 otelcol-contrib: ${url}"
mkdir -p "${extract_dir}"
download_file "${url}" "${archive}"
tar -xzf "${archive}" -C "${extract_dir}"
if [[ ! -x "${extract_dir}/otelcol-contrib" ]]; then
echo "[prepare-provision-tools] otelcol-contrib 包中缺少可执行文件。" >&2
exit 1
fi
install -m 0755 "${extract_dir}/otelcol-contrib" "${target}"
"${target}" --version >/dev/null
}
prepare_spacetime() {
local tmp_dir="$1"
local install_root="${tmp_dir}/spacetime-root"
local target_dir="${PROVISION_TOOLS_DIR}/spacetime"
echo "[prepare-provision-tools] 使用官方安装器准备 SpacetimeDB: ${SPACETIME_INSTALLER_URL}"
mkdir -p "${install_root}"
download_file "${SPACETIME_INSTALLER_URL}" "${tmp_dir}/spacetime-install.sh"
chmod 0755 "${tmp_dir}/spacetime-install.sh"
TMPDIR="${tmp_dir}" SPACETIME_DOWNLOAD_ROOT="${SPACETIME_DOWNLOAD_ROOT}" sh "${tmp_dir}/spacetime-install.sh" --root-dir "${install_root}" -y
if [[ ! -x "${install_root}/bin/current/spacetimedb-cli" ]]; then
echo "[prepare-provision-tools] SpacetimeDB 安装结果缺少 bin/current/spacetimedb-cli。" >&2
exit 1
fi
if [[ ! -x "${install_root}/bin/current/spacetimedb-standalone" ]]; then
echo "[prepare-provision-tools] SpacetimeDB 安装结果缺少 bin/current/spacetimedb-standalone。" >&2
exit 1
fi
mkdir -p "${target_dir}"
cp -a "${install_root}/bin" "${target_dir}/bin"
make_spacetime_wrapper "${target_dir}/spacetime"
"${target_dir}/spacetime" --version >/dev/null
}
main() {
local tmp_dir
require_cmd chmod
require_cmd cp
require_cmd install
require_cmd mktemp
require_cmd rm
mkdir -p "${PROVISION_TOOLS_TMP_PARENT}"
tmp_dir="$(mktemp -d "${PROVISION_TOOLS_TMP_PARENT%/}/run.XXXXXX")"
TMP_DIR_TO_CLEAN="${tmp_dir}"
trap cleanup_tmp_dir EXIT
rm -rf "${PROVISION_TOOLS_DIR}"
mkdir -p "${PROVISION_TOOLS_DIR}"
prepare_otelcol "${tmp_dir}"
prepare_spacetime "${tmp_dir}"
cat >"${PROVISION_TOOLS_DIR}/MANIFEST.txt" <<EOF
otelcol-contrib ${OTELCOL_VERSION}
spacetime installer ${SPACETIME_INSTALLER_URL}
spacetime download root ${SPACETIME_DOWNLOAD_ROOT}
EOF
echo "[prepare-provision-tools] 工具包已准备: ${PROVISION_TOOLS_DIR}"
find "${PROVISION_TOOLS_DIR}" -maxdepth 5 \( -type f -o -type l \) | sort
}
main "$@"