diff --git a/docs/technical/README.md b/docs/technical/README.md index 209e3bd5..cab7e99d 100644 --- a/docs/technical/README.md +++ b/docs/technical/README.md @@ -4,6 +4,7 @@ ## 文档列表 +- [SPACETIMEDB_START_SH_EARLY_EXIT_DIAGNOSTICS_2026-04-27.md](./SPACETIMEDB_START_SH_EARLY_EXIT_DIAGNOSTICS_2026-04-27.md):记录发布包 `start.sh` 只输出“SpacetimeDB 进程在就绪前退出”时的诊断补强,启动失败或超时时自动回显 `logs/spacetimedb.log`、`server ping`、端口监听和 root-dir 相关进程。 - [RPG_AND_AGENT_CHAT_TRUE_SSE_STREAMING_2026-04-26.md](./RPG_AND_AGENT_CHAT_TRUE_SSE_STREAMING_2026-04-26.md):记录 RPG 运行时 NPC 聊天、RPG/自定义世界 Agent 与大鱼 Agent 从“拼完整 SSE 字符串后一次性返回”改为 `mpsc + Sse` 真流式输出的后端落地口径。 - [SPACETIMEDB_START_SH_ROOT_OWNER_FALSE_POSITIVE_FIX_2026-04-27.md](./SPACETIMEDB_START_SH_ROOT_OWNER_FALSE_POSITIVE_FIX_2026-04-27.md):记录发布包 `start.sh` root-dir 占用检测把 `grep -F .../.spacetimedb` 误判为 SpacetimeDB 实例的根因、脚本修复和现场处理方式。 - [RPG_BATTLE_HEALTHBAR_AND_ACTION_PRESENTATION_FIX_2026-04-26.md](./RPG_BATTLE_HEALTHBAR_AND_ACTION_PRESENTATION_FIX_2026-04-26.md):记录 RPG 战斗血条安全锚点、服务端战斗回包前端短表现,以及 `battle_use_skill` 指定技能兜底结算的修复口径。 diff --git a/docs/technical/SPACETIMEDB_START_SH_EARLY_EXIT_DIAGNOSTICS_2026-04-27.md b/docs/technical/SPACETIMEDB_START_SH_EARLY_EXIT_DIAGNOSTICS_2026-04-27.md new file mode 100644 index 00000000..5b49aa99 --- /dev/null +++ b/docs/technical/SPACETIMEDB_START_SH_EARLY_EXIT_DIAGNOSTICS_2026-04-27.md @@ -0,0 +1,75 @@ +# start.sh SpacetimeDB 就绪前退出诊断补强 + +日期:`2026-04-27` + +## 1. 问题 + +执行发布包内 `start.sh` 时,可能只看到: + +```text +[start] 启动 spacetimedb +[start] SpacetimeDB 进程在就绪前退出。 +``` + +这条信息只能说明 `spacetime start` 子进程在 `server ping` 判定就绪前退出,不能直接说明根因。真实错误通常已经写入发布目录下的 `logs/spacetimedb.log`。 + +## 2. 常见根因 + +1. `GENARRATIVE_SPACETIME_PORT` 对应端口已被其他进程占用。 +2. `.spacetimedb/` root-dir 权限不正确,当前用户无法写入数据、bin 或日志目录。 +3. 目标机 `spacetime` 安装不完整,发布包同步不到可执行的 `bin/current/spacetimedb-cli`。 +4. 目标机上的 `spacetime` 版本与脚本启动参数不兼容。 +5. 旧 SpacetimeDB 进程仍持有同一 root-dir 或数据锁,但当前 `GENARRATIVE_SPACETIME_SERVER_URL` 指向的端口未就绪。 +6. `.spacetimedb/bin/current/` 下只有 `spacetimedb-cli`,缺少 `spacetimedb-standalone`,日志会显示 `exec failed for .../spacetimedb-standalone`。 + +## 3. 落地修复 + +发布包生成的 `start.sh` 调整为: + +1. 等待循环先执行 `server ping`,再检查启动进程是否退出,避免目标服务刚好已就绪但启动包装进程已退出时误报。 +2. `sync_ubuntu_spacetime_install` 不再只判断或复制 `spacetimedb-cli`,而是要求 `bin/current/spacetimedb-cli` 与 `bin/current/spacetimedb-standalone` 同时存在;从 Ubuntu 用户级安装目录同步时也复制完整版本目录。 +3. 当 SpacetimeDB 进程提前退出或等待超时时,自动打印: + - `GENARRATIVE_SPACETIME_SERVER_URL` 对应的目标地址。 + - `GENARRATIVE_SPACETIME_HOST:GENARRATIVE_SPACETIME_PORT` 对应的监听地址。 + - 当前 `GENARRATIVE_SPACETIME_ROOT_DIR`。 + - `logs/spacetimedb.log` 最近 120 行。 + - `spacetime server ping` 的原始输出。 + - `ss` 或 `netstat` 中当前端口的监听情况。 + - 同一 root-dir 下仍在运行的 SpacetimeDB 进程。 + +## 4. 现场排查 + +在发布目录中执行: + +```bash +tail -n 120 logs/spacetimedb.log +spacetime --root-dir ./.spacetimedb server ping "${GENARRATIVE_SPACETIME_SERVER_URL:-http://127.0.0.1:3101}" +ss -ltnp | grep ':3101' || true +``` + +如果日志包含: + +```text +exec failed for /var/lib/jenkins/deploy/Genarrative/.spacetimedb/bin/current/spacetimedb-standalone +No such file or directory (os error 2) +``` + +说明发布目录的 SpacetimeDB root-dir 中同步了 CLI,但没有同步 standalone。现场可先执行: + +```bash +cd /var/lib/jenkins/deploy/Genarrative +SPACETIME_VERSION_DIR="$(find /usr/.local/share/spacetime/bin "$HOME/.local/share/spacetime/bin" -mindepth 1 -maxdepth 1 -type d 2>/dev/null | sort -V | tail -n 1)" +test -x "${SPACETIME_VERSION_DIR}/spacetimedb-cli" +test -x "${SPACETIME_VERSION_DIR}/spacetimedb-standalone" +rm -rf .spacetimedb/bin/current +mkdir -p .spacetimedb/bin/current +cp -a "${SPACETIME_VERSION_DIR}/." .spacetimedb/bin/current/ +chmod +x .spacetimedb/bin/current/spacetimedb-cli .spacetimedb/bin/current/spacetimedb-standalone +./start.sh +``` + +如果日志显示端口占用,先确认占用者是否就是旧的 SpacetimeDB。需要复用时,把 `GENARRATIVE_SPACETIME_PORT` 或 `GENARRATIVE_SPACETIME_SERVER_URL` 改成实际端口;需要重启时,优先执行同目录 `./stop.sh`。 + +如果日志显示权限问题,先修复发布目录、`.spacetimedb/` 和 `logs/` 的所属用户,不要直接删除 `.spacetimedb/` 绕过问题。删除 `.spacetimedb/` 会同时影响本地 SpacetimeDB 数据与 CLI 身份,只能在确认本地库可丢弃时使用。 + +如果日志显示身份或 `403 Forbidden`,继续按 `SPACETIMEDB_START_SH_PUBLISH_403_IDENTITY_FIX_2026-04-26.md` 处理;这类错误发生在发布阶段,不属于启动进程提前退出。 diff --git a/scripts/deploy-rust-remote.sh b/scripts/deploy-rust-remote.sh index 24f6bb32..a8b8b822 100644 --- a/scripts/deploy-rust-remote.sh +++ b/scripts/deploy-rust-remote.sh @@ -548,19 +548,25 @@ wait_for_spacetime() { local deadline=$((SECONDS + SPACETIME_TIMEOUT_SECONDS)) while ((SECONDS < deadline)); do - if [[ -n "${process_pid}" ]] && ! kill -0 "${process_pid}" 2>/dev/null; then - echo "[start] SpacetimeDB 进程在就绪前退出。" >&2 - exit 1 - fi - if is_spacetime_ready; then return fi + if [[ -n "${process_pid}" ]] && ! kill -0 "${process_pid}" 2>/dev/null; then + echo "[start] SpacetimeDB 进程在就绪前退出。" >&2 + print_spacetime_start_diagnostics + exit 1 + fi + sleep 0.5 done + if is_spacetime_ready; then + return + fi + echo "[start] 等待 SpacetimeDB 就绪超时: ${SPACETIME_SERVER_URL}" >&2 + print_spacetime_start_diagnostics exit 1 } @@ -575,6 +581,42 @@ is_spacetime_ready() { [[ "${output}" == *"Server is online:"* ]] } +print_spacetime_start_diagnostics() { + local log_file="${LOG_DIR}/spacetimedb.log" + local root_owner="" + + # SpacetimeDB 启动日志默认重定向到文件;失败时主动回显关键现场,避免只看到“就绪前退出”。 + echo "[start] SpacetimeDB 启动诊断:" >&2 + echo "[start] - server: ${SPACETIME_SERVER_URL}" >&2 + echo "[start] - listen: ${SPACETIME_HOST}:${SPACETIME_PORT}" >&2 + echo "[start] - root-dir: ${SPACETIME_ROOT_DIR}" >&2 + echo "[start] - log: ${log_file}" >&2 + + if [[ -f "${log_file}" ]]; then + echo "[start] ${log_file} 最近 120 行:" >&2 + tail -n 120 "${log_file}" >&2 || true + else + echo "[start] 尚未生成 ${log_file}" >&2 + fi + + echo "[start] server ping 结果:" >&2 + spacetime --root-dir="${SPACETIME_ROOT_DIR}" server ping "${SPACETIME_SERVER_URL}" >&2 || true + + if command -v ss >/dev/null 2>&1; then + echo "[start] ${SPACETIME_PORT} 端口监听检查:" >&2 + ss -ltnp 2>/dev/null | awk -v listen=":${SPACETIME_PORT}" 'NR == 1 || index($0, listen) > 0 { print }' >&2 || true + elif command -v netstat >/dev/null 2>&1; then + echo "[start] ${SPACETIME_PORT} 端口监听检查:" >&2 + netstat -ltnp 2>/dev/null | awk -v listen=":${SPACETIME_PORT}" 'NR == 1 || index($0, listen) > 0 { print }' >&2 || true + fi + + root_owner="$(describe_spacetime_root_owner || true)" + if [[ -n "${root_owner}" ]]; then + echo "[start] root-dir 相关 SpacetimeDB 进程:" >&2 + echo "${root_owner}" >&2 + fi +} + describe_spacetime_root_owner() { if command -v ps >/dev/null 2>&1; then ps -eo user=,pid=,ppid=,stat=,comm=,args= 2>/dev/null | awk -v root_dir="${SPACETIME_ROOT_DIR}" ' @@ -600,7 +642,9 @@ describe_spacetime_root_owner() { sync_ubuntu_spacetime_install() { local root_dir="$1" - local target_cli="${root_dir}/bin/current/spacetimedb-cli" + local target_bin_dir="${root_dir}/bin/current" + local target_cli="${target_bin_dir}/spacetimedb-cli" + local target_standalone="${target_bin_dir}/spacetimedb-standalone" local spacetime_command="" local resolved_command="" local install_dir="" @@ -609,7 +653,7 @@ sync_ubuntu_spacetime_install() { local share_bin_dir="" local version_dir="" - if [[ -x "${target_cli}" ]]; then + if [[ -x "${target_cli}" && -x "${target_standalone}" ]]; then return fi @@ -632,11 +676,12 @@ sync_ubuntu_spacetime_install() { "${HOME:-}/.local/share/spacetime/bin"; do if [[ -d "${share_bin_dir}" ]]; then version_dir="$(find "${share_bin_dir}" -mindepth 1 -maxdepth 1 -type d | sort -V | tail -n 1)" - if [[ -n "${version_dir}" && -x "${version_dir}/spacetimedb-cli" ]]; then - echo "[start] 同步 Ubuntu SpacetimeDB CLI: ${version_dir}/spacetimedb-cli -> ${target_cli}" - mkdir -p "${root_bin}/current" - cp -f "${version_dir}/spacetimedb-cli" "${target_cli}" - chmod +x "${target_cli}" + if [[ -n "${version_dir}" && -x "${version_dir}/spacetimedb-cli" && -x "${version_dir}/spacetimedb-standalone" ]]; then + echo "[start] 同步 Ubuntu SpacetimeDB 安装: ${version_dir} -> ${target_bin_dir}" + rm -rf "${target_bin_dir}" + mkdir -p "${target_bin_dir}" + cp -a "${version_dir}/." "${target_bin_dir}/" + chmod +x "${target_cli}" "${target_standalone}" return fi fi @@ -648,27 +693,27 @@ sync_ubuntu_spacetime_install() { elif [[ -x "${install_dir}/current/spacetimedb-cli" ]]; then echo "[start] 同步 Ubuntu SpacetimeDB 安装: ${install_dir} -> ${root_bin}" cp -a "${install_dir}/." "${root_bin}/" - elif [[ -x "${install_dir}/spacetimedb-cli" ]]; then - echo "[start] 同步 Ubuntu SpacetimeDB CLI: ${install_dir}/spacetimedb-cli -> ${target_cli}" - mkdir -p "${root_bin}/current" + elif [[ -x "${install_dir}/spacetimedb-cli" && -x "${install_dir}/spacetimedb-standalone" ]]; then + echo "[start] 同步 Ubuntu SpacetimeDB 安装: ${install_dir} -> ${target_bin_dir}" + rm -rf "${target_bin_dir}" + mkdir -p "${target_bin_dir}" cp -f "${install_dir}/spacetimedb-cli" "${target_cli}" - chmod +x "${target_cli}" + cp -f "${install_dir}/spacetimedb-standalone" "${target_standalone}" + chmod +x "${target_cli}" "${target_standalone}" elif [[ -f "${resolved_command}" ]]; then parent_dir="$(cd -- "${install_dir}/.." && pwd)" - if [[ -d "${parent_dir}/bin" && -x "${parent_dir}/bin/current/spacetimedb-cli" ]]; then + if [[ -d "${parent_dir}/bin" && -x "${parent_dir}/bin/current/spacetimedb-cli" && -x "${parent_dir}/bin/current/spacetimedb-standalone" ]]; then echo "[start] 同步 Ubuntu SpacetimeDB 安装: ${parent_dir}/bin -> ${root_bin}" cp -a "${parent_dir}/bin/." "${root_bin}/" else - echo "[start] 同步 Ubuntu SpacetimeDB 命令: ${resolved_command} -> ${target_cli}" - mkdir -p "${root_bin}/current" - cp -f "${resolved_command}" "${target_cli}" - chmod +x "${target_cli}" + echo "[start] 未能从 spacetime 命令路径推断完整 SpacetimeDB 安装目录: ${resolved_command}" >&2 fi fi - if [[ ! -x "${target_cli}" ]]; then - echo "[start] 同步 SpacetimeDB 安装后仍未找到 ${target_cli}。" >&2 - echo "[start] 请确认 Ubuntu 上的 spacetime 安装目录包含 bin/current/spacetimedb-cli,或提供可执行的 spacetime 命令。" >&2 + if [[ ! -x "${target_cli}" || ! -x "${target_standalone}" ]]; then + echo "[start] 同步 SpacetimeDB 安装后仍未找到完整 current 目录。" >&2 + echo "[start] 需要同时存在: ${target_cli} 与 ${target_standalone}" >&2 + echo "[start] 请确认 Ubuntu 上的 spacetime 安装目录包含 spacetimedb-cli 和 spacetimedb-standalone。" >&2 exit 1 fi }