diff --git a/.codex/skills/gpt-image-2-apimart/SKILL.md b/.codex/skills/gpt-image-2-apimart/SKILL.md index a29d0889..543ed3be 100644 --- a/.codex/skills/gpt-image-2-apimart/SKILL.md +++ b/.codex/skills/gpt-image-2-apimart/SKILL.md @@ -43,6 +43,7 @@ Default body: "model": "gpt-image-2", "prompt": "", "n": 1, + "official_fallback": true, "size": "1:1" } ``` diff --git a/.codex/skills/gpt-image-2-apimart/scripts/generate-template-samples.mjs b/.codex/skills/gpt-image-2-apimart/scripts/generate-template-samples.mjs index c8732519..dbecf356 100644 --- a/.codex/skills/gpt-image-2-apimart/scripts/generate-template-samples.mjs +++ b/.codex/skills/gpt-image-2-apimart/scripts/generate-template-samples.mjs @@ -237,6 +237,7 @@ async function generateOne(env, template, outDir) { model: 'gpt-image-2', prompt: buildPrompt(template), n: 1, + official_fallback: true, size: '1:1', }; const payload = await fetchJson( @@ -304,6 +305,7 @@ if (dryRun) { model: 'gpt-image-2', prompt: buildPrompt(template), n: 1, + official_fallback: true, size: '1:1', }, })), diff --git a/.hermes/shared-memory/decision-log.md b/.hermes/shared-memory/decision-log.md index 72bddd80..e11246e2 100644 --- a/.hermes/shared-memory/decision-log.md +++ b/.hermes/shared-memory/decision-log.md @@ -16,6 +16,14 @@ --- +## 2026-05-08 APIMart 接口统一携带 `official_fallback` + +- 背景:APIMart 的图片生成和 Responses 接口在仓库内分散于 `api-server`、`platform-llm` 和本地 skill 脚本,若只修单点,容易出现不同入口的上游请求体不一致。 +- 决策:凡是仓库内调用 APIMart 的 OpenAI 兼容接口,请求体统一携带 `official_fallback: true`;其中图片生成请求直接固定写入,`platform-llm` 的 APIMart GPT-5 client 通过显式开关开启,不默认扩散到 Ark 等其它 provider。 +- 影响范围:`server-rs/crates/api-server/src/openai_image_generation.rs`、`server-rs/crates/api-server/src/puzzle.rs`、`server-rs/crates/api-server/src/state.rs`、`server-rs/crates/platform-llm/src/lib.rs`、`.codex/skills/gpt-image-2-apimart/` 和相关技术文档。 +- 验证方式:图片生成与 creative-agent APIMart 路径的单测都应断言 `official_fallback` 已写入请求 JSON;编码检查和相关 Rust 测试应持续通过。 +- 关联文档:`docs/technical/PUZZLE_APIMART_IMAGE_MODEL_ROUTING_2026-05-01.md`、`docs/technical/RPG_IMAGE_GENERATION_GPT_IMAGE_2_MIGRATION_2026-05-02.md`、`docs/technical/CREATIVE_INTERACTIVE_CONTENT_AGENT_TECHNICAL_SOLUTION_2026-05-05.md`。 + ## 2026-05-07 server-rs Cargo 依赖集中到 workspace - 背景:`server-rs` 多 crate 已稳定成 DDD workspace,成员 `Cargo.toml` 中重复散写第三方版本和本地 path 依赖,升级 SpacetimeDB SDK、`serde`、`reqwest`、`tokio` 等依赖时容易漂移。 diff --git a/.hermes/shared-memory/pitfalls.md b/.hermes/shared-memory/pitfalls.md index 31554084..6c86f356 100644 --- a/.hermes/shared-memory/pitfalls.md +++ b/.hermes/shared-memory/pitfalls.md @@ -77,13 +77,30 @@ - 验证:不打印密钥内容,只检查 `APIMART_API_KEY` 非空;重启后触发拼图生成不再返回本地配置缺失的 503。 - 关联:`docs/technical/PUZZLE_APIMART_IMAGE_MODEL_ROUTING_2026-05-01.md`、`.codex/skills/gpt-image-2-apimart/SKILL.md`。 +## 拼图图片生成 98% 后报 OSS V4 签名时间格式化失败 + +- 现象:拼图创作表单生成进度卡在 98%,`POST /api/runtime/puzzle/agent/sessions/{sessionId}/actions` 返回 `502 Bad Gateway`,前端提示 `拼图图片生成失败:OSS V4 签名时间格式化失败`。 +- 原因:`platform-oss` 曾用 `OffsetDateTime::time().to_string()` 拼接 `x-oss-date`,UTC 小时、分钟或秒为个位数时可能缺少前导零,导致 V4 签名时间不是固定 `YYYYMMDDTHHMMSSZ`。 +- 处理:OSS V4 签名日期统一显式补零格式化;签名 scope 用 `YYYYMMDD`,完整签名时间用 `YYYYMMDDTHHMMSSZ`,不要再依赖 `time().to_string()`。 +- 验证:运行 `cargo test -p platform-oss` 和 `cargo check -p api-server`;重启 `npm run api-server` 后检查 `/healthz`,再重新触发拼图生成。 +- 关联:`server-rs/crates/platform-oss/src/lib.rs`、`server-rs/crates/api-server/src/assets.rs`、`docs/technical/M6_OSS_SERVER_UPLOAD_AND_STS_POLICY_2026-04-21.md`。 + +## 拼图生成完成后图片只显示破图或 alt 文案 + +- 现象:拼图结果页生成完成后,“画面图”区域出现破图图标和作品名,图片无法正常预览;但打开历史拼图素材时同一张图可能可以正常预览。 +- 原因:拼图正式图保存为 `/generated-puzzle-assets/*` 兼容标识,旧 `/generated-*` 直读代理已删除;如果前端没有通过 `ResolvedAssetImage` / `/api/assets/read-url` 换签,或收到无前导斜杠的 `generated-puzzle-assets/*` object key 后未识别为 generated 私有资源,浏览器会直接请求裸路径并失败。生成完成后的结果图还会传入 `refreshKey`,它只能用于重新请求 `/api/assets/read-url`,不能给 OSS V4 签名 URL 追加 `_v`;OSS 会把 query 纳入签名,额外参数会让签名失效,历史素材常因未传 `refreshKey` 而表现正常。 +- 处理:拼图结果页、发布预览、运行态和历史素材预览都走 `ResolvedAssetImage` 或 `useResolvedAssetReadUrl`;`isGeneratedLegacyPath(...)` 必须同时识别 `/generated-*` 和 `generated-*`;`refreshKey` 只绕过前端签名缓存并重新换签,不修改已返回的 OSS 签名 URL;禁止恢复 `/generated-puzzle-assets` 直读代理。 +- 验证:运行 `npm run test -- src\services\assetReadUrlService.test.ts src\hooks\useResolvedAssetReadUrl.test.tsx src\components\puzzle-result\PuzzleResultView.test.tsx`,再触发一次真实生成确认 Network 中先请求 `/api/assets/read-url`,图片 `src` 为未追加 `_v` 的签名 URL。 +- 关联:`src/services/assetReadUrlService.ts`、`src/components/ResolvedAssetImage.tsx`、`docs/technical/PUZZLE_IMAGE_ASSET_PROXY_FIX_2026-04-27.md`。 + ## 本地短信登录页签突然消失 - 现象:登录弹窗只剩密码登录,短信登录页签看起来像被删掉,但 `LoginScreen` 中手机号验证码表单仍存在。 - 原因:前端根据 `GET /api/auth/login-options` 返回的 `availableLoginMethods` 渲染页签;常见根因有两类: - 本地启动脚本没有让 `.env.local` 覆盖 `.env`,`SMS_AUTH_ENABLED=true` 不生效,后端只返回 `["password"]`。 - Rust API 直连已返回 `["phone","password"]`,但 Vite 代理目标指向未监听端口,导致 3000 域名下的 `login-options` 返回 `500`,`AuthGate` 降级成 `["password"]`。 -- 处理:优先用 `npm run api-server`、`npm run dev:rust` 或 `npm run dev` 启动,这些入口应保持 shell 环境变量最高优先级,并允许 `.env.local` 覆盖 `.env`;完整栈启动时还要确保脚本计算出的 `RUST_SERVER_TARGET` 不被 `.env.local` 里的旧值覆盖。排查时先请求 3000 域名下的 `/api/auth/login-options`,再直连 Rust API 目标,并核对 `.env.local` 的 `SMS_AUTH_ENABLED` 与代理端口。 + - 3000 端口被旧 `dev:web` 占用后,新的完整栈 Vite 自动漂移到 3001/3002;浏览器仍打开旧 3000 页面,旧页面继续代理到已经下线的端口。 +- 处理:优先用 `npm run api-server`、`npm run dev:rust` 或 `npm run dev` 启动,这些入口应保持 shell 环境变量最高优先级,并允许 `.env.local` 覆盖 `.env`;完整栈启动时还要确保脚本计算出的 `RUST_SERVER_TARGET` 不被 `.env.local` 里的旧值覆盖。排查时先请求 3000 域名下的 `/api/auth/login-options`,再直连 Rust API 目标,并核对 `.env.local` 的 `SMS_AUTH_ENABLED` 与代理端口;若 3001/3002 才返回正确结果,说明当前 3000 是旧前端进程,应清理旧进程后重启。 - 验证:`http://127.0.0.1:3000/api/auth/login-options` 返回至少 `{"availableLoginMethods":["phone","password"]}` 后,登录弹窗会恢复短信登录页签和“获取验证码”按钮。 - 关联:`scripts/api-server-dev.mjs`、`scripts/api-server-maincloud.mjs`、`scripts/dev-rust-stack.sh`、`scripts/dev-web-rust.mjs`、`docs/technical/AUTH_LOGIN_OPTIONS_DESIGN_2026-04-21.md`。 diff --git a/deploy/env/api-server.env.example b/deploy/env/api-server.env.example index bd5af9d9..943d9a48 100644 --- a/deploy/env/api-server.env.example +++ b/deploy/env/api-server.env.example @@ -39,6 +39,20 @@ APIMART_BASE_URL= APIMART_API_KEY= APIMART_IMAGE_REQUEST_TIMEOUT_MS=180000 +VECTOR_ENGINE_BASE_URL= +VECTOR_ENGINE_API_KEY= +VECTOR_ENGINE_AUDIO_REQUEST_TIMEOUT_MS=180000 + +VOLCENGINE_SPEECH_API_KEY= +VOLCENGINE_SPEECH_APP_ID= +VOLCENGINE_SPEECH_ACCESS_KEY= +VOLCENGINE_SPEECH_ASR_RESOURCE_ID=volc.seedasr.sauc.concurrent +VOLCENGINE_SPEECH_TTS_RESOURCE_ID=seed-tts-2.0 +VOLCENGINE_SPEECH_REQUEST_TIMEOUT_MS=180000 +VOLCENGINE_SPEECH_ASR_WS_URL=wss://openspeech.bytedance.com/api/v3/sauc/bigmodel_async +VOLCENGINE_SPEECH_TTS_BIDIRECTION_WS_URL=wss://openspeech.bytedance.com/api/v3/tts/bidirection +VOLCENGINE_SPEECH_TTS_SSE_URL=https://openspeech.bytedance.com/api/v3/tts/unidirectional/sse + ARK_CHARACTER_VIDEO_BASE_URL= ARK_CHARACTER_VIDEO_API_KEY= ARK_CHARACTER_VIDEO_MODEL= diff --git a/docs/design/MOBILE_CREATION_NEW_WORK_COMPACT_LAYOUT_2026-04-24.md b/docs/design/MOBILE_CREATION_NEW_WORK_COMPACT_LAYOUT_2026-04-24.md index 0911539f..35ba0e79 100644 --- a/docs/design/MOBILE_CREATION_NEW_WORK_COMPACT_LAYOUT_2026-04-24.md +++ b/docs/design/MOBILE_CREATION_NEW_WORK_COMPACT_LAYOUT_2026-04-24.md @@ -37,3 +37,4 @@ 4. 移动端卡片仍以紧凑横滑为主,参考图使用暗色遮罩承接标题,文本不得溢出卡片。 5. 当前创作 Tab 的可见玩法卡必须真实渲染 `img`,不能只在隐藏弹窗或旧入口中配置图片。 6. 参考图卡片上的标题和副标题必须显式使用白色文字,并配合底部加深渐变与文字阴影;禁止依赖 `text-inherit`,避免黑字叠在暗蒙版上。 +7. 当前创作 Tab 顶部不再保留“10分钟创作一个精品互动玩法”标题,玩法参考图卡带直接作为首屏入口;移动端卡带必须支持横向拖动滑动。 diff --git a/docs/design/PLATFORM_MOBILE_RECOMMEND_DISCOVER_DRAFT_TAB_REDESIGN_2026-05-05.md b/docs/design/PLATFORM_MOBILE_RECOMMEND_DISCOVER_DRAFT_TAB_REDESIGN_2026-05-05.md index 7d900162..87807c71 100644 --- a/docs/design/PLATFORM_MOBILE_RECOMMEND_DISCOVER_DRAFT_TAB_REDESIGN_2026-05-05.md +++ b/docs/design/PLATFORM_MOBILE_RECOMMEND_DISCOVER_DRAFT_TAB_REDESIGN_2026-05-05.md @@ -27,12 +27,15 @@ ## 3. 推荐页 -移动端推荐页默认不展示搜索和频道横滑条,进入一级“推荐”后直接渲染公开作品流: +移动端推荐页默认不展示搜索和频道横滑条,进入一级“推荐”后直接渲染公开作品启动后的内容: - 数据来源沿用 `featuredEntries + latestEntries` 去重后的公开作品列表。 -- 卡片保留现有作品读模型字段:封面、作者、游玩、改造、点赞、标签。 -- 移动端推荐卡使用近全屏大画面比例,底部展示互动指标、作者和主操作,不写规则说明类文案。 -- 无数据、加载中和错误态沿用短状态文案。 +- 首个可运行作品自动进入推荐页内嵌运行态,主视口不再展示作品封面卡。 +- 主视口占据顶部栏与底部作品区之间的主要空间,保持黑色运行容器、圆角边界和短加载态,直接承载作品启动后的玩法画面。 +- 主视口下方展示当前作品的游玩、点赞、评论/改造等紧凑指标、作者头像、作者名与作品名,不写规则说明类文案。 +- 底部作品区使用横向滑动切换器,条目只展示作品名、类型和核心指标;点击或滑动到其他作品时切换上方运行内容。 +- 点击作品元信息仍可进入既有详情页,点赞、改造、复制作品号等完整操作继续收敛在详情页。 +- 无数据、加载中、启动失败和暂不支持内嵌运行的作品沿用短状态文案。 桌面端仍保持现有首页布局,只把一级导航文案从“首页”改为“推荐”。 @@ -72,12 +75,13 @@ ## 7. 验收 1. 移动端底部导航显示“推荐 / 发现 / 创作 / 草稿 / 我的”,未登录时显示“推荐 / 创作 / 发现”。 -2. 点击“推荐”直接看到公开作品推荐流,不再先看到搜索框和频道 Tab。 +2. 点击“推荐”直接看到公开作品启动后的内容,不再先看到搜索框、频道 Tab 或封面卡流。 3. 点击“发现”可看到搜索、推荐、今日、分类、排行子 Tab。 4. 点击“草稿”看到原创作页作品列表。 5. 点击“创作”只看到新建创作入口。 6. “我的”里的“玩过”弹层包含原存档列表入口,点击存档能继续恢复。 7. 移动端底部导航为悬浮胶囊样式,保留当前明暗主题色变量,不新增第三套主题。 +8. 推荐页底部作品区可横向滑动并切换作品,切换后上方运行视口同步进入对应作品内容。 ## 8. 2026-05-07 未登录三栏补充 diff --git a/docs/design/PUZZLE_RUNTIME_HEADER_AND_CLEAR_EFFECT_DESIGN_2026-04-27.md b/docs/design/PUZZLE_RUNTIME_HEADER_AND_CLEAR_EFFECT_DESIGN_2026-04-27.md index 71574a80..a062a099 100644 --- a/docs/design/PUZZLE_RUNTIME_HEADER_AND_CLEAR_EFFECT_DESIGN_2026-04-27.md +++ b/docs/design/PUZZLE_RUNTIME_HEADER_AND_CLEAR_EFFECT_DESIGN_2026-04-27.md @@ -45,6 +45,7 @@ 1. 玩家需要优先依赖画面主体、构图和色块识别位置。 2. 编号覆盖会削弱“完整图片被逐步复原”的视觉奖励。 +3. 合成后的拼图块只保留原图切片、外轮廓描边和必要的拖拽层级,不叠加额外的色块、暗层或蒙版,避免破坏原图识别。 ### 3. 设置能力 diff --git a/docs/experience/AGENT_UI_CHANGELOG.md b/docs/experience/AGENT_UI_CHANGELOG.md index 7fa7c705..83c84c05 100644 --- a/docs/experience/AGENT_UI_CHANGELOG.md +++ b/docs/experience/AGENT_UI_CHANGELOG.md @@ -118,6 +118,11 @@ - 早期方案曾在 `AuthGate` 层提供右上角全局账号信息条,并在 `GameShellRuntime` 中临时隐藏。 - 2026-04-20 起,这个全局悬浮入口已整体下线,不再区分“平台显示 / 冒险隐藏”。 - 原因是右上角高频观察区不适合承载账号入口,且平台内已经有更明确的页面内入口。 + +## 9. 2026-05-08 创作首页通知入口下线 + +- `CreativeAgentHome` 顶栏右上角不再展示“通知与账户”按钮,避免创作首页把通知入口放在首屏高频区域。 +- 账号入口仍保留在侧边栏底部,创作首页顶栏维持左侧菜单、居中品牌的轻量结构。 - 当前账号相关入口统一保留在平台首页受保护动作、个人页、存档页与账号弹窗,不再占用全局悬浮层。 --- diff --git a/docs/prd/AI_NATIVE_CUSTOM_WORLD_CREATION_HUB_PRD_2026-04-13.md b/docs/prd/AI_NATIVE_CUSTOM_WORLD_CREATION_HUB_PRD_2026-04-13.md index 58044fec..7f2592db 100644 --- a/docs/prd/AI_NATIVE_CUSTOM_WORLD_CREATION_HUB_PRD_2026-04-13.md +++ b/docs/prd/AI_NATIVE_CUSTOM_WORLD_CREATION_HUB_PRD_2026-04-13.md @@ -468,8 +468,9 @@ interface CustomWorldCoverProfile { 1. 上传后先进入独立裁剪面板 2. 裁剪框比例固定为 `16:9` -3. 作者只能平移和缩放,不允许自由改比例 -4. 裁剪完成后,再提交给后端保存 +3. 作者直接在图片上拖拽裁剪框内部移动区域,拖拽四边或四角调整裁剪范围,不再通过参数滑杆调整 +4. 裁剪框调整过程中必须持续锁定 `16:9`,不允许自由改比例 +5. 裁剪完成后,再提交给后端保存 ### 上传大小与格式限制 diff --git a/docs/prd/AI_NATIVE_VISUAL_NOVEL_TEMPLATE_PRD_2026-05-05.md b/docs/prd/AI_NATIVE_VISUAL_NOVEL_TEMPLATE_PRD_2026-05-05.md index dae3ca55..2d817456 100644 --- a/docs/prd/AI_NATIVE_VISUAL_NOVEL_TEMPLATE_PRD_2026-05-05.md +++ b/docs/prd/AI_NATIVE_VISUAL_NOVEL_TEMPLATE_PRD_2026-05-05.md @@ -1536,6 +1536,14 @@ VN-10 实施收口记录(2026-05-07): 4. 运行时背景图与角色立绘改用 `ResolvedAssetImage`,私有 generated 路径通过 `/api/assets/read-url` 换签后渲染。 5. 本阶段未接入可选图片生成 action,保留既有 action 契约;未改 SpacetimeDB schema,未保存 Data URL、二进制对象或外部 R2 路径。 +VN-10 音频生成补充(2026-05-08): + +1. 场景编辑器在既有 `musicSrc` / `ambientSoundSrc` 槽位上接入音频生成,不新增视觉小说专属音频表或第二套资产系统。 +2. 背景音乐走 VectorEngine Suno `POST /suno/submit/music` 与 `GET /suno/fetch/{task_id}`,完成后写入 `visual_novel_music` 平台资产并绑定到 `visual_novel_scene / music`。 +3. 场景音效走 VectorEngine Vidu `POST /ent/v2/text2audio` 与 `GET /ent/v2/tasks/{id}/creations`,完成后写入 `visual_novel_ambient_sound` 平台资产并绑定到 `visual_novel_scene / ambient_sound`。 +4. 前端只提交提示词、标题、标签、时长等生成参数;供应商 base URL 和 API key 只在 `api-server` 环境变量中读取。 +5. 生成参数使用独立弹层承载,不在场景面板下方展开规则说明。 + ### VN-11:回放删除与外部平台功能负向扫描 负责范围: diff --git a/docs/prd/FIRST_LAUNCH_PUZZLE_ONBOARDING_PRD_2026-05-05.md b/docs/prd/FIRST_LAUNCH_PUZZLE_ONBOARDING_PRD_2026-05-05.md index dd2406d5..03e45196 100644 --- a/docs/prd/FIRST_LAUNCH_PUZZLE_ONBOARDING_PRD_2026-05-05.md +++ b/docs/prd/FIRST_LAUNCH_PUZZLE_ONBOARDING_PRD_2026-05-05.md @@ -37,7 +37,7 @@ ## 5. 范围边界 -1. 不增加跳过入口。 +1. 首屏右上角提供 `跳过` 入口,点击后写入首次访问标记并回到产品首页。 2. 不定义额外功能说明文案。 3. 不扩展拼图为多关。 4. 不调整注册/登录后的去向,当前进入产品首页。 @@ -47,10 +47,12 @@ 1. 未登录首次访问产品时,进入新手引导首屏。 2. 首屏展示确认文案、输入框和生成按钮。 -3. 用户输入内容并点击生成后,系统生成 1 关拼图。 -4. 生成完成后,用户可以进入该拼图并完成第 1 关。 -5. 第 1 关完成后,页面展示注册/登录引导文案和登录模块。 -6. 用户完成注册或登录后,进入产品首页。 +3. 首屏右上角展示跳过按钮;点击后本次和后续访问不再自动展示新手引导。 +4. 用户输入内容并点击生成后,系统生成 1 关拼图。 +5. 若临时生成接口返回 `404 / 资源不存在`,前端使用本地临时拼图兜底继续进入试玩,不把错误直接展示给用户。 +6. 生成完成后,用户可以进入该拼图并完成第 1 关。 +7. 第 1 关完成后,页面展示注册/登录引导文案和登录模块。 +8. 用户完成注册或登录后,进入产品首页。 ## 7. 落地接口与状态 @@ -59,3 +61,4 @@ 3. 登录后保存入口为 `POST /api/runtime/puzzle/onboarding/save`,要求登录;服务端为当前用户创建拼图 agent session,并把临时 1 关拼图保存为当前用户作品草稿。 4. 新手引导游玩阶段复用现有本地拼图运行时,不新增 SpacetimeDB 表、reducer 或运行时真相。 5. 保存完成后清空新手引导临时态,刷新拼图作品架,并回到产品首页。 +6. 跳过新手引导只更新本地首次访问标记和界面状态,不创建临时作品、不调用保存接口。 diff --git a/docs/technical/API_SERVER_EXTERNAL_SERVICE_ENV_CONFIG_2026-05-07.md b/docs/technical/API_SERVER_EXTERNAL_SERVICE_ENV_CONFIG_2026-05-07.md index a89274c3..f727a5e8 100644 --- a/docs/technical/API_SERVER_EXTERNAL_SERVICE_ENV_CONFIG_2026-05-07.md +++ b/docs/technical/API_SERVER_EXTERNAL_SERVICE_ENV_CONFIG_2026-05-07.md @@ -39,6 +39,22 @@ APIMART_BASE_URL= APIMART_API_KEY= APIMART_IMAGE_REQUEST_TIMEOUT_MS=180000 +# VectorEngine / Suno / Vidu 音频生成网关 +VECTOR_ENGINE_BASE_URL= +VECTOR_ENGINE_API_KEY= +VECTOR_ENGINE_AUDIO_REQUEST_TIMEOUT_MS=180000 + +# 火山引擎豆包语音 ASR / TTS +VOLCENGINE_SPEECH_API_KEY= +VOLCENGINE_SPEECH_APP_ID= +VOLCENGINE_SPEECH_ACCESS_KEY= +VOLCENGINE_SPEECH_ASR_RESOURCE_ID=volc.seedasr.sauc.concurrent +VOLCENGINE_SPEECH_TTS_RESOURCE_ID=seed-tts-2.0 +VOLCENGINE_SPEECH_REQUEST_TIMEOUT_MS=180000 +VOLCENGINE_SPEECH_ASR_WS_URL=wss://openspeech.bytedance.com/api/v3/sauc/bigmodel_async +VOLCENGINE_SPEECH_TTS_BIDIRECTION_WS_URL=wss://openspeech.bytedance.com/api/v3/tts/bidirection +VOLCENGINE_SPEECH_TTS_SSE_URL=https://openspeech.bytedance.com/api/v3/tts/unidirectional/sse + # DashScope 图片模型名 DASHSCOPE_SCENE_IMAGE_MODEL= DASHSCOPE_REFERENCE_IMAGE_MODEL= @@ -65,6 +81,9 @@ DASHSCOPE_COVER_IMAGE_MODEL / DASHSCOPE_IMAGE_MODEL ARK_CHARACTER_VIDEO_BASE_URL / ARK_BASE_URL / GENARRATIVE_LLM_BASE_URL / LLM_BASE_URL ARK_CHARACTER_VIDEO_API_KEY / ARK_API_KEY / GENARRATIVE_LLM_API_KEY / LLM_API_KEY ARK_CHARACTER_VIDEO_MODEL / DASHSCOPE_CHARACTER_VIDEO_MODEL +VOLCENGINE_SPEECH_API_KEY / VOLCENGINE_API_KEY +VOLCENGINE_SPEECH_APP_ID / VOLCENGINE_ACCESS_KEY_ID +VOLCENGINE_SPEECH_ACCESS_KEY / VOLCENGINE_SECRET_ACCESS_KEY ``` ## 运行时行为 @@ -74,6 +93,8 @@ ARK_CHARACTER_VIDEO_MODEL / DASHSCOPE_CHARACTER_VIDEO_MODEL 3. 文本 LLM provider 为 `ark` 且未配置 `GENARRATIVE_LLM_BASE_URL` 时,仍回退到 Ark 公开基础 URL。 4. 角色视频 provider 复用 Ark 且未配置 `ARK_CHARACTER_VIDEO_BASE_URL` 时,仍回退到 Ark 公开基础 URL。 5. 具体模型名缺失时不在配置层伪造默认模型,调用到对应能力时由下游配置校验返回缺配置错误。 +6. VectorEngine 音频生成只读取 `VECTOR_ENGINE_BASE_URL` / `VECTOR_ENGINE_API_KEY`,不复用 `APIMART_*`、`GENARRATIVE_LLM_*` 或前端变量。 +7. 火山引擎语音能力由 `platform-speech` 收口协议帧与上游鉴权,`api-server` 只暴露平台鉴权后的代理路由,不向前端返回任何密钥字段。 ## 示例文件 diff --git a/docs/technical/AUTH_LOGIN_OPTIONS_DESIGN_2026-04-21.md b/docs/technical/AUTH_LOGIN_OPTIONS_DESIGN_2026-04-21.md index 987c113b..c90935f2 100644 --- a/docs/technical/AUTH_LOGIN_OPTIONS_DESIGN_2026-04-21.md +++ b/docs/technical/AUTH_LOGIN_OPTIONS_DESIGN_2026-04-21.md @@ -135,4 +135,5 @@ 2. 再请求当前 Rust API 目标,例如 `http://127.0.0.1:3100/api/auth/login-options` 或 `http://127.0.0.1:8082/api/auth/login-options`。 3. 若直连 API 成功而 3000 返回 `500`,检查 `RUST_SERVER_TARGET`、`GENARRATIVE_API_TARGET`、`GENARRATIVE_RUNTIME_SERVER_TARGET` 是否指向仍在监听的 API 端口。 4. `npm run dev` / `npm run dev:rust` 完整栈默认由脚本计算 API 端口;加载 `.env.local` 给后端使用后,脚本必须重新固定 `RUST_SERVER_TARGET`,避免 `.env.local` 中的旧代理目标覆盖本次启动的实际 API 端口。 -5. `npm run dev:web` 只启动前端,不会自动拉起 Rust API;如果单独使用它,必须同时确认其打印的 backend target 已有 `api-server` 正在监听。 +5. `npm run dev:web` 只启动前端,不会自动拉起 Rust API;如果单独使用它,脚本会先探测 `.env.local` / 当前环境里声明的目标,再回退到本机常见端口,最终只会接入一个真实可用的 `api-server`。 +6. 如果 `3000` 仍然返回 `500`,先确认浏览器是不是还开着旧的前端进程。当前脚本如果因为端口占用漂移到 `3001` / `3002`,应直接关掉旧进程后重启,而不是继续用旧的 3000 页面判断登录入口状态。 diff --git a/docs/technical/CREATIVE_INTERACTIVE_CONTENT_AGENT_TECHNICAL_SOLUTION_2026-05-05.md b/docs/technical/CREATIVE_INTERACTIVE_CONTENT_AGENT_TECHNICAL_SOLUTION_2026-05-05.md index f6e7caaf..82633389 100644 --- a/docs/technical/CREATIVE_INTERACTIVE_CONTENT_AGENT_TECHNICAL_SOLUTION_2026-05-05.md +++ b/docs/technical/CREATIVE_INTERACTIVE_CONTENT_AGENT_TECHNICAL_SOLUTION_2026-05-05.md @@ -28,6 +28,7 @@ ```text POST https://api.apimart.ai/v1/responses model: gpt-5 +official_fallback: true input[].content[].type: input_text | input_image ``` @@ -42,6 +43,7 @@ export interface CreativeAgentMultimodalInputPart { export interface CreativeAgentGpt5Request { model: 'gpt-5'; + official_fallback: true; input: Array<{ role: 'system' | 'user' | 'assistant'; content: CreativeAgentMultimodalInputPart[]; @@ -55,7 +57,7 @@ export interface CreativeAgentGpt5Request { 1. Agent 入口支持文本 + 图片多模态输入,首版至少支持 1 张图片,协议层预留多图。 2. 图片必须先进入资产系统,Agent 请求使用可访问的 `readUrl` 或受控 Data URI;SpacetimeDB 不保存大图 base64。 -3. `platform-llm` 当前已有 Responses 协议骨架,但 Phase 1 需要把 content part 从纯文本扩展成 `input_text` / `input_image` 两类。 +3. `platform-llm` 当前已有 Responses 协议骨架,但 Phase 1 需要把 content part 从纯文本扩展成 `input_text` / `input_image` 两类;APIMart GPT-5 client 必须显式开启 `official_fallback = true`,该供应商字段不默认扩散到 Ark 等非 APIMart provider。 4. `CREATION_TEMPLATE_LLM_MODEL` 等旧文本创作模型不能作为创意互动内容 Agent 的默认模型;本 Agent 必须显式使用 `gpt-5`。 5. 如果 LangChain-Rust adapter 暂时无法直接表达多模态 Responses 请求,应在 `platform-agent` 内桥接到 `platform-llm` 的多模态 Responses client,不能退回纯文本摘要替代图片理解。 6. 模型工具调用可用 APIMart Responses 的 `tools` 能力承载;工具真正执行仍由 `platform-agent` 注册表和后端 typed Tool 控制。 diff --git a/docs/technical/M6_CUSTOM_WORLD_ASSET_OSS_INTEGRATION_STAGE2_2026-04-22.md b/docs/technical/M6_CUSTOM_WORLD_ASSET_OSS_INTEGRATION_STAGE2_2026-04-22.md index 6594ee30..bfe68535 100644 --- a/docs/technical/M6_CUSTOM_WORLD_ASSET_OSS_INTEGRATION_STAGE2_2026-04-22.md +++ b/docs/technical/M6_CUSTOM_WORLD_ASSET_OSS_INTEGRATION_STAGE2_2026-04-22.md @@ -108,6 +108,12 @@ Node 旧链路对上传封面有明确处理: Rust 本批必须保持这组兼容约束。 +2026-05-08 交互补充: + +1. 前端裁剪面板不再展示 `缩放 / 左右位置 / 上下位置` 参数滑杆。 +2. 作者直接在图片上拖拽裁剪框内部移动区域,或拖拽四边、四角调整裁剪范围。 +3. 调整过程中前端持续锁定 `16:9`,确认时仍只提交后端兼容的 `cropRect`。 + ## 4. 请求与响应 contract ### 4.1 `POST /api/custom-world/scene-image` diff --git a/docs/technical/M6_OSS_SERVER_UPLOAD_AND_STS_POLICY_2026-04-21.md b/docs/technical/M6_OSS_SERVER_UPLOAD_AND_STS_POLICY_2026-04-21.md index 6db24370..b94c5620 100644 --- a/docs/technical/M6_OSS_SERVER_UPLOAD_AND_STS_POLICY_2026-04-21.md +++ b/docs/technical/M6_OSS_SERVER_UPLOAD_AND_STS_POLICY_2026-04-21.md @@ -122,6 +122,12 @@ 2. AI worker 绕过确认链路写出不完整记录 3. 把 OSS 响应中的派生 URL 当成对象真相 +### 5.1 OSS V4 签名时间格式 + +`platform-oss` 的 OSS V4 签名时间必须显式格式化为 `YYYYMMDDTHHMMSSZ`,签名 scope 日期必须显式格式化为 `YYYYMMDD`。实现中不要依赖 `OffsetDateTime::time().to_string()` 或 `date().to_string()` 再替换字符,因为 UTC 小时、分钟、秒为个位数时可能不会保留前导零,导致 AI 生图已完成但上传 OSS 阶段报 `OSS V4 签名时间格式化失败`。 + +拼图、视觉小说、自定义世界等所有服务端生成图片上传链路都复用同一套 `platform-oss` 签名 helper;新增签名逻辑时必须覆盖个位时间分量的测试样例,例如 `05:03:09` 应输出 `T050309Z`。 + ## 6. 与 Web 端的边界 Web 端当前只允许: diff --git a/docs/technical/PUZZLE_APIMART_IMAGE_MODEL_ROUTING_2026-05-01.md b/docs/technical/PUZZLE_APIMART_IMAGE_MODEL_ROUTING_2026-05-01.md index e297bc97..b1732ad7 100644 --- a/docs/technical/PUZZLE_APIMART_IMAGE_MODEL_ROUTING_2026-05-01.md +++ b/docs/technical/PUZZLE_APIMART_IMAGE_MODEL_ROUTING_2026-05-01.md @@ -9,7 +9,7 @@ 1. `https://docs.apimart.ai/cn/api-reference/images/gpt-image-2/generation` 2. `https://docs.apimart.ai/cn/api-reference/images/gemini-3.1-flash/generation` -两条文档均指向 OpenAI 兼容风格的图片生成入口:`POST https://api.apimart.ai/v1/images/generations`,头部使用 `Authorization: Bearer {APIMART_API_KEY}`。请求体至少包含 `model`、`prompt`、`n`、`size`。返回体按 OpenAI images 兼容格式优先读取 `data[].url`,若供应商返回异步任务结构,则继续按 `task_id` / `tasks/{task_id}` 轮询并提取图片 URL。 +两条文档均指向 OpenAI 兼容风格的图片生成入口:`POST https://api.apimart.ai/v1/images/generations`,头部使用 `Authorization: Bearer {APIMART_API_KEY}`。请求体至少包含 `model`、`prompt`、`n`、`official_fallback = true`、`size`。返回体按 OpenAI images 兼容格式优先读取 `data[].url`,若供应商返回异步任务结构,则继续按 `task_id` / `tasks/{task_id}` 轮询并提取图片 URL。 ## 模型选项 @@ -43,7 +43,7 @@ - `gpt-image-2` 走 APIMart; - `gemini-3.1-flash-image-preview` 走 APIMart,前端显示名为 `nanobanana2`。 4. APIMart 文生图和图生图共用 `POST /v1/images/generations`。有参考图时,后端将参考图 Data URL 作为 `image_urls` 数组传入;若上游不接受该字段,错误按上游失败返回,不在前端降级伪造结果。 -5. APIMart 尺寸使用文档要求的比例写法 `1:1`。`gemini-3.1-flash-image-preview` 额外带 `resolution = "1K"`,对齐约 1024px 的拼图正方形素材。 +5. APIMart 尺寸使用文档要求的比例写法 `1:1`,所有 APIMart 图片请求体固定携带 `official_fallback = true`。`gemini-3.1-flash-image-preview` 额外带 `resolution = "1K"`,对齐约 1024px 的拼图正方形素材。 6. APIMart 生成成功后仍下载远程图片,沿用现有 OSS 私有对象、`asset_object` 和 `asset_entity_binding` 写入流程。若图片已成功上传 OSS,但 Maincloud / SpacetimeDB 短暂返回 `503 Service Unavailable`,资产索引写入允许降级跳过,并返回本次生成图片;日志必须记录 `拼图图片资产索引写入因 SpacetimeDB 连接不可用而降级跳过`。 7. `save_puzzle_generated_images` 写回草稿时若遇到 Maincloud 连接级 `503` 或断线,API 层基于本次生成结果合成 session 快照返回给前端,避免 APIMart 已成功出图却被后置持久化误报成服务不可用。余额不足、参数错误、上游生图失败仍按原错误返回,不做伪成功。 8. 结果页 `generate_puzzle_images` 会携带当前作品信息和 `levelsJson`。当 Maincloud / SpacetimeDB 在读取 session 阶段就返回连接级 `503` 或断线时,后端必须先用这份结果页快照构造最小内存 session,再继续调用 APIMart;外部图片已经生成后仍按第 6、7 条处理持久化降级。余额不足、参数错误、缺少草稿快照、关卡不存在等业务错误不走此降级。 @@ -51,6 +51,14 @@ 10. APIMart 错误统一映射为 `502 UPSTREAM_ERROR`,`details.provider = "apimart"`,保留上游状态码、业务 message 和截断后的 raw excerpt。 11. 拼图首图生成 `compile_puzzle_draft` 与关卡图片生成 `generate_puzzle_images` 每次预扣 `2` 光点;余额不足仍返回 `409 CONFLICT`,Maincloud 连接级 503 仍按既有降级策略处理。 +## 关卡名多模态生成 + +1. 第一关和结果页关卡重新生图的最终关卡名统一由 APIMart Chat Completions `gpt-4o-mini` 生成。 +2. 输入必须同时包含生成完成后的正式图片和当前关卡 `pictureDescription`;图片由 `api-server` 从生成结果字节压缩为最多 768 边长的 PNG Data URL 后,以 OpenAI 兼容 `messages[].content[]` 的 `image_url` 形式传入。 +3. 文本模型仍只输出 `{"levelName":"..."}`,并继续复用现有 2 到 8 个中文字符、禁用“画面 / 拼图 / 作品”等泛词的解析与归一化规则。 +4. `gpt-4o-mini` 调用失败、返回非法或 APIMart 文本配置缺失时,不阻断图片生成;后端保留图片生成前的文本关卡名或确定性兜底名。 +5. 关卡名与候选图在同一次 `save_puzzle_generated_images` 中写回 `levels_json` 和正式候选图,避免图片与关卡名不同步。 + ## 环境变量 新增服务端环境变量: @@ -69,7 +77,8 @@ APIMART_IMAGE_REQUEST_TIMEOUT_MS=180000 2. 点击“生成草稿”时,后端首图生成使用当前表单选择的模型。 3. 点击“生成画面 / 重新生成画面”时,后端当前关卡图片生成使用关卡详情选择的模型。 4. 历史 `original` 或空模型值不会再触发 DashScope,统一按 `gpt-image-2` 请求 APIMart。 -5. 选择 APIMart 模型时,请求 `POST {APIMART_BASE_URL}/images/generations`,使用 `Authorization: Bearer {APIMART_API_KEY}`,`model` 等于请求值,`size = 1:1`。 -6. “生成草稿”和关卡详情生图按钮展示 `消耗2光点`;关卡详情确认后展示 30 秒预计剩余进度条。 -7. 不改 SpacetimeDB 表结构,因此无需更新 `migration.rs` 或重新生成 bindings。 -8. 后端改动后运行对应 Rust 测试,并按项目约束用 `npm run api-server` 重启验证。 +5. 选择 APIMart 模型时,请求 `POST {APIMART_BASE_URL}/images/generations`,使用 `Authorization: Bearer {APIMART_API_KEY}`,`model` 等于请求值,`official_fallback = true`,`size = 1:1`。 +6. 首图和结果页关卡重新生图成功后,Network 中应先完成图片生成,再调用 APIMart `POST {APIMART_BASE_URL}/chat/completions`,请求模型为 `gpt-4o-mini`,消息同时包含画面描述文本和正式图 `image_url` Data URL。 +7. “生成草稿”和关卡详情生图按钮展示 `消耗2光点`;关卡详情确认后展示 30 秒预计剩余进度条。 +8. 不改 SpacetimeDB 表结构,因此无需更新 `migration.rs` 或重新生成 bindings。 +9. 后端改动后运行对应 Rust 测试,并按项目约束用 `npm run api-server` 重启验证。 diff --git a/docs/technical/PUZZLE_IMAGE_ASSET_PROXY_FIX_2026-04-27.md b/docs/technical/PUZZLE_IMAGE_ASSET_PROXY_FIX_2026-04-27.md index 97eb1f04..714bc2d3 100644 --- a/docs/technical/PUZZLE_IMAGE_ASSET_PROXY_FIX_2026-04-27.md +++ b/docs/technical/PUZZLE_IMAGE_ASSET_PROXY_FIX_2026-04-27.md @@ -1,31 +1,40 @@ -# 拼图生成图片资源代理修复 +# 拼图生成图片读取链路修复 日期:`2026-04-27` +更新:`2026-05-08` + ## 背景 拼图结果页的“生成或更换图片”会在 `api-server` 中调用 DashScope 生成图片,再把候选图上传到 OSS,最终以 `/generated-puzzle-assets/...` 旧兼容路径写回 `PuzzleGeneratedImageCandidate.image_src` 与草稿封面字段。 -本次排查发现拼图图片写入路径已经进入 `platform-oss::LegacyAssetPrefix::PuzzleAssets`,但后端 Axum 旧资源代理和 Vite 本地代理没有挂载 `/generated-puzzle-assets`。这会导致候选图或正式图无法读取;后续如果把已有候选图作为参考图继续更换图片,也会让参考图读取链路失效。 +历史排查曾发现拼图图片写入路径已经进入 `platform-oss::LegacyAssetPrefix::PuzzleAssets`,但后端 Axum 旧资源代理和 Vite 本地代理没有挂载 `/generated-puzzle-assets`。当时的处理口径是补旧资源代理。 + +当前 `WP-DEL` 后,旧 `/generated-*` 直读代理已经物理删除;`/generated-puzzle-assets/...` 只允许作为 `legacyPublicPath` / OSS object key 标识。浏览器不能再直接请求该路径,必须通过 `/api/assets/read-url?legacyPublicPath=...` 换取短期签名 URL 后预览。 ## 修复口径 -1. `server-rs/crates/api-server/src/legacy_generated_assets.rs` 增加 `proxy_generated_puzzle_assets(...)`,复用统一的 OSS 签名读取逻辑。 -2. `server-rs/crates/api-server/src/app.rs` 挂载 `/generated-puzzle-assets/{*path}`,与角色、大鱼、自定义世界图片资源前缀保持一致。 -3. `vite.config.ts` 增加 `/generated-puzzle-assets` dev proxy,保证本地网页端不会因为 Vite 代理缺口读不到后端资源。 +1. `platform-oss::LEGACY_PUBLIC_PREFIXES` 必须包含 `generated-puzzle-assets`,保持直传票据、服务端上传、read-url 支持列表和错误提示同一口径。 +2. `src/services/assetReadUrlService.ts` 的 `isGeneratedLegacyPath(...)` 需要同时识别 `/generated-puzzle-assets/...` 和 `generated-puzzle-assets/...`。后者可能来自 object key 形态的历史字段。 +3. `ResolvedAssetImage` / `useResolvedAssetReadUrl` 在签名 URL 返回前不能把裸 `/generated-*` 或 `generated-*` 写进 ``。 +4. `refreshKey` 只能用于跳过前端签名 URL 缓存并重新请求 `/api/assets/read-url`,不能再给 OSS V4 签名 URL 追加 `_v` 等额外 query;OSS 会把 query 纳入签名,额外参数会让新生成图变成 403/破图。 +5. 历史素材被选为参考图后,参考图小预览也必须走 `ResolvedAssetImage`,不能使用裸 ``。 +6. 禁止恢复 `/generated-puzzle-assets/{*path}` Axum 路由或 Vite 直读代理;正式读取统一走 `/api/assets/read-url`。 ## 后续约束 1. 任何新增 `LegacyAssetPrefix` 都必须同时检查: - `platform-oss` 前缀枚举 - - `api-server` 旧资源代理路由 - - Vite dev proxy + - `platform-oss::LEGACY_PUBLIC_PREFIXES` + - `/api/assets/read-url` 入参解析 - 前端 `isGeneratedLegacyPath(...)` 是否能识别 2. 拼图候选图 JSON 仍保持 SpacetimeDB 持久化结构 `PuzzleGeneratedImageCandidate` 的 snake_case 字段,不把 HTTP camelCase 响应结构写入 `draft_json`。 3. 图片生成、OSS 读写和外部参考图解析继续留在 `api-server`,不能下沉到 SpacetimeDB reducer。 +4. 如果图片组件需要刷新 generated 私有资源,优先让 `refreshKey` 触发重新换签;不要修改已返回的 `signedUrl`。 ## 验收 -1. `npm run check:encoding` -2. `cargo check -p api-server --manifest-path server-rs/Cargo.toml` -3. `npm run api-server` 重启后,点击拼图结果页“生成或更换图片”,候选图应能写回并正常展示。 +1. `npm run test -- src\services\assetReadUrlService.test.ts src\hooks\useResolvedAssetReadUrl.test.tsx src\components\puzzle-result\PuzzleResultView.test.tsx` +2. `cargo test -p platform-oss --manifest-path server-rs\Cargo.toml` +3. `npm run check:encoding` +4. `npm run api-server` 重启后检查 `/healthz`,再点击拼图结果页“生成或更换图片”,候选图应能写回并正常展示。 diff --git a/docs/technical/PUZZLE_PICTURE_ONLY_CREATION_AND_AI_TAGS_2026-05-03.md b/docs/technical/PUZZLE_PICTURE_ONLY_CREATION_AND_AI_TAGS_2026-05-03.md index f5ed9549..fb62e4ec 100644 --- a/docs/technical/PUZZLE_PICTURE_ONLY_CREATION_AND_AI_TAGS_2026-05-03.md +++ b/docs/technical/PUZZLE_PICTURE_ONLY_CREATION_AND_AI_TAGS_2026-05-03.md @@ -19,10 +19,18 @@ 3. `puzzle-select-image` 展示为“写入正式草稿”:把首图设为第一关正式图,并同步到结果页草稿。 4. `ready` 文案提示进入结果页补作品信息;不得暗示作品名称、作品描述或作品标签已经完整生成。 +### 2026-05-08 进度页预计等待与步骤动效补充 + +1. 拼图草稿生成进度页预计等待时间固定按 `60` 秒展示和倒计时,后端真实完成后立即进入结果页,不强制等满 60 秒。 +2. 60 秒进度拆成三段:`compile` 约 12 秒,`puzzle-images` 约 42 秒,`puzzle-select-image` 约 6 秒。 +3. 生成中即使后端 `compile_puzzle_draft` 仍是一次同步 action,前端也必须按本地计时推进总进度和当前步骤进度,避免页面停在静态等待态。 +4. 每个步骤卡片都展示独立进度条;已完成步骤显示 100%,当前步骤按该段预计时长推进,后续步骤保留 0% 待处理状态。 +5. 后端未返回前总进度最多推进到 98%,防止 UI 提前宣称生成完成;只有 action 成功并写回 `ready` 后才显示 100%。 + ## 草稿默认值 -1. 后端先由 `module-puzzle` 生成可回滚的确定性草稿,再由 `api-server` 基于画面描述调用文本模型生成第一关关卡名;模型不可用或返回非法时才降级到确定性兜底名。 -2. 第一关关卡名生成后,必须写回首关 `levelName`,并在入口直创默认场景下作为 `workTitle` 同步写入草稿和作品草稿卡。 +1. 后端先由 `module-puzzle` 生成可回滚的确定性草稿,再由 `api-server` 生成第一关关卡名。图片生成前可先基于画面描述生成临时关卡名;正式图片生成完成后,必须使用 APIMart Chat Completions 的 `gpt-4o-mini`,把正式图片 data URL 与画面描述一起传入模型,生成最终关卡名。 +2. 最终关卡名生成后,必须写回首关 `levelName`,并在入口直创默认场景下作为 `workTitle` 同步写入草稿和作品草稿卡;模型不可用、图片压缩失败或返回非法时,才保留前一步文本名或确定性兜底名。 3. `workDescription` 默认保持空字符串,不再回退为画面描述。 4. `themeTags` 默认保持空数组,不再由入口画面描述自动推断为正式作品标签。 5. `formDraft` 只保留 `pictureDescription`,`workTitle` 与 `workDescription` 为空。 diff --git a/docs/technical/PUZZLE_SINGLE_PLAYER_AND_REAL_IMAGE_PLAN_2026-04-24.md b/docs/technical/PUZZLE_SINGLE_PLAYER_AND_REAL_IMAGE_PLAN_2026-04-24.md index 258a4cb6..939461ab 100644 --- a/docs/technical/PUZZLE_SINGLE_PLAYER_AND_REAL_IMAGE_PLAN_2026-04-24.md +++ b/docs/technical/PUZZLE_SINGLE_PLAYER_AND_REAL_IMAGE_PLAN_2026-04-24.md @@ -48,7 +48,7 @@ 不能继续写到仓库本地 `public/generated-puzzle-covers/*`。 -这些路径只是前后端 DTO 里的兼容标识,不是浏览器可以直接裸读的公开资源地址。实际图片对象存放在私有 OSS 中,前端渲染前必须先通过 `/api/assets/read-url?legacyPublicPath=...` 换取签名读 URL;签名 URL 未返回或换签失败时,图片组件不能把 `/generated-puzzle-assets/*` 直接写入 ``,避免浏览器发起无签名、无鉴权请求。 +这些路径只是前后端 DTO 里的兼容标识,不是浏览器可以直接裸读的公开资源地址。实际图片对象存放在私有 OSS 中,前端渲染前必须先通过 `/api/assets/read-url?legacyPublicPath=...` 换取签名读 URL;签名 URL 未返回或换签失败时,图片组件不能把 `/generated-puzzle-assets/*` 或无前导斜杠的 `generated-puzzle-assets/*` 直接写入 ``,避免浏览器发起无签名、无鉴权请求。 ### 4.2 运行态边界 diff --git a/docs/technical/PUZZLE_TEMPLATE_FORM_AND_GPT_IMAGE_SKILL_2026-05-03.md b/docs/technical/PUZZLE_TEMPLATE_FORM_AND_GPT_IMAGE_SKILL_2026-05-03.md index cb7882d1..af43dbe1 100644 --- a/docs/technical/PUZZLE_TEMPLATE_FORM_AND_GPT_IMAGE_SKILL_2026-05-03.md +++ b/docs/technical/PUZZLE_TEMPLATE_FORM_AND_GPT_IMAGE_SKILL_2026-05-03.md @@ -32,14 +32,18 @@ - 上传区自身就是图片卡片,不再额外封装为 `platform-subpanel` 模块壳。 - 亮色主题下上传卡片必须使用白色或暖浅色卡面,不得显示整块黑色底。 - 上传卡片固定为 1:1 正方形,避免拼图主画面在首屏出现非正方形预期。 - - 上传卡片底部不再叠加文件名 bar;卡片下方只保留 `点击上传拼图图片` 纯文字入口。 + - 移动端表单主体不可依赖纵向拖动查看核心控件;玩法卡带、描述输入框和底部生成按钮占位固定后,上传卡片必须按剩余高度等比例缩放,仍保持 1:1。 + - 上传卡片底部不再叠加文件名 bar;`点击上传拼图图片` 入口必须显示在拼图画面卡片内部。 - 上传卡片上方固定展示 `拼图画面` 标题。 - - 叠在上传卡片上的 `AI重绘` 和图标必须和卡面保持足够对比,避免浅色主题重映射后不可读。 -3. 画面描述输入框高度约为旧版大输入框的 1/2,避免移动端把上传参考图和提交区挤出首屏。 -4. 输入区保留: + - 无图状态下,上传卡片内部、`点击上传拼图图片` 按钮上方展示 11px 级辅助提示 `若没有合适的图片可以通过填写画面描述生成画面`,提示用户可不上传图片、直接填写画面描述生成画面。 + - 上传成功后,`AI重绘` 开关显示在卡片左下角,右上角显示移除拼图图片图标按钮;移除必须先弹出二次确认。 + - 叠在上传卡片上的 `AI重绘`、移除图标和上传入口必须和卡面保持足够对比,避免浅色主题重映射后不可读。 +3. 画面描述输入框高度固定,移动端保持约 `6rem`,不随剩余屏幕高度变大或变小,避免把上传参考图和提交区挤出首屏。 +4. 创作 Tab 顶部玩法卡带的选中态只使用卡内暗色蒙版、细描边或内描边,不使用粉色外发光、外扩阴影或会从卡片边缘突出的高饱和边。 +5. 输入区保留: - 上传拼图图片按钮。 - 图片模型切换按钮。 -5. 输入区不保留: +6. 输入区不保留: - `try` 文本。 - 示例 prompt chip。 - 画面描述输入框默认提示词或占位示例。 @@ -73,8 +77,9 @@ Skill 封装仓库现有后端口径: POST {APIMART_BASE_URL}/images/generations Authorization: Bearer {APIMART_API_KEY} model = gpt-image-2 -size = 1:1 n = 1 +official_fallback = true +size = 1:1 ``` 响应兼容: @@ -87,7 +92,7 @@ n = 1 ## 2026-05-07 AI 重绘与上传直用 -拼图入口上传区右上角新增 `AI重绘` 开关,默认打开;未上传拼图图片前不显示开关,上传成功后才显示。 +拼图入口上传区左下角展示 `AI重绘` 开关,默认打开;未上传拼图图片前不显示开关,上传成功后才显示。上传成功后右上角展示移除图标按钮,点击后必须二次确认。 1. `AI重绘=true` - 上传区文案为 `点击上传拼图图片`,上传图作为生图参考图。 @@ -104,6 +109,7 @@ n = 1 3. 上传裁剪 - 前端读取上传图原始宽高。 - 非 1:1 图片必须先弹出正方形裁剪工具,裁剪完成后再进入表单状态和提交 payload。 + - 裁剪工具必须在完整原图上展示正方形裁剪框,支持拖拽框内区域移动,以及拖拽四边或四角调整裁剪边界,不再展示 `缩放 / 横向 / 纵向` 参数滑杆。 - 裁剪输出仍按参考图体积预算压缩,避免上传图撑爆 JSON body。 契约字段同步: @@ -117,13 +123,13 @@ Rust 共享契约使用 `ai_redraw: Option` 并按 camelCase 序列化为 ## 验收 -1. 点击拼图创作后,表单首屏呈现大参考图区和半高文本输入框。 +1. 点击拼图创作后,移动端表单无需纵向拖动即可看到大参考图区、固定高度文本输入框和 `生成拼图游戏草稿` 按钮。 2. 输入框里没有 `try` 示例功能。 3. 图片模型切换仍可打开并选择 `gpt-image-2` / `nanobanana2`。 4. 历史模板样例图文件可保留,但不出现在拼图入口表单。 5. 当前创作 Tab 顶部的拼图、方洞挑战、视觉小说和 AIRP 卡片能看到对应 `creation-type-references` 图片。 6. 默认 `AI重绘` 打开时,无图状态展示 `画面描述` 与 `消耗2光点`;上传图片后输入框标题改为 `画面AI重绘要求(提示词)`。 7. 关闭 `AI重绘` 后隐藏画面描述输入框,生成按钮不展示 `消耗2光点`,后端直接应用上传图片为第一关图片。 -8. 上传非 1:1 图片时必须先完成正方形裁剪。 +8. 上传非 1:1 图片时必须先通过拖拽裁剪框完成正方形裁剪。 9. gpt-image-2 Skill 校验通过,且脚本 dry-run 能输出计划请求而不泄露密钥。 10. `npm run check:encoding` 通过。 diff --git a/docs/technical/README.md b/docs/technical/README.md index 76fe422e..08ea4d7a 100644 --- a/docs/technical/README.md +++ b/docs/technical/README.md @@ -5,6 +5,8 @@ ## 文档列表 - [RUST_WORKSPACE_DEPENDENCY_CONSOLIDATION_2026-05-07.md](./RUST_WORKSPACE_DEPENDENCY_CONSOLIDATION_2026-05-07.md):记录 `server-rs` Cargo 依赖集中配置口径,第三方版本和 workspace 内部 crate path 统一维护在根 `server-rs/Cargo.toml`,成员 crate 只保留 feature/optional 差异。 +- [VOLCENGINE_SPEECH_STREAMING_INTEGRATION_2026-05-08.md](./VOLCENGINE_SPEECH_STREAMING_INTEGRATION_2026-05-08.md):记录火山引擎大模型 ASR 双向流式、TTS WebSocket 双向流式和 TTS HTTP SSE 单向流式的后端代理、环境变量、协议帧和验收边界。 +- [VECTOR_ENGINE_AUDIO_GENERATION_SUNO_VIDU_2026-05-08.md](./VECTOR_ENGINE_AUDIO_GENERATION_SUNO_VIDU_2026-05-08.md):记录视觉小说结果页接入 VectorEngine Suno 文生背景音乐与 Vidu 文生音效的接口、环境变量、后端路由、OSS 资产回写和前端弹层交互边界。 - [API_SERVER_EXTERNAL_SERVICE_ENV_CONFIG_2026-05-07.md](./API_SERVER_EXTERNAL_SERVICE_ENV_CONFIG_2026-05-07.md):冻结 api-server 外部服务配置边界,公共服务 URL 可保留代码默认值,非公共模型名和私有网关 URL 统一通过环境变量注入。 - [CREATIVE_INTERACTIVE_CONTENT_AGENT_TECHNICAL_SOLUTION_2026-05-05.md](./CREATIVE_INTERACTIVE_CONTENT_AGENT_TECHNICAL_SOLUTION_2026-05-05.md):冻结基于 LangChain-Rust 的创意互动内容生成 Agent 技术方案,明确首版只支持拼图模板、必须显式展示模板选择和积分范围,通过拼图模块 Tool/模板协议填充同一份草稿字段,支持单关卡与多关卡图片生成、立即试玩、表单化编辑和 Agent 自然语言修订草稿字段。 - [VISUAL_NOVEL_PROMPT_AND_LLM_TOOLS_VN03_2026-05-05.md](./VISUAL_NOVEL_PROMPT_AND_LLM_TOOLS_VN03_2026-05-05.md):记录视觉小说模板 `VN-03` Prompt / LLM 工具落地,包含创作底稿 Prompt、运行时 GM Prompt、repair Prompt、工具参数 schema、Responses 请求口径和定向验证结果。 diff --git a/docs/technical/RPG_IMAGE_GENERATION_GPT_IMAGE_2_MIGRATION_2026-05-02.md b/docs/technical/RPG_IMAGE_GENERATION_GPT_IMAGE_2_MIGRATION_2026-05-02.md index 8040f66e..276adfcf 100644 --- a/docs/technical/RPG_IMAGE_GENERATION_GPT_IMAGE_2_MIGRATION_2026-05-02.md +++ b/docs/technical/RPG_IMAGE_GENERATION_GPT_IMAGE_2_MIGRATION_2026-05-02.md @@ -36,8 +36,9 @@ model = gpt-image-2 1. `model` 2. `prompt` 3. `n` -4. `size` -5. 有参考图时增加 `image_urls` +4. `official_fallback = true` +5. `size` +6. 有参考图时增加 `image_urls` 尺寸归一规则: @@ -67,8 +68,8 @@ APIMART_IMAGE_REQUEST_TIMEOUT_MS=180000 ## 验收 -1. 角色主图生成请求上游 `model` 为 `gpt-image-2`。 -2. 场景图生成请求上游 `model` 为 `gpt-image-2`。 +1. 角色主图生成请求上游 `model` 为 `gpt-image-2`,且携带 `official_fallback = true`。 +2. 场景图生成请求上游 `model` 为 `gpt-image-2`,且携带 `official_fallback = true`。 3. 旧前端或历史草稿传 `wan2.7-image-pro` 时不会回退旧模型。 4. 场景参考图生成仍能把参考图 Data URL 放入 `image_urls`。 5. 角色主图生成后仍执行原有 PNG 透明背景处理与 OSS 写入。 diff --git a/docs/technical/VECTOR_ENGINE_AUDIO_GENERATION_SUNO_VIDU_2026-05-08.md b/docs/technical/VECTOR_ENGINE_AUDIO_GENERATION_SUNO_VIDU_2026-05-08.md new file mode 100644 index 00000000..98e0389d --- /dev/null +++ b/docs/technical/VECTOR_ENGINE_AUDIO_GENERATION_SUNO_VIDU_2026-05-08.md @@ -0,0 +1,156 @@ +# VectorEngine 音频生成接入方案 2026-05-08 + +## 1. 范围 + +本方案用于把 VectorEngine / Apifox 文档中的 Suno 文生背景音乐与 Vidu 文生音效接入视觉小说结果页。 + +本次只接入 `visual-novel` 现有场景资产槽位,不新增独立音频系统、不新增 SpacetimeDB 表、不把供应商密钥下发到前端。 + +## 2. 参考接口 + +### 2.1 Suno 文生背景音乐 + +参考文档: + +- `https://vectorengine.apifox.cn/api-349239190` +- `https://vectorengine.apifox.cn/api-349239199` + +接口: + +```text +POST /suno/submit/music +GET /suno/fetch/{task_id} +``` + +提交请求头: + +```text +Content-Type: application/json +Accept: application/json +Authorization: Bearer {VECTOR_ENGINE_API_KEY} +``` + +自定义模式请求体: + +```json +{ + "prompt": "音乐描述或歌词", + "mv": "chirp-v4", + "title": "曲名", + "tags": "风格标签", + "continue_at": 120, + "continue_clip_id": "", + "task": "" +} +``` + +首版只使用 `prompt`、`mv`、`title`、`tags`。返回体按 `code = success` 且 `data` 为供应商任务 ID 处理。 + +### 2.2 Vidu 文生音效 + +参考文档: + +- `https://vectorengine.apifox.cn/api-417728889` +- `https://vectorengine.apifox.cn/api-417728893` + +接口: + +```text +POST /ent/v2/text2audio +GET /ent/v2/tasks/{id}/creations +``` + +提交请求体: + +```json +{ + "model": "audio1.0", + "prompt": "雨滴落在窗户上的声音,伴随着轻柔的雷声", + "duration": 5, + "seed": 0 +} +``` + +约束: + +- `prompt` 最长 1500 字符。 +- `duration` 范围为 2 到 10 秒,默认 5 秒。 +- `model` 首版固定为 `audio1.0`。 + +## 3. 环境变量 + +```text +VECTOR_ENGINE_BASE_URL= +VECTOR_ENGINE_API_KEY= +VECTOR_ENGINE_AUDIO_REQUEST_TIMEOUT_MS=180000 +``` + +说明: + +1. `VECTOR_ENGINE_BASE_URL` 只保存供应商代理 API 基础地址,不在代码中写死私有网关。 +2. `VECTOR_ENGINE_API_KEY` 只能进入本地或生产私密环境文件,不提交到 Git。 +3. 缺少任一必配项时,后端返回 `503 SERVICE_UNAVAILABLE`,前端沿用现有错误展示。 + +## 4. 后端路由 + +视觉小说创作链新增 4 个鉴权路由: + +| 方法 | 路由 | 用途 | +| --- | --- | --- | +| `POST` | `/api/creation/visual-novel/audio/background-music` | 提交 Suno 背景音乐任务 | +| `POST` | `/api/creation/visual-novel/audio/background-music/{task_id}/asset` | 查询 Suno 任务,完成后下载并写入平台资产 | +| `POST` | `/api/creation/visual-novel/audio/sound-effect` | 提交 Vidu 音效任务 | +| `POST` | `/api/creation/visual-novel/audio/sound-effect/{task_id}/asset` | 查询 Vidu 任务,完成后下载并写入平台资产 | + +生成资产回包写入既有视觉小说字段: + +- Suno 背景音乐:`VisualNovelSceneDraft.musicSrc` +- Vidu 文生音效:`VisualNovelSceneDraft.ambientSoundSrc` + +## 5. 资产落点 + +音频文件由后端下载后通过 `OssClient::put_object` 写入平台 OSS,并确认 `asset_object` 与 `asset_entity_binding`。 + +对象规划: + +| 类型 | `assetKind` | `entityKind` | `slot` | 旧路径前缀 | +| --- | --- | --- | --- | --- | +| 背景音乐 | `visual_novel_music` | `visual_novel_scene` | `music` | `generated-custom-world-scenes` | +| 场景音效 | `visual_novel_ambient_sound` | `visual_novel_scene` | `ambient_sound` | `generated-custom-world-scenes` | + +确认后的 `audioSrc` 使用 OSS 返回的 legacy public path,继续由前端 `resolveAssetReadUrl` 换签播放。 + +## 6. 前端交互 + +视觉小说场景编辑弹层新增两类音频能力: + +1. `音乐` 保留上传能力,并新增 Suno 生成按钮。 +2. `音效` 使用 `ambientSoundSrc`,支持上传和 Vidu 生成。 + +交互要求: + +1. 生成参数放在独立弹层中,不在当前场景面板下方展开。 +2. 弹层只保留必要字段、提交、关闭和状态反馈,不展示供应商规则说明。 +3. 任务提交成功后前端轮询资产接口;若供应商仍在处理,保持弹层状态。 +4. 资产生成完成后自动写回当前场景字段。 + +## 7. 验收 + +建议执行: + +```bash +npm run check:encoding +npm run typecheck + +cd server-rs +cargo test -p shared-contracts visual_novel +cargo check -p api-server +``` + +涉及真实 API smoke 时: + +1. 只在本地私密环境设置 `VECTOR_ENGINE_API_KEY`。 +2. 使用 `npm run api-server` 重启后端。 +3. 确认 `/healthz`。 +4. 在视觉小说结果页提交背景音乐或音效生成,生成完成后确认场景音频槽位可播放。 + diff --git a/docs/technical/VOLCENGINE_SPEECH_STREAMING_INTEGRATION_2026-05-08.md b/docs/technical/VOLCENGINE_SPEECH_STREAMING_INTEGRATION_2026-05-08.md new file mode 100644 index 00000000..eda265b1 --- /dev/null +++ b/docs/technical/VOLCENGINE_SPEECH_STREAMING_INTEGRATION_2026-05-08.md @@ -0,0 +1,225 @@ +# 火山引擎大模型语音流式接入 2026-05-08 + +## 背景 + +本次接入火山引擎豆包语音能力,覆盖两类运行态语音链路: + +1. 大模型流式语音识别 ASR,使用 WebSocket 双向流式优化模式。 +2. 大模型语音合成 TTS,使用实时交互场景的 WebSocket 双向流式接口,并提供一次性文本输入的 HTTP SSE 单向流式接口。 + +语音能力属于外部副作用,按 server-rs DDD 分层落在 `platform-speech`;`api-server` 只负责平台账号鉴权、环境配置校验、协议代理和错误映射。前端不得直接持有火山引擎密钥。 + +## 官方文档依据 + +- ASR:`https://www.volcengine.com/docs/6561/1354869?lang=zh` +- TTS WebSocket 双向流式:`https://www.volcengine.com/docs/6561/1329505?lang=zh` +- TTS WebSocket 单向流式:`https://www.volcengine.com/docs/6561/1719100?lang=zh` +- TTS HTTP Chunked / SSE 单向流式:`https://www.volcengine.com/docs/6561/1598757?lang=zh` + +## 环境变量 + +真实值只能放在本地未提交的 `.env.local` / `.env.secrets.local` 或生产服务器环境文件,禁止提交到仓库。 + +```text +VOLCENGINE_SPEECH_API_KEY= +VOLCENGINE_SPEECH_APP_ID= +VOLCENGINE_SPEECH_ACCESS_KEY= +VOLCENGINE_SPEECH_ASR_RESOURCE_ID=volc.seedasr.sauc.concurrent +VOLCENGINE_SPEECH_TTS_RESOURCE_ID=seed-tts-2.0 +VOLCENGINE_SPEECH_REQUEST_TIMEOUT_MS=180000 +VOLCENGINE_SPEECH_ASR_WS_URL=wss://openspeech.bytedance.com/api/v3/sauc/bigmodel_async +VOLCENGINE_SPEECH_TTS_BIDIRECTION_WS_URL=wss://openspeech.bytedance.com/api/v3/tts/bidirection +VOLCENGINE_SPEECH_TTS_SSE_URL=https://openspeech.bytedance.com/api/v3/tts/unidirectional/sse +``` + +配置规则: + +1. 优先使用新版控制台 `VOLCENGINE_SPEECH_API_KEY`,上游请求头写 `X-Api-Key`。 +2. 若只配置旧版控制台信息,则使用 `VOLCENGINE_SPEECH_APP_ID` 和 `VOLCENGINE_SPEECH_ACCESS_KEY`。 +3. ASR 默认资源 ID 选 ASR 2.0 并发版;如账号是小时版,部署时改成 `volc.seedasr.sauc.duration`。 +4. TTS 默认资源 ID 选 `seed-tts-2.0`;旧音色或 1.0 计费资源由部署环境覆盖。 + +## ASR 协议边界 + +客户端连接: + +```text +GET /api/speech/volcengine/asr/stream +Authorization: Bearer +``` + +浏览器与 `api-server` 使用 WebSocket 二进制帧透传: + +1. 首包必须是 JSON 文本,表示 ASR full client request 的业务参数。 +2. 后续二进制帧是音频分片。 +3. 浏览器发送文本帧 `{"type":"finish"}` 时,后端把最后一个空音频包按负包发送给火山。 +4. 后端把火山 full server response 解析成 JSON 文本帧发回浏览器。 + +ASR 上游连接: + +```text +wss://openspeech.bytedance.com/api/v3/sauc/bigmodel_async +X-Api-Key: +X-Api-Resource-Id: +X-Api-Request-Id: +X-Api-Sequence: -1 +``` + +ASR 二进制协议: + +1. 4 字节 header,大端整数。 +2. full client request:message type `0b0001`,JSON 序列化,gzip 压缩。 +3. audio only request:message type `0b0010`,raw payload,gzip 压缩。 +4. 最后一包音频使用 flags `0b0010`。 +5. full server response:message type `0b1001`,payload 为 gzip JSON。 +6. error response:message type `0b1111`,payload 为错误 JSON 或 UTF-8 文本。 + +首包参数由前端传入,但后端会兜底: + +```json +{ + "user": { "uid": "current-user-id" }, + "audio": { + "format": "pcm", + "codec": "raw", + "rate": 16000, + "bits": 16, + "channel": 1 + }, + "request": { + "model_name": "bigmodel", + "enable_itn": true, + "enable_punc": true, + "show_utterances": true, + "result_type": "full" + } +} +``` + +## TTS 协议边界 + +### WebSocket 双向流式 + +客户端连接: + +```text +GET /api/speech/volcengine/tts/bidirection +Authorization: Bearer +``` + +浏览器向后端发送 JSON 文本帧: + +```json +{ "type": "start_connection" } +{ "type": "start_session", "sessionId": "...", "payload": { "user": {}, "req_params": {} } } +{ "type": "task_request", "sessionId": "...", "payload": { "req_params": { "text": "..." } } } +{ "type": "finish_session", "sessionId": "..." } +{ "type": "finish_connection" } +``` + +后端转成火山 WebSocket V3 二进制帧,并把上游返回帧统一解析成 JSON 文本或音频二进制帧回传浏览器。 + +TTS 双向上游连接: + +```text +wss://openspeech.bytedance.com/api/v3/tts/bidirection +X-Api-Key: +X-Api-Resource-Id: +X-Api-Connect-Id: +``` + +V3 事件帧: + +1. Full-client request + event number 用于 `StartConnection`、`StartSession`、`TaskRequest`、`FinishSession`、`FinishConnection`。 +2. Full-server response + event number 用于 `ConnectionStarted`、`SessionStarted`、`SessionFinished` 等状态事件。 +3. Audio-only response + event number 用于返回音频二进制。 +4. 错误帧必须转成结构化 JSON 错误,不把上游密钥或完整请求头写入日志。 + +### HTTP SSE 单向流式 + +客户端请求: + +```text +POST /api/speech/volcengine/tts/sse +Authorization: Bearer +Content-Type: application/json +Accept: text/event-stream +``` + +请求体: + +```json +{ + "text": "你好,欢迎来到百梦。", + "speaker": "zh_female_cancan_mars_bigtts", + "audioParams": { + "format": "mp3", + "sampleRate": 24000 + } +} +``` + +后端转换为火山 HTTP SSE 请求体: + +```json +{ + "user": { "uid": "current-user-id" }, + "req_params": { + "text": "...", + "speaker": "...", + "audio_params": { + "format": "mp3", + "sample_rate": 24000 + } + } +} +``` + +上游 SSE 的常见事件: + +1. `352`:TTSResponse,`data` 为 base64 音频片段。 +2. `351`:TTSSentenceEnd,`sentence` 为字幕或时间戳数据。 +3. `152`:SessionFinish,合成完成,可含 `usage.text_words`。 +4. `153`:SessionFailed,合成失败。 + +后端保持 SSE 形态透传,但会补齐平台 `requestId` 与上游 `X-Tt-Logid` 作为排障信息。 + +## api-server 路由 + +| 方法 | 路由 | 说明 | +|---|---|---| +| `GET` | `/api/speech/volcengine/config` | 返回前端可见的默认资源和推荐音频参数,不返回密钥 | +| `GET` | `/api/speech/volcengine/asr/stream` | ASR WebSocket 双向流式代理 | +| `GET` | `/api/speech/volcengine/tts/bidirection` | TTS WebSocket 双向流式代理 | +| `POST` | `/api/speech/volcengine/tts/sse` | TTS HTTP SSE 单向流式代理 | + +所有路由必须走 `require_bearer_auth`。 + +## 验收 + +代码级验收: + +```bash +cargo fmt --manifest-path server-rs/Cargo.toml --all --check +cargo test --manifest-path server-rs/Cargo.toml -p platform-speech +cargo test --manifest-path server-rs/Cargo.toml -p api-server volcengine_speech +cargo check --manifest-path server-rs/Cargo.toml -p api-server +npm run check:encoding +``` + +联调验收: + +1. 启动 `npm run api-server`。 +2. 检查 `/healthz` 返回 200。 +3. 未登录访问语音路由返回 401。 +4. 已登录后 `/api/speech/volcengine/config` 不返回任何密钥字段。 +5. ASR WebSocket 发送首包和 200ms PCM 分片后能收到识别 JSON。 +6. TTS SSE 能收到 `352` 音频事件与最终 `152` 完成事件。 +7. TTS 双向 WebSocket 能复用连接完成至少一个 session。 + +## 注意事项 + +1. 不把 `VOLCENGINE_ACCESS_KEY_ID`、`VOLCENGINE_SECRET_ACCESS_KEY`、API Key、Access Token 或完整 Authorization 写入文档、日志、测试快照或前端状态。 +2. 中文语音默认使用 16k 单声道 PCM ASR;TTS 默认使用 24k mp3,运行时可按玩法需要改为 pcm。 +3. 火山返回的 `X-Tt-Logid` 是排障关键信息,应记录 logid,但不能记录密钥。 +4. 语音流式能力是平台副作用,不涉及 SpacetimeDB 表结构变更,本次无需修改 `migration.rs`。 diff --git a/packages/shared/src/contracts/visualNovel.ts b/packages/shared/src/contracts/visualNovel.ts index 1ad3e0a2..167824ab 100644 --- a/packages/shared/src/contracts/visualNovel.ts +++ b/packages/shared/src/contracts/visualNovel.ts @@ -7,7 +7,14 @@ export type VisualNovelCharacterRole = | 'antagonist' | 'background'; -export type VisualNovelAssetSource = 'platform_asset' | 'generated' | 'external'; +export type VisualNovelAssetSource = + | 'platform_asset' + | 'generated' + | 'external'; + +export type VisualNovelAudioGenerationKind = + | 'background_music' + | 'sound_effect'; export type VisualNovelSceneAvailability = | 'opening' @@ -257,6 +264,41 @@ export interface VisualNovelCompileResponse { work: VisualNovelWorkDetail; } +export interface CreateVisualNovelBackgroundMusicRequest { + prompt: string; + title: string; + tags?: string | null; + model?: string | null; +} + +export interface CreateVisualNovelSoundEffectRequest { + prompt: string; + duration?: number | null; + seed?: number | null; +} + +export interface VisualNovelAudioGenerationTaskResponse { + kind: VisualNovelAudioGenerationKind; + taskId: string; + provider: string; + status: string; +} + +export interface PublishVisualNovelGeneratedAudioAssetRequest { + sceneId: string; + profileId?: string | null; +} + +export interface VisualNovelGeneratedAudioAssetResponse { + kind: VisualNovelAudioGenerationKind; + taskId: string; + provider: string; + status: string; + assetObjectId?: string | null; + assetKind?: string | null; + audioSrc?: string | null; +} + export interface SendVisualNovelMessageRequest { clientMessageId: string; text: string; diff --git a/public/ui-previews/puzzle-image-compact-ui-2026-05-08.png b/public/ui-previews/puzzle-image-compact-ui-2026-05-08.png new file mode 100644 index 00000000..5f9ca4e0 Binary files /dev/null and b/public/ui-previews/puzzle-image-compact-ui-2026-05-08.png differ diff --git a/scripts/dev-web-rust.mjs b/scripts/dev-web-rust.mjs index 2a92a4a8..98d1ed10 100644 --- a/scripts/dev-web-rust.mjs +++ b/scripts/dev-web-rust.mjs @@ -37,22 +37,93 @@ loadEnvFile(resolve(repoRoot, '.env'), fileEnv); loadEnvFile(resolve(repoRoot, '.env.local'), fileEnv); loadEnvFile(resolve(repoRoot, '.env.secrets.local'), fileEnv); +function buildTargetCandidates() { + const candidates = [ + fileEnv.GENARRATIVE_RUNTIME_SERVER_TARGET, + fileEnv.RUST_SERVER_TARGET, + fileEnv.GENARRATIVE_API_TARGET, + `http://127.0.0.1:${fileEnv.GENARRATIVE_API_PORT || '3100'}`, + 'http://127.0.0.1:8082', + 'http://127.0.0.1:3100', + ].filter(Boolean); + + return Array.from(new Set(candidates)); +} + +async function isTargetReachable(target) { + const healthUrl = new URL('/healthz', target); + const controller = new AbortController(); + const timeoutId = setTimeout(() => controller.abort(), 1200); + + try { + const response = await fetch(healthUrl, { + method: 'GET', + signal: controller.signal, + }); + + return response.ok; + } catch { + return false; + } finally { + clearTimeout(timeoutId); + } +} + +async function resolveRuntimeTarget() { + const candidates = buildTargetCandidates(); + const reachableTargets = []; + + for (const target of candidates) { + if (await isTargetReachable(target)) { + reachableTargets.push(target); + if ( + target === fileEnv.GENARRATIVE_RUNTIME_SERVER_TARGET || + target === fileEnv.RUST_SERVER_TARGET || + target === fileEnv.GENARRATIVE_API_TARGET + ) { + return { + target, + fallbackUsed: false, + }; + } + } + } + + if (reachableTargets.length > 0) { + return { + target: reachableTargets[0], + fallbackUsed: true, + }; + } + + return { + target: + fileEnv.GENARRATIVE_RUNTIME_SERVER_TARGET || + fileEnv.RUST_SERVER_TARGET || + fileEnv.GENARRATIVE_API_TARGET || + `http://127.0.0.1:${fileEnv.GENARRATIVE_API_PORT || '3100'}`, + fallbackUsed: false, + }; +} + +const runtimeTarget = await resolveRuntimeTarget(); +if (runtimeTarget.fallbackUsed) { + console.warn( + `[dev:web] 配置的 Rust target 不可用,已切换到 ${runtimeTarget.target}`, + ); +} + const mergedEnv = { ...fileEnv, - RUST_SERVER_TARGET: - fileEnv.RUST_SERVER_TARGET || - fileEnv.GENARRATIVE_API_TARGET || - `http://127.0.0.1:${fileEnv.GENARRATIVE_API_PORT || '3100'}`, + RUST_SERVER_TARGET: runtimeTarget.target, + GENARRATIVE_RUNTIME_SERVER_TARGET: runtimeTarget.target, }; -mergedEnv.GENARRATIVE_RUNTIME_SERVER_TARGET = - fileEnv.GENARRATIVE_RUNTIME_SERVER_TARGET || mergedEnv.RUST_SERVER_TARGET; - console.log(`[dev:web] backend=rust target=${mergedEnv.GENARRATIVE_RUNTIME_SERVER_TARGET}`); const child = spawn( 'node', - ['scripts/vite-cli.mjs', '--port=3000', '--host=0.0.0.0'], + ['scripts/vite-cli.mjs', '--port=3000', '--host=0.0.0.0', '--strictPort'], { cwd: process.cwd(), env: mergedEnv, diff --git a/server-rs/Cargo.lock b/server-rs/Cargo.lock index d9833b19..c809ecda 100644 --- a/server-rs/Cargo.lock +++ b/server-rs/Cargo.lock @@ -82,6 +82,7 @@ dependencies = [ "axum", "base64 0.22.1", "dotenvy", + "futures-util", "hmac", "http-body-util", "image", @@ -106,6 +107,7 @@ dependencies = [ "platform-auth", "platform-llm", "platform-oss", + "platform-speech", "reqwest 0.12.28", "serde", "serde_json", @@ -117,6 +119,7 @@ dependencies = [ "time", "tokio", "tokio-stream", + "tokio-tungstenite 0.27.0", "tower", "tower-http", "tracing", @@ -221,6 +224,7 @@ source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "31b698c5f9a010f6573133b09e0de5408834d0c82f8d7475a89fc1867a71cd90" dependencies = [ "axum-core", + "base64 0.22.1", "bytes", "form_urlencoded", "futures-util", @@ -239,8 +243,10 @@ dependencies = [ "serde_json", "serde_path_to_error", "serde_urlencoded", + "sha1", "sync_wrapper 1.0.2", "tokio", + "tokio-tungstenite 0.29.0", "tower", "tower-layer", "tower-service", @@ -1220,7 +1226,7 @@ dependencies = [ "tokio", "tokio-rustls", "tower-service", - "webpki-roots", + "webpki-roots 1.0.7", ] [[package]] @@ -2204,6 +2210,22 @@ dependencies = [ "tokio", ] +[[package]] +name = "platform-speech" +version = "0.1.0" +dependencies = [ + "base64 0.22.1", + "bytes", + "flate2", + "futures-util", + "reqwest 0.12.28", + "serde", + "serde_json", + "tokio", + "tokio-tungstenite 0.27.0", + "uuid", +] + [[package]] name = "png" version = "0.18.1" @@ -2620,7 +2642,7 @@ dependencies = [ "wasm-bindgen-futures", "wasm-streams", "web-sys", - "webpki-roots", + "webpki-roots 1.0.7", ] [[package]] @@ -3378,7 +3400,7 @@ dependencies = [ "spacetimedb-schema", "thiserror 1.0.69", "tokio", - "tokio-tungstenite", + "tokio-tungstenite 0.27.0", ] [[package]] @@ -3699,9 +3721,25 @@ dependencies = [ "futures-util", "log", "native-tls", + "rustls", + "rustls-pki-types", "tokio", "tokio-native-tls", - "tungstenite", + "tokio-rustls", + "tungstenite 0.27.0", + "webpki-roots 0.26.11", +] + +[[package]] +name = "tokio-tungstenite" +version = "0.29.0" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "8f72a05e828585856dacd553fba484c242c46e391fb0e58917c942ee9202915c" +dependencies = [ + "futures-util", + "log", + "tokio", + "tungstenite 0.29.0", ] [[package]] @@ -3884,12 +3922,30 @@ dependencies = [ "log", "native-tls", "rand 0.9.4", + "rustls", + "rustls-pki-types", "sha1", "thiserror 2.0.18", "url", "utf-8", ] +[[package]] +name = "tungstenite" +version = "0.29.0" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "6c01152af293afb9c7c2a57e4b559c5620b421f6d133261c60dd2d0cdb38e6b8" +dependencies = [ + "bytes", + "data-encoding", + "http 1.4.0", + "httparse", + "log", + "rand 0.9.4", + "sha1", + "thiserror 2.0.18", +] + [[package]] name = "type1-encoding-parser" version = "0.1.1" @@ -4169,6 +4225,15 @@ dependencies = [ "libwebp-sys", ] +[[package]] +name = "webpki-roots" +version = "0.26.11" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "521bc38abb08001b01866da9f51eb7c5d647a19260e00054a8c7fd5f9e57f7a9" +dependencies = [ + "webpki-roots 1.0.7", +] + [[package]] name = "webpki-roots" version = "1.0.7" diff --git a/server-rs/Cargo.toml b/server-rs/Cargo.toml index c85b4ff5..0e744652 100644 --- a/server-rs/Cargo.toml +++ b/server-rs/Cargo.toml @@ -30,6 +30,7 @@ members = [ "crates/platform-oss", "crates/platform-auth", "crates/platform-llm", + "crates/platform-speech", "crates/platform-agent", "crates/shared-contracts", "crates/shared-kernel", @@ -69,6 +70,7 @@ platform-agent = { path = "crates/platform-agent", default-features = false } platform-auth = { path = "crates/platform-auth", default-features = false } platform-llm = { path = "crates/platform-llm", default-features = false } platform-oss = { path = "crates/platform-oss", default-features = false } +platform-speech = { path = "crates/platform-speech", default-features = false } shared-contracts = { path = "crates/shared-contracts", default-features = false } shared-kernel = { path = "crates/shared-kernel", default-features = false } shared-logging = { path = "crates/shared-logging", default-features = false } @@ -79,7 +81,10 @@ async-stream = "0.3" async-trait = "0.1" axum = "0.8" base64 = "0.22" +bytes = "1" dotenvy = "0.15" +flate2 = "1" +futures-util = "0.3" hmac = "0.12" http-body-util = "0.1" image = { version = "0.25", default-features = false } @@ -98,6 +103,7 @@ spacetimedb-lib = { version = "2.2.0", default-features = false } time = "0.3" tokio = "1" tokio-stream = "0.1" +tokio-tungstenite = "0.27" tower = "0.5" tower-http = "0.6" tracing = "0.1" diff --git a/server-rs/crates/api-server/Cargo.toml b/server-rs/crates/api-server/Cargo.toml index 1d987c59..117b95b0 100644 --- a/server-rs/crates/api-server/Cargo.toml +++ b/server-rs/crates/api-server/Cargo.toml @@ -6,7 +6,7 @@ license.workspace = true [dependencies] async-stream = { workspace = true } -axum = { workspace = true } +axum = { workspace = true, features = ["ws"] } base64 = { workspace = true } dotenvy = { workspace = true } image = { workspace = true, features = ["jpeg", "png", "webp"] } @@ -33,6 +33,7 @@ platform-agent = { workspace = true } platform-auth = { workspace = true } platform-llm = { workspace = true } platform-oss = { workspace = true } +platform-speech = { workspace = true } serde = { workspace = true } serde_json = { workspace = true } shared-contracts = { workspace = true } @@ -41,6 +42,8 @@ shared-logging = { workspace = true } spacetime-client = { workspace = true } tokio = { workspace = true, features = ["macros", "rt-multi-thread", "net", "time"] } tokio-stream = { workspace = true } +tokio-tungstenite = { workspace = true } +futures-util = { workspace = true } time = { workspace = true, features = ["formatting"] } tower-http = { workspace = true, features = ["trace"] } tracing = { workspace = true } diff --git a/server-rs/crates/api-server/src/app.rs b/server-rs/crates/api-server/src/app.rs index 8941d551..6f92d589 100644 --- a/server-rs/crates/api-server/src/app.rs +++ b/server-rs/crates/api-server/src/app.rs @@ -145,6 +145,10 @@ use crate::{ begin_story_runtime_session, begin_story_session, continue_story, get_story_runtime_projection, get_story_session_state, resolve_story_runtime_action, }, + vector_engine_audio_generation::{ + create_visual_novel_background_music_task, create_visual_novel_sound_effect_task, + publish_visual_novel_background_music_asset, publish_visual_novel_sound_effect_asset, + }, visual_novel::{ compile_visual_novel_session, create_visual_novel_session, delete_visual_novel_work, execute_visual_novel_action, get_visual_novel_run, get_visual_novel_session, @@ -153,6 +157,10 @@ use crate::{ start_visual_novel_run, stream_visual_novel_action, stream_visual_novel_message, submit_visual_novel_message, update_visual_novel_work, }, + volcengine_speech::{ + get_volcengine_speech_config, stream_volcengine_asr, stream_volcengine_tts_bidirection, + stream_volcengine_tts_sse, + }, wechat_auth::{bind_wechat_phone, handle_wechat_callback, start_wechat_login}, }; @@ -312,6 +320,34 @@ pub fn build_router(state: AppState) -> Router { require_bearer_auth, )), ) + .route( + "/api/speech/volcengine/config", + get(get_volcengine_speech_config).route_layer(middleware::from_fn_with_state( + state.clone(), + require_bearer_auth, + )), + ) + .route( + "/api/speech/volcengine/asr/stream", + get(stream_volcengine_asr).route_layer(middleware::from_fn_with_state( + state.clone(), + require_bearer_auth, + )), + ) + .route( + "/api/speech/volcengine/tts/bidirection", + get(stream_volcengine_tts_bidirection).route_layer(middleware::from_fn_with_state( + state.clone(), + require_bearer_auth, + )), + ) + .route( + "/api/speech/volcengine/tts/sse", + post(stream_volcengine_tts_sse).route_layer(middleware::from_fn_with_state( + state.clone(), + require_bearer_auth, + )), + ) .route( "/api/runtime/chat/character/suggestions", post(generate_runtime_character_chat_suggestions).route_layer( @@ -1571,6 +1607,30 @@ fn visual_novel_router(state: AppState) -> Router { require_bearer_auth, )), ) + .route( + "/api/creation/visual-novel/audio/background-music", + post(create_visual_novel_background_music_task).route_layer( + middleware::from_fn_with_state(state.clone(), require_bearer_auth), + ), + ) + .route( + "/api/creation/visual-novel/audio/background-music/{task_id}/asset", + post(publish_visual_novel_background_music_asset).route_layer( + middleware::from_fn_with_state(state.clone(), require_bearer_auth), + ), + ) + .route( + "/api/creation/visual-novel/audio/sound-effect", + post(create_visual_novel_sound_effect_task).route_layer( + middleware::from_fn_with_state(state.clone(), require_bearer_auth), + ), + ) + .route( + "/api/creation/visual-novel/audio/sound-effect/{task_id}/asset", + post(publish_visual_novel_sound_effect_asset).route_layer( + middleware::from_fn_with_state(state.clone(), require_bearer_auth), + ), + ) .route( "/api/runtime/visual-novel/gallery", get(list_visual_novel_gallery), diff --git a/server-rs/crates/api-server/src/assets.rs b/server-rs/crates/api-server/src/assets.rs index 6dcc4828..76fcbd96 100644 --- a/server-rs/crates/api-server/src/assets.rs +++ b/server-rs/crates/api-server/src/assets.rs @@ -1458,7 +1458,7 @@ mod tests { endpoint: &str, signed_at: time::OffsetDateTime, ) -> Result> { - let date = signed_at.date().to_string().replace('-', ""); + let date = format_oss_v4_signature_scope_date(signed_at); let region = endpoint .trim() .split('.') @@ -1470,17 +1470,22 @@ mod tests { } fn build_oss_v4_signature_date(signed_at: time::OffsetDateTime) -> String { - let date = signed_at.date().to_string().replace('-', ""); - let time = signed_at - .time() - .to_string() - .split('.') - .next() - .unwrap_or("00:00:00") - .replace(':', ""); + format!( + "{}T{:02}{:02}{:02}Z", + format_oss_v4_signature_scope_date(signed_at), + signed_at.hour(), + signed_at.minute(), + signed_at.second() + ) + } - debug_assert_eq!(time.len(), 6); - format!("{date}T{time}Z") + fn format_oss_v4_signature_scope_date(signed_at: time::OffsetDateTime) -> String { + format!( + "{:04}{:02}{:02}", + signed_at.year(), + signed_at.month() as u8, + signed_at.day() + ) } fn build_oss_v4_canonical_uri(bucket: &str, object_key: Option<&str>) -> String { diff --git a/server-rs/crates/api-server/src/config.rs b/server-rs/crates/api-server/src/config.rs index 12bc685e..6bfeca45 100644 --- a/server-rs/crates/api-server/src/config.rs +++ b/server-rs/crates/api-server/src/config.rs @@ -4,6 +4,11 @@ use platform_llm::{ DEFAULT_ARK_BASE_URL, DEFAULT_MAX_RETRIES, DEFAULT_REQUEST_TIMEOUT_MS, DEFAULT_RETRY_BACKOFF_MS, LlmProvider, }; +use platform_speech::{ + DEFAULT_ASR_RESOURCE_ID, DEFAULT_ASR_WS_URL, + DEFAULT_REQUEST_TIMEOUT_MS as DEFAULT_SPEECH_REQUEST_TIMEOUT_MS, + DEFAULT_TTS_BIDIRECTION_WS_URL, DEFAULT_TTS_RESOURCE_ID, DEFAULT_TTS_SSE_URL, +}; const DEFAULT_INTERNAL_API_SECRET: &str = "genarrative-dev-internal-bridge"; const DEFAULT_AUTH_STORE_PATH: &str = "server-rs/.data/auth-store.json"; @@ -92,6 +97,18 @@ pub struct AppConfig { pub apimart_base_url: String, pub apimart_api_key: Option, pub apimart_image_request_timeout_ms: u64, + pub vector_engine_base_url: String, + pub vector_engine_api_key: Option, + pub vector_engine_audio_request_timeout_ms: u64, + pub volcengine_speech_api_key: Option, + pub volcengine_speech_app_id: Option, + pub volcengine_speech_access_key: Option, + pub volcengine_speech_asr_resource_id: String, + pub volcengine_speech_tts_resource_id: String, + pub volcengine_speech_asr_ws_url: String, + pub volcengine_speech_tts_bidirection_ws_url: String, + pub volcengine_speech_tts_sse_url: String, + pub volcengine_speech_request_timeout_ms: u64, pub draft_asset_generation_max_concurrent_requests: usize, pub ark_character_video_base_url: String, pub ark_character_video_api_key: Option, @@ -187,6 +204,18 @@ impl Default for AppConfig { apimart_base_url: String::new(), apimart_api_key: None, apimart_image_request_timeout_ms: 180_000, + vector_engine_base_url: String::new(), + vector_engine_api_key: None, + vector_engine_audio_request_timeout_ms: 180_000, + volcengine_speech_api_key: None, + volcengine_speech_app_id: None, + volcengine_speech_access_key: None, + volcengine_speech_asr_resource_id: DEFAULT_ASR_RESOURCE_ID.to_string(), + volcengine_speech_tts_resource_id: DEFAULT_TTS_RESOURCE_ID.to_string(), + volcengine_speech_asr_ws_url: DEFAULT_ASR_WS_URL.to_string(), + volcengine_speech_tts_bidirection_ws_url: DEFAULT_TTS_BIDIRECTION_WS_URL.to_string(), + volcengine_speech_tts_sse_url: DEFAULT_TTS_SSE_URL.to_string(), + volcengine_speech_request_timeout_ms: DEFAULT_SPEECH_REQUEST_TIMEOUT_MS, draft_asset_generation_max_concurrent_requests: 4, ark_character_video_base_url: String::new(), ark_character_video_api_key: None, @@ -544,6 +573,54 @@ impl AppConfig { config.apimart_image_request_timeout_ms = apimart_image_request_timeout_ms; } + if let Some(vector_engine_base_url) = read_first_non_empty_env(&["VECTOR_ENGINE_BASE_URL"]) + { + config.vector_engine_base_url = vector_engine_base_url; + } + + config.vector_engine_api_key = read_first_non_empty_env(&["VECTOR_ENGINE_API_KEY"]); + + if let Some(vector_engine_audio_request_timeout_ms) = + read_first_positive_u64_env(&["VECTOR_ENGINE_AUDIO_REQUEST_TIMEOUT_MS"]) + { + config.vector_engine_audio_request_timeout_ms = vector_engine_audio_request_timeout_ms; + } + + config.volcengine_speech_api_key = + read_first_non_empty_env(&["VOLCENGINE_SPEECH_API_KEY", "VOLCENGINE_API_KEY"]); + config.volcengine_speech_app_id = + read_first_non_empty_env(&["VOLCENGINE_SPEECH_APP_ID", "VOLCENGINE_ACCESS_KEY_ID"]); + config.volcengine_speech_access_key = read_first_non_empty_env(&[ + "VOLCENGINE_SPEECH_ACCESS_KEY", + "VOLCENGINE_SECRET_ACCESS_KEY", + ]); + if let Some(asr_resource_id) = + read_first_non_empty_env(&["VOLCENGINE_SPEECH_ASR_RESOURCE_ID"]) + { + config.volcengine_speech_asr_resource_id = asr_resource_id; + } + if let Some(tts_resource_id) = + read_first_non_empty_env(&["VOLCENGINE_SPEECH_TTS_RESOURCE_ID"]) + { + config.volcengine_speech_tts_resource_id = tts_resource_id; + } + if let Some(asr_ws_url) = read_first_non_empty_env(&["VOLCENGINE_SPEECH_ASR_WS_URL"]) { + config.volcengine_speech_asr_ws_url = asr_ws_url; + } + if let Some(tts_bidirection_ws_url) = + read_first_non_empty_env(&["VOLCENGINE_SPEECH_TTS_BIDIRECTION_WS_URL"]) + { + config.volcengine_speech_tts_bidirection_ws_url = tts_bidirection_ws_url; + } + if let Some(tts_sse_url) = read_first_non_empty_env(&["VOLCENGINE_SPEECH_TTS_SSE_URL"]) { + config.volcengine_speech_tts_sse_url = tts_sse_url; + } + if let Some(request_timeout_ms) = + read_first_positive_u64_env(&["VOLCENGINE_SPEECH_REQUEST_TIMEOUT_MS"]) + { + config.volcengine_speech_request_timeout_ms = request_timeout_ms; + } + if let Some(max_concurrent_requests) = read_first_usize_env(&[ "GENARRATIVE_DRAFT_ASSET_GENERATION_MAX_CONCURRENT_REQUESTS", "DRAFT_ASSET_GENERATION_MAX_CONCURRENT_REQUESTS", @@ -831,6 +908,7 @@ mod tests { assert!(config.llm_model.is_empty()); assert!(config.llm_base_url.is_empty()); assert!(config.apimart_base_url.is_empty()); + assert!(config.vector_engine_base_url.is_empty()); assert!(config.ark_character_video_base_url.is_empty()); assert!(config.ark_character_video_model.is_empty()); assert!(config.dashscope_scene_image_model.is_empty()); @@ -859,6 +937,7 @@ mod tests { std::env::remove_var("GENARRATIVE_LLM_BASE_URL"); std::env::remove_var("GENARRATIVE_LLM_MODEL"); std::env::remove_var("APIMART_BASE_URL"); + std::env::remove_var("VECTOR_ENGINE_BASE_URL"); std::env::remove_var("DASHSCOPE_SCENE_IMAGE_MODEL"); std::env::remove_var("DASHSCOPE_REFERENCE_IMAGE_MODEL"); std::env::remove_var("DASHSCOPE_COVER_IMAGE_MODEL"); @@ -871,6 +950,7 @@ mod tests { ); std::env::set_var("GENARRATIVE_LLM_MODEL", "internal-text-model"); std::env::set_var("APIMART_BASE_URL", "https://image.internal.example/v1"); + std::env::set_var("VECTOR_ENGINE_BASE_URL", "https://audio.internal.example"); std::env::set_var("DASHSCOPE_SCENE_IMAGE_MODEL", "scene-model"); std::env::set_var("DASHSCOPE_REFERENCE_IMAGE_MODEL", "reference-model"); std::env::set_var("DASHSCOPE_COVER_IMAGE_MODEL", "cover-model"); @@ -886,6 +966,10 @@ mod tests { assert_eq!(config.llm_base_url, "https://llm.internal.example/v1"); assert_eq!(config.llm_model, "internal-text-model"); assert_eq!(config.apimart_base_url, "https://image.internal.example/v1"); + assert_eq!( + config.vector_engine_base_url, + "https://audio.internal.example" + ); assert_eq!(config.dashscope_scene_image_model, "scene-model"); assert_eq!(config.dashscope_reference_image_model, "reference-model"); assert_eq!(config.dashscope_cover_image_model, "cover-model"); @@ -900,6 +984,7 @@ mod tests { std::env::remove_var("GENARRATIVE_LLM_BASE_URL"); std::env::remove_var("GENARRATIVE_LLM_MODEL"); std::env::remove_var("APIMART_BASE_URL"); + std::env::remove_var("VECTOR_ENGINE_BASE_URL"); std::env::remove_var("DASHSCOPE_SCENE_IMAGE_MODEL"); std::env::remove_var("DASHSCOPE_REFERENCE_IMAGE_MODEL"); std::env::remove_var("DASHSCOPE_COVER_IMAGE_MODEL"); diff --git a/server-rs/crates/api-server/src/llm_model_routing.rs b/server-rs/crates/api-server/src/llm_model_routing.rs index 78d5cdda..4fd986fd 100644 --- a/server-rs/crates/api-server/src/llm_model_routing.rs +++ b/server-rs/crates/api-server/src/llm_model_routing.rs @@ -1,2 +1,3 @@ pub(crate) const RPG_STORY_LLM_MODEL: &str = "doubao-seed-character-251128"; pub(crate) const CREATION_TEMPLATE_LLM_MODEL: &str = "deepseek-v3-2-251201"; +pub(crate) const PUZZLE_LEVEL_NAME_VISION_LLM_MODEL: &str = "gpt-4o-mini"; diff --git a/server-rs/crates/api-server/src/main.rs b/server-rs/crates/api-server/src/main.rs index 74463ae8..396cd0aa 100644 --- a/server-rs/crates/api-server/src/main.rs +++ b/server-rs/crates/api-server/src/main.rs @@ -68,7 +68,9 @@ mod square_hole_agent_turn; mod state; mod story_battles; mod story_sessions; +mod vector_engine_audio_generation; mod visual_novel; +mod volcengine_speech; mod wechat_auth; mod wechat_provider; mod work_author; diff --git a/server-rs/crates/api-server/src/openai_image_generation.rs b/server-rs/crates/api-server/src/openai_image_generation.rs index b3322fea..fd55bdf9 100644 --- a/server-rs/crates/api-server/src/openai_image_generation.rs +++ b/server-rs/crates/api-server/src/openai_image_generation.rs @@ -177,6 +177,7 @@ pub(crate) fn build_openai_image_request_body( Value::String(build_prompt_with_negative(prompt, negative_prompt)), ), ("n".to_string(), json!(candidate_count.clamp(1, 4))), + ("official_fallback".to_string(), Value::Bool(true)), ( "size".to_string(), Value::String(normalize_image_size(size)), @@ -613,6 +614,7 @@ mod tests { assert_eq!(body["model"], GPT_IMAGE_2_MODEL); assert_eq!(body["size"], "16:9"); assert_eq!(body["n"], 2); + assert_eq!(body["official_fallback"], true); assert_eq!(body["image_urls"][0], "data:image/png;base64,abcd"); assert!(body["prompt"].as_str().unwrap_or_default().contains("避免")); } diff --git a/server-rs/crates/api-server/src/prompt/puzzle/level_name.rs b/server-rs/crates/api-server/src/prompt/puzzle/level_name.rs index a18a15e7..3bee8568 100644 --- a/server-rs/crates/api-server/src/prompt/puzzle/level_name.rs +++ b/server-rs/crates/api-server/src/prompt/puzzle/level_name.rs @@ -3,7 +3,7 @@ /// 模型只负责把画面描述压缩成可直接展示的中文关卡名;写回草稿和作品卡由业务路由处理。 pub(crate) const PUZZLE_FIRST_LEVEL_NAME_SYSTEM_PROMPT: &str = r#"你是一个中文拼图关卡命名编辑。 -你会收到拼图第一关的画面描述。请生成 1 个适合直接展示在游戏关卡卡片上的中文关卡名。 +你会收到拼图第一关的画面描述,部分请求还会附带已经生成完成的正式图片。请综合图片内容和画面描述,生成 1 个适合直接展示在游戏关卡卡片上的中文关卡名。 硬约束: 1. 只输出 JSON,不要输出 Markdown、解释或代码块。 @@ -21,6 +21,13 @@ pub(crate) fn build_puzzle_first_level_name_user_prompt(picture_description: &st ) } +pub(crate) fn build_puzzle_first_level_name_vision_user_text(picture_description: &str) -> String { + format!( + "画面描述:{picture_description}\n\n请观察随消息附带的正式拼图图片,生成第一关关卡名。", + picture_description = picture_description.trim(), + ) +} + #[cfg(test)] mod tests { use super::*; @@ -32,4 +39,12 @@ mod tests { assert!(prompt.contains("画面描述:一只猫在雨夜灯牌下回头。")); assert!(prompt.contains("第一关关卡名")); } + + #[test] + fn level_name_vision_prompt_mentions_generated_image() { + let prompt = build_puzzle_first_level_name_vision_user_text("一只猫在雨夜灯牌下回头。"); + + assert!(prompt.contains("画面描述:一只猫在雨夜灯牌下回头。")); + assert!(prompt.contains("正式拼图图片")); + } } diff --git a/server-rs/crates/api-server/src/puzzle.rs b/server-rs/crates/api-server/src/puzzle.rs index 97f2c24e..10a8fb72 100644 --- a/server-rs/crates/api-server/src/puzzle.rs +++ b/server-rs/crates/api-server/src/puzzle.rs @@ -13,12 +13,13 @@ use axum::{ }, }; use base64::{Engine as _, engine::general_purpose::STANDARD as BASE64_STANDARD}; +use image::ImageFormat; use module_assets::{ AssetObjectAccessPolicy, AssetObjectFieldError, build_asset_entity_binding_input, build_asset_object_upsert_input, generate_asset_binding_id, generate_asset_object_id, }; use module_puzzle::{PuzzleGeneratedImageCandidate, PuzzleRuntimeLevelStatus}; -use platform_llm::{LlmMessage, LlmTextRequest}; +use platform_llm::{LlmMessage, LlmMessageContentPart, LlmTextRequest}; use platform_oss::{ LegacyAssetPrefix, OssHeadObjectRequest, OssObjectAccess, OssPutObjectRequest, OssSignedGetObjectUrlRequest, @@ -78,7 +79,7 @@ use crate::{ }, auth::AuthenticatedAccessToken, http_error::AppError, - llm_model_routing::CREATION_TEMPLATE_LLM_MODEL, + llm_model_routing::{CREATION_TEMPLATE_LLM_MODEL, PUZZLE_LEVEL_NAME_VISION_LLM_MODEL}, platform_errors::map_oss_error, prompt::puzzle::{ draft::{ @@ -88,6 +89,7 @@ use crate::{ image::{PUZZLE_DEFAULT_NEGATIVE_PROMPT, build_puzzle_image_prompt}, level_name::{ PUZZLE_FIRST_LEVEL_NAME_SYSTEM_PROMPT, build_puzzle_first_level_name_user_prompt, + build_puzzle_first_level_name_vision_user_text, }, tags::{PUZZLE_TAG_GENERATION_SYSTEM_PROMPT, build_puzzle_tag_generation_user_prompt}, }, @@ -112,6 +114,7 @@ const PUZZLE_ENTITY_KIND: &str = "puzzle_work"; const PUZZLE_GENERATED_IMAGE_SIZE: &str = "1024*1024"; const PUZZLE_APIMART_GENERATED_IMAGE_SIZE: &str = "1:1"; const PUZZLE_APIMART_GEMINI_RESOLUTION: &str = "1K"; +const PUZZLE_LEVEL_NAME_VISION_IMAGE_MAX_SIDE: u32 = 768; pub async fn create_puzzle_agent_session( State(state): State, @@ -204,7 +207,8 @@ pub async fn generate_puzzle_onboarding_work( PUZZLE_AGENT_API_BASE_PROVIDER, map_puzzle_generation_endpoint_error(error), ) - })?; + })? + .into_records(); let selected = candidates.first().cloned().ok_or_else(|| { puzzle_error_response( &request_context, @@ -864,8 +868,9 @@ pub async fn execute_puzzle_agent_action( if let Some(levels_json) = levels_json.as_ref() { draft.levels = parse_puzzle_level_records_from_module_json(levels_json)?; } - let target_level = + let mut target_level = select_puzzle_level_for_api(&draft, target_level_id.as_deref())?; + let fallback_level_name = target_level.level_name.clone(); let prompt = resolve_puzzle_level_image_prompt( payload.prompt_text.as_deref(), &target_level.picture_description, @@ -886,10 +891,32 @@ pub async fn execute_puzzle_agent_action( ) .await .map_err(map_puzzle_generation_endpoint_error)?; + if candidates.is_empty() { + return Err(AppError::from_status(StatusCode::BAD_GATEWAY).with_details( + json!({ + "provider": PUZZLE_AGENT_API_BASE_PROVIDER, + "message": "拼图候选图生成结果为空", + }), + )); + } + if let Some(refined_level_name) = generate_puzzle_first_level_name_from_image( + &state, + target_level.picture_description.as_str(), + &candidates[0].downloaded_image, + ) + .await + { + target_level.level_name = refined_level_name; + } + let generated_level_name = target_level.level_name.clone(); + let levels_json_with_generated_name = + Some(serialize_puzzle_level_records_for_module( + &build_puzzle_levels_with_primary_name(&draft, &target_level), + )?); let candidates_json = serde_json::to_string( &candidates .iter() - .map(to_puzzle_generated_image_candidate) + .map(|candidate| to_puzzle_generated_image_candidate(&candidate.record)) .collect::>(), ) .map_err(|error| { @@ -904,7 +931,7 @@ pub async fn execute_puzzle_agent_action( session_id: session.session_id.clone(), owner_user_id: owner_user_id.clone(), level_id: Some(target_level.level_id.clone()), - levels_json, + levels_json: levels_json_with_generated_name, candidates_json, saved_at_micros: now, }) @@ -925,9 +952,15 @@ pub async fn execute_puzzle_agent_action( let fallback_session = replace_puzzle_session_draft_snapshot(session, draft, now); Ok(apply_generated_puzzle_candidates_to_session_snapshot( - fallback_session, + apply_generated_puzzle_first_level_name_to_session_snapshot( + fallback_session, + target_level.level_id.as_str(), + generated_level_name.as_str(), + fallback_level_name.as_str(), + now, + ), target_level.level_id.as_str(), - candidates, + candidates.into_records(), now, )) } @@ -2830,6 +2863,91 @@ async fn generate_puzzle_first_level_name(state: &AppState, picture_description: build_fallback_puzzle_first_level_name(picture_description) } +async fn generate_puzzle_first_level_name_from_image( + state: &AppState, + picture_description: &str, + image: &PuzzleDownloadedImage, +) -> Option { + let Some(llm_client) = state.creative_agent_gpt5_client() else { + return None; + }; + let Some(image_data_url) = build_puzzle_level_name_image_data_url(image) else { + tracing::warn!( + provider = PUZZLE_AGENT_API_BASE_PROVIDER, + picture_chars = picture_description.chars().count(), + "拼图首关名图片输入压缩失败,保留文本关卡名" + ); + return None; + }; + let user_text = build_puzzle_first_level_name_vision_user_text(picture_description); + let response = llm_client + .request_text( + LlmTextRequest::new(vec![ + LlmMessage::system(PUZZLE_FIRST_LEVEL_NAME_SYSTEM_PROMPT), + LlmMessage::user_multimodal(vec![ + LlmMessageContentPart::InputText { text: user_text }, + LlmMessageContentPart::InputImage { + image_url: image_data_url, + }, + ]), + ]) + .with_model(PUZZLE_LEVEL_NAME_VISION_LLM_MODEL) + .with_max_tokens(80), + ) + .await; + + match response { + Ok(response) => { + parse_puzzle_first_level_name_from_text(response.content.as_str()).or_else(|| { + tracing::warn!( + provider = PUZZLE_AGENT_API_BASE_PROVIDER, + model = PUZZLE_LEVEL_NAME_VISION_LLM_MODEL, + picture_chars = picture_description.chars().count(), + "拼图首关名视觉模型返回非法,保留文本关卡名" + ); + None + }) + } + Err(error) => { + tracing::warn!( + provider = PUZZLE_AGENT_API_BASE_PROVIDER, + model = PUZZLE_LEVEL_NAME_VISION_LLM_MODEL, + picture_chars = picture_description.chars().count(), + error = %error, + "拼图首关名视觉生成失败,保留文本关卡名" + ); + None + } + } +} + +fn build_puzzle_level_name_image_data_url(image: &PuzzleDownloadedImage) -> Option { + let bytes = resize_puzzle_level_name_image_bytes(image.bytes.as_slice()) + .unwrap_or_else(|| image.bytes.clone()); + let mime_type = if bytes.starts_with(b"\x89PNG\r\n\x1A\n") { + "image/png" + } else { + image.mime_type.as_str() + }; + Some(format!( + "data:{};base64,{}", + normalize_puzzle_downloaded_image_mime_type(mime_type), + BASE64_STANDARD.encode(bytes) + )) +} + +fn resize_puzzle_level_name_image_bytes(bytes: &[u8]) -> Option> { + let image = image::load_from_memory(bytes).ok()?; + let resized = image.resize( + PUZZLE_LEVEL_NAME_VISION_IMAGE_MAX_SIDE, + PUZZLE_LEVEL_NAME_VISION_IMAGE_MAX_SIDE, + image::imageops::FilterType::Triangle, + ); + let mut cursor = std::io::Cursor::new(Vec::new()); + resized.write_to(&mut cursor, ImageFormat::Png).ok()?; + Some(cursor.into_inner()) +} + fn parse_puzzle_first_level_name_from_text(text: &str) -> Option { let trimmed = text.trim(); let json_text = if let Some(start) = trimmed.find('{') @@ -2985,9 +3103,6 @@ async fn compile_puzzle_draft_with_initial_cover( let generated_level_name = generate_puzzle_first_level_name(state, &target_level.picture_description).await; target_level.level_name = generated_level_name.clone(); - let levels_json_with_generated_name = Some(serialize_puzzle_level_records_for_module( - &build_puzzle_levels_with_primary_name(&draft, &target_level), - )?); let image_prompt = resolve_puzzle_draft_cover_prompt( prompt_text, &target_level.picture_description, @@ -3008,19 +3123,32 @@ async fn compile_puzzle_draft_with_initial_cover( .await?; let selected_candidate_id = candidates .iter() - .find(|candidate| candidate.selected) + .find(|candidate| candidate.record.selected) .or_else(|| candidates.first()) - .map(|candidate| candidate.candidate_id.clone()) + .map(|candidate| candidate.record.candidate_id.clone()) .ok_or_else(|| { AppError::from_status(StatusCode::BAD_GATEWAY).with_details(json!({ "provider": PUZZLE_AGENT_API_BASE_PROVIDER, "message": "拼图候选图生成结果为空", })) })?; + if let Some(refined_level_name) = generate_puzzle_first_level_name_from_image( + state, + target_level.picture_description.as_str(), + &candidates[0].downloaded_image, + ) + .await + { + target_level.level_name = refined_level_name; + } + let generated_level_name = target_level.level_name.clone(); + let levels_json_with_generated_name = Some(serialize_puzzle_level_records_for_module( + &build_puzzle_levels_with_primary_name(&draft, &target_level), + )?); let candidates_json = serde_json::to_string( &candidates .iter() - .map(to_puzzle_generated_image_candidate) + .map(|candidate| to_puzzle_generated_image_candidate(&candidate.record)) .collect::>(), ) .map_err(|error| { @@ -3061,7 +3189,7 @@ async fn compile_puzzle_draft_with_initial_cover( now, ), target_level.level_id.as_str(), - candidates.clone(), + candidates.into_records(), now, ); Ok((session, true)) @@ -3138,9 +3266,6 @@ async fn compile_puzzle_draft_with_uploaded_cover( let generated_level_name = generate_puzzle_first_level_name(state, &target_level.picture_description).await; target_level.level_name = generated_level_name.clone(); - let levels_json_with_generated_name = Some(serialize_puzzle_level_records_for_module( - &build_puzzle_levels_with_primary_name(&draft, &target_level), - )?); let image_prompt = resolve_puzzle_draft_cover_prompt( prompt_text, &target_level.picture_description, @@ -3152,6 +3277,24 @@ async fn compile_puzzle_draft_with_uploaded_cover( compiled_session.session_id, target_level.candidates.len() + 1 ); + let uploaded_downloaded_image = PuzzleDownloadedImage { + extension: puzzle_mime_to_extension(uploaded_image.mime_type.as_str()).to_string(), + mime_type: normalize_puzzle_downloaded_image_mime_type(uploaded_image.mime_type.as_str()), + bytes: uploaded_image.bytes, + }; + if let Some(refined_level_name) = generate_puzzle_first_level_name_from_image( + state, + target_level.picture_description.as_str(), + &uploaded_downloaded_image, + ) + .await + { + target_level.level_name = refined_level_name; + } + let generated_level_name = target_level.level_name.clone(); + let levels_json_with_generated_name = Some(serialize_puzzle_level_records_for_module( + &build_puzzle_levels_with_primary_name(&draft, &target_level), + )?); let persisted_upload = persist_puzzle_generated_asset( state, owner_user_id.as_str(), @@ -3159,13 +3302,7 @@ async fn compile_puzzle_draft_with_uploaded_cover( &target_level.level_name, candidate_id.as_str(), "uploaded-direct", - PuzzleDownloadedImage { - extension: puzzle_mime_to_extension(uploaded_image.mime_type.as_str()).to_string(), - mime_type: normalize_puzzle_downloaded_image_mime_type( - uploaded_image.mime_type.as_str(), - ), - bytes: uploaded_image.bytes, - }, + uploaded_downloaded_image, current_utc_micros(), ) .await?; @@ -3865,7 +4002,7 @@ async fn generate_puzzle_image_candidates( image_model: Option<&str>, candidate_count: u32, candidate_start_index: usize, -) -> Result, AppError> { +) -> Result, AppError> { let count = candidate_count.clamp(1, 1); let resolved_model = resolve_puzzle_image_model(image_model); let actual_prompt = build_puzzle_image_prompt(level_name, prompt); @@ -3914,6 +4051,7 @@ async fn generate_puzzle_image_candidates( "{session_id}-candidate-{}", candidate_start_index + index + 1 ); + let downloaded_image = image.clone(); let asset = persist_puzzle_generated_asset( state, owner_user_id, @@ -3926,30 +4064,22 @@ async fn generate_puzzle_image_candidates( ) .await .map_err(map_puzzle_generation_endpoint_error)?; - items.push(PuzzleGeneratedImageCandidateResponse { - candidate_id, - image_src: asset.image_src, - asset_id: asset.asset_id, - prompt: prompt.to_string(), - actual_prompt: Some(actual_prompt.clone()), - source_type: resolved_model.candidate_source_type().to_string(), - // 单图生成结果总是直接成为当前正式图。 - selected: index == 0, + items.push(GeneratedPuzzleImageCandidate { + record: PuzzleGeneratedImageCandidateRecord { + candidate_id, + image_src: asset.image_src, + asset_id: asset.asset_id, + prompt: prompt.to_string(), + actual_prompt: Some(actual_prompt.clone()), + source_type: resolved_model.candidate_source_type().to_string(), + // 单图生成结果总是直接成为当前正式图。 + selected: index == 0, + }, + downloaded_image, }); } - Ok(items - .into_iter() - .map(|candidate| PuzzleGeneratedImageCandidateRecord { - candidate_id: candidate.candidate_id, - image_src: candidate.image_src, - asset_id: candidate.asset_id, - prompt: candidate.prompt, - actual_prompt: candidate.actual_prompt, - source_type: candidate.source_type, - selected: candidate.selected, - }) - .collect()) + Ok(items) } #[cfg(test)] @@ -3977,6 +4107,7 @@ mod tests { assert_eq!(body["size"], PUZZLE_APIMART_GENERATED_IMAGE_SIZE); assert_eq!(body["resolution"], PUZZLE_APIMART_GEMINI_RESOLUTION); assert_eq!(body["n"], 1); + assert_eq!(body["official_fallback"], true); assert_eq!(body["image_urls"][0], "data:image/png;base64,abcd"); assert!( body["prompt"] @@ -4014,6 +4145,7 @@ mod tests { prompt_text: None, reference_image_src: None, image_model: Some(PUZZLE_IMAGE_MODEL_GPT_IMAGE_2.to_string()), + ai_redraw: None, candidate_count: Some(1), candidate_id: None, level_id: Some("puzzle-level-1".to_string()), @@ -4073,6 +4205,26 @@ mod tests { ); } + #[test] + fn puzzle_level_name_image_data_url_downsizes_generated_image() { + let image = image::DynamicImage::ImageRgb8(image::RgbImage::new(4, 4)); + let mut cursor = std::io::Cursor::new(Vec::new()); + image + .write_to(&mut cursor, ImageFormat::Png) + .expect("test image should encode"); + let downloaded = PuzzleDownloadedImage { + extension: "png".to_string(), + mime_type: "image/png".to_string(), + bytes: cursor.into_inner(), + }; + + let data_url = build_puzzle_level_name_image_data_url(&downloaded) + .expect("data url should be generated"); + + assert!(data_url.starts_with("data:image/png;base64,")); + assert!(data_url.len() > "data:image/png;base64,".len()); + } + #[test] fn puzzle_first_level_name_snapshot_defaults_work_title() { let levels_json = serde_json::to_string(&vec![json!({ @@ -4091,6 +4243,7 @@ mod tests { prompt_text: None, reference_image_src: None, image_model: Some(PUZZLE_IMAGE_MODEL_GPT_IMAGE_2.to_string()), + ai_redraw: None, candidate_count: Some(1), candidate_id: None, level_id: Some("puzzle-level-1".to_string()), @@ -4181,6 +4334,30 @@ struct PuzzleGeneratedImages { images: Vec, } +struct GeneratedPuzzleImageCandidate { + record: PuzzleGeneratedImageCandidateRecord, + downloaded_image: PuzzleDownloadedImage, +} + +impl GeneratedPuzzleImageCandidate { + fn into_record(self) -> PuzzleGeneratedImageCandidateRecord { + self.record + } +} + +trait GeneratedPuzzleImageCandidatesExt { + fn into_records(self) -> Vec; +} + +impl GeneratedPuzzleImageCandidatesExt for Vec { + fn into_records(self) -> Vec { + self.into_iter() + .map(GeneratedPuzzleImageCandidate::into_record) + .collect() + } +} + +#[derive(Clone)] struct PuzzleDownloadedImage { extension: String, mime_type: String, @@ -4361,6 +4538,7 @@ fn build_puzzle_apimart_image_request_body( Value::String(build_puzzle_apimart_prompt(prompt, negative_prompt)), ), ("n".to_string(), json!(candidate_count.clamp(1, 1))), + ("official_fallback".to_string(), Value::Bool(true)), ("size".to_string(), Value::String(size.to_string())), ]); body.insert( diff --git a/server-rs/crates/api-server/src/state.rs b/server-rs/crates/api-server/src/state.rs index 2848178d..47be20e9 100644 --- a/server-rs/crates/api-server/src/state.rs +++ b/server-rs/crates/api-server/src/state.rs @@ -787,7 +787,8 @@ fn build_creative_agent_gpt5_client( config.apimart_image_request_timeout_ms, 0, config.llm_retry_backoff_ms, - )?; + )? + .with_official_fallback(true); Ok(Some(LlmClient::new(llm_config)?)) } @@ -888,5 +889,6 @@ mod tests { client.config().responses_url(), "https://api.apimart.test/v1/responses" ); + assert!(client.config().official_fallback()); } } diff --git a/server-rs/crates/api-server/src/vector_engine_audio_generation.rs b/server-rs/crates/api-server/src/vector_engine_audio_generation.rs new file mode 100644 index 00000000..c14eb1ef --- /dev/null +++ b/server-rs/crates/api-server/src/vector_engine_audio_generation.rs @@ -0,0 +1,973 @@ +use std::{collections::BTreeMap, time::Duration}; + +use axum::{ + Json, + extract::{Path, State, rejection::JsonRejection}, + http::StatusCode, + response::Response, +}; +use module_assets::{ + AssetObjectAccessPolicy, build_asset_entity_binding_input, build_asset_object_upsert_input, + generate_asset_binding_id, generate_asset_object_id, +}; +use platform_oss::{LegacyAssetPrefix, OssObjectAccess, OssPutObjectRequest}; +use reqwest::header; +use serde_json::{Map, Value, json}; +use shared_contracts::visual_novel as contract; + +use crate::{ + api_response::json_success_body, auth::AuthenticatedAccessToken, http_error::AppError, + platform_errors::map_oss_error, request_context::RequestContext, state::AppState, +}; + +const VECTOR_ENGINE_PROVIDER: &str = "vector-engine"; +const VECTOR_ENGINE_SUNO_PROVIDER: &str = "vector-engine-suno"; +const VECTOR_ENGINE_VIDU_PROVIDER: &str = "vector-engine-vidu"; +const SUNO_DEFAULT_MODEL: &str = "chirp-v4"; +const VIDU_AUDIO_MODEL: &str = "audio1.0"; +const AUDIO_ENTITY_KIND: &str = "visual_novel_scene"; +const MUSIC_ASSET_KIND: &str = "visual_novel_music"; +const AMBIENT_SOUND_ASSET_KIND: &str = "visual_novel_ambient_sound"; +const MUSIC_SLOT: &str = "music"; +const AMBIENT_SOUND_SLOT: &str = "ambient_sound"; +const SUNO_PROMPT_MAX_CHARS: usize = 5_000; +const SUNO_TITLE_MAX_CHARS: usize = 80; +const SUNO_TAGS_MAX_CHARS: usize = 160; +const VIDU_PROMPT_MAX_CHARS: usize = 1_500; +const DEFAULT_SOUND_EFFECT_DURATION_SECONDS: u8 = 5; +const MAX_GENERATED_AUDIO_BYTES: usize = 40 * 1024 * 1024; + +#[derive(Clone, Debug)] +struct VectorEngineAudioSettings { + base_url: String, + api_key: String, + request_timeout_ms: u64, +} + +#[derive(Clone, Debug)] +struct DownloadedAudio { + bytes: Vec, + mime_type: String, + extension: String, +} + +#[derive(Clone, Copy, Debug, PartialEq, Eq)] +enum AudioAssetSlot { + BackgroundMusic, + SoundEffect, +} + +impl AudioAssetSlot { + fn contract_kind(self) -> contract::VisualNovelAudioGenerationKind { + match self { + Self::BackgroundMusic => contract::VisualNovelAudioGenerationKind::BackgroundMusic, + Self::SoundEffect => contract::VisualNovelAudioGenerationKind::SoundEffect, + } + } + + fn provider(self) -> &'static str { + match self { + Self::BackgroundMusic => VECTOR_ENGINE_SUNO_PROVIDER, + Self::SoundEffect => VECTOR_ENGINE_VIDU_PROVIDER, + } + } + + fn asset_kind(self) -> &'static str { + match self { + Self::BackgroundMusic => MUSIC_ASSET_KIND, + Self::SoundEffect => AMBIENT_SOUND_ASSET_KIND, + } + } + + fn slot(self) -> &'static str { + match self { + Self::BackgroundMusic => MUSIC_SLOT, + Self::SoundEffect => AMBIENT_SOUND_SLOT, + } + } + + fn file_stem(self) -> &'static str { + match self { + Self::BackgroundMusic => "background-music", + Self::SoundEffect => "sound-effect", + } + } +} + +pub async fn create_visual_novel_background_music_task( + State(state): State, + axum::extract::Extension(request_context): axum::extract::Extension, + payload: Result, JsonRejection>, +) -> Result, Response> { + let Json(payload) = parse_json_payload(&request_context, payload)?; + let settings = require_vector_engine_audio_settings(&state)?; + let http_client = build_vector_engine_audio_http_client(&settings)?; + let prompt = normalize_limited_text(&payload.prompt, "prompt", SUNO_PROMPT_MAX_CHARS)?; + let title = normalize_limited_text(&payload.title, "title", SUNO_TITLE_MAX_CHARS)?; + let tags = payload + .tags + .as_deref() + .map(|value| normalize_limited_text(value, "tags", SUNO_TAGS_MAX_CHARS)) + .transpose()?; + let model = normalize_optional_text(payload.model.as_deref()) + .unwrap_or_else(|| SUNO_DEFAULT_MODEL.to_string()); + + let mut body = Map::from_iter([ + ("prompt".to_string(), Value::String(prompt)), + ("mv".to_string(), Value::String(model)), + ("title".to_string(), Value::String(title)), + ("task".to_string(), Value::String("generate".to_string())), + ]); + if let Some(tags) = tags { + body.insert("tags".to_string(), Value::String(tags)); + } + + let response = post_vector_engine_json( + &http_client, + &settings, + "/suno/submit/music", + Value::Object(body), + "提交 Suno 背景音乐任务失败", + ) + .await?; + let task_id = extract_string_by_path(&response, &["data"]) + .or_else(|| find_first_string_by_key(&response, "task_id")) + .or_else(|| find_first_string_by_key(&response, "taskId")) + .ok_or_else(|| { + vector_engine_bad_gateway("提交 Suno 背景音乐任务失败:上游未返回任务 ID") + })?; + + Ok(json_success_body( + Some(&request_context), + contract::VisualNovelAudioGenerationTaskResponse { + kind: contract::VisualNovelAudioGenerationKind::BackgroundMusic, + task_id, + provider: VECTOR_ENGINE_SUNO_PROVIDER.to_string(), + status: "submitted".to_string(), + }, + )) +} + +pub async fn create_visual_novel_sound_effect_task( + State(state): State, + axum::extract::Extension(request_context): axum::extract::Extension, + payload: Result, JsonRejection>, +) -> Result, Response> { + let Json(payload) = parse_json_payload(&request_context, payload)?; + let settings = require_vector_engine_audio_settings(&state)?; + let http_client = build_vector_engine_audio_http_client(&settings)?; + let prompt = normalize_limited_text(&payload.prompt, "prompt", VIDU_PROMPT_MAX_CHARS)?; + let duration = payload + .duration + .unwrap_or(DEFAULT_SOUND_EFFECT_DURATION_SECONDS) + .clamp(2, 10); + + let mut body = Map::from_iter([ + ( + "model".to_string(), + Value::String(VIDU_AUDIO_MODEL.to_string()), + ), + ("prompt".to_string(), Value::String(prompt)), + ("duration".to_string(), json!(duration)), + ]); + if let Some(seed) = payload.seed { + body.insert("seed".to_string(), json!(seed)); + } + + let response = post_vector_engine_json( + &http_client, + &settings, + "/ent/v2/text2audio", + Value::Object(body), + "提交 Vidu 音效任务失败", + ) + .await?; + let task_id = find_first_string_by_key(&response, "task_id") + .or_else(|| find_first_string_by_key(&response, "taskId")) + .ok_or_else(|| vector_engine_bad_gateway("提交 Vidu 音效任务失败:上游未返回任务 ID"))?; + let status = find_first_string_by_key(&response, "state").unwrap_or_else(|| "created".into()); + + Ok(json_success_body( + Some(&request_context), + contract::VisualNovelAudioGenerationTaskResponse { + kind: contract::VisualNovelAudioGenerationKind::SoundEffect, + task_id, + provider: VECTOR_ENGINE_VIDU_PROVIDER.to_string(), + status, + }, + )) +} + +pub async fn publish_visual_novel_background_music_asset( + State(state): State, + Path(task_id): Path, + axum::extract::Extension(request_context): axum::extract::Extension, + axum::extract::Extension(authenticated): axum::extract::Extension, + payload: Result, JsonRejection>, +) -> Result, Response> { + publish_generated_audio_asset( + &state, + &request_context, + authenticated.claims().user_id(), + task_id, + parse_json_payload(&request_context, payload)?.0, + AudioAssetSlot::BackgroundMusic, + ) + .await + .map(|payload| json_success_body(Some(&request_context), payload)) + .map_err(|error| error.into_response_with_context(Some(&request_context))) +} + +pub async fn publish_visual_novel_sound_effect_asset( + State(state): State, + Path(task_id): Path, + axum::extract::Extension(request_context): axum::extract::Extension, + axum::extract::Extension(authenticated): axum::extract::Extension, + payload: Result, JsonRejection>, +) -> Result, Response> { + publish_generated_audio_asset( + &state, + &request_context, + authenticated.claims().user_id(), + task_id, + parse_json_payload(&request_context, payload)?.0, + AudioAssetSlot::SoundEffect, + ) + .await + .map(|payload| json_success_body(Some(&request_context), payload)) + .map_err(|error| error.into_response_with_context(Some(&request_context))) +} + +async fn publish_generated_audio_asset( + state: &AppState, + _request_context: &RequestContext, + owner_user_id: &str, + task_id: String, + payload: contract::PublishVisualNovelGeneratedAudioAssetRequest, + slot: AudioAssetSlot, +) -> Result { + let task_id = normalize_limited_text(&task_id, "taskId", 160)?; + let scene_id = normalize_limited_text(&payload.scene_id, "sceneId", 160)?; + let profile_id = normalize_optional_text(payload.profile_id.as_deref()); + let settings = require_vector_engine_audio_settings(state)?; + let http_client = build_vector_engine_audio_http_client(&settings)?; + let task_payload = fetch_audio_task_payload(&http_client, &settings, slot, &task_id).await?; + let status = normalize_task_status( + find_first_string_by_key(&task_payload, "status") + .or_else(|| find_first_string_by_key(&task_payload, "state")) + .or_else(|| find_first_string_by_key(&task_payload, "Status")) + .as_deref() + .unwrap_or(""), + ); + + let mut audio_urls = extract_audio_urls(&task_payload); + if slot == AudioAssetSlot::BackgroundMusic && audio_urls.is_empty() { + if let Some(clip_id) = extract_string_by_path(&task_payload, &["data"]) + .filter(|value| !value.trim().is_empty()) + { + let wav_payload = get_vector_engine_json( + &http_client, + &settings, + &format!("/suno/act/wav/{}", encode_path_segment(clip_id.as_str())), + "获取 Suno wav 音频失败", + ) + .await?; + audio_urls = extract_audio_urls(&wav_payload); + } + } + + if is_pending_task_status(&status) && audio_urls.is_empty() { + return Ok(contract::VisualNovelGeneratedAudioAssetResponse { + kind: slot.contract_kind(), + task_id, + provider: slot.provider().to_string(), + status, + asset_object_id: None, + asset_kind: None, + audio_src: None, + }); + } + + if is_failed_task_status(&status) { + return Err(vector_engine_bad_gateway( + "音频生成任务失败,请调整提示词后重试", + )); + } + + let audio_url = audio_urls + .into_iter() + .next() + .ok_or_else(|| vector_engine_bad_gateway("音频生成尚未返回可下载地址"))?; + let audio = download_generated_audio(&http_client, &audio_url, slot.provider()).await?; + let persisted = persist_generated_audio_asset( + state, + &http_client, + owner_user_id, + profile_id, + scene_id, + &task_id, + slot, + audio, + ) + .await?; + + Ok(contract::VisualNovelGeneratedAudioAssetResponse { + kind: slot.contract_kind(), + task_id, + provider: slot.provider().to_string(), + status: "completed".to_string(), + asset_object_id: Some(persisted.asset_object_id), + asset_kind: Some(slot.asset_kind().to_string()), + audio_src: Some(persisted.audio_src), + }) +} + +async fn fetch_audio_task_payload( + http_client: &reqwest::Client, + settings: &VectorEngineAudioSettings, + slot: AudioAssetSlot, + task_id: &str, +) -> Result { + match slot { + AudioAssetSlot::BackgroundMusic => { + get_vector_engine_json( + http_client, + settings, + &format!("/suno/fetch/{}", encode_path_segment(task_id)), + "查询 Suno 背景音乐任务失败", + ) + .await + } + AudioAssetSlot::SoundEffect => { + get_vector_engine_json( + http_client, + settings, + &format!("/ent/v2/tasks/{}/creations", encode_path_segment(task_id)), + "查询 Vidu 音效任务失败", + ) + .await + } + } +} + +#[derive(Clone, Debug)] +struct PersistedAudioAsset { + asset_object_id: String, + audio_src: String, +} + +async fn persist_generated_audio_asset( + state: &AppState, + http_client: &reqwest::Client, + owner_user_id: &str, + profile_id: Option, + scene_id: String, + task_id: &str, + slot: AudioAssetSlot, + audio: DownloadedAudio, +) -> Result { + let oss_client = state.oss_client().ok_or_else(|| { + AppError::from_status(StatusCode::SERVICE_UNAVAILABLE).with_details(json!({ + "provider": "aliyun-oss", + "reason": "OSS 未完成环境变量配置", + })) + })?; + + let file_name = format!("{}-{}.{}", slot.file_stem(), task_id, audio.extension); + let put_result = oss_client + .put_object( + http_client, + OssPutObjectRequest { + prefix: LegacyAssetPrefix::CustomWorldScenes, + path_segments: vec![ + "visual-novel".to_string(), + profile_id.clone().unwrap_or_else(|| "draft".to_string()), + scene_id.clone(), + slot.slot().to_string(), + ], + file_name, + content_type: Some(audio.mime_type.clone()), + access: OssObjectAccess::Private, + metadata: build_audio_asset_metadata( + owner_user_id, + profile_id.as_deref(), + &scene_id, + slot, + ), + body: audio.bytes, + }, + ) + .await + .map_err(|error| map_oss_error(error, "aliyun-oss"))?; + let head = oss_client + .head_object( + http_client, + platform_oss::OssHeadObjectRequest { + object_key: put_result.object_key.clone(), + }, + ) + .await + .map_err(|error| map_oss_error(error, "aliyun-oss"))?; + let now_micros = current_utc_micros(); + let asset_object = state + .spacetime_client() + .confirm_asset_object( + build_asset_object_upsert_input( + generate_asset_object_id(now_micros), + head.bucket, + head.object_key, + AssetObjectAccessPolicy::Private, + head.content_type.or(Some(audio.mime_type)), + head.content_length, + head.etag, + slot.asset_kind().to_string(), + Some(task_id.to_string()), + Some(owner_user_id.to_string()), + profile_id.clone(), + Some(scene_id.clone()), + now_micros, + ) + .map_err(map_asset_field_error)?, + ) + .await + .map_err(map_spacetime_error)?; + state + .spacetime_client() + .bind_asset_object_to_entity( + build_asset_entity_binding_input( + generate_asset_binding_id(now_micros), + asset_object.asset_object_id.clone(), + AUDIO_ENTITY_KIND.to_string(), + scene_id, + slot.slot().to_string(), + slot.asset_kind().to_string(), + Some(owner_user_id.to_string()), + profile_id, + now_micros, + ) + .map_err(map_asset_field_error)?, + ) + .await + .map_err(map_spacetime_error)?; + + Ok(PersistedAudioAsset { + asset_object_id: asset_object.asset_object_id, + audio_src: put_result.legacy_public_path, + }) +} + +fn build_audio_asset_metadata( + owner_user_id: &str, + profile_id: Option<&str>, + scene_id: &str, + slot: AudioAssetSlot, +) -> BTreeMap { + let mut metadata = BTreeMap::from([ + ("asset-kind".to_string(), slot.asset_kind().to_string()), + ("owner-user-id".to_string(), owner_user_id.to_string()), + ("entity-kind".to_string(), AUDIO_ENTITY_KIND.to_string()), + ("entity-id".to_string(), scene_id.to_string()), + ("slot".to_string(), slot.slot().to_string()), + ("provider".to_string(), slot.provider().to_string()), + ]); + if let Some(profile_id) = profile_id { + metadata.insert("profile-id".to_string(), profile_id.to_string()); + } + metadata +} + +fn require_vector_engine_audio_settings( + state: &AppState, +) -> Result { + let base_url = state + .config + .vector_engine_base_url + .trim() + .trim_end_matches('/'); + if base_url.is_empty() { + return Err( + AppError::from_status(StatusCode::SERVICE_UNAVAILABLE).with_details(json!({ + "provider": VECTOR_ENGINE_PROVIDER, + "reason": "VECTOR_ENGINE_BASE_URL 未配置", + })), + ); + } + + let api_key = state + .config + .vector_engine_api_key + .as_deref() + .map(str::trim) + .filter(|value| !value.is_empty()) + .ok_or_else(|| { + AppError::from_status(StatusCode::SERVICE_UNAVAILABLE).with_details(json!({ + "provider": VECTOR_ENGINE_PROVIDER, + "reason": "VECTOR_ENGINE_API_KEY 未配置", + })) + })?; + + Ok(VectorEngineAudioSettings { + base_url: base_url.to_string(), + api_key: api_key.to_string(), + request_timeout_ms: state.config.vector_engine_audio_request_timeout_ms.max(1), + }) +} + +fn build_vector_engine_audio_http_client( + settings: &VectorEngineAudioSettings, +) -> Result { + reqwest::Client::builder() + .timeout(Duration::from_millis(settings.request_timeout_ms)) + .build() + .map_err(|error| { + AppError::from_status(StatusCode::INTERNAL_SERVER_ERROR).with_details(json!({ + "provider": VECTOR_ENGINE_PROVIDER, + "message": format!("构造 VectorEngine 音频生成 HTTP 客户端失败:{error}"), + })) + }) +} + +async fn post_vector_engine_json( + http_client: &reqwest::Client, + settings: &VectorEngineAudioSettings, + path: &str, + body: Value, + failure_context: &str, +) -> Result { + let response = http_client + .post(format!("{}{}", settings.base_url, path)) + .header( + header::AUTHORIZATION, + format!("Bearer {}", settings.api_key), + ) + .header(header::ACCEPT, "application/json") + .header(header::CONTENT_TYPE, "application/json") + .json(&body) + .send() + .await + .map_err(|error| vector_engine_bad_gateway(format!("{failure_context}:{error}")))?; + parse_vector_engine_response(response, failure_context).await +} + +async fn get_vector_engine_json( + http_client: &reqwest::Client, + settings: &VectorEngineAudioSettings, + path: &str, + failure_context: &str, +) -> Result { + let response = http_client + .get(format!("{}{}", settings.base_url, path)) + .header( + header::AUTHORIZATION, + format!("Bearer {}", settings.api_key), + ) + .header(header::ACCEPT, "application/json") + .send() + .await + .map_err(|error| vector_engine_bad_gateway(format!("{failure_context}:{error}")))?; + parse_vector_engine_response(response, failure_context).await +} + +async fn parse_vector_engine_response( + response: reqwest::Response, + failure_context: &str, +) -> Result { + let status = response.status(); + let raw_text = response.text().await.map_err(|error| { + vector_engine_bad_gateway(format!("{failure_context}:读取响应失败:{error}")) + })?; + if !status.is_success() { + return Err( + AppError::from_status(StatusCode::BAD_GATEWAY).with_details(json!({ + "provider": VECTOR_ENGINE_PROVIDER, + "message": failure_context, + "status": status.as_u16(), + "rawExcerpt": truncate_raw(raw_text.as_str()), + })), + ); + } + + let payload = serde_json::from_str::(&raw_text).map_err(|error| { + AppError::from_status(StatusCode::BAD_GATEWAY).with_details(json!({ + "provider": VECTOR_ENGINE_PROVIDER, + "message": format!("{failure_context}:解析响应失败:{error}"), + "rawExcerpt": truncate_raw(raw_text.as_str()), + })) + })?; + if let Some(code) = payload.get("code").and_then(Value::as_str) + && !matches!( + code.trim().to_ascii_lowercase().as_str(), + "success" | "succeeded" | "ok" + ) + { + return Err( + AppError::from_status(StatusCode::BAD_GATEWAY).with_details(json!({ + "provider": VECTOR_ENGINE_PROVIDER, + "message": payload + .get("message") + .and_then(Value::as_str) + .unwrap_or(failure_context), + "code": code, + })), + ); + } + + Ok(payload) +} + +async fn download_generated_audio( + http_client: &reqwest::Client, + audio_url: &str, + provider: &str, +) -> Result { + let response = http_client + .get(audio_url) + .send() + .await + .map_err(|error| vector_engine_bad_gateway(format!("下载生成音频失败:{error}")))?; + let status = response.status(); + let content_type = response + .headers() + .get(header::CONTENT_TYPE) + .and_then(|value| value.to_str().ok()) + .unwrap_or("audio/mpeg") + .to_string(); + let body = response + .bytes() + .await + .map_err(|error| vector_engine_bad_gateway(format!("读取生成音频内容失败:{error}")))?; + if !status.is_success() { + return Err( + AppError::from_status(StatusCode::BAD_GATEWAY).with_details(json!({ + "provider": provider, + "message": "下载生成音频失败", + "status": status.as_u16(), + })), + ); + } + if body.is_empty() || body.len() > MAX_GENERATED_AUDIO_BYTES { + return Err( + AppError::from_status(StatusCode::BAD_GATEWAY).with_details(json!({ + "provider": provider, + "message": "生成音频内容为空或超过大小上限", + })), + ); + } + + let mime_type = normalize_audio_mime_type(&content_type, audio_url); + Ok(DownloadedAudio { + extension: audio_mime_to_extension(&mime_type).to_string(), + mime_type, + bytes: body.to_vec(), + }) +} + +fn extract_audio_urls(payload: &Value) -> Vec { + let mut urls = Vec::new(); + collect_audio_url_strings(payload, &mut urls); + let mut deduped = Vec::new(); + for url in urls { + if !deduped.contains(&url) { + deduped.push(url); + } + } + deduped +} + +fn collect_audio_url_strings(value: &Value, output: &mut Vec) { + match value { + Value::Object(object) => { + for (key, value) in object { + if let Some(raw) = value.as_str() + && looks_like_audio_url_key(key) + && looks_like_http_url(raw) + { + output.push(raw.trim().to_string()); + } + collect_audio_url_strings(value, output); + } + } + Value::Array(items) => { + for item in items { + collect_audio_url_strings(item, output); + } + } + Value::String(raw) if looks_like_http_url(raw) && looks_like_audio_url(raw) => { + output.push(raw.trim().to_string()); + } + _ => {} + } +} + +fn looks_like_audio_url_key(key: &str) -> bool { + let normalized = key.trim().to_ascii_lowercase(); + normalized.contains("audio") + || normalized.contains("wav") + || normalized.contains("mp3") + || normalized.contains("fileurl") + || normalized == "url" + || normalized.ends_with("_url") + || normalized.ends_with("url") +} + +fn looks_like_http_url(value: &str) -> bool { + let value = value.trim().to_ascii_lowercase(); + value.starts_with("http://") || value.starts_with("https://") +} + +fn looks_like_audio_url(value: &str) -> bool { + let value = value + .trim() + .split('?') + .next() + .unwrap_or_default() + .to_ascii_lowercase(); + value.ends_with(".mp3") + || value.ends_with(".wav") + || value.ends_with(".m4a") + || value.ends_with(".aac") + || value.ends_with(".ogg") + || value.ends_with(".webm") + || value.ends_with(".flac") +} + +fn normalize_audio_mime_type(content_type: &str, audio_url: &str) -> String { + let mime_type = content_type + .split(';') + .next() + .map(str::trim) + .filter(|value| value.starts_with("audio/")) + .unwrap_or(""); + match mime_type { + "audio/mpeg" | "audio/mp3" => "audio/mpeg".to_string(), + "audio/wav" | "audio/wave" | "audio/x-wav" => "audio/wav".to_string(), + "audio/ogg" => "audio/ogg".to_string(), + "audio/webm" => "audio/webm".to_string(), + "audio/aac" => "audio/aac".to_string(), + "audio/flac" => "audio/flac".to_string(), + "audio/mp4" | "audio/x-m4a" => "audio/mp4".to_string(), + _ => mime_type_from_audio_url(audio_url), + } +} + +fn mime_type_from_audio_url(audio_url: &str) -> String { + let path = audio_url + .split('?') + .next() + .unwrap_or_default() + .to_ascii_lowercase(); + if path.ends_with(".wav") { + "audio/wav".to_string() + } else if path.ends_with(".ogg") { + "audio/ogg".to_string() + } else if path.ends_with(".webm") { + "audio/webm".to_string() + } else if path.ends_with(".aac") { + "audio/aac".to_string() + } else if path.ends_with(".flac") { + "audio/flac".to_string() + } else if path.ends_with(".m4a") { + "audio/mp4".to_string() + } else { + "audio/mpeg".to_string() + } +} + +fn audio_mime_to_extension(mime_type: &str) -> &'static str { + match mime_type { + "audio/wav" => "wav", + "audio/ogg" => "ogg", + "audio/webm" => "webm", + "audio/aac" => "aac", + "audio/flac" => "flac", + "audio/mp4" => "m4a", + _ => "mp3", + } +} + +fn normalize_limited_text( + value: &str, + field: &'static str, + max_chars: usize, +) -> Result { + let normalized = value.trim().to_string(); + if normalized.is_empty() { + return Err( + AppError::from_status(StatusCode::BAD_REQUEST).with_details(json!({ + "provider": VECTOR_ENGINE_PROVIDER, + "field": field, + "message": format!("{field} 不能为空"), + })), + ); + } + if normalized.chars().count() > max_chars { + return Err( + AppError::from_status(StatusCode::BAD_REQUEST).with_details(json!({ + "provider": VECTOR_ENGINE_PROVIDER, + "field": field, + "message": format!("{field} 超过 {} 字符", max_chars), + })), + ); + } + Ok(normalized) +} + +fn normalize_optional_text(value: Option<&str>) -> Option { + value + .map(str::trim) + .filter(|value| !value.is_empty()) + .map(ToOwned::to_owned) +} + +fn normalize_task_status(status: &str) -> String { + let normalized = status.trim().to_ascii_lowercase().replace(' ', "_"); + match normalized.as_str() { + "finish" | "finished" | "complete" | "completed" | "success" | "succeeded" => { + "completed".to_string() + } + "" => "processing".to_string(), + value => value.to_string(), + } +} + +fn is_pending_task_status(status: &str) -> bool { + matches!( + status, + "created" | "pending" | "queued" | "processing" | "running" | "submitted" | "started" + ) +} + +fn is_failed_task_status(status: &str) -> bool { + matches!( + status, + "failed" | "error" | "canceled" | "cancelled" | "rejected" | "expired" + ) +} + +fn find_first_string_by_key(value: &Value, target_key: &str) -> Option { + match value { + Value::Object(object) => { + for (key, value) in object { + if key.eq_ignore_ascii_case(target_key) + && let Some(text) = value.as_str() + { + return Some(text.trim().to_string()); + } + if let Some(found) = find_first_string_by_key(value, target_key) { + return Some(found); + } + } + None + } + Value::Array(items) => items + .iter() + .find_map(|item| find_first_string_by_key(item, target_key)), + _ => None, + } +} + +fn extract_string_by_path(value: &Value, path: &[&str]) -> Option { + let mut current = value; + for key in path { + current = current.get(*key)?; + } + current.as_str().map(str::trim).map(ToOwned::to_owned) +} + +fn encode_path_segment(value: &str) -> String { + urlencoding::encode(value).into_owned() +} + +fn truncate_raw(raw_text: &str) -> String { + raw_text.chars().take(800).collect() +} + +fn current_utc_micros() -> i64 { + shared_kernel::offset_datetime_to_unix_micros(time::OffsetDateTime::now_utc()) +} + +fn map_asset_field_error(error: module_assets::AssetObjectFieldError) -> AppError { + AppError::from_status(StatusCode::BAD_REQUEST).with_details(json!({ + "provider": "asset-object", + "message": error.to_string(), + })) +} + +fn map_spacetime_error(error: spacetime_client::SpacetimeClientError) -> AppError { + AppError::from_status(StatusCode::BAD_GATEWAY).with_details(json!({ + "provider": "spacetimedb", + "message": error.to_string(), + })) +} + +fn vector_engine_bad_gateway(message: impl Into) -> AppError { + AppError::from_status(StatusCode::BAD_GATEWAY).with_details(json!({ + "provider": VECTOR_ENGINE_PROVIDER, + "message": message.into(), + })) +} + +fn parse_json_payload( + request_context: &RequestContext, + payload: Result, JsonRejection>, +) -> Result, Response> { + payload.map_err(|rejection| { + AppError::from_status(StatusCode::BAD_REQUEST) + .with_message(format!("请求体 JSON 不合法:{rejection}")) + .into_response_with_context(Some(request_context)) + }) +} + +#[cfg(test)] +mod tests { + use super::*; + + #[test] + fn normalizes_audio_mime_type_from_content_type_and_url() { + assert_eq!( + normalize_audio_mime_type("audio/x-wav; charset=utf-8", "https://x/a.bin"), + "audio/wav" + ); + assert_eq!( + normalize_audio_mime_type("application/octet-stream", "https://x/a.m4a?token=1"), + "audio/mp4" + ); + assert_eq!(audio_mime_to_extension("audio/mp4"), "m4a"); + } + + #[test] + fn extracts_nested_audio_urls() { + let payload = json!({ + "Response": { + "Status": "FINISH", + "Task": { + "Output": { + "FileInfos": [ + { "FileUrl": "https://cdn.example.test/audio.wav" } + ] + } + } + } + }); + + assert_eq!( + extract_audio_urls(&payload), + vec!["https://cdn.example.test/audio.wav".to_string()] + ); + } + + #[test] + fn vector_engine_task_status_is_stable() { + assert_eq!(normalize_task_status("FINISH"), "completed"); + assert!(is_pending_task_status("processing")); + assert!(is_failed_task_status("failed")); + } + + #[test] + fn validates_prompt_length() { + let prompt = "声".repeat(VIDU_PROMPT_MAX_CHARS + 1); + let error = normalize_limited_text(&prompt, "prompt", VIDU_PROMPT_MAX_CHARS) + .expect_err("long prompt should fail"); + assert_eq!(error.status_code(), StatusCode::BAD_REQUEST); + } +} diff --git a/server-rs/crates/api-server/src/visual_novel.rs b/server-rs/crates/api-server/src/visual_novel.rs index e7468465..4b137b0f 100644 --- a/server-rs/crates/api-server/src/visual_novel.rs +++ b/server-rs/crates/api-server/src/visual_novel.rs @@ -1532,7 +1532,10 @@ mod tests { let summary = resolve_document_summary_for_prompt(&record, None) .expect("document session should build summary"); - assert_eq!(summary.chars().count(), VISUAL_NOVEL_DOCUMENT_SUMMARY_MAX_CHARS); + assert_eq!( + summary.chars().count(), + VISUAL_NOVEL_DOCUMENT_SUMMARY_MAX_CHARS + ); assert!(summary.contains("旧书店")); } @@ -1598,7 +1601,8 @@ async fn create_or_update_creation_draft( latest_user_text: Option, ) -> Result { let now_iso = current_utc_iso(); - let document_summary = resolve_document_summary_for_prompt(session, latest_user_text.as_deref()); + let document_summary = + resolve_document_summary_for_prompt(session, latest_user_text.as_deref()); if let Some(llm_client) = state.llm_client() { let current_draft = session.draft.as_ref(); let recent_messages = session @@ -1682,7 +1686,12 @@ fn resolve_document_summary_for_prompt( (!seed_text.is_empty()).then_some(seed_text) })?; - Some(source.chars().take(VISUAL_NOVEL_DOCUMENT_SUMMARY_MAX_CHARS).collect()) + Some( + source + .chars() + .take(VISUAL_NOVEL_DOCUMENT_SUMMARY_MAX_CHARS) + .collect(), + ) } async fn compile_visual_novel_session_inner( diff --git a/server-rs/crates/api-server/src/volcengine_speech.rs b/server-rs/crates/api-server/src/volcengine_speech.rs new file mode 100644 index 00000000..cb631dcb --- /dev/null +++ b/server-rs/crates/api-server/src/volcengine_speech.rs @@ -0,0 +1,552 @@ +use axum::{ + Json, + body::Body, + extract::{ + State, + ws::{Message as ClientWsMessage, WebSocket, WebSocketUpgrade}, + }, + http::{HeaderValue, StatusCode, header}, + response::{IntoResponse, Response}, +}; +use futures_util::{SinkExt, StreamExt, TryStreamExt}; +use platform_speech::{ + AsrAudioConfig, AsrFrameKind, PublicSpeechConfig, PublicSpeechEndpoints, SpeechError, + TtsAudioParams, TtsBidirectionClientEvent, TtsSseRequest, VolcengineSpeechClient, + VolcengineSpeechConfig, build_asr_frame, build_asr_full_client_request, + build_tts_bidirection_frame_from_client_event, default_asr_request_payload, + parse_asr_response_frame, parse_tts_response_frame, tts_response_to_client_value, +}; +use serde_json::{Value, json}; +use tokio_tungstenite::tungstenite::Message as UpstreamWsMessage; +use tracing::{info, warn}; + +use crate::{ + api_response::json_success_body, auth::AuthenticatedAccessToken, http_error::AppError, + request_context::RequestContext, state::AppState, +}; + +const PROVIDER: &str = "volcengine-speech"; + +pub async fn get_volcengine_speech_config( + State(state): State, + axum::extract::Extension(request_context): axum::extract::Extension, +) -> Json { + json_success_body(Some(&request_context), public_speech_config(&state)) +} + +pub async fn stream_volcengine_asr( + State(state): State, + axum::extract::Extension(authenticated): axum::extract::Extension, + ws: WebSocketUpgrade, +) -> Result { + let client = build_speech_client(&state) + .map_err(|error| map_speech_error(error).into_response_with_context(None))?; + let user_id = authenticated.claims().user_id().to_string(); + Ok(ws.on_upgrade(move |socket| proxy_asr_websocket(socket, client, user_id))) +} + +pub async fn stream_volcengine_tts_bidirection( + State(state): State, + ws: WebSocketUpgrade, +) -> Result { + let client = build_speech_client(&state) + .map_err(|error| map_speech_error(error).into_response_with_context(None))?; + Ok(ws.on_upgrade(move |socket| proxy_tts_bidirection_websocket(socket, client))) +} + +pub async fn stream_volcengine_tts_sse( + State(state): State, + axum::extract::Extension(request_context): axum::extract::Extension, + axum::extract::Extension(authenticated): axum::extract::Extension, + payload: Result, axum::extract::rejection::JsonRejection>, +) -> Result { + let Json(payload) = payload.map_err(|rejection| { + AppError::from_status(StatusCode::BAD_REQUEST) + .with_message(format!("请求体 JSON 不合法:{rejection}")) + .into_response_with_context(Some(&request_context)) + })?; + let client = build_speech_client(&state).map_err(|error| { + map_speech_error(error).into_response_with_context(Some(&request_context)) + })?; + let upstream_request = client + .build_tts_sse_upstream_request(payload, authenticated.claims().user_id()) + .map_err(|error| { + map_speech_error(error).into_response_with_context(Some(&request_context)) + })?; + let http_client = reqwest::Client::builder() + .timeout(upstream_request.timeout) + .build() + .map_err(|error| { + AppError::from_status(StatusCode::INTERNAL_SERVER_ERROR) + .with_details(json!({ + "provider": PROVIDER, + "message": format!("构造火山语音 HTTP 客户端失败:{error}"), + })) + .into_response_with_context(Some(&request_context)) + })?; + let upstream_response = http_client + .post(upstream_request.url) + .headers(upstream_request.headers) + .json(&upstream_request.body) + .send() + .await + .map_err(|error| { + AppError::from_status(StatusCode::BAD_GATEWAY) + .with_details(json!({ + "provider": PROVIDER, + "message": format!("请求火山 TTS SSE 失败:{error}"), + })) + .into_response_with_context(Some(&request_context)) + })?; + let status = upstream_response.status(); + let log_id = upstream_response + .headers() + .get("X-Tt-Logid") + .and_then(|value| value.to_str().ok()) + .map(ToOwned::to_owned); + if !status.is_success() { + let raw_text = upstream_response.text().await.unwrap_or_default(); + return Err(AppError::from_status(StatusCode::BAD_GATEWAY) + .with_details(json!({ + "provider": PROVIDER, + "status": status.as_u16(), + "logId": log_id, + "rawExcerpt": raw_text.chars().take(800).collect::(), + })) + .into_response_with_context(Some(&request_context))); + } + + let byte_stream = upstream_response + .bytes_stream() + .map_err(std::io::Error::other); + let mut response = Response::new(Body::from_stream(byte_stream)); + *response.status_mut() = StatusCode::OK; + response.headers_mut().insert( + header::CONTENT_TYPE, + HeaderValue::from_static("text/event-stream; charset=utf-8"), + ); + response + .headers_mut() + .insert(header::CACHE_CONTROL, HeaderValue::from_static("no-cache")); + if let Some(log_id) = log_id.and_then(|value| HeaderValue::from_str(&value).ok()) { + response.headers_mut().insert("x-volcengine-logid", log_id); + } + Ok(response) +} + +async fn proxy_asr_websocket(socket: WebSocket, client: VolcengineSpeechClient, user_id: String) { + let (mut browser_sender, mut browser_receiver) = socket.split(); + let Ok((upstream, response_headers)) = client.connect_asr().await else { + let _ = browser_sender + .send(ClientWsMessage::Text( + json!({ + "type": "error", + "provider": PROVIDER, + "message": "连接火山 ASR WebSocket 失败", + }) + .to_string() + .into(), + )) + .await; + return; + }; + if let Some(log_id) = response_headers.get("x-tt-logid") { + info!(%log_id, "火山 ASR WebSocket 已连接"); + } + let (mut upstream_sender, mut upstream_receiver) = upstream.split(); + let mut has_sent_start = false; + let mut last_audio_sent = false; + + let browser_to_upstream = async { + while let Some(message) = browser_receiver.next().await { + match message { + Ok(ClientWsMessage::Text(text)) => { + let value = serde_json::from_str::(text.as_str()).unwrap_or_else(|_| { + json!({ + "request": { + "context": text.as_str(), + } + }) + }); + if value + .get("type") + .and_then(Value::as_str) + .is_some_and(|kind| kind.eq_ignore_ascii_case("finish")) + { + let frame = build_asr_frame(AsrFrameKind::LastAudio, &[])?; + upstream_sender + .send(UpstreamWsMessage::Binary(frame.into())) + .await + .map_err(map_ws_send_error)?; + last_audio_sent = true; + continue; + } + if !has_sent_start { + let payload = default_asr_request_payload(&user_id, Some(value)); + let frame = build_asr_full_client_request(&payload)?; + upstream_sender + .send(UpstreamWsMessage::Binary(frame.into())) + .await + .map_err(map_ws_send_error)?; + has_sent_start = true; + } + } + Ok(ClientWsMessage::Binary(bytes)) => { + if !has_sent_start { + let payload = default_asr_request_payload(&user_id, None); + let frame = build_asr_full_client_request(&payload)?; + upstream_sender + .send(UpstreamWsMessage::Binary(frame.into())) + .await + .map_err(map_ws_send_error)?; + has_sent_start = true; + } + let frame = build_asr_frame(AsrFrameKind::Audio, &bytes)?; + upstream_sender + .send(UpstreamWsMessage::Binary(frame.into())) + .await + .map_err(map_ws_send_error)?; + } + Ok(ClientWsMessage::Close(_)) => break, + Ok(ClientWsMessage::Ping(bytes)) => { + upstream_sender + .send(UpstreamWsMessage::Ping(bytes)) + .await + .map_err(map_ws_send_error)?; + } + Ok(ClientWsMessage::Pong(_)) => {} + Err(error) => { + return Err(SpeechError::Upstream(format!( + "读取浏览器 ASR WebSocket 失败:{error}" + ))); + } + } + } + if has_sent_start && !last_audio_sent { + let frame = build_asr_frame(AsrFrameKind::LastAudio, &[])?; + let _ = upstream_sender + .send(UpstreamWsMessage::Binary(frame.into())) + .await; + } + Ok::<(), SpeechError>(()) + }; + + let upstream_to_browser = async { + while let Some(message) = upstream_receiver.next().await { + match message { + Ok(UpstreamWsMessage::Binary(bytes)) => { + let parsed = parse_asr_response_frame(&bytes)?; + let value = json!({ + "type": "asr_response", + "sequence": parsed.sequence, + "payload": parsed.payload, + "errorCode": parsed.error_code, + }); + browser_sender + .send(ClientWsMessage::Text(value.to_string().into())) + .await + .map_err(map_client_ws_send_error)?; + } + Ok(UpstreamWsMessage::Text(text)) => { + browser_sender + .send(ClientWsMessage::Text(text)) + .await + .map_err(map_client_ws_send_error)?; + } + Ok(UpstreamWsMessage::Close(close)) => { + let _ = browser_sender.send(ClientWsMessage::Close(close)).await; + break; + } + Ok(UpstreamWsMessage::Ping(bytes)) => { + browser_sender + .send(ClientWsMessage::Ping(bytes)) + .await + .map_err(map_client_ws_send_error)?; + } + Ok(UpstreamWsMessage::Pong(_)) => {} + Ok(UpstreamWsMessage::Frame(_)) => {} + Err(error) => { + return Err(SpeechError::Upstream(format!( + "读取火山 ASR WebSocket 失败:{error}" + ))); + } + } + } + Ok::<(), SpeechError>(()) + }; + + let mut browser_to_upstream = Box::pin(browser_to_upstream); + let mut upstream_to_browser = Box::pin(upstream_to_browser); + let result = tokio::select! { + result = &mut browser_to_upstream => result, + result = &mut upstream_to_browser => result, + }; + if let Err(error) = result { + warn!(error = %error, "火山 ASR WebSocket 代理中断"); + } +} + +async fn proxy_tts_bidirection_websocket(socket: WebSocket, client: VolcengineSpeechClient) { + let (mut browser_sender, mut browser_receiver) = socket.split(); + let Ok((upstream, response_headers)) = client.connect_tts_bidirection().await else { + let _ = browser_sender + .send(ClientWsMessage::Text( + json!({ + "type": "error", + "provider": PROVIDER, + "message": "连接火山 TTS WebSocket 失败", + }) + .to_string() + .into(), + )) + .await; + return; + }; + if let Some(log_id) = response_headers.get("x-tt-logid") { + info!(%log_id, "火山 TTS WebSocket 已连接"); + } + let (mut upstream_sender, mut upstream_receiver) = upstream.split(); + + let browser_to_upstream = async { + while let Some(message) = browser_receiver.next().await { + match message { + Ok(ClientWsMessage::Text(text)) => { + let event = serde_json::from_str::(text.as_str()) + .map_err(|error| { + SpeechError::InvalidFrame(format!( + "TTS 浏览器事件 JSON 不合法:{error}" + )) + })?; + let frame = build_tts_bidirection_frame_from_client_event(event)?; + upstream_sender + .send(UpstreamWsMessage::Binary(frame.into())) + .await + .map_err(map_ws_send_error)?; + } + Ok(ClientWsMessage::Close(_)) => break, + Ok(ClientWsMessage::Ping(bytes)) => { + upstream_sender + .send(UpstreamWsMessage::Ping(bytes)) + .await + .map_err(map_ws_send_error)?; + } + Ok(ClientWsMessage::Binary(_)) | Ok(ClientWsMessage::Pong(_)) => {} + Err(error) => { + return Err(SpeechError::Upstream(format!( + "读取浏览器 TTS WebSocket 失败:{error}" + ))); + } + } + } + Ok::<(), SpeechError>(()) + }; + + let upstream_to_browser = async { + while let Some(message) = upstream_receiver.next().await { + match message { + Ok(UpstreamWsMessage::Binary(bytes)) => { + let parsed = parse_tts_response_frame(&bytes)?; + if let Some(audio) = parsed.audio.clone() { + browser_sender + .send(ClientWsMessage::Binary(audio.into())) + .await + .map_err(map_client_ws_send_error)?; + } + if parsed.payload.is_some() || parsed.error_code.is_some() { + browser_sender + .send(ClientWsMessage::Text( + tts_response_to_client_value(&parsed).to_string().into(), + )) + .await + .map_err(map_client_ws_send_error)?; + } + } + Ok(UpstreamWsMessage::Text(text)) => { + browser_sender + .send(ClientWsMessage::Text(text)) + .await + .map_err(map_client_ws_send_error)?; + } + Ok(UpstreamWsMessage::Close(close)) => { + let _ = browser_sender.send(ClientWsMessage::Close(close)).await; + break; + } + Ok(UpstreamWsMessage::Ping(bytes)) => { + browser_sender + .send(ClientWsMessage::Ping(bytes)) + .await + .map_err(map_client_ws_send_error)?; + } + Ok(UpstreamWsMessage::Pong(_)) => {} + Ok(UpstreamWsMessage::Frame(_)) => {} + Err(error) => { + return Err(SpeechError::Upstream(format!( + "读取火山 TTS WebSocket 失败:{error}" + ))); + } + } + } + Ok::<(), SpeechError>(()) + }; + + let mut browser_to_upstream = Box::pin(browser_to_upstream); + let mut upstream_to_browser = Box::pin(upstream_to_browser); + let result = tokio::select! { + result = &mut browser_to_upstream => result, + result = &mut upstream_to_browser => result, + }; + if let Err(error) = result { + warn!(error = %error, "火山 TTS WebSocket 代理中断"); + } +} + +fn build_speech_client(state: &AppState) -> Result { + Ok(VolcengineSpeechClient::new(VolcengineSpeechConfig::new( + state.config.volcengine_speech_api_key.clone(), + state.config.volcengine_speech_app_id.clone(), + state.config.volcengine_speech_access_key.clone(), + state.config.volcengine_speech_asr_resource_id.clone(), + state.config.volcengine_speech_tts_resource_id.clone(), + state.config.volcengine_speech_asr_ws_url.clone(), + state + .config + .volcengine_speech_tts_bidirection_ws_url + .clone(), + state.config.volcengine_speech_tts_sse_url.clone(), + state.config.volcengine_speech_request_timeout_ms, + )?)) +} + +fn public_speech_config(state: &AppState) -> PublicSpeechConfig { + PublicSpeechConfig { + asr_resource_id: state.config.volcengine_speech_asr_resource_id.clone(), + tts_resource_id: state.config.volcengine_speech_tts_resource_id.clone(), + asr_audio: AsrAudioConfig::default(), + tts_audio: TtsAudioParams::default(), + endpoints: PublicSpeechEndpoints { + asr_stream: "/api/speech/volcengine/asr/stream", + tts_bidirection: "/api/speech/volcengine/tts/bidirection", + tts_sse: "/api/speech/volcengine/tts/sse", + }, + } +} + +fn map_speech_error(error: SpeechError) -> AppError { + match error { + SpeechError::InvalidConfig(message) => { + AppError::from_status(StatusCode::SERVICE_UNAVAILABLE).with_details(json!({ + "provider": PROVIDER, + "message": message, + })) + } + SpeechError::InvalidHeader(message) + | SpeechError::InvalidFrame(message) + | SpeechError::Serialize(message) + | SpeechError::Io(message) + | SpeechError::Upstream(message) => AppError::from_status(StatusCode::BAD_GATEWAY) + .with_details(json!({ + "provider": PROVIDER, + "message": message, + })), + } +} + +fn map_ws_send_error(error: tokio_tungstenite::tungstenite::Error) -> SpeechError { + SpeechError::Upstream(format!("发送火山语音 WebSocket 帧失败:{error}")) +} + +fn map_client_ws_send_error(error: axum::Error) -> SpeechError { + SpeechError::Upstream(format!("发送浏览器语音 WebSocket 帧失败:{error}")) +} + +#[cfg(test)] +mod tests { + use axum::{ + body::Body, + http::{Request, StatusCode}, + }; + use http_body_util::BodyExt; + use serde_json::Value; + use tower::ServiceExt; + + use super::*; + use crate::{app::build_router, config::AppConfig, state::AppState}; + + #[tokio::test] + async fn speech_config_route_requires_authentication() { + let app = build_router(AppState::new(AppConfig::default()).expect("state should build")); + + let response = app + .oneshot( + Request::builder() + .uri("/api/speech/volcengine/config") + .body(Body::empty()) + .expect("request should build"), + ) + .await + .expect("request should complete"); + + assert_eq!(response.status(), StatusCode::UNAUTHORIZED); + } + + #[tokio::test] + async fn speech_config_route_returns_no_secret_fields() { + let mut config = AppConfig::default(); + config.volcengine_speech_api_key = Some("secret-key".to_string()); + let state = AppState::new(config).expect("state should build"); + state + .seed_test_phone_user_with_password("13800138088", "Password123") + .await; + let app = build_router(state); + let login_response = app + .clone() + .oneshot( + Request::builder() + .method("POST") + .uri("/api/auth/entry") + .header("content-type", "application/json") + .body(Body::from( + json!({ + "phone": "13800138088", + "password": "Password123" + }) + .to_string(), + )) + .expect("login request should build"), + ) + .await + .expect("login should complete"); + let login_body = login_response + .into_body() + .collect() + .await + .expect("login body should collect") + .to_bytes(); + let login_payload: Value = + serde_json::from_slice(&login_body).expect("login body should be json"); + let token = login_payload["token"].as_str().expect("token should exist"); + + let response = app + .oneshot( + Request::builder() + .uri("/api/speech/volcengine/config") + .header("authorization", format!("Bearer {token}")) + .body(Body::empty()) + .expect("request should build"), + ) + .await + .expect("request should complete"); + + assert_eq!(response.status(), StatusCode::OK); + let body = response + .into_body() + .collect() + .await + .expect("body should collect") + .to_bytes(); + let payload_text = String::from_utf8_lossy(&body); + assert!(!payload_text.contains("secret-key")); + assert!(!payload_text.contains("apiKey")); + assert!(payload_text.contains("asrResourceId")); + } +} diff --git a/server-rs/crates/module-puzzle/src/creative_templates.rs b/server-rs/crates/module-puzzle/src/creative_templates.rs index 06dc7185..b933647e 100644 --- a/server-rs/crates/module-puzzle/src/creative_templates.rs +++ b/server-rs/crates/module-puzzle/src/creative_templates.rs @@ -198,11 +198,7 @@ mod tests { .iter() .any(|template| template.template_id == PUZZLE_TRAVEL_MEMORY_TEMPLATE_ID) ); - assert!( - catalog - .iter() - .all(|template| template.supported_level_mode - == PuzzleCreativeSupportedLevelMode::SingleOrMulti) - ); + assert!(catalog.iter().all(|template| template.supported_level_mode + == PuzzleCreativeSupportedLevelMode::SingleOrMulti)); } } diff --git a/server-rs/crates/module-puzzle/src/creative_tools.rs b/server-rs/crates/module-puzzle/src/creative_tools.rs index dd0c9e00..5bbbc0e5 100644 --- a/server-rs/crates/module-puzzle/src/creative_tools.rs +++ b/server-rs/crates/module-puzzle/src/creative_tools.rs @@ -472,8 +472,7 @@ mod tests { min_points: 4, max_points: 16, pricing_unit: PuzzleCreativePricingUnit::Point, - reason: "按旅行节点和每关图片生成次数估算,实际扣费以后端任务结算为准" - .to_string(), + reason: "按旅行节点和每关图片生成次数估算,实际扣费以后端任务结算为准".to_string(), }, work_title: "旅行记忆".to_string(), work_description: "把旅行照片做成系列拼图。".to_string(), diff --git a/server-rs/crates/platform-llm/src/lib.rs b/server-rs/crates/platform-llm/src/lib.rs index a4a9a4f6..0f51347c 100644 --- a/server-rs/crates/platform-llm/src/lib.rs +++ b/server-rs/crates/platform-llm/src/lib.rs @@ -42,6 +42,7 @@ pub struct LlmConfig { request_timeout_ms: u64, max_retries: u32, retry_backoff_ms: u64, + official_fallback: bool, } // 首版只冻结当前项目已稳定使用的 system/user/assistant 三种消息角色。 @@ -161,9 +162,11 @@ enum LlmRequestBody { #[derive(Serialize)] struct ChatCompletionsRequestBody { model: String, - messages: Vec, + messages: Vec, stream: bool, #[serde(skip_serializing_if = "Option::is_none")] + official_fallback: Option, + #[serde(skip_serializing_if = "Option::is_none")] max_tokens: Option, #[serde(skip_serializing_if = "Option::is_none")] web_search_options: Option, @@ -172,12 +175,41 @@ struct ChatCompletionsRequestBody { #[derive(Serialize)] struct ChatCompletionsWebSearchOptions {} +#[derive(Serialize)] +struct ChatCompletionsInputMessage { + role: &'static str, + content: ChatCompletionsInputContent, +} + +#[derive(Serialize)] +#[serde(untagged)] +enum ChatCompletionsInputContent { + Text(String), + Parts(Vec), +} + +#[derive(Serialize)] +#[serde(tag = "type")] +enum ChatCompletionsInputContentPart { + #[serde(rename = "text")] + Text { text: String }, + #[serde(rename = "image_url")] + ImageUrl { image_url: ChatCompletionsImageUrl }, +} + +#[derive(Serialize)] +struct ChatCompletionsImageUrl { + url: String, +} + #[derive(Serialize)] struct ResponsesRequestBody { model: String, stream: bool, input: Vec, #[serde(skip_serializing_if = "Option::is_none")] + official_fallback: Option, + #[serde(skip_serializing_if = "Option::is_none")] max_output_tokens: Option, #[serde(skip_serializing_if = "Option::is_none")] tools: Option>, @@ -215,6 +247,15 @@ struct LlmRawFailureInputLog<'a> { messages: &'a [LlmMessage], } +#[derive(Deserialize)] +#[serde(untagged)] +enum ChatCompletionsResponsePayload { + Direct(ChatCompletionsResponseEnvelope), + Wrapped { + data: ChatCompletionsResponseEnvelope, + }, +} + #[derive(Deserialize)] struct ChatCompletionsResponseEnvelope { id: Option, @@ -344,9 +385,15 @@ impl LlmConfig { request_timeout_ms, max_retries, retry_backoff_ms, + official_fallback: false, }) } + pub fn with_official_fallback(mut self, official_fallback: bool) -> Self { + self.official_fallback = official_fallback; + self + } + pub fn ark_default(api_key: String, model: String) -> Result { Self::new( LlmProvider::Ark, @@ -387,6 +434,10 @@ impl LlmConfig { self.retry_backoff_ms } + pub fn official_fallback(&self) -> bool { + self.official_fallback + } + pub fn chat_completions_url(&self) -> String { format!( "{}/{}", @@ -886,7 +937,7 @@ impl LlmClient { request: &LlmTextRequest, stream: bool, ) -> Result { - let request_body = build_request_body(request, self.config.model(), stream); + let request_body = build_request_body(request, &self.config, stream); let model = request.resolved_model(self.config.model()); let url = match request.protocol { LlmTextProtocol::ChatCompletions => self.config.chat_completions_url(), @@ -1097,15 +1148,18 @@ fn normalize_non_empty(value: String, error_message: &str) -> Result LlmRequestBody { + let fallback_model = config.model(); + let official_fallback = config.official_fallback().then_some(true); match request.protocol { LlmTextProtocol::ChatCompletions => { LlmRequestBody::ChatCompletions(ChatCompletionsRequestBody { model: request.resolved_model(fallback_model).to_string(), - messages: request.messages.clone(), + messages: map_chat_completions_input_messages(request.messages.as_slice()), stream, + official_fallback, max_tokens: request.max_tokens, web_search_options: request .enable_web_search @@ -1116,6 +1170,7 @@ fn build_request_body( model: request.resolved_model(fallback_model).to_string(), stream, input: map_responses_input_messages(request.messages.as_slice()), + official_fallback, max_output_tokens: request.max_tokens, tools: request.enable_web_search.then(|| { vec![ResponsesWebSearchTool { @@ -1127,20 +1182,61 @@ fn build_request_body( } } +fn map_chat_completions_input_messages( + messages: &[LlmMessage], +) -> Vec { + messages + .iter() + .map(|message| ChatCompletionsInputMessage { + role: map_llm_message_role(message.role), + content: map_chat_completions_content(message), + }) + .collect() +} + +fn map_chat_completions_content(message: &LlmMessage) -> ChatCompletionsInputContent { + if message.content_parts.is_empty() { + return ChatCompletionsInputContent::Text(message.content.clone()); + } + + ChatCompletionsInputContent::Parts( + message + .content_parts + .iter() + .map(|part| match part { + LlmMessageContentPart::InputText { text } => { + ChatCompletionsInputContentPart::Text { text: text.clone() } + } + LlmMessageContentPart::InputImage { image_url } => { + ChatCompletionsInputContentPart::ImageUrl { + image_url: ChatCompletionsImageUrl { + url: image_url.clone(), + }, + } + } + }) + .collect(), + ) +} + fn map_responses_input_messages(messages: &[LlmMessage]) -> Vec { messages .iter() .map(|message| ResponsesInputMessage { - role: match message.role { - LlmMessageRole::System => "system", - LlmMessageRole::User => "user", - LlmMessageRole::Assistant => "assistant", - }, + role: map_llm_message_role(message.role), content: map_responses_content_parts(message), }) .collect() } +fn map_llm_message_role(role: LlmMessageRole) -> &'static str { + match role { + LlmMessageRole::System => "system", + LlmMessageRole::User => "user", + LlmMessageRole::Assistant => "assistant", + } +} + fn map_responses_content_parts(message: &LlmMessage) -> Vec { if message.content_parts.is_empty() { return vec![ResponsesInputContentPart::InputText { @@ -1265,8 +1361,12 @@ fn parse_chat_completions_response( fallback_model: &str, raw_text: &str, ) -> Result { - let parsed: ChatCompletionsResponseEnvelope = serde_json::from_str(raw_text) + let parsed: ChatCompletionsResponsePayload = serde_json::from_str(raw_text) .map_err(|error| LlmError::Deserialize(format!("解析 LLM JSON 响应失败:{error}")))?; + let parsed = match parsed { + ChatCompletionsResponsePayload::Direct(envelope) => envelope, + ChatCompletionsResponsePayload::Wrapped { data } => data, + }; let first_choice = parsed .choices @@ -1590,6 +1690,71 @@ mod tests { assert_eq!(config.responses_url(), "https://example.com/base/responses"); } + #[test] + fn llm_config_official_fallback_is_opt_in() { + let config = LlmConfig::new( + LlmProvider::OpenAiCompatible, + "https://example.com/base".to_string(), + "secret".to_string(), + "model-a".to_string(), + DEFAULT_REQUEST_TIMEOUT_MS, + DEFAULT_MAX_RETRIES, + DEFAULT_RETRY_BACKOFF_MS, + ) + .expect("config should be valid"); + + assert!(!config.official_fallback()); + assert!(config.with_official_fallback(true).official_fallback()); + } + + #[tokio::test] + async fn request_text_sends_official_fallback_for_openai_compatible_clients() { + let listener = TcpListener::bind("127.0.0.1:0").expect("listener should bind"); + let address = listener.local_addr().expect("listener should have addr"); + let server_handle = thread::spawn(move || { + let (mut stream, _) = listener.accept().expect("request should connect"); + let request_text = read_request(&mut stream); + write_response( + &mut stream, + MockResponse { + status_line: "200 OK", + content_type: "application/json; charset=utf-8", + body: r#"{"id":"resp_openai_compatible","model":"gpt-5","output_text":"兼容成功","status":"completed"}"#.to_string(), + extra_headers: Vec::new(), + }, + ); + request_text + }); + + let config = LlmConfig::new( + LlmProvider::OpenAiCompatible, + format!("http://{address}"), + "test-key".to_string(), + "gpt-5".to_string(), + DEFAULT_REQUEST_TIMEOUT_MS, + 0, + 1, + ) + .expect("config should be valid") + .with_official_fallback(true); + let client = LlmClient::new(config).expect("client should be created"); + let response = client + .request_text(LlmTextRequest::single_turn("系统", "用户").with_responses_api()) + .await + .expect("request_text should succeed"); + + let request_text = server_handle.join().expect("server thread should join"); + let request_body = request_text + .split("\r\n\r\n") + .nth(1) + .expect("request body should exist"); + let request_json: serde_json::Value = + serde_json::from_str(request_body).expect("request body should be json"); + + assert_eq!(response.content, "兼容成功"); + assert_eq!(request_json["official_fallback"], serde_json::json!(true)); + } + #[test] fn sse_parser_handles_split_chunks_and_done_marker() { let mut parser = OpenAiCompatibleSseParser::new(LlmTextProtocol::ChatCompletions); @@ -1711,8 +1876,9 @@ mod tests { MockResponse { status_line: "200 OK", content_type: "application/json; charset=utf-8", - body: r#"{"choices":[{"message":{"content":"too late"},"finish_reason":"stop"}]}"# - .to_string(), + body: + r#"{"choices":[{"message":{"content":"too late"},"finish_reason":"stop"}]}"# + .to_string(), extra_headers: Vec::new(), }, ); @@ -1731,9 +1897,7 @@ mod tests { let client = LlmClient::new(config).expect("client should be created"); let error = client - .request_text( - LlmTextRequest::single_turn("系统", "用户").with_request_timeout_ms(20), - ) + .request_text(LlmTextRequest::single_turn("系统", "用户").with_request_timeout_ms(20)) .await .expect_err("request override should timeout before the global timeout"); @@ -1779,6 +1943,75 @@ mod tests { assert_eq!(response.content, "搜索成功"); assert_eq!(request_json["web_search_options"], serde_json::json!({})); + assert!(request_json.get("official_fallback").is_none()); + } + + #[tokio::test] + async fn chat_completions_multimodal_request_sends_text_and_image_url_parts() { + let listener = TcpListener::bind("127.0.0.1:0").expect("listener should bind"); + let address = listener.local_addr().expect("listener should have addr"); + let server_handle = thread::spawn(move || { + let (mut stream, _) = listener.accept().expect("request should connect"); + let request_text = read_request(&mut stream); + write_response( + &mut stream, + MockResponse { + status_line: "200 OK", + content_type: "application/json; charset=utf-8", + body: r#"{"id":"chat_multimodal","model":"gpt-4o-mini","choices":[{"message":{"content":"{\"levelName\":\"雨夜猫街\"}"},"finish_reason":"stop"}]}"#.to_string(), + extra_headers: Vec::new(), + }, + ); + request_text + }); + + let config = LlmConfig::new( + LlmProvider::OpenAiCompatible, + format!("http://{address}"), + "test-key".to_string(), + "gpt-4o-mini".to_string(), + DEFAULT_REQUEST_TIMEOUT_MS, + 0, + 1, + ) + .expect("config should be valid") + .with_official_fallback(true); + let client = LlmClient::new(config).expect("client should be created"); + let response = client + .request_text(LlmTextRequest::new(vec![ + LlmMessage::system("你是拼图关卡命名编辑"), + LlmMessage::user_multimodal(vec![ + LlmMessageContentPart::InputText { + text: "画面描述:一只猫在雨夜灯牌下回头。".to_string(), + }, + LlmMessageContentPart::InputImage { + image_url: "data:image/png;base64,abcd".to_string(), + }, + ]), + ])) + .await + .expect("request_text should succeed"); + + let request_text = server_handle.join().expect("server thread should join"); + let request_line = request_text.lines().next().unwrap_or_default(); + let request_body = request_text + .split("\r\n\r\n") + .nth(1) + .expect("request body should exist"); + let request_json: serde_json::Value = + serde_json::from_str(request_body).expect("request body should be json"); + + assert!(request_line.contains("POST /chat/completions HTTP/1.1")); + assert_eq!(response.model, "gpt-4o-mini"); + assert_eq!(response.content, r#"{"levelName":"雨夜猫街"}"#); + assert_eq!(request_json["official_fallback"], serde_json::json!(true)); + assert_eq!( + request_json["messages"][1]["content"], + serde_json::json!([ + { "type": "text", "text": "画面描述:一只猫在雨夜灯牌下回头。" }, + { "type": "image_url", "image_url": { "url": "data:image/png;base64,abcd" } } + ]) + ); } #[tokio::test] @@ -1841,6 +2074,7 @@ mod tests { request_json["tools"], serde_json::json!([{ "type": "web_search", "max_keyword": 3 }]) ); + assert!(request_json.get("official_fallback").is_none()); assert_eq!( request_json["input"][0]["content"][0], serde_json::json!({ "type": "input_text", "text": "系统" }) @@ -1896,6 +2130,7 @@ mod tests { assert_eq!(response.model, "gpt-5"); assert_eq!(request_json["model"], serde_json::json!("gpt-5")); + assert!(request_json.get("official_fallback").is_none()); assert_eq!( request_json["input"][1]["content"], serde_json::json!([ diff --git a/server-rs/crates/platform-oss/src/lib.rs b/server-rs/crates/platform-oss/src/lib.rs index 54c401d6..ce869402 100644 --- a/server-rs/crates/platform-oss/src/lib.rs +++ b/server-rs/crates/platform-oss/src/lib.rs @@ -20,12 +20,13 @@ const OSS_V4_REQUEST: &str = "aliyun_v4_request"; const OSS_V4_SERVICE: &str = "oss"; const OSS_UNSIGNED_PAYLOAD: &str = "UNSIGNED-PAYLOAD"; -pub const LEGACY_PUBLIC_PREFIXES: [&str; 8] = [ +pub const LEGACY_PUBLIC_PREFIXES: [&str; 9] = [ "generated-character-drafts", "generated-characters", "generated-animations", "generated-big-fish-assets", "generated-square-hole-assets", + "generated-puzzle-assets", "generated-custom-world-scenes", "generated-custom-world-covers", "generated-qwen-sprites", @@ -419,8 +420,11 @@ impl OssClient { let policy = serde_json::to_string(&policy_json) .map_err(|error| OssError::SerializePolicy(format!("序列化 policy 失败:{error}")))?; let encoded_policy = BASE64_STANDARD.encode(policy.as_bytes()); - let signature = - sign_v4_content(&self.config.access_key_secret, &signature_scope, &encoded_policy)?; + let signature = sign_v4_content( + &self.config.access_key_secret, + &signature_scope, + &encoded_policy, + )?; Ok(OssPostObjectResponse { signature_version: "v4", @@ -492,11 +496,8 @@ impl OssClient { let canonical_uri = build_v4_canonical_uri(&self.config.bucket, Some(&object_key)); let object_url_path = format!("/{}", encode_url_path(&object_key)); let additional_headers = "host"; - let canonical_headers = format!( - "host:{}.{}\n", - self.config.bucket(), - self.config.endpoint() - ); + let canonical_headers = + format!("host:{}.{}\n", self.config.bucket(), self.config.endpoint()); let canonical_query = build_canonical_query_string(&query); let canonical_request = build_v4_canonical_request( Method::GET.as_str(), @@ -506,10 +507,16 @@ impl OssClient { additional_headers, OSS_UNSIGNED_PAYLOAD, ); - let string_to_sign = - build_v4_string_to_sign(query["x-oss-date"].as_str(), &signature_scope, &canonical_request); - let signature = - sign_v4_content(&self.config.access_key_secret, &signature_scope, &string_to_sign)?; + let string_to_sign = build_v4_string_to_sign( + query["x-oss-date"].as_str(), + &signature_scope, + &canonical_request, + ); + let signature = sign_v4_content( + &self.config.access_key_secret, + &signature_scope, + &string_to_sign, + )?; query.insert("x-oss-signature".to_string(), signature); let signed_url = format!( "{}{}?{}", @@ -1036,8 +1043,13 @@ fn signed_request_builder( additional_headers, &body_sha256, ); - let string_to_sign = build_v4_string_to_sign(&signed_at_text, &signature_scope, &canonical_request); - let signature = sign_v4_content(config.access_key_secret(), &signature_scope, &string_to_sign)?; + let string_to_sign = + build_v4_string_to_sign(&signed_at_text, &signature_scope, &canonical_request); + let signature = sign_v4_content( + config.access_key_secret(), + &signature_scope, + &string_to_sign, + )?; let mut builder = client .request(method, target_url) .header("x-oss-content-sha256", body_sha256) @@ -1065,33 +1077,29 @@ fn signed_request_builder( } fn build_v4_signature_scope(endpoint: &str, signed_at: OffsetDateTime) -> Result { - let date = signed_at - .date() - .to_string() - .replace('-', ""); + let date = format_v4_signature_scope_date(signed_at); let region = extract_oss_region(endpoint)?; Ok(format!("{date}/{region}/{OSS_V4_SERVICE}/{OSS_V4_REQUEST}")) } fn build_v4_signature_date(signed_at: OffsetDateTime) -> Result { - let date = signed_at - .date() - .to_string() - .replace('-', ""); - let time = signed_at - .time() - .to_string() - .split('.') - .next() - .unwrap_or("00:00:00") - .replace(':', ""); + Ok(format!( + "{}T{:02}{:02}{:02}Z", + format_v4_signature_scope_date(signed_at), + signed_at.hour(), + signed_at.minute(), + signed_at.second() + )) +} - if time.len() != 6 { - return Err(OssError::Sign("OSS V4 签名时间格式化失败".to_string())); - } - - Ok(format!("{date}T{time}Z")) +fn format_v4_signature_scope_date(signed_at: OffsetDateTime) -> String { + format!( + "{:04}{:02}{:02}", + signed_at.year(), + signed_at.month() as u8, + signed_at.day() + ) } fn build_v4_canonical_uri(bucket: &str, object_key: Option<&str>) -> String { @@ -1116,9 +1124,7 @@ fn extract_oss_region(endpoint: &str) -> Result { .map(str::to_string) .filter(|region| !region.is_empty()) .ok_or_else(|| { - OssError::InvalidConfig(format!( - "OSS endpoint 无法解析 region,当前值:{endpoint}" - )) + OssError::InvalidConfig(format!("OSS endpoint 无法解析 region,当前值:{endpoint}")) }) } @@ -1131,7 +1137,10 @@ fn sign_v4_content( Ok(hex_sha256_hmac(&signing_key, content.as_bytes())) } -fn build_v4_signing_key(access_key_secret: &str, signature_scope: &str) -> Result, OssError> { +fn build_v4_signing_key( + access_key_secret: &str, + signature_scope: &str, +) -> Result, OssError> { let mut parts = signature_scope.split('/'); let date = parts .next() @@ -1160,8 +1169,7 @@ fn hmac_sha256_raw(key: &[u8], content: &str) -> Result, OssError> { } fn hex_sha256_hmac(key: &[u8], content: &[u8]) -> String { - let mut signer = HmacSha256::new_from_slice(key) - .expect("HMAC-SHA256 accepts keys of any size"); + let mut signer = HmacSha256::new_from_slice(key).expect("HMAC-SHA256 accepts keys of any size"); signer.update(content); hex_lower(&signer.finalize().into_bytes()) } @@ -1213,7 +1221,13 @@ fn build_v4_canonical_headers(headers: &BTreeMap) -> String { fn build_canonical_query_string(params: &BTreeMap) -> String { params .iter() - .map(|(key, value)| format!("{}={}", encode_url_query_value(key), encode_url_query_value(value))) + .map(|(key, value)| { + format!( + "{}={}", + encode_url_query_value(key), + encode_url_query_value(value) + ) + }) .collect::>() .join("&") } @@ -1286,9 +1300,30 @@ mod tests { LegacyAssetPrefix::parse("/generated-characters/*"), Some(LegacyAssetPrefix::Characters) ); + assert_eq!( + LegacyAssetPrefix::parse("/generated-puzzle-assets/*"), + Some(LegacyAssetPrefix::PuzzleAssets) + ); + assert!(LEGACY_PUBLIC_PREFIXES.contains(&"generated-puzzle-assets")); assert_eq!(LegacyAssetPrefix::parse("unknown"), None); } + #[test] + fn build_v4_signature_date_zero_pads_single_digit_time_parts() { + let signed_at = + OffsetDateTime::from_unix_timestamp(1_771_477_389).expect("timestamp should be valid"); + + assert_eq!( + build_v4_signature_date(signed_at).expect("date should format"), + "20260219T050309Z" + ); + assert_eq!( + build_v4_signature_scope("oss-cn-shanghai.aliyuncs.com", signed_at) + .expect("scope should format"), + "20260219/cn-shanghai/oss/aliyun_v4_request" + ); + } + #[test] fn sign_post_object_returns_bucket_and_object_key_for_private_storage_truth() { let client = build_client(); @@ -1327,18 +1362,19 @@ mod tests { response.form_fields.signature_version, OSS_V4_ALGORITHM.to_string() ); - assert!(response - .form_fields - .credential - .starts_with("test-access-key-id/")); - assert!(response - .form_fields - .credential - .ends_with("/cn-shanghai/oss/aliyun_v4_request")); - assert_eq!( - response.form_fields.date.len(), - "20260507T120000Z".len() + assert!( + response + .form_fields + .credential + .starts_with("test-access-key-id/") ); + assert!( + response + .form_fields + .credential + .ends_with("/cn-shanghai/oss/aliyun_v4_request") + ); + assert_eq!(response.form_fields.date.len(), "20260507T120000Z".len()); assert_eq!( response.form_fields.metadata.get("x-oss-meta-asset-kind"), Some(&"character-visual".to_string()) @@ -1441,9 +1477,11 @@ mod tests { .signed_url .contains("x-oss-signature-version=OSS4-HMAC-SHA256") ); - assert!(response - .signed_url - .contains("x-oss-credential=test-access-key-id%2F")); + assert!( + response + .signed_url + .contains("x-oss-credential=test-access-key-id%2F") + ); assert!(response.signed_url.contains("&x-oss-expires=300")); assert!(response.signed_url.contains("&x-oss-signature=")); } diff --git a/server-rs/crates/platform-speech/Cargo.toml b/server-rs/crates/platform-speech/Cargo.toml new file mode 100644 index 00000000..5e7ebe15 --- /dev/null +++ b/server-rs/crates/platform-speech/Cargo.toml @@ -0,0 +1,20 @@ +[package] +name = "platform-speech" +edition.workspace = true +version.workspace = true +license.workspace = true + +[dependencies] +base64 = { workspace = true } +bytes = { workspace = true } +flate2 = { workspace = true } +futures-util = { workspace = true } +reqwest = { workspace = true, features = ["json", "rustls-tls", "stream"] } +serde = { workspace = true } +serde_json = { workspace = true } +tokio = { workspace = true, features = ["net", "time"] } +tokio-tungstenite = { workspace = true, features = ["rustls-tls-webpki-roots"] } +uuid = { workspace = true, features = ["v4"] } + +[dev-dependencies] +tokio = { workspace = true, features = ["macros", "rt"] } diff --git a/server-rs/crates/platform-speech/src/lib.rs b/server-rs/crates/platform-speech/src/lib.rs new file mode 100644 index 00000000..2904554e --- /dev/null +++ b/server-rs/crates/platform-speech/src/lib.rs @@ -0,0 +1,1203 @@ +use std::{ + collections::BTreeMap, + error::Error, + fmt, + io::{Read, Write}, + time::Duration, +}; + +use flate2::{Compression, read::GzDecoder, write::GzEncoder}; +use futures_util::{SinkExt, StreamExt}; +use reqwest::header::{HeaderMap, HeaderValue}; +use serde::{Deserialize, Serialize}; +use serde_json::{Value, json}; +use tokio_tungstenite::{ + MaybeTlsStream, WebSocketStream, connect_async, + tungstenite::{Message, client::IntoClientRequest, http::Uri}, +}; +use uuid::Uuid; + +pub const DEFAULT_ASR_WS_URL: &str = "wss://openspeech.bytedance.com/api/v3/sauc/bigmodel_async"; +pub const DEFAULT_TTS_BIDIRECTION_WS_URL: &str = + "wss://openspeech.bytedance.com/api/v3/tts/bidirection"; +pub const DEFAULT_TTS_SSE_URL: &str = + "https://openspeech.bytedance.com/api/v3/tts/unidirectional/sse"; +pub const DEFAULT_ASR_RESOURCE_ID: &str = "volc.seedasr.sauc.concurrent"; +pub const DEFAULT_TTS_RESOURCE_ID: &str = "seed-tts-2.0"; +pub const DEFAULT_REQUEST_TIMEOUT_MS: u64 = 180_000; + +const PROTOCOL_VERSION: u8 = 0b0001; +const HEADER_SIZE_FOUR_BYTES: u8 = 0b0001; +const SERIALIZATION_NONE: u8 = 0b0000; +const SERIALIZATION_JSON: u8 = 0b0001; +const COMPRESSION_NONE: u8 = 0b0000; +const COMPRESSION_GZIP: u8 = 0b0001; +const MESSAGE_FULL_CLIENT_REQUEST: u8 = 0b0001; +const MESSAGE_AUDIO_ONLY_REQUEST: u8 = 0b0010; +pub const MESSAGE_FULL_SERVER_RESPONSE: u8 = 0b1001; +const MESSAGE_AUDIO_ONLY_RESPONSE: u8 = 0b1011; +const MESSAGE_ERROR: u8 = 0b1111; +const FLAG_NONE: u8 = 0b0000; +const FLAG_WITH_SEQUENCE: u8 = 0b0001; +const FLAG_LAST_PACKET: u8 = 0b0010; +const FLAG_WITH_NEGATIVE_SEQUENCE: u8 = 0b0011; +const FLAG_WITH_EVENT: u8 = 0b0100; + +const EVENT_START_CONNECTION: i32 = 1; +const EVENT_FINISH_CONNECTION: i32 = 2; +const EVENT_CONNECTION_STARTED: i32 = 50; +const EVENT_CONNECTION_FAILED: i32 = 51; +const EVENT_CONNECTION_FINISHED: i32 = 52; +const EVENT_START_SESSION: i32 = 100; +const EVENT_CANCEL_SESSION: i32 = 101; +const EVENT_FINISH_SESSION: i32 = 102; +const EVENT_SESSION_STARTED: i32 = 150; +const EVENT_SESSION_CANCELED: i32 = 151; +const EVENT_SESSION_FINISHED: i32 = 152; +const EVENT_SESSION_FAILED: i32 = 153; +const EVENT_TASK_REQUEST: i32 = 200; +const EVENT_TTS_SENTENCE_END: i32 = 351; +const EVENT_TTS_RESPONSE: i32 = 352; +const EVENT_TTS_SUBTITLE: i32 = 353; + +pub type SpeechWsStream = WebSocketStream>; + +#[derive(Clone, Debug, PartialEq, Eq)] +pub struct VolcengineSpeechConfig { + pub api_key: Option, + pub app_id: Option, + pub access_key: Option, + pub asr_resource_id: String, + pub tts_resource_id: String, + pub asr_ws_url: String, + pub tts_bidirection_ws_url: String, + pub tts_sse_url: String, + pub request_timeout_ms: u64, +} + +#[derive(Clone, Debug, PartialEq, Eq, Serialize)] +#[serde(rename_all = "camelCase")] +pub struct PublicSpeechConfig { + pub asr_resource_id: String, + pub tts_resource_id: String, + pub asr_audio: AsrAudioConfig, + pub tts_audio: TtsAudioParams, + pub endpoints: PublicSpeechEndpoints, +} + +#[derive(Clone, Debug, PartialEq, Eq, Serialize)] +#[serde(rename_all = "camelCase")] +pub struct PublicSpeechEndpoints { + pub asr_stream: &'static str, + pub tts_bidirection: &'static str, + pub tts_sse: &'static str, +} + +#[derive(Clone, Debug, PartialEq, Eq, Serialize, Deserialize)] +#[serde(rename_all = "camelCase")] +pub struct AsrAudioConfig { + pub format: String, + pub codec: String, + pub rate: u32, + pub bits: u8, + pub channel: u8, +} + +#[derive(Clone, Debug, PartialEq, Eq, Serialize, Deserialize)] +#[serde(rename_all = "camelCase")] +pub struct TtsAudioParams { + pub format: String, + pub sample_rate: u32, + #[serde(skip_serializing_if = "Option::is_none")] + pub bit_rate: Option, +} + +#[derive(Clone, Debug, PartialEq, Eq)] +pub struct VolcengineSpeechClient { + config: VolcengineSpeechConfig, +} + +#[derive(Debug)] +pub enum SpeechError { + InvalidConfig(String), + InvalidHeader(String), + InvalidFrame(String), + Serialize(String), + Io(String), + Upstream(String), +} + +#[derive(Clone, Copy, Debug, PartialEq, Eq)] +pub enum VolcengineSpeechAuthMode { + ApiKey, + LegacyApp, +} + +#[derive(Clone, Copy, Debug, PartialEq, Eq)] +pub enum AsrFrameKind { + FullClientRequest, + Audio, + LastAudio, +} + +#[derive(Clone, Copy, Debug, PartialEq, Eq)] +pub struct SpeechFrameHeader { + pub version: u8, + pub header_size_words: u8, + pub message_type: u8, + pub flags: u8, + pub serialization: u8, + pub compression: u8, +} + +#[derive(Clone, Debug, PartialEq)] +pub struct ParsedAsrResponse { + pub header: SpeechFrameHeader, + pub sequence: Option, + pub event: Option, + pub payload: Value, + pub error_code: Option, +} + +#[derive(Clone, Debug, PartialEq)] +pub struct ParsedTtsResponse { + pub header: SpeechFrameHeader, + pub event: Option, + pub session_id: Option, + pub connection_id: Option, + pub payload: Option, + pub audio: Option>, + pub error_code: Option, +} + +#[derive(Clone, Copy, Debug, PartialEq, Eq, Serialize, Deserialize)] +#[serde(rename_all = "snake_case")] +pub enum TtsEvent { + StartConnection, + FinishConnection, + ConnectionStarted, + ConnectionFailed, + ConnectionFinished, + StartSession, + CancelSession, + FinishSession, + SessionStarted, + SessionCanceled, + SessionFinished, + SessionFailed, + TaskRequest, + TtsSentenceEnd, + TtsResponse, + TtsSubtitle, + Unknown(i32), +} + +#[derive(Clone, Debug, PartialEq, Eq, Deserialize)] +#[serde(rename_all = "snake_case", tag = "type")] +pub enum TtsBidirectionClientEvent { + StartConnection { + #[serde(default)] + payload: Option, + }, + FinishConnection { + #[serde(default)] + payload: Option, + }, + StartSession { + #[serde(rename = "sessionId")] + session_id: Option, + #[serde(default)] + payload: Value, + }, + FinishSession { + #[serde(rename = "sessionId")] + session_id: String, + #[serde(default)] + payload: Option, + }, + CancelSession { + #[serde(rename = "sessionId")] + session_id: String, + #[serde(default)] + payload: Option, + }, + TaskRequest { + #[serde(rename = "sessionId")] + session_id: String, + #[serde(default)] + payload: Value, + }, +} + +#[derive(Clone, Debug, PartialEq, Eq, Deserialize)] +#[serde(rename_all = "camelCase")] +pub struct TtsSseRequest { + pub text: String, + pub speaker: String, + #[serde(default)] + pub model: Option, + #[serde(default)] + pub audio_params: Option, + #[serde(default)] + pub additions: Option, + #[serde(default)] + pub ssml: Option, +} + +#[derive(Clone, Debug, PartialEq, Eq)] +pub struct TtsSseUpstreamRequest { + pub url: String, + pub headers: HeaderMap, + pub body: Value, + pub timeout: Duration, +} + +#[derive(Clone, Copy, Debug, PartialEq, Eq)] +enum SpeechCompression { + None, + Gzip, +} + +impl Default for AsrAudioConfig { + fn default() -> Self { + Self { + format: "pcm".to_string(), + codec: "raw".to_string(), + rate: 16_000, + bits: 16, + channel: 1, + } + } +} + +impl Default for TtsAudioParams { + fn default() -> Self { + Self { + format: "mp3".to_string(), + sample_rate: 24_000, + bit_rate: None, + } + } +} + +impl VolcengineSpeechConfig { + pub fn new( + api_key: Option, + app_id: Option, + access_key: Option, + asr_resource_id: String, + tts_resource_id: String, + asr_ws_url: String, + tts_bidirection_ws_url: String, + tts_sse_url: String, + request_timeout_ms: u64, + ) -> Result { + let config = Self { + api_key: normalize_optional_secret(api_key), + app_id: normalize_optional_secret(app_id), + access_key: normalize_optional_secret(access_key), + asr_resource_id: default_if_empty(asr_resource_id, DEFAULT_ASR_RESOURCE_ID), + tts_resource_id: default_if_empty(tts_resource_id, DEFAULT_TTS_RESOURCE_ID), + asr_ws_url: default_if_empty(asr_ws_url, DEFAULT_ASR_WS_URL), + tts_bidirection_ws_url: default_if_empty( + tts_bidirection_ws_url, + DEFAULT_TTS_BIDIRECTION_WS_URL, + ), + tts_sse_url: default_if_empty(tts_sse_url, DEFAULT_TTS_SSE_URL), + request_timeout_ms: request_timeout_ms.max(1), + }; + config.auth_mode()?; + Ok(config) + } + + pub fn auth_mode(&self) -> Result { + if self.api_key.as_ref().is_some_and(|value| !value.is_empty()) { + return Ok(VolcengineSpeechAuthMode::ApiKey); + } + + if self.app_id.as_ref().is_some_and(|value| !value.is_empty()) + && self + .access_key + .as_ref() + .is_some_and(|value| !value.is_empty()) + { + return Ok(VolcengineSpeechAuthMode::LegacyApp); + } + + Err(SpeechError::InvalidConfig( + "火山语音密钥未配置:需要 VOLCENGINE_SPEECH_API_KEY,或旧版 VOLCENGINE_SPEECH_APP_ID + VOLCENGINE_SPEECH_ACCESS_KEY" + .to_string(), + )) + } + + pub fn public_config(&self) -> PublicSpeechConfig { + PublicSpeechConfig { + asr_resource_id: self.asr_resource_id.clone(), + tts_resource_id: self.tts_resource_id.clone(), + asr_audio: AsrAudioConfig::default(), + tts_audio: TtsAudioParams::default(), + endpoints: PublicSpeechEndpoints { + asr_stream: "/api/speech/volcengine/asr/stream", + tts_bidirection: "/api/speech/volcengine/tts/bidirection", + tts_sse: "/api/speech/volcengine/tts/sse", + }, + } + } + + fn auth_headers(&self, resource_id: &str) -> Result { + let mut headers = HeaderMap::new(); + match self.auth_mode()? { + VolcengineSpeechAuthMode::ApiKey => { + headers.insert( + "X-Api-Key", + header_value(self.api_key.as_deref().unwrap_or(""))?, + ); + } + VolcengineSpeechAuthMode::LegacyApp => { + headers.insert( + "X-Api-App-Id", + header_value(self.app_id.as_deref().unwrap_or(""))?, + ); + headers.insert( + "X-Api-Access-Key", + header_value(self.access_key.as_deref().unwrap_or(""))?, + ); + } + } + headers.insert("X-Api-Resource-Id", header_value(resource_id)?); + Ok(headers) + } +} + +impl VolcengineSpeechClient { + pub fn new(config: VolcengineSpeechConfig) -> Self { + Self { config } + } + + pub fn config(&self) -> &VolcengineSpeechConfig { + &self.config + } + + pub async fn connect_asr( + &self, + ) -> Result<(SpeechWsStream, BTreeMap), SpeechError> { + let request_id = Uuid::new_v4().to_string(); + let mut headers = self.config.auth_headers(&self.config.asr_resource_id)?; + headers.insert("X-Api-Request-Id", header_value(&request_id)?); + headers.insert("X-Api-Sequence", HeaderValue::from_static("-1")); + self.connect_ws(&self.config.asr_ws_url, headers).await + } + + pub async fn connect_tts_bidirection( + &self, + ) -> Result<(SpeechWsStream, BTreeMap), SpeechError> { + let connect_id = Uuid::new_v4().to_string(); + let mut headers = self.config.auth_headers(&self.config.tts_resource_id)?; + headers.insert("X-Api-Connect-Id", header_value(&connect_id)?); + self.connect_ws(&self.config.tts_bidirection_ws_url, headers) + .await + } + + pub fn build_tts_sse_upstream_request( + &self, + request: TtsSseRequest, + user_id: &str, + ) -> Result { + let mut headers = self.config.auth_headers(&self.config.tts_resource_id)?; + headers.insert( + "X-Api-Request-Id", + header_value(&Uuid::new_v4().to_string())?, + ); + headers.insert( + reqwest::header::ACCEPT, + HeaderValue::from_static("text/event-stream"), + ); + headers.insert( + reqwest::header::CONTENT_TYPE, + HeaderValue::from_static("application/json"), + ); + let body = build_tts_sse_body(request, user_id)?; + Ok(TtsSseUpstreamRequest { + url: self.config.tts_sse_url.clone(), + headers, + body, + timeout: Duration::from_millis(self.config.request_timeout_ms), + }) + } + + async fn connect_ws( + &self, + url: &str, + headers: HeaderMap, + ) -> Result<(SpeechWsStream, BTreeMap), SpeechError> { + let uri: Uri = url.parse().map_err(|error| { + SpeechError::InvalidConfig(format!("火山语音 WebSocket URL 非法:{error}")) + })?; + let mut request = uri.into_client_request().map_err(|error| { + SpeechError::InvalidConfig(format!("构造火山语音 WebSocket 请求失败:{error}")) + })?; + for (name, value) in headers { + if let Some(name) = name { + request.headers_mut().insert(name, value); + } + } + + let (stream, response) = connect_async(request).await.map_err(|error| { + SpeechError::Upstream(format!("连接火山语音 WebSocket 失败:{error}")) + })?; + let response_headers = response + .headers() + .iter() + .filter_map(|(name, value)| { + value + .to_str() + .ok() + .map(|value| (name.as_str().to_string(), value.to_string())) + }) + .collect(); + Ok((stream, response_headers)) + } +} + +pub fn default_asr_request_payload(user_id: &str, override_payload: Option) -> Value { + let mut payload = json!({ + "user": { + "uid": user_id, + }, + "audio": { + "format": AsrAudioConfig::default().format, + "codec": AsrAudioConfig::default().codec, + "rate": AsrAudioConfig::default().rate, + "bits": AsrAudioConfig::default().bits, + "channel": AsrAudioConfig::default().channel, + }, + "request": { + "model_name": "bigmodel", + "enable_itn": true, + "enable_punc": true, + "show_utterances": true, + "result_type": "full", + } + }); + if let Some(override_payload) = override_payload { + merge_json_object(&mut payload, override_payload); + } + payload +} + +pub fn build_asr_frame(kind: AsrFrameKind, payload: &[u8]) -> Result, SpeechError> { + match kind { + AsrFrameKind::FullClientRequest => build_sized_frame( + MESSAGE_FULL_CLIENT_REQUEST, + FLAG_NONE, + SERIALIZATION_JSON, + COMPRESSION_GZIP, + &gzip_bytes(payload)?, + ), + AsrFrameKind::Audio => build_sized_frame( + MESSAGE_AUDIO_ONLY_REQUEST, + FLAG_NONE, + SERIALIZATION_NONE, + COMPRESSION_GZIP, + &gzip_bytes(payload)?, + ), + AsrFrameKind::LastAudio => build_sized_frame( + MESSAGE_AUDIO_ONLY_REQUEST, + FLAG_LAST_PACKET, + SERIALIZATION_NONE, + COMPRESSION_GZIP, + &gzip_bytes(payload)?, + ), + } +} + +pub fn build_asr_full_client_request(payload: &Value) -> Result, SpeechError> { + let bytes = serde_json::to_vec(payload) + .map_err(|error| SpeechError::Serialize(format!("序列化 ASR 请求失败:{error}")))?; + build_asr_frame(AsrFrameKind::FullClientRequest, &bytes) +} + +pub fn parse_asr_response_frame(bytes: &[u8]) -> Result { + let header = parse_header(bytes)?; + let mut offset = usize::from(header.header_size_words) * 4; + let sequence = if matches!( + header.flags, + FLAG_WITH_SEQUENCE | FLAG_WITH_NEGATIVE_SEQUENCE + ) && bytes.len() >= offset + 4 + { + let sequence = read_i32(bytes, &mut offset)?; + Some(sequence) + } else { + None + }; + + if header.message_type == MESSAGE_ERROR { + let error_code = read_u32(bytes, &mut offset).ok(); + let payload = read_payload_value(bytes, &mut offset, SpeechCompression::None) + .unwrap_or_else(|_| json!({ "message": decode_lossy(&bytes[offset..]) })); + return Ok(ParsedAsrResponse { + header, + sequence, + event: None, + payload, + error_code, + }); + } + + let payload = read_payload_value(bytes, &mut offset, header.compression())?; + Ok(ParsedAsrResponse { + header, + sequence, + event: None, + payload, + error_code: None, + }) +} + +pub fn build_tts_bidirection_frame( + event: TtsEvent, + session_id: Option<&str>, + payload: Option<&Value>, +) -> Result, SpeechError> { + let payload_bytes = serde_json::to_vec(payload.unwrap_or(&json!({}))) + .map_err(|error| SpeechError::Serialize(format!("序列化 TTS 事件失败:{error}")))?; + let event_number = event.to_i32(); + let message_type = match event { + TtsEvent::TaskRequest => MESSAGE_FULL_CLIENT_REQUEST, + _ => MESSAGE_FULL_CLIENT_REQUEST, + }; + build_tts_event_frame( + message_type, + SERIALIZATION_JSON, + COMPRESSION_NONE, + event_number, + session_id, + &payload_bytes, + ) +} + +pub fn build_tts_bidirection_frame_from_client_event( + event: TtsBidirectionClientEvent, +) -> Result, SpeechError> { + match event { + TtsBidirectionClientEvent::StartConnection { payload } => { + build_tts_bidirection_frame(TtsEvent::StartConnection, None, payload.as_ref()) + } + TtsBidirectionClientEvent::FinishConnection { payload } => { + build_tts_bidirection_frame(TtsEvent::FinishConnection, None, payload.as_ref()) + } + TtsBidirectionClientEvent::StartSession { + session_id, + payload, + } => { + let session_id = session_id.unwrap_or_else(|| Uuid::new_v4().to_string()); + build_tts_bidirection_frame(TtsEvent::StartSession, Some(&session_id), Some(&payload)) + } + TtsBidirectionClientEvent::FinishSession { + session_id, + payload, + } => build_tts_bidirection_frame( + TtsEvent::FinishSession, + Some(&session_id), + payload.as_ref(), + ), + TtsBidirectionClientEvent::CancelSession { + session_id, + payload, + } => build_tts_bidirection_frame( + TtsEvent::CancelSession, + Some(&session_id), + payload.as_ref(), + ), + TtsBidirectionClientEvent::TaskRequest { + session_id, + payload, + } => build_tts_bidirection_frame(TtsEvent::TaskRequest, Some(&session_id), Some(&payload)), + } +} + +pub fn parse_tts_response_frame(bytes: &[u8]) -> Result { + let header = parse_header(bytes)?; + let mut offset = usize::from(header.header_size_words) * 4; + if header.message_type == MESSAGE_ERROR { + let error_code = read_u32(bytes, &mut offset).ok(); + let payload = read_payload_value(bytes, &mut offset, header.compression()) + .unwrap_or_else(|_| json!({ "message": decode_lossy(&bytes[offset..]) })); + return Ok(ParsedTtsResponse { + header, + event: None, + session_id: None, + connection_id: None, + payload: Some(payload), + audio: None, + error_code, + }); + } + + let event = if header.flags == FLAG_WITH_EVENT { + Some(TtsEvent::from_i32(read_i32(bytes, &mut offset)?)) + } else { + None + }; + let session_or_connection_id = read_optional_length_prefixed_string(bytes, &mut offset)?; + let payload_bytes = read_payload_bytes(bytes, &mut offset)?; + let payload_bytes = match header.compression() { + SpeechCompression::None => payload_bytes, + SpeechCompression::Gzip => ungzip_bytes(&payload_bytes)?, + }; + let is_audio = header.message_type == MESSAGE_AUDIO_ONLY_RESPONSE + || event == Some(TtsEvent::TtsResponse) && header.serialization == SERIALIZATION_NONE; + let payload = if is_audio { + None + } else if payload_bytes.is_empty() { + Some(json!({})) + } else { + Some(serde_json::from_slice(&payload_bytes).map_err(|error| { + SpeechError::InvalidFrame(format!("解析 TTS JSON 响应失败:{error}")) + })?) + }; + let audio = if is_audio { Some(payload_bytes) } else { None }; + let (connection_id, session_id) = match event { + Some(TtsEvent::ConnectionStarted) + | Some(TtsEvent::ConnectionFailed) + | Some(TtsEvent::ConnectionFinished) => (session_or_connection_id, None), + _ => (None, session_or_connection_id), + }; + + Ok(ParsedTtsResponse { + header, + event, + session_id, + connection_id, + payload, + audio, + error_code: None, + }) +} + +pub fn tts_response_to_client_value(response: &ParsedTtsResponse) -> Value { + json!({ + "event": response.event.map(|event| event.name()), + "eventCode": response.event.map(|event| event.to_i32()), + "sessionId": response.session_id, + "connectionId": response.connection_id, + "payload": response.payload, + "audioBytes": response.audio.as_ref().map(Vec::len), + "errorCode": response.error_code, + }) +} + +pub fn build_tts_sse_body(request: TtsSseRequest, user_id: &str) -> Result { + let text = request.text.trim(); + let speaker = request.speaker.trim(); + if text.is_empty() && request.ssml.as_deref().unwrap_or("").trim().is_empty() { + return Err(SpeechError::InvalidConfig("TTS 文本不能为空".to_string())); + } + if speaker.is_empty() { + return Err(SpeechError::InvalidConfig( + "TTS speaker 不能为空".to_string(), + )); + } + + let mut req_params = json!({ + "text": text, + "speaker": speaker, + "audio_params": { + "format": request.audio_params.clone().unwrap_or_default().format, + "sample_rate": request.audio_params.clone().unwrap_or_default().sample_rate, + }, + }); + if let Some(bit_rate) = request + .audio_params + .as_ref() + .and_then(|params| params.bit_rate) + { + req_params["audio_params"]["bit_rate"] = json!(bit_rate); + } + if let Some(model) = normalize_optional_secret(request.model) { + req_params["model"] = json!(model); + } + if let Some(ssml) = normalize_optional_secret(request.ssml) { + req_params["ssml"] = json!(ssml); + } + if let Some(additions) = request.additions { + req_params["additions"] = additions; + } + + Ok(json!({ + "user": { + "uid": user_id, + }, + "req_params": req_params, + })) +} + +pub async fn send_binary(ws: &mut SpeechWsStream, bytes: Vec) -> Result<(), SpeechError> { + ws.send(Message::Binary(bytes.into())) + .await + .map_err(|error| SpeechError::Upstream(format!("发送火山语音 WebSocket 帧失败:{error}"))) +} + +pub async fn recv_binary(ws: &mut SpeechWsStream) -> Result>, SpeechError> { + while let Some(message) = ws.next().await { + match message { + Ok(Message::Binary(bytes)) => return Ok(Some(bytes.to_vec())), + Ok(Message::Text(text)) => return Ok(Some(text.as_bytes().to_vec())), + Ok(Message::Close(_)) => return Ok(None), + Ok(Message::Ping(_)) | Ok(Message::Pong(_)) => {} + Ok(Message::Frame(_)) => {} + Err(error) => { + return Err(SpeechError::Upstream(format!( + "读取火山语音 WebSocket 帧失败:{error}" + ))); + } + } + } + Ok(None) +} + +impl SpeechFrameHeader { + fn compression(self) -> SpeechCompression { + match self.compression { + COMPRESSION_GZIP => SpeechCompression::Gzip, + _ => SpeechCompression::None, + } + } +} + +impl TtsEvent { + pub fn from_i32(value: i32) -> Self { + match value { + EVENT_START_CONNECTION => Self::StartConnection, + EVENT_FINISH_CONNECTION => Self::FinishConnection, + EVENT_CONNECTION_STARTED => Self::ConnectionStarted, + EVENT_CONNECTION_FAILED => Self::ConnectionFailed, + EVENT_CONNECTION_FINISHED => Self::ConnectionFinished, + EVENT_START_SESSION => Self::StartSession, + EVENT_CANCEL_SESSION => Self::CancelSession, + EVENT_FINISH_SESSION => Self::FinishSession, + EVENT_SESSION_STARTED => Self::SessionStarted, + EVENT_SESSION_CANCELED => Self::SessionCanceled, + EVENT_SESSION_FINISHED => Self::SessionFinished, + EVENT_SESSION_FAILED => Self::SessionFailed, + EVENT_TASK_REQUEST => Self::TaskRequest, + EVENT_TTS_SENTENCE_END => Self::TtsSentenceEnd, + EVENT_TTS_RESPONSE => Self::TtsResponse, + EVENT_TTS_SUBTITLE => Self::TtsSubtitle, + other => Self::Unknown(other), + } + } + + pub fn to_i32(self) -> i32 { + match self { + Self::StartConnection => EVENT_START_CONNECTION, + Self::FinishConnection => EVENT_FINISH_CONNECTION, + Self::ConnectionStarted => EVENT_CONNECTION_STARTED, + Self::ConnectionFailed => EVENT_CONNECTION_FAILED, + Self::ConnectionFinished => EVENT_CONNECTION_FINISHED, + Self::StartSession => EVENT_START_SESSION, + Self::CancelSession => EVENT_CANCEL_SESSION, + Self::FinishSession => EVENT_FINISH_SESSION, + Self::SessionStarted => EVENT_SESSION_STARTED, + Self::SessionCanceled => EVENT_SESSION_CANCELED, + Self::SessionFinished => EVENT_SESSION_FINISHED, + Self::SessionFailed => EVENT_SESSION_FAILED, + Self::TaskRequest => EVENT_TASK_REQUEST, + Self::TtsSentenceEnd => EVENT_TTS_SENTENCE_END, + Self::TtsResponse => EVENT_TTS_RESPONSE, + Self::TtsSubtitle => EVENT_TTS_SUBTITLE, + Self::Unknown(value) => value, + } + } + + pub fn name(self) -> &'static str { + match self { + Self::StartConnection => "start_connection", + Self::FinishConnection => "finish_connection", + Self::ConnectionStarted => "connection_started", + Self::ConnectionFailed => "connection_failed", + Self::ConnectionFinished => "connection_finished", + Self::StartSession => "start_session", + Self::CancelSession => "cancel_session", + Self::FinishSession => "finish_session", + Self::SessionStarted => "session_started", + Self::SessionCanceled => "session_canceled", + Self::SessionFinished => "session_finished", + Self::SessionFailed => "session_failed", + Self::TaskRequest => "task_request", + Self::TtsSentenceEnd => "tts_sentence_end", + Self::TtsResponse => "tts_response", + Self::TtsSubtitle => "tts_subtitle", + Self::Unknown(_) => "unknown", + } + } +} + +impl fmt::Display for SpeechError { + fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result { + match self { + Self::InvalidConfig(message) + | Self::InvalidHeader(message) + | Self::InvalidFrame(message) + | Self::Serialize(message) + | Self::Io(message) + | Self::Upstream(message) => write!(f, "{message}"), + } + } +} + +impl Error for SpeechError {} + +fn build_sized_frame( + message_type: u8, + flags: u8, + serialization: u8, + compression: u8, + payload: &[u8], +) -> Result, SpeechError> { + let payload_len = u32::try_from(payload.len()) + .map_err(|_| SpeechError::InvalidFrame("语音帧 payload 超过 u32 上限".to_string()))?; + let mut frame = Vec::with_capacity(8 + payload.len()); + frame.push((PROTOCOL_VERSION << 4) | HEADER_SIZE_FOUR_BYTES); + frame.push((message_type << 4) | flags); + frame.push((serialization << 4) | compression); + frame.push(0); + frame.extend_from_slice(&payload_len.to_be_bytes()); + frame.extend_from_slice(payload); + Ok(frame) +} + +fn build_tts_event_frame( + message_type: u8, + serialization: u8, + compression: u8, + event: i32, + session_id: Option<&str>, + payload: &[u8], +) -> Result, SpeechError> { + let id = session_id.unwrap_or(""); + let id_bytes = id.as_bytes(); + let id_len = u32::try_from(id_bytes.len()) + .map_err(|_| SpeechError::InvalidFrame("TTS session id 超过 u32 上限".to_string()))?; + let payload_len = u32::try_from(payload.len()) + .map_err(|_| SpeechError::InvalidFrame("TTS payload 超过 u32 上限".to_string()))?; + let mut frame = Vec::with_capacity(12 + id_bytes.len() + payload.len()); + frame.push((PROTOCOL_VERSION << 4) | HEADER_SIZE_FOUR_BYTES); + frame.push((message_type << 4) | FLAG_WITH_EVENT); + frame.push((serialization << 4) | compression); + frame.push(0); + frame.extend_from_slice(&event.to_be_bytes()); + if session_id.is_some() { + frame.extend_from_slice(&id_len.to_be_bytes()); + frame.extend_from_slice(id_bytes); + } + frame.extend_from_slice(&payload_len.to_be_bytes()); + frame.extend_from_slice(payload); + Ok(frame) +} + +fn parse_header(bytes: &[u8]) -> Result { + if bytes.len() < 4 { + return Err(SpeechError::InvalidFrame( + "语音帧长度不足 4 字节".to_string(), + )); + } + Ok(SpeechFrameHeader { + version: bytes[0] >> 4, + header_size_words: bytes[0] & 0x0f, + message_type: bytes[1] >> 4, + flags: bytes[1] & 0x0f, + serialization: bytes[2] >> 4, + compression: bytes[2] & 0x0f, + }) +} + +fn read_payload_value( + bytes: &[u8], + offset: &mut usize, + compression: SpeechCompression, +) -> Result { + let payload = read_payload_bytes(bytes, offset)?; + let payload = match compression { + SpeechCompression::None => payload, + SpeechCompression::Gzip => ungzip_bytes(&payload)?, + }; + if payload.is_empty() { + return Ok(json!({})); + } + serde_json::from_slice(&payload) + .map_err(|error| SpeechError::InvalidFrame(format!("解析语音 JSON 帧失败:{error}"))) +} + +fn read_payload_bytes(bytes: &[u8], offset: &mut usize) -> Result, SpeechError> { + let payload_len = read_u32(bytes, offset)? as usize; + if bytes.len() < *offset + payload_len { + return Err(SpeechError::InvalidFrame( + "语音帧 payload 长度超过实际数据".to_string(), + )); + } + let payload = bytes[*offset..*offset + payload_len].to_vec(); + *offset += payload_len; + Ok(payload) +} + +fn read_optional_length_prefixed_string( + bytes: &[u8], + offset: &mut usize, +) -> Result, SpeechError> { + if bytes.len() < *offset + 4 { + return Ok(None); + } + let saved_offset = *offset; + let len = read_u32(bytes, offset)? as usize; + if bytes.len() < *offset + len { + *offset = saved_offset; + return Ok(None); + } + let text = decode_lossy(&bytes[*offset..*offset + len]); + *offset += len; + if text.is_empty() { + Ok(None) + } else { + Ok(Some(text)) + } +} + +fn read_i32(bytes: &[u8], offset: &mut usize) -> Result { + if bytes.len() < *offset + 4 { + return Err(SpeechError::InvalidFrame("语音帧缺少 i32 字段".to_string())); + } + let value = i32::from_be_bytes([ + bytes[*offset], + bytes[*offset + 1], + bytes[*offset + 2], + bytes[*offset + 3], + ]); + *offset += 4; + Ok(value) +} + +fn read_u32(bytes: &[u8], offset: &mut usize) -> Result { + if bytes.len() < *offset + 4 { + return Err(SpeechError::InvalidFrame("语音帧缺少 u32 字段".to_string())); + } + let value = u32::from_be_bytes([ + bytes[*offset], + bytes[*offset + 1], + bytes[*offset + 2], + bytes[*offset + 3], + ]); + *offset += 4; + Ok(value) +} + +fn gzip_bytes(payload: &[u8]) -> Result, SpeechError> { + let mut encoder = GzEncoder::new(Vec::new(), Compression::default()); + encoder + .write_all(payload) + .map_err(|error| SpeechError::Io(format!("gzip 压缩语音帧失败:{error}")))?; + encoder + .finish() + .map_err(|error| SpeechError::Io(format!("完成 gzip 压缩语音帧失败:{error}"))) +} + +fn ungzip_bytes(payload: &[u8]) -> Result, SpeechError> { + let mut decoder = GzDecoder::new(payload); + let mut output = Vec::new(); + decoder + .read_to_end(&mut output) + .map_err(|error| SpeechError::Io(format!("gzip 解压语音帧失败:{error}")))?; + Ok(output) +} + +fn header_value(value: &str) -> Result { + HeaderValue::from_str(value) + .map_err(|error| SpeechError::InvalidHeader(format!("构造火山语音请求头失败:{error}"))) +} + +fn normalize_optional_secret(value: Option) -> Option { + value + .map(|value| value.trim().to_string()) + .filter(|value| !value.is_empty()) +} + +fn default_if_empty(value: String, default_value: &str) -> String { + let value = value.trim(); + if value.is_empty() { + default_value.to_string() + } else { + value.to_string() + } +} + +fn merge_json_object(target: &mut Value, source: Value) { + match (target, source) { + (Value::Object(target), Value::Object(source)) => { + for (key, value) in source { + merge_json_object(target.entry(key).or_insert(Value::Null), value); + } + } + (target, source) => *target = source, + } +} + +fn decode_lossy(bytes: &[u8]) -> String { + String::from_utf8_lossy(bytes).trim().to_string() +} + +#[cfg(test)] +mod tests { + use super::*; + + fn test_config_with_api_key() -> VolcengineSpeechConfig { + VolcengineSpeechConfig::new( + Some("api-key".to_string()), + None, + None, + String::new(), + String::new(), + String::new(), + String::new(), + String::new(), + DEFAULT_REQUEST_TIMEOUT_MS, + ) + .expect("config should build") + } + + #[test] + fn config_prefers_api_key_auth_and_exposes_no_secret_in_public_config() { + let config = test_config_with_api_key(); + + assert_eq!( + config.auth_mode().unwrap(), + VolcengineSpeechAuthMode::ApiKey + ); + let public_config = config.public_config(); + assert_eq!(public_config.asr_resource_id, DEFAULT_ASR_RESOURCE_ID); + assert_eq!(public_config.tts_resource_id, DEFAULT_TTS_RESOURCE_ID); + assert_eq!( + public_config.endpoints.asr_stream, + "/api/speech/volcengine/asr/stream" + ); + } + + #[test] + fn config_accepts_legacy_auth_when_api_key_missing() { + let config = VolcengineSpeechConfig::new( + None, + Some("app-id".to_string()), + Some("access-key".to_string()), + String::new(), + String::new(), + String::new(), + String::new(), + String::new(), + DEFAULT_REQUEST_TIMEOUT_MS, + ) + .expect("legacy config should build"); + + assert_eq!( + config.auth_mode().unwrap(), + VolcengineSpeechAuthMode::LegacyApp + ); + } + + #[test] + fn asr_frame_roundtrip_parses_gzip_json_response() { + let payload = json!({ "result": { "text": "你好" } }); + let payload_bytes = serde_json::to_vec(&payload).unwrap(); + let compressed = gzip_bytes(&payload_bytes).unwrap(); + let mut frame = vec![ + (PROTOCOL_VERSION << 4) | HEADER_SIZE_FOUR_BYTES, + (MESSAGE_FULL_SERVER_RESPONSE << 4) | FLAG_WITH_SEQUENCE, + (SERIALIZATION_JSON << 4) | COMPRESSION_GZIP, + 0, + ]; + frame.extend_from_slice(&7_i32.to_be_bytes()); + frame.extend_from_slice(&(compressed.len() as u32).to_be_bytes()); + frame.extend_from_slice(&compressed); + + let parsed = parse_asr_response_frame(&frame).expect("asr response should parse"); + + assert_eq!(parsed.sequence, Some(7)); + assert_eq!(parsed.payload["result"]["text"], "你好"); + } + + #[test] + fn asr_full_request_frame_uses_expected_header() { + let frame = build_asr_full_client_request(&default_asr_request_payload("user-1", None)) + .expect("frame should build"); + + assert_eq!(frame[0], 0x11); + assert_eq!(frame[1], 0x10); + assert_eq!(frame[2], 0x11); + assert!(frame.len() > 8); + } + + #[test] + fn tts_start_session_frame_contains_event_and_session_id() { + let frame = build_tts_bidirection_frame( + TtsEvent::StartSession, + Some("session-1"), + Some(&json!({ "req_params": { "speaker": "voice" } })), + ) + .expect("tts frame should build"); + + assert_eq!(frame[0], 0x11); + assert_eq!(frame[1], 0x14); + assert_eq!(frame[2], 0x10); + assert_eq!( + i32::from_be_bytes([frame[4], frame[5], frame[6], frame[7]]), + 100 + ); + assert_eq!( + u32::from_be_bytes([frame[8], frame[9], frame[10], frame[11]]), + 9 + ); + } + + #[test] + fn tts_response_frame_parses_json_event() { + let payload = json!({ "status_code": 20000000, "message": "ok" }); + let payload_bytes = serde_json::to_vec(&payload).unwrap(); + let frame = build_tts_event_frame( + MESSAGE_FULL_SERVER_RESPONSE, + SERIALIZATION_JSON, + COMPRESSION_NONE, + TtsEvent::SessionFinished.to_i32(), + Some("session-1"), + &payload_bytes, + ) + .expect("response frame should build"); + + let parsed = parse_tts_response_frame(&frame).expect("tts response should parse"); + + assert_eq!(parsed.event, Some(TtsEvent::SessionFinished)); + assert_eq!(parsed.session_id.as_deref(), Some("session-1")); + assert_eq!(parsed.payload.unwrap()["status_code"], 20000000); + } + + #[test] + fn tts_sse_body_uses_snake_case_audio_params() { + let body = build_tts_sse_body( + TtsSseRequest { + text: "你好".to_string(), + speaker: "voice".to_string(), + model: None, + audio_params: Some(TtsAudioParams { + format: "mp3".to_string(), + sample_rate: 24_000, + bit_rate: Some(64_000), + }), + additions: None, + ssml: None, + }, + "user-1", + ) + .expect("sse body should build"); + + assert_eq!(body["user"]["uid"], "user-1"); + assert_eq!(body["req_params"]["audio_params"]["sample_rate"], 24_000); + assert_eq!(body["req_params"]["audio_params"]["bit_rate"], 64_000); + } +} diff --git a/server-rs/crates/shared-contracts/src/creative_agent.rs b/server-rs/crates/shared-contracts/src/creative_agent.rs index 5b61a148..87976d36 100644 --- a/server-rs/crates/shared-contracts/src/creative_agent.rs +++ b/server-rs/crates/shared-contracts/src/creative_agent.rs @@ -520,8 +520,7 @@ mod tests { }) .expect("event data should serialize"), }; - let catalog_payload = - serde_json::to_value(catalog_event).expect("event should serialize"); + let catalog_payload = serde_json::to_value(catalog_event).expect("event should serialize"); assert_eq!(catalog_payload["event"], json!("puzzle_template_catalog")); assert_eq!( catalog_payload["data"]["templates"][0]["templateId"], diff --git a/server-rs/crates/shared-contracts/src/visual_novel.rs b/server-rs/crates/shared-contracts/src/visual_novel.rs index 43065e33..f373495a 100644 --- a/server-rs/crates/shared-contracts/src/visual_novel.rs +++ b/server-rs/crates/shared-contracts/src/visual_novel.rs @@ -28,6 +28,13 @@ pub enum VisualNovelAssetSource { External, } +#[derive(Clone, Debug, Serialize, Deserialize, PartialEq, Eq)] +#[serde(rename_all = "snake_case")] +pub enum VisualNovelAudioGenerationKind { + BackgroundMusic, + SoundEffect, +} + #[derive(Clone, Debug, Serialize, Deserialize, PartialEq, Eq)] #[serde(rename_all = "snake_case")] pub enum VisualNovelSceneAvailability { @@ -407,6 +414,59 @@ pub struct VisualNovelCompileResponse { pub work: VisualNovelWorkDetail, } +#[derive(Clone, Debug, Serialize, Deserialize, PartialEq, Eq)] +#[serde(rename_all = "camelCase")] +pub struct CreateVisualNovelBackgroundMusicRequest { + pub prompt: String, + pub title: String, + #[serde(default)] + pub tags: Option, + #[serde(default)] + pub model: Option, +} + +#[derive(Clone, Debug, Serialize, Deserialize, PartialEq, Eq)] +#[serde(rename_all = "camelCase")] +pub struct CreateVisualNovelSoundEffectRequest { + pub prompt: String, + #[serde(default)] + pub duration: Option, + #[serde(default)] + pub seed: Option, +} + +#[derive(Clone, Debug, Serialize, Deserialize, PartialEq, Eq)] +#[serde(rename_all = "camelCase")] +pub struct VisualNovelAudioGenerationTaskResponse { + pub kind: VisualNovelAudioGenerationKind, + pub task_id: String, + pub provider: String, + pub status: String, +} + +#[derive(Clone, Debug, Serialize, Deserialize, PartialEq, Eq)] +#[serde(rename_all = "camelCase")] +pub struct PublishVisualNovelGeneratedAudioAssetRequest { + pub scene_id: String, + #[serde(default)] + pub profile_id: Option, +} + +#[derive(Clone, Debug, Serialize, Deserialize, PartialEq, Eq)] +#[serde(rename_all = "camelCase")] +pub struct VisualNovelGeneratedAudioAssetResponse { + pub kind: VisualNovelAudioGenerationKind, + pub task_id: String, + pub provider: String, + pub status: String, + #[serde(default)] + pub asset_object_id: Option, + #[serde(default)] + pub asset_kind: Option, + #[serde(default)] + pub audio_src: Option, +} + #[derive(Clone, Debug, Serialize, Deserialize, PartialEq)] #[serde(rename_all = "camelCase")] pub struct SendVisualNovelMessageRequest { @@ -671,6 +731,36 @@ mod tests { assert_eq!(step_payload["sceneId"], json!("scene-1")); } + #[test] + fn audio_generation_contracts_use_camel_case_fields() { + let request = CreateVisualNovelSoundEffectRequest { + prompt: "雨声".to_string(), + duration: Some(5), + seed: Some(12), + }; + let payload = serde_json::to_value(request).expect("request should serialize"); + assert_eq!(payload["duration"], json!(5)); + assert_eq!(payload["seed"], json!(12)); + + let response = VisualNovelGeneratedAudioAssetResponse { + kind: VisualNovelAudioGenerationKind::SoundEffect, + task_id: "task-1".to_string(), + provider: "vector-engine-vidu".to_string(), + status: "completed".to_string(), + asset_object_id: Some("assetobj_1".to_string()), + asset_kind: Some("visual_novel_ambient_sound".to_string()), + audio_src: Some("/generated-custom-world-scenes/a.wav".to_string()), + }; + let payload = serde_json::to_value(response).expect("response should serialize"); + assert_eq!(payload["kind"], json!("sound_effect")); + assert_eq!(payload["taskId"], json!("task-1")); + assert_eq!(payload["assetObjectId"], json!("assetobj_1")); + assert_eq!( + payload["audioSrc"], + json!("/generated-custom-world-scenes/a.wav") + ); + } + #[test] fn runtime_stream_event_uses_tagged_envelope() { let event = VisualNovelRuntimeStreamEvent::Step { diff --git a/server-rs/crates/spacetime-client/src/mapper.rs b/server-rs/crates/spacetime-client/src/mapper.rs index d4278833..73e54930 100644 --- a/server-rs/crates/spacetime-client/src/mapper.rs +++ b/server-rs/crates/spacetime-client/src/mapper.rs @@ -1682,14 +1682,10 @@ pub(crate) fn map_visual_novel_agent_session_procedure_result( let session_json = result .session_json .ok_or_else(|| SpacetimeClientError::missing_snapshot("visual novel agent session 快照"))?; - let session = - serde_json::from_str::(&session_json).map_err( - |error| { - SpacetimeClientError::Runtime(format!( - "visual novel session_json 非法: {error}" - )) - }, - )?; + let session = serde_json::from_str::(&session_json) + .map_err(|error| { + SpacetimeClientError::Runtime(format!("visual novel session_json 非法: {error}")) + })?; Ok(map_visual_novel_agent_session_snapshot(session)) } @@ -1721,13 +1717,15 @@ pub(crate) fn map_visual_novel_works_procedure_result( let items_json = result .items_json .ok_or_else(|| SpacetimeClientError::missing_snapshot("visual novel works 快照"))?; - let items = serde_json::from_str::>(&items_json).map_err( - |error| { + let items = + serde_json::from_str::>(&items_json).map_err(|error| { SpacetimeClientError::Runtime(format!("visual novel works items_json 非法: {error}")) - }, - )?; + })?; - Ok(items.into_iter().map(map_visual_novel_work_snapshot).collect()) + Ok(items + .into_iter() + .map(map_visual_novel_work_snapshot) + .collect()) } pub(crate) fn map_visual_novel_run_procedure_result( @@ -1762,7 +1760,10 @@ pub(crate) fn map_visual_novel_history_procedure_result( SpacetimeClientError::Runtime(format!("visual novel history items_json 非法: {error}")) })?; - Ok(items.into_iter().map(map_visual_novel_history_entry).collect()) + Ok(items + .into_iter() + .map(map_visual_novel_history_entry) + .collect()) } pub(crate) fn map_visual_novel_runtime_event_procedure_result( @@ -1776,9 +1777,7 @@ pub(crate) fn map_visual_novel_runtime_event_procedure_result( .event_json .ok_or_else(|| SpacetimeClientError::missing_snapshot("visual novel runtime event 快照"))?; let event = serde_json::from_str::(&event_json).map_err( - |error| { - SpacetimeClientError::Runtime(format!("visual novel event_json 非法: {error}")) - }, + |error| SpacetimeClientError::Runtime(format!("visual novel event_json 非法: {error}")), )?; Ok(map_visual_novel_runtime_event(event)) diff --git a/server-rs/crates/spacetime-client/src/visual_novel.rs b/server-rs/crates/spacetime-client/src/visual_novel.rs index ad4b544a..bbc8226a 100644 --- a/server-rs/crates/spacetime-client/src/visual_novel.rs +++ b/server-rs/crates/spacetime-client/src/visual_novel.rs @@ -29,15 +29,14 @@ impl SpacetimeClient { }; self.call_after_connect(move |connection, sender| { - connection.procedures().create_visual_novel_agent_session_then( - procedure_input, - move |_, result| { + connection + .procedures() + .create_visual_novel_agent_session_then(procedure_input, move |_, result| { let mapped = result .map_err(|error| SpacetimeClientError::Procedure(error.to_string())) .and_then(map_visual_novel_agent_session_procedure_result); send_once(&sender, mapped); - }, - ); + }); }) .await } @@ -79,15 +78,14 @@ impl SpacetimeClient { }; self.call_after_connect(move |connection, sender| { - connection.procedures().submit_visual_novel_agent_message_then( - procedure_input, - move |_, result| { + connection + .procedures() + .submit_visual_novel_agent_message_then(procedure_input, move |_, result| { let mapped = result .map_err(|error| SpacetimeClientError::Procedure(error.to_string())) .and_then(map_visual_novel_agent_session_procedure_result); send_once(&sender, mapped); - }, - ); + }); }) .await } @@ -112,12 +110,15 @@ impl SpacetimeClient { self.call_after_connect(move |connection, sender| { connection .procedures() - .finalize_visual_novel_agent_message_turn_then(procedure_input, move |_, result| { - let mapped = result - .map_err(|error| SpacetimeClientError::Procedure(error.to_string())) - .and_then(map_visual_novel_agent_session_procedure_result); - send_once(&sender, mapped); - }); + .finalize_visual_novel_agent_message_turn_then( + procedure_input, + move |_, result| { + let mapped = result + .map_err(|error| SpacetimeClientError::Procedure(error.to_string())) + .and_then(map_visual_novel_agent_session_procedure_result); + send_once(&sender, mapped); + }, + ); }) .await } @@ -140,15 +141,14 @@ impl SpacetimeClient { }; self.call_after_connect(move |connection, sender| { - connection.procedures().compile_visual_novel_work_profile_then( - procedure_input, - move |_, result| { + connection + .procedures() + .compile_visual_novel_work_profile_then(procedure_input, move |_, result| { let mapped = result .map_err(|error| SpacetimeClientError::Procedure(error.to_string())) .and_then(map_visual_novel_agent_session_procedure_result); send_once(&sender, mapped); - }, - ); + }); }) .await } @@ -312,14 +312,15 @@ impl SpacetimeClient { }; self.call_after_connect(move |connection, sender| { - connection - .procedures() - .start_visual_novel_run_then(procedure_input, move |_, result| { + connection.procedures().start_visual_novel_run_then( + procedure_input, + move |_, result| { let mapped = result .map_err(|error| SpacetimeClientError::Procedure(error.to_string())) .and_then(map_visual_novel_run_procedure_result); send_once(&sender, mapped); - }); + }, + ); }) .await } @@ -367,15 +368,14 @@ impl SpacetimeClient { }; self.call_after_connect(move |connection, sender| { - connection.procedures().upsert_visual_novel_run_snapshot_then( - procedure_input, - move |_, result| { + connection + .procedures() + .upsert_visual_novel_run_snapshot_then(procedure_input, move |_, result| { let mapped = result .map_err(|error| SpacetimeClientError::Procedure(error.to_string())) .and_then(map_visual_novel_run_procedure_result); send_once(&sender, mapped); - }, - ); + }); }) .await } @@ -400,12 +400,15 @@ impl SpacetimeClient { self.call_after_connect(move |connection, sender| { connection .procedures() - .append_visual_novel_runtime_history_entry_then(procedure_input, move |_, result| { - let mapped = result - .map_err(|error| SpacetimeClientError::Procedure(error.to_string())) - .and_then(map_visual_novel_history_procedure_result); - send_once(&sender, mapped); - }); + .append_visual_novel_runtime_history_entry_then( + procedure_input, + move |_, result| { + let mapped = result + .map_err(|error| SpacetimeClientError::Procedure(error.to_string())) + .and_then(map_visual_novel_history_procedure_result); + send_once(&sender, mapped); + }, + ); }) .await } @@ -421,15 +424,14 @@ impl SpacetimeClient { }; self.call_after_connect(move |connection, sender| { - connection.procedures().list_visual_novel_runtime_history_then( - procedure_input, - move |_, result| { + connection + .procedures() + .list_visual_novel_runtime_history_then(procedure_input, move |_, result| { let mapped = result .map_err(|error| SpacetimeClientError::Procedure(error.to_string())) .and_then(map_visual_novel_history_procedure_result); send_once(&sender, mapped); - }, - ); + }); }) .await } @@ -451,15 +453,14 @@ impl SpacetimeClient { }; self.call_after_connect(move |connection, sender| { - connection.procedures().record_visual_novel_runtime_event_then( - procedure_input, - move |_, result| { + connection + .procedures() + .record_visual_novel_runtime_event_then(procedure_input, move |_, result| { let mapped = result .map_err(|error| SpacetimeClientError::Procedure(error.to_string())) .and_then(map_visual_novel_runtime_event_procedure_result); send_once(&sender, mapped); - }, - ); + }); }) .await } diff --git a/src/components/CustomWorldEntityEditorModal.test.tsx b/src/components/CustomWorldEntityEditorModal.test.tsx index 08717ec0..904d793d 100644 --- a/src/components/CustomWorldEntityEditorModal.test.tsx +++ b/src/components/CustomWorldEntityEditorModal.test.tsx @@ -1423,6 +1423,11 @@ test('作品封面上传会先进入 16:9 裁剪面板再提交到后端', async await waitFor(() => { expect(screen.getByText('裁剪上传封面')).toBeTruthy(); }); + expect( + screen.getByRole('button', { name: '拖拽右下角裁剪边界' }), + ).toBeTruthy(); + expect(screen.queryByText('左右位置')).toBeNull(); + expect(screen.queryByText('上下位置')).toBeNull(); await user.click(screen.getByRole('button', { name: '确认裁剪并上传' })); diff --git a/src/components/CustomWorldGenerationView.tsx b/src/components/CustomWorldGenerationView.tsx index bbbe826c..a6f58ddb 100644 --- a/src/components/CustomWorldGenerationView.tsx +++ b/src/components/CustomWorldGenerationView.tsx @@ -47,6 +47,37 @@ function getProgressPercentage(progress: CustomWorldGenerationProgress | null) { return Math.max(0, Math.min(100, progress?.overallProgress ?? 0)); } +function getStepProgressPercentage(step: { + completed: number; + total: number; + status: string; +}) { + if (step.status === 'completed') { + return 100; + } + + if (step.total <= 0) { + return 0; + } + + return Math.max( + 0, + Math.min(100, Math.round((step.completed / step.total) * 100)), + ); +} + +function getStepStatusLabel(step: { status: string }) { + if (step.status === 'completed') { + return '完成'; + } + + if (step.status === 'active') { + return '进行中'; + } + + return '待处理'; +} + function buildFallbackRenderKey( value: string | null | undefined, fallback: string, @@ -177,30 +208,47 @@ export function CustomWorldGenerationView({
- {steps.map((step, index) => ( -
-
-
- {step.label} + {steps.map((step, index) => { + const stepProgress = getStepProgressPercentage(step); + + return ( +
+
+
+ {step.label} +
+
+ {getStepStatusLabel(step)} {stepProgress}% +
-
- {step.completed}/{step.total} +
+ +
+
+ {step.detail}
-
- {step.detail} -
-
- ))} + ); + })}
{error ? ( diff --git a/src/components/big-fish-runtime/BigFishRuntimeShell.tsx b/src/components/big-fish-runtime/BigFishRuntimeShell.tsx index 111b4427..e2062f0e 100644 --- a/src/components/big-fish-runtime/BigFishRuntimeShell.tsx +++ b/src/components/big-fish-runtime/BigFishRuntimeShell.tsx @@ -27,6 +27,7 @@ type BigFishRuntimeShellProps = { sharePublicWorkCode?: string | null; isBusy?: boolean; error?: string | null; + embedded?: boolean; onBack: () => void; onRestart?: () => void; onSubmitInput: (payload: SubmitBigFishInputRequest) => void; @@ -227,6 +228,7 @@ export function BigFishRuntimeShell({ sharePublicWorkCode = null, isBusy = false, error = null, + embedded = false, onBack, onRestart, onSubmitInput, @@ -360,7 +362,9 @@ export function BigFishRuntimeShell({ if (!run) { return ( -
+
正在进入玩法 @@ -376,7 +380,9 @@ export function BigFishRuntimeShell({ findBigFishAssetSlot(assetSlots, 'stage_background')?.assetUrl?.trim() || null; return ( -
+
-
diff --git a/src/components/match3d-runtime/Match3DRuntimeShell.tsx b/src/components/match3d-runtime/Match3DRuntimeShell.tsx index a4d173a5..8c2515eb 100644 --- a/src/components/match3d-runtime/Match3DRuntimeShell.tsx +++ b/src/components/match3d-runtime/Match3DRuntimeShell.tsx @@ -40,6 +40,7 @@ type Match3DRuntimeShellProps = { run: Match3DRunSnapshot | null; isBusy?: boolean; error?: string | null; + embedded?: boolean; onBack: () => void; onRestart: () => void; onOptimisticRunChange: (run: Match3DRunSnapshot) => void; @@ -301,6 +302,7 @@ export function Match3DRuntimeShell({ run, isBusy = false, error = null, + embedded = false, onBack, onRestart, onOptimisticRunChange, @@ -429,17 +431,21 @@ export function Match3DRuntimeShell({ if (!run) { return ( -
+
{isBusy ? '载入中' : (error ?? '暂无运行态')}
); } return ( -
+
/gu, '>') + .replace(/"/gu, '"'); +} + +function buildPuzzleOnboardingFallbackImage(promptText: string) { + const trimmedPrompt = promptText.trim(); + const displayPrompt = escapePuzzleOnboardingSvgText( + trimmedPrompt.slice(0, 12) || '百梦拼图', + ); + return ( + 'data:image/svg+xml;utf8,' + + encodeURIComponent(` + + + + + + + + + + + + + + + + + + + + ${displayPrompt} +`) + ); +} + +function buildPuzzleOnboardingFallbackWork( + promptText: string, +): PuzzleWorkSummary { + const now = new Date().toISOString(); + const seed = Date.now(); + const coverImageSrc = buildPuzzleOnboardingFallbackImage(promptText); + const level: PuzzleDraftLevel = { + levelId: 'onboarding-local-level-1', + levelName: '梦境拼图', + pictureDescription: promptText, + pictureReference: null, + candidates: [ + { + candidateId: 'onboarding-local-candidate-1', + imageSrc: coverImageSrc, + assetId: 'onboarding-local-asset-1', + prompt: promptText, + actualPrompt: promptText, + sourceType: 'generated', + selected: true, + }, + ], + selectedCandidateId: 'onboarding-local-candidate-1', + coverImageSrc, + coverAssetId: 'onboarding-local-asset-1', + generationStatus: 'ready', + }; + + return { + workId: `onboarding-local-work-${seed}`, + profileId: `onboarding-local-profile-${seed}`, + ownerUserId: 'onboarding-guest', + sourceSessionId: null, + authorDisplayName: '百梦主', + workTitle: '梦境拼图', + workDescription: promptText, + levelName: level.levelName, + summary: promptText, + themeTags: ['新手引导', '拼图'], + coverImageSrc, + coverAssetId: level.coverAssetId, + publicationStatus: 'draft', + updatedAt: now, + publishedAt: null, + playCount: 0, + remixCount: 0, + likeCount: 0, + publishReady: true, + levels: [level], + }; +} + +function shouldUseLocalPuzzleOnboardingFallback(error: unknown) { + return ( + error instanceof ApiClientError && + error.status === 404 && + (error.code === 'NOT_FOUND' || error.message.includes('资源不存在')) + ); +} + function hasSeenPuzzleOnboarding() { if (typeof window === 'undefined') { return true; @@ -741,12 +878,14 @@ function PuzzleOnboardingView({ error, onPromptChange, onSubmit, + onSkip, }: { prompt: string; phase: PuzzleOnboardingPhase; error: string | null; onPromptChange: (value: string) => void; onSubmit: () => void; + onSkip: () => void; }) { const isGenerating = phase === 'generating'; const isGenerated = phase === 'generated'; @@ -755,6 +894,14 @@ function PuzzleOnboardingView({ return (
+
{isGenerating ? ( @@ -1390,6 +1537,15 @@ export function PlatformEntryFlowShellImpl({ const [isBigFishLoadingLibrary, setIsBigFishLoadingLibrary] = useState(false); const [bigFishGenerationState, setBigFishGenerationState] = useState(null); + const [activeRecommendEntryKey, setActiveRecommendEntryKey] = useState< + string | null + >(null); + const [activeRecommendRuntimeKind, setActiveRecommendRuntimeKind] = + useState(null); + const [activeRecommendRuntimeError, setActiveRecommendRuntimeError] = + useState(null); + const [isStartingRecommendEntry, setIsStartingRecommendEntry] = + useState(false); const [, setPuzzleOperation] = useState( null, ); @@ -1426,6 +1582,8 @@ export function PlatformEntryFlowShellImpl({ ); const [puzzleGenerationState, setPuzzleGenerationState] = useState(null); + const [puzzleGenerationProgressNowMs, setPuzzleGenerationProgressNowMs] = + useState(() => Date.now()); const [puzzleFormDraftPayload, setPuzzleFormDraftPayload] = useState(null); const [puzzleOnboardingPrompt, setPuzzleOnboardingPrompt] = useState(''); @@ -1953,6 +2111,16 @@ export function PlatformEntryFlowShellImpl({ visualNovelGalleryEntries, ], ); + const recommendRuntimeEntries = useMemo( + () => { + const entryMap = new Map(); + [...featuredGalleryEntries, ...latestGalleryEntries].forEach((entry) => { + entryMap.set(getPlatformPublicGalleryEntryKey(entry), entry); + }); + return Array.from(entryMap.values()); + }, + [featuredGalleryEntries, latestGalleryEntries], + ); const creationHubItems = useMemo( () => @@ -1991,6 +2159,25 @@ export function PlatformEntryFlowShellImpl({ setSelectionStage, ]); + useEffect(() => { + const shouldTickPuzzleProgress = + selectionStage === 'puzzle-generating' && + puzzleGenerationState != null && + puzzleGenerationState.phase !== 'ready' && + puzzleGenerationState.phase !== 'failed'; + + if (!shouldTickPuzzleProgress) { + return undefined; + } + + setPuzzleGenerationProgressNowMs(Date.now()); + const timerId = window.setInterval(() => { + setPuzzleGenerationProgressNowMs(Date.now()); + }, 500); + + return () => window.clearInterval(timerId); + }, [puzzleGenerationState, selectionStage]); + const runProtectedAction = useCallback( (action: () => void) => { if (!authUi?.requireAuth) { @@ -2049,6 +2236,18 @@ export function PlatformEntryFlowShellImpl({ }); }, [authUi, isPuzzleOnboardingSaving, savePuzzleOnboardingDraft]); + const skipPuzzleOnboarding = useCallback(() => { + markPuzzleOnboardingSeen(); + setPuzzleOnboardingDraft(null); + setPuzzleOnboardingPrompt(''); + setPuzzleOnboardingPhase('input'); + setPuzzleOnboardingError(null); + setPuzzleRun(null); + setSelectedPuzzleDetail(null); + platformBootstrap.setPlatformTab('home'); + setSelectionStage('platform'); + }, [platformBootstrap, setSelectionStage]); + useEffect(() => { if ( !authUi || @@ -2073,14 +2272,24 @@ export function PlatformEntryFlowShellImpl({ setPuzzleOnboardingPhase('generating'); setPuzzleOnboardingError(null); try { - const response = await generatePuzzleOnboardingWork({ promptText }); - const item: PuzzleWorkSummary = { - ...response.item, - levels: - response.item.levels && response.item.levels.length > 0 - ? response.item.levels - : [response.level], - }; + let item: PuzzleWorkSummary; + try { + const response = await generatePuzzleOnboardingWork({ promptText }); + item = { + ...response.item, + levels: + response.item.levels && response.item.levels.length > 0 + ? response.item.levels + : [response.level], + }; + } catch (error) { + if (!shouldUseLocalPuzzleOnboardingFallback(error)) { + throw error; + } + + // 中文注释:旧后端或资源路由尚未更新时,首访体验继续用前端临时图面兜底,不写入作品库。 + item = buildPuzzleOnboardingFallbackWork(promptText); + } setPuzzleOnboardingDraft({ promptText, item }); setSelectedPuzzleDetail(item); setPuzzleOnboardingPhase('generated'); @@ -2728,7 +2937,7 @@ export function PlatformEntryFlowShellImpl({ if (nextSession) { void refreshPuzzleShelf(); } - }, [puzzleFlow, refreshPuzzleShelf]); + }, [puzzleFlow, refreshPuzzleShelf, sessionController]); const openVisualNovelAgentWorkspace = useCallback(() => { setVisualNovelWork(null); @@ -3250,10 +3459,11 @@ export function PlatformEntryFlowShellImpl({ async ( profileId: string, returnStage: VisualNovelRuntimeReturnStage = 'work-detail', + options: { embedded?: boolean } = {}, ) => { const targetProfileId = profileId.trim(); if (!targetProfileId) { - return; + return false; } setVisualNovelError(null); @@ -3275,17 +3485,21 @@ export function PlatformEntryFlowShellImpl({ setVisualNovelWork(workDetail); setVisualNovelRun(run); setVisualNovelRuntimeReturnStage(returnStage); - setSelectionStage('visual-novel-runtime'); - pushAppHistoryPath( - buildPublicWorkStagePath( - 'visual-novel-runtime', - buildVisualNovelPublicWorkCode(targetProfileId), - ), - ); + if (!options.embedded) { + setSelectionStage('visual-novel-runtime'); + pushAppHistoryPath( + buildPublicWorkStagePath( + 'visual-novel-runtime', + buildVisualNovelPublicWorkCode(targetProfileId), + ), + ); + } + return true; } catch (error) { setVisualNovelError( resolvePuzzleErrorMessage(error, '启动视觉小说试玩失败。'), ); + return false; } finally { setIsVisualNovelBusy(false); } @@ -3758,9 +3972,10 @@ export function PlatformEntryFlowShellImpl({ detailItem?: PuzzleWorkSummary, mirrorErrorToPublicDetail = false, levelId?: string | null, + options: { embedded?: boolean } = {}, ) => { if (isPuzzleBusy) { - return; + return false; } setIsPuzzleBusy(true); @@ -3776,19 +3991,23 @@ export function PlatformEntryFlowShellImpl({ setSelectedPuzzleDetail(item); setPuzzleRun(run); setPuzzleRuntimeReturnStage(returnStage); - setSelectionStage('puzzle-runtime'); - pushAppHistoryPath( - buildPublicWorkStagePath( - 'puzzle-runtime', - buildPuzzlePublicWorkCode(item.profileId), - ), - ); + if (!options.embedded) { + setSelectionStage('puzzle-runtime'); + pushAppHistoryPath( + buildPublicWorkStagePath( + 'puzzle-runtime', + buildPuzzlePublicWorkCode(item.profileId), + ), + ); + } + return true; } catch (error) { const message = resolvePuzzleErrorMessage(error, '启动拼图玩法失败。'); setPuzzleError(message); if (mirrorErrorToPublicDetail) { setPublicWorkDetailError(message); } + return false; } finally { setIsPuzzleBusy(false); } @@ -3799,6 +4018,7 @@ export function PlatformEntryFlowShellImpl({ setIsPuzzleBusy, setPuzzleError, setSelectionStage, + startPuzzleRun, ], ); @@ -3807,9 +4027,10 @@ export function PlatformEntryFlowShellImpl({ profile: Match3DWorkProfile | Match3DWorkSummary, returnStage: 'match3d-result' | 'work-detail' = 'match3d-result', mirrorErrorToPublicDetail = false, + options: { embedded?: boolean } = {}, ) => { if (isMatch3DBusy) { - return; + return false; } match3dFlow.setIsBusy(true); @@ -3819,8 +4040,10 @@ export function PlatformEntryFlowShellImpl({ const { run } = await startMatch3DRun(profile.profileId); setMatch3DRun(run); setMatch3DRuntimeReturnStage(returnStage); - setSelectionStage('match3d-runtime'); - if (profile.publicationStatus === 'published') { + if (!options.embedded) { + setSelectionStage('match3d-runtime'); + } + if (!options.embedded && profile.publicationStatus === 'published') { pushAppHistoryPath( buildPublicWorkStagePath( 'work-detail', @@ -3828,6 +4051,7 @@ export function PlatformEntryFlowShellImpl({ ), ); } + return true; } catch (error) { const message = resolveMatch3DErrorMessage( error, @@ -3837,6 +4061,7 @@ export function PlatformEntryFlowShellImpl({ if (mirrorErrorToPublicDetail) { setPublicWorkDetailError(message); } + return false; } finally { match3dFlow.setIsBusy(false); } @@ -3855,9 +4080,10 @@ export function PlatformEntryFlowShellImpl({ profile: SquareHoleWorkProfile | SquareHoleWorkSummary, returnStage: SquareHoleRuntimeReturnStage = 'square-hole-result', mirrorErrorToPublicDetail = false, + options: { embedded?: boolean } = {}, ) => { if (isSquareHoleBusy) { - return; + return false; } squareHoleFlow.setIsBusy(true); @@ -3867,8 +4093,10 @@ export function PlatformEntryFlowShellImpl({ const { run } = await startSquareHoleRun(profile.profileId); setSquareHoleRun(run); setSquareHoleRuntimeReturnStage(returnStage); - setSelectionStage('square-hole-runtime'); - if (profile.publicationStatus === 'published') { + if (!options.embedded) { + setSelectionStage('square-hole-runtime'); + } + if (!options.embedded && profile.publicationStatus === 'published') { pushAppHistoryPath( buildPublicWorkStagePath( 'work-detail', @@ -3876,6 +4104,7 @@ export function PlatformEntryFlowShellImpl({ ), ); } + return true; } catch (error) { const message = resolveSquareHoleErrorMessage( error, @@ -3885,6 +4114,7 @@ export function PlatformEntryFlowShellImpl({ if (mirrorErrorToPublicDetail) { setPublicWorkDetailError(message); } + return false; } finally { squareHoleFlow.setIsBusy(false); } @@ -4424,7 +4654,6 @@ export function PlatformEntryFlowShellImpl({ setPuzzleError, ], ); - const remodelCurrentPuzzleRuntimeWork = useCallback( (profileId: string) => { const targetProfileId = profileId.trim(); @@ -5280,6 +5509,45 @@ export function PlatformEntryFlowShellImpl({ ], ); + const openRecommendGalleryDetail = useCallback( + (entry: PlatformPublicGalleryCard) => { + if (isBigFishGalleryEntry(entry)) { + openPublicWorkDetail(entry); + return; + } + + if (isPuzzleGalleryEntry(entry)) { + void openPuzzlePublicWorkDetail(entry.profileId, { + tab: platformBootstrap.platformTab, + }); + return; + } + + if (isMatch3DGalleryEntry(entry)) { + openPublicWorkDetail(entry); + return; + } + + if (isSquareHoleGalleryEntry(entry)) { + openPublicWorkDetail(entry); + return; + } + + if (isVisualNovelGalleryEntry(entry)) { + void openVisualNovelPublicWorkDetail(entry.profileId); + return; + } + + void openRpgPublicWorkDetail(entry); + }, + [ + openPuzzlePublicWorkDetail, + openPublicWorkDetail, + openRpgPublicWorkDetail, + openVisualNovelPublicWorkDetail, + platformBootstrap.platformTab, + ], + ); const openPuzzleDetail = useCallback( async ( profileId: string, @@ -5510,11 +5778,12 @@ export function PlatformEntryFlowShellImpl({ async ( item: BigFishWorkSummary, returnStage: BigFishRuntimeReturnStage = 'work-detail', + options: { embedded?: boolean } = {}, ) => { const sessionId = item.sourceSessionId?.trim(); if (!sessionId) { setBigFishError('当前作品缺少会话信息,暂时无法进入玩法。'); - return; + return false; } const publicWorkCode = buildBigFishPublicWorkCode(item.sourceSessionId); @@ -5533,19 +5802,23 @@ export function PlatformEntryFlowShellImpl({ const { run } = await startBigFishRuntimeRun(sessionId); setBigFishRuntimeStartedAt(Date.now()); setBigFishRun(run); - setSelectionStage('big-fish-runtime'); - pushAppHistoryPath( - buildPublicWorkStagePath('big-fish-runtime', publicWorkCode), - ); + if (!options.embedded) { + setSelectionStage('big-fish-runtime'); + pushAppHistoryPath( + buildPublicWorkStagePath('big-fish-runtime', publicWorkCode), + ); + } void recordBigFishPlay(sessionId, { elapsedMs: 0 }).catch((error) => { setBigFishError( resolveBigFishErrorMessage(error, '记录大鱼吃小鱼游玩失败。'), ); }); + return true; } catch (error) { setBigFishError( resolveBigFishErrorMessage(error, '启动大鱼吃小鱼玩法失败。'), ); + return false; } }, [ @@ -5675,6 +5948,419 @@ export function PlatformEntryFlowShellImpl({ startVisualNovelRunFromProfile, ]); + const selectRecommendRuntimeEntry = useCallback( + async (entry: PlatformPublicGalleryCard) => { + const entryKey = getPlatformPublicGalleryEntryKey(entry); + const runtimeKind = getPlatformRecommendRuntimeKind(entry); + setActiveRecommendEntryKey(entryKey); + setActiveRecommendRuntimeKind(runtimeKind); + setActiveRecommendRuntimeError(null); + setIsStartingRecommendEntry(true); + + try { + let started = false; + if (isBigFishGalleryEntry(entry)) { + const work = mapPublicWorkDetailToBigFishWork(entry); + if (!work) { + setBigFishError('当前作品缺少会话信息,暂时无法进入玩法。'); + } else { + started = await startBigFishRunFromWork(work, 'platform', { + embedded: true, + }); + } + } else if (isPuzzleGalleryEntry(entry)) { + const work = + selectedPuzzleDetail?.profileId === entry.profileId + ? selectedPuzzleDetail + : mapPublicWorkDetailToPuzzleWork(entry); + if (!work) { + setPuzzleError('当前拼图作品信息不完整,暂时无法进入玩法。'); + } else { + started = await startPuzzleRunFromProfile( + work.profileId, + 'platform', + work, + false, + null, + { embedded: true }, + ); + } + } else if (isMatch3DGalleryEntry(entry)) { + const work = mapPublicWorkDetailToMatch3DWork(entry); + if (!work) { + setMatch3DError('当前抓大鹅作品信息不完整,暂时无法进入玩法。'); + } else { + started = await startMatch3DRunFromProfile( + work, + 'work-detail', + false, + { embedded: true }, + ); + } + } else if (isSquareHoleGalleryEntry(entry)) { + const work = mapPublicWorkDetailToSquareHoleWork(entry); + if (!work) { + setSquareHoleError('当前方洞挑战作品信息不完整,暂时无法进入玩法。'); + } else { + started = await startSquareHoleRunFromProfile( + work, + 'platform', + false, + { embedded: true }, + ); + } + } else if (isVisualNovelGalleryEntry(entry)) { + started = await startVisualNovelRunFromProfile( + entry.profileId, + 'platform', + { embedded: true }, + ); + } else { + started = true; + } + + setActiveRecommendRuntimeKind(started ? runtimeKind : null); + } finally { + setIsStartingRecommendEntry(false); + } + }, + [ + selectedPuzzleDetail, + setBigFishError, + setMatch3DError, + setPuzzleError, + setSquareHoleError, + startBigFishRunFromWork, + startMatch3DRunFromProfile, + startPuzzleRunFromProfile, + startSquareHoleRunFromProfile, + startVisualNovelRunFromProfile, + ], + ); + + const recommendRuntimeContent = useMemo(() => { + if ( + selectionStage !== 'platform' || + platformBootstrap.platformTab !== 'home' || + !activeRecommendRuntimeKind + ) { + return null; + } + + const activeEntry = + recommendRuntimeEntries.find( + (entry) => + getPlatformPublicGalleryEntryKey(entry) === activeRecommendEntryKey, + ) ?? null; + if (!activeEntry) { + return null; + } + + if (activeRecommendRuntimeKind === 'big-fish') { + return ( + { + reportBigFishObservedPlayTime(); + setActiveRecommendRuntimeKind(null); + }} + onRestart={() => { + reportBigFishObservedPlayTime(); + void restartBigFishRun(); + }} + onSubmitInput={submitBigFishInput} + /> + ); + } + + if (activeRecommendRuntimeKind === 'match3d') { + return ( + { + setActiveRecommendRuntimeKind(null); + }} + onRestart={() => { + if (!match3dRun?.runId || isMatch3DBusy) { + return; + } + + match3dFlow.setIsBusy(true); + setMatch3DError(null); + void restartMatch3DRun(match3dRun.runId) + .then(({ run }) => { + setMatch3DRun(run); + }) + .catch((error) => { + setMatch3DError( + resolveMatch3DErrorMessage( + error, + '重新开始抓大鹅玩法失败。', + ), + ); + }) + .finally(() => { + match3dFlow.setIsBusy(false); + }); + }} + onOptimisticRunChange={setMatch3DRun} + onClickItem={(payload) => { + const runId = payload.runId ?? match3dRun?.runId; + if (!runId) { + return Promise.reject(new Error('抓大鹅运行态缺少 runId。')); + } + return clickMatch3DItem(runId, payload); + }} + onTimeExpired={() => { + if (!match3dRun?.runId) { + return; + } + + void finishMatch3DTimeUp(match3dRun.runId) + .then(({ run }) => { + setMatch3DRun(run); + }) + .catch((error) => { + setMatch3DError( + resolveMatch3DErrorMessage( + error, + '同步抓大鹅倒计时失败。', + ), + ); + }); + }} + /> + ); + } + + if (activeRecommendRuntimeKind === 'puzzle') { + return ( + { + setActiveRecommendRuntimeKind(null); + }} + onRemodelWork={ + selectedPuzzleDetail?.publicationStatus === 'published' + ? remodelCurrentPuzzleRuntimeWork + : undefined + } + onSwapPieces={(payload) => { + void swapPuzzlePiecesInRun(payload); + }} + onDragPiece={(payload) => { + void dragPuzzlePiece(payload); + }} + onAdvanceNextLevel={(target) => { + void advancePuzzleLevel(target); + }} + onRestartLevel={() => { + void restartPuzzleCurrentLevel(); + }} + onPauseChange={setPuzzleRuntimePaused} + onUseProp={usePuzzleProp} + onTimeExpired={syncPuzzleRuntimeTimeout} + /> + ); + } + + if (activeRecommendRuntimeKind === 'square-hole') { + return ( + { + if ( + squareHoleRun?.runId && + squareHoleRun.status.toLowerCase() === 'running' + ) { + void stopSquareHoleRun(squareHoleRun.runId).catch( + () => undefined, + ); + } + setActiveRecommendRuntimeKind(null); + }} + onRestart={() => { + if (!squareHoleRun?.runId || isSquareHoleBusy) { + return; + } + + squareHoleFlow.setIsBusy(true); + setSquareHoleError(null); + void restartSquareHoleRun(squareHoleRun.runId) + .then(({ run }) => { + setSquareHoleRun(run); + }) + .catch((error) => { + setSquareHoleError( + resolveSquareHoleErrorMessage( + error, + '重新开始方洞挑战失败。', + ), + ); + }) + .finally(() => { + squareHoleFlow.setIsBusy(false); + }); + }} + onOptimisticRunChange={setSquareHoleRun} + onDropShape={(payload) => { + const runId = payload.runId ?? squareHoleRun?.runId; + if (!runId) { + return Promise.reject(new Error('方洞挑战运行态缺少 runId。')); + } + return dropSquareHoleShape(runId, payload); + }} + onTimeExpired={() => { + if (!squareHoleRun?.runId) { + return; + } + + void finishSquareHoleTimeUp(squareHoleRun.runId) + .then(({ run }) => { + setSquareHoleRun(run); + }) + .catch((error) => { + setSquareHoleError( + resolveSquareHoleErrorMessage( + error, + '同步方洞挑战倒计时失败。', + ), + ); + }); + }} + /> + ); + } + + if (activeRecommendRuntimeKind === 'visual-novel') { + return ( + { + setActiveRecommendRuntimeKind(null); + }} + onSubmitAction={(payload) => { + void submitVisualNovelRuntimeAction(payload); + }} + /> + ); + } + + return ( +
+ 正在读取世界 +
+ ); + }, [ + activeRecommendEntryKey, + activeRecommendRuntimeKind, + bigFishError, + bigFishRun, + bigFishRuntimeShare, + bigFishSession?.assetSlots, + isBigFishBusy, + isMatch3DBusy, + isPuzzleBusy, + isPuzzleLeaderboardBusy, + isPuzzleNextLevelGenerating, + isSquareHoleBusy, + isVisualNovelBusy, + match3dError, + match3dFlow, + match3dRun, + puzzleError, + puzzleRun, + recommendRuntimeEntries, + remodelCurrentPuzzleRuntimeWork, + reportBigFishObservedPlayTime, + restartBigFishRun, + selectedPuzzleDetail, + selectionStage, + setMatch3DError, + setMatch3DRun, + setPuzzleRuntimePaused, + setSquareHoleRun, + squareHoleError, + squareHoleFlow, + squareHoleRun, + submitBigFishInput, + submitVisualNovelRuntimeAction, + advancePuzzleLevel, + dragPuzzlePiece, + restartPuzzleCurrentLevel, + setSquareHoleError, + swapPuzzlePiecesInRun, + syncPuzzleRuntimeTimeout, + usePuzzleProp, + visualNovelError, + visualNovelRun, + visualNovelSession, + visualNovelWork, + ]); + + useEffect(() => { + if ( + selectionStage !== 'platform' || + platformBootstrap.platformTab !== 'home' || + platformBootstrap.isLoadingPlatform + ) { + return; + } + + if (recommendRuntimeEntries.length === 0) { + setActiveRecommendEntryKey(null); + setActiveRecommendRuntimeKind(null); + setActiveRecommendRuntimeError(null); + return; + } + + const hasActiveEntry = + activeRecommendEntryKey && + recommendRuntimeEntries.some( + (entry) => + getPlatformPublicGalleryEntryKey(entry) === activeRecommendEntryKey, + ); + if (hasActiveEntry || isStartingRecommendEntry) { + return; + } + + const firstRecommendEntry = recommendRuntimeEntries[0]; + if (firstRecommendEntry) { + void selectRecommendRuntimeEntry(firstRecommendEntry); + } + }, [ + activeRecommendEntryKey, + isStartingRecommendEntry, + platformBootstrap.isLoadingPlatform, + platformBootstrap.platformTab, + recommendRuntimeEntries, + selectRecommendRuntimeEntry, + selectionStage, + ]); + const remixPublicWork = useCallback( (entry: PlatformPublicGalleryCard) => { if (isPublicWorkDetailBusy) { @@ -6492,14 +7178,11 @@ export function PlatformEntryFlowShellImpl({ ); const creationStartContent = ( -
+
-

- 10分钟创作一个精品互动玩法 -

@@ -6531,9 +7214,9 @@ export function PlatformEntryFlowShellImpl({ } handleCreationHubCreateType(item.id); }} - className={`platform-creation-reference-card platform-interactive-card relative flex min-h-[4.9rem] w-[10.75rem] shrink-0 snap-start flex-col overflow-hidden rounded-[1.1rem] border p-0 text-left transition sm:min-h-[7.25rem] sm:w-[13.5rem] sm:rounded-[1.35rem] ${ + className={`platform-creation-reference-card platform-interactive-card relative flex min-h-[4.75rem] w-[10.75rem] shrink-0 snap-start flex-col overflow-hidden rounded-[1.1rem] border p-0 text-left transition sm:min-h-[6.85rem] sm:w-[13.25rem] sm:rounded-[1.28rem] ${ selected - ? 'border-[var(--platform-primary)] bg-[var(--platform-primary)] text-white shadow-[0_10px_24px_rgba(255,79,139,0.2)]' + ? 'border-white/48 bg-black/12 text-white shadow-none ring-1 ring-inset ring-white/34' : 'border-[var(--platform-subpanel-border)] bg-[var(--platform-subpanel-fill)] text-white hover:border-[var(--platform-surface-hover-border)]' } ${disabled ? 'cursor-not-allowed opacity-55' : ''}`} > @@ -6546,7 +7229,7 @@ export function PlatformEntryFlowShellImpl({ )} - + {item.title} @@ -6575,7 +7258,7 @@ export function PlatformEntryFlowShellImpl({
-
+
}> { - if (isBigFishGalleryEntry(entry)) { - openPublicWorkDetail(entry); - return; - } - - if (isPuzzleGalleryEntry(entry)) { - void openPuzzlePublicWorkDetail(entry.profileId, { - tab: platformBootstrap.platformTab, - }); - return; - } - - if (isMatch3DGalleryEntry(entry)) { - openPublicWorkDetail(entry); - return; - } - - if (isSquareHoleGalleryEntry(entry)) { - openPublicWorkDetail(entry); - return; - } - - if (isVisualNovelGalleryEntry(entry)) { - void openVisualNovelPublicWorkDetail(entry.profileId); - return; - } - - void openRpgPublicWorkDetail(entry); + onOpenGalleryDetail={openRecommendGalleryDetail} + recommendRuntimeContent={recommendRuntimeContent} + activeRecommendEntryKey={activeRecommendEntryKey} + isStartingRecommendEntry={ + isStartingRecommendEntry || + isBigFishBusy || + isPuzzleBusy || + isMatch3DBusy || + isSquareHoleBusy || + isVisualNovelBusy + } + recommendRuntimeError={activeRecommendRuntimeError} + onSelectRecommendEntry={(entry) => { + void selectRecommendRuntimeEntry(entry); }} onOpenLibraryDetail={(entry) => { runProtectedAction(() => { @@ -7513,6 +8181,7 @@ export function PlatformEntryFlowShellImpl({ onSubmit={() => { void submitPuzzleOnboardingPrompt(); }} + onSkip={skipPuzzleOnboarding} /> )} @@ -7538,6 +8207,7 @@ export function PlatformEntryFlowShellImpl({ )} progress={buildMiniGameDraftGenerationProgress( puzzleGenerationState, + puzzleGenerationProgressNowMs, )} isGenerating={isPuzzleBusy} error={puzzleError} diff --git a/src/components/puzzle-agent/PuzzleAgentWorkspace.interaction.test.tsx b/src/components/puzzle-agent/PuzzleAgentWorkspace.interaction.test.tsx index 3a60f4dd..cad64ead 100644 --- a/src/components/puzzle-agent/PuzzleAgentWorkspace.interaction.test.tsx +++ b/src/components/puzzle-agent/PuzzleAgentWorkspace.interaction.test.tsx @@ -145,7 +145,7 @@ test('puzzle workspace submits the work form instead of agent chat', () => { fireEvent.change(screen.getByLabelText('画面描述'), { target: { value: '一只猫在雨夜灯牌下回头。' }, }); - fireEvent.click(screen.getByRole('button', { name: /生成草稿/u })); + fireEvent.click(screen.getByRole('button', { name: /生成拼图游戏草稿/u })); expect(onCreateFromForm).toHaveBeenCalledWith({ seedText: '一只猫在雨夜灯牌下回头。', @@ -178,9 +178,20 @@ test('puzzle workspace keeps the reference image upload as a primary panel', () expect(uploadCard).not.toBeNull(); expect(uploadCard?.closest('.platform-subpanel')).toBeNull(); expect(container.querySelector('.puzzle-image-upload-card')).toBeTruthy(); + expect(container.querySelector('.puzzle-creation-form-body')?.className).toContain( + 'overflow-hidden', + ); + expect(container.querySelector('.puzzle-image-field')?.className).toContain( + 'flex-1', + ); expect(screen.getByText('拼图画面')).toBeTruthy(); - expect(screen.getByText('点击上传拼图图片')).toBeTruthy(); + expect( + screen + .getByText('若没有合适的图片可以通过填写画面描述生成画面') + .closest('.puzzle-image-upload-card'), + ).toBeTruthy(); + expect(screen.getByText('点击上传拼图图片').closest('.puzzle-image-upload-card')).toBeTruthy(); expect(screen.queryByRole('switch', { name: 'AI重绘' })).toBeNull(); expect(screen.queryByLabelText('拼图创作模板')).toBeNull(); expect( @@ -190,15 +201,19 @@ test('puzzle workspace keeps the reference image upload as a primary panel', () (screen.getByLabelText('画面描述') as HTMLTextAreaElement).placeholder, ).toBe(''); expect(screen.queryByText(/一只猫在雨夜灯牌下回头/u)).toBeNull(); - expect(screen.getByLabelText('画面描述').className).toContain( - 'min-h-[clamp(5rem,15svh,7rem)]', - ); + expect(screen.getByLabelText('画面描述').className).toContain('h-[6rem]'); expect(uploadCard?.className).toContain('aspect-square'); + expect(uploadCard?.className).toContain('h-full'); + expect( + screen + .getByRole('button', { name: /生成拼图游戏草稿/u }) + .parentElement?.className, + ).toContain('justify-center'); fireEvent.change(screen.getByLabelText('画面描述'), { target: { value: '一只猫在阳光窗台上看着毛线球。' }, }); - fireEvent.click(screen.getByRole('button', { name: /生成草稿/u })); + fireEvent.click(screen.getByRole('button', { name: /生成拼图游戏草稿/u })); expect(onCreateFromForm).toHaveBeenCalledWith( expect.objectContaining({ pictureDescription: '一只猫在阳光窗台上看着毛线球。', @@ -221,7 +236,7 @@ test('puzzle upload card stays light in light theme', () => { expect(container.querySelector('.puzzle-image-upload-card')).toBeTruthy(); const uploadLabel = screen.getByText('点击上传拼图图片'); expect(uploadLabel).toBeTruthy(); - expect(uploadLabel.closest('.puzzle-image-upload-card')).toBeNull(); + expect(uploadLabel.closest('.puzzle-image-upload-card')).toBeTruthy(); expect(screen.queryByText('AI重绘')).toBeNull(); expect(container.querySelector('.puzzle-image-upload-card')?.className).toContain( 'bg-white/90', @@ -245,7 +260,7 @@ test('puzzle workspace falls back to compile action for restored sessions', () = />, ); - fireEvent.click(screen.getByRole('button', { name: /生成草稿/u })); + fireEvent.click(screen.getByRole('button', { name: /生成拼图游戏草稿/u })); expect(onCreateFromForm).not.toHaveBeenCalled(); expect(onExecuteAction).toHaveBeenCalledWith({ @@ -278,7 +293,7 @@ test('puzzle workspace switches the image model from the description box', () => fireEvent.click(screen.getByRole('button', { name: '图片模型' })); expect(screen.queryByRole('menuitemradio', { name: '原模型' })).toBeNull(); fireEvent.click(screen.getByRole('menuitemradio', { name: 'nanobanana2' })); - fireEvent.click(screen.getByRole('button', { name: /生成草稿/u })); + fireEvent.click(screen.getByRole('button', { name: /生成拼图游戏草稿/u })); expect(onCreateFromForm).toHaveBeenCalledWith( expect.objectContaining({ @@ -390,7 +405,7 @@ test('puzzle workspace hides prompt and cost when AI redraw is off', async () => expect(screen.queryByLabelText('画面AI重绘要求(提示词)')).toBeNull(); expect(screen.queryByText('消耗2光点')).toBeNull(); - fireEvent.click(screen.getByRole('button', { name: /生成草稿/u })); + fireEvent.click(screen.getByRole('button', { name: /生成拼图游戏草稿/u })); expect(onCreateFromForm).toHaveBeenCalledWith({ seedText: 'first-level.png', @@ -425,6 +440,51 @@ test('puzzle workspace shows AI redraw switch only after upload', async () => { await waitFor(() => { expect(screen.getByRole('switch', { name: 'AI重绘' })).toBeTruthy(); }); + expect( + screen.getByRole('switch', { name: 'AI重绘' }).closest('.puzzle-image-upload-card'), + ).toBeTruthy(); + expect(screen.getByRole('button', { name: '移除拼图图片' })).toBeTruthy(); + expect(screen.queryByText('点击上传拼图图片')).toBeNull(); +}); + +test('puzzle workspace confirms before removing uploaded image', async () => { + const uploadedDataUrl = 'data:image/png;base64,uploaded-square'; + stubReferenceImageUpload(uploadedDataUrl); + + render( + {}} + onSubmitMessage={() => {}} + onExecuteAction={() => {}} + onCreateFromForm={() => {}} + />, + ); + + fireEvent.change(screen.getByLabelText('上传拼图图片', { selector: 'input' }), { + target: { + files: [new File(['x'], 'first-level.png', { type: 'image/png' })], + }, + }); + + await waitFor(() => { + expect(screen.getByAltText('拼图图片')).toBeTruthy(); + }); + fireEvent.click(screen.getByRole('button', { name: '移除拼图图片' })); + expect( + screen.getByRole('dialog', { name: '移除拼图图片?' }), + ).toBeTruthy(); + expect(screen.getByAltText('拼图图片')).toBeTruthy(); + + fireEvent.click(screen.getByRole('button', { name: '取消' })); + expect(screen.queryByRole('dialog', { name: '移除拼图图片?' })).toBeNull(); + expect(screen.getByAltText('拼图图片')).toBeTruthy(); + + fireEvent.click(screen.getByRole('button', { name: '移除拼图图片' })); + fireEvent.click(screen.getByRole('button', { name: '移除' })); + expect(screen.queryByAltText('拼图图片')).toBeNull(); + expect(screen.queryByRole('switch', { name: 'AI重绘' })).toBeNull(); + expect(screen.getByText('点击上传拼图图片')).toBeTruthy(); }); test('puzzle workspace opens crop tool for non-square uploads', async () => { @@ -455,6 +515,12 @@ test('puzzle workspace opens crop tool for non-square uploads', async () => { await waitFor(() => { expect(screen.getByRole('dialog', { name: '裁剪拼图图片' })).toBeTruthy(); }); + expect( + screen.getByRole('button', { name: '拖拽右下角裁剪边界' }), + ).toBeTruthy(); + expect(screen.queryByText('缩放')).toBeNull(); + expect(screen.queryByText('横向')).toBeNull(); + expect(screen.queryByText('纵向')).toBeNull(); fireEvent.click(screen.getByRole('button', { name: '应用' })); await waitFor(() => { diff --git a/src/components/puzzle-agent/PuzzleAgentWorkspace.tsx b/src/components/puzzle-agent/PuzzleAgentWorkspace.tsx index 4db8a473..adaf2a46 100644 --- a/src/components/puzzle-agent/PuzzleAgentWorkspace.tsx +++ b/src/components/puzzle-agent/PuzzleAgentWorkspace.tsx @@ -1,6 +1,7 @@ -import { ArrowLeft, ImagePlus, Loader2, Sparkles, X } from 'lucide-react'; +import { ArrowLeft, ImagePlus, Loader2, Sparkles, Trash2 } from 'lucide-react'; import { type ChangeEvent, + type CSSProperties, type PointerEvent, useEffect, useMemo, @@ -62,11 +63,79 @@ type PuzzleImageCropState = { imageSize: { width: number; height: number }; cropX: number; cropY: number; - scale: number; + cropSize: number; error: string | null; isSaving: boolean; }; +type PuzzleCropDragHandle = + | 'move' + | 'north' + | 'northEast' + | 'east' + | 'southEast' + | 'south' + | 'southWest' + | 'west' + | 'northWest'; + +type PuzzleCropDragSnapshot = { + pointerId: number; + handle: PuzzleCropDragHandle; + clientX: number; + clientY: number; + cropRect: { x: number; y: number; size: number }; + previewWidth: number; + previewHeight: number; +}; + +const PUZZLE_CROP_RESIZE_HANDLES: Array<{ + handle: Exclude; + label: string; + className: string; +}> = [ + { + handle: 'northWest', + label: '拖拽左上角裁剪边界', + className: 'left-0 top-0 -translate-x-1/2 -translate-y-1/2 cursor-nwse-resize', + }, + { + handle: 'north', + label: '拖拽上边裁剪边界', + className: 'left-1/2 top-0 -translate-x-1/2 -translate-y-1/2 cursor-ns-resize', + }, + { + handle: 'northEast', + label: '拖拽右上角裁剪边界', + className: 'right-0 top-0 translate-x-1/2 -translate-y-1/2 cursor-nesw-resize', + }, + { + handle: 'east', + label: '拖拽右边裁剪边界', + className: 'right-0 top-1/2 -translate-y-1/2 translate-x-1/2 cursor-ew-resize', + }, + { + handle: 'southEast', + label: '拖拽右下角裁剪边界', + className: 'bottom-0 right-0 translate-x-1/2 translate-y-1/2 cursor-nwse-resize', + }, + { + handle: 'south', + label: '拖拽下边裁剪边界', + className: 'bottom-0 left-1/2 -translate-x-1/2 translate-y-1/2 cursor-ns-resize', + }, + { + handle: 'southWest', + label: '拖拽左下角裁剪边界', + className: 'bottom-0 left-0 -translate-x-1/2 translate-y-1/2 cursor-nesw-resize', + }, + { + handle: 'west', + label: '拖拽左边裁剪边界', + className: 'left-0 top-1/2 -translate-x-1/2 -translate-y-1/2 cursor-ew-resize', + }, +]; + function resolveInitialFormState( session: PuzzleAgentSessionSnapshot | null, initialFormPayload: CreatePuzzleAgentSessionRequest | null = null, @@ -125,73 +194,221 @@ function resolveInitialFormState( }; } -function clampPuzzleImageCrop( +function clampNumber(value: number, min: number, max: number) { + return Math.max(min, Math.min(max, value)); +} + +function getPuzzleCropSizeBounds(imageSize: { width: number; height: number }) { + const maxSize = Math.max(1, Math.min(imageSize.width, imageSize.height)); + const minSize = Math.min(maxSize, Math.max(48, maxSize * 0.18)); + + return { minSize, maxSize }; +} + +function clampPuzzleImageCropRect( imageSize: { width: number; height: number }, - scale: number, - crop: { x: number; y: number }, + crop: { x: number; y: number; size: number }, ) { - const cropSize = Math.min(imageSize.width, imageSize.height) / scale; - const maxCropX = Math.max(0, imageSize.width - cropSize); - const maxCropY = Math.max(0, imageSize.height - cropSize); + const { minSize, maxSize } = getPuzzleCropSizeBounds(imageSize); + const size = clampNumber(crop.size, minSize, maxSize); return { - x: Math.max(0, Math.min(maxCropX, crop.x)), - y: Math.max(0, Math.min(maxCropY, crop.y)), + x: clampNumber(crop.x, 0, Math.max(0, imageSize.width - size)), + y: clampNumber(crop.y, 0, Math.max(0, imageSize.height - size)), + size, }; } +function buildPuzzleCropPreviewStyle( + crop: { x: number; y: number; size: number }, + imageSize: { width: number; height: number }, +) { + return { + left: `${(crop.x / imageSize.width) * 100}%`, + top: `${(crop.y / imageSize.height) * 100}%`, + width: `${(crop.size / imageSize.width) * 100}%`, + height: `${(crop.size / imageSize.height) * 100}%`, + } satisfies CSSProperties; +} + +function resizePuzzleCropRectFromHandle( + snapshot: PuzzleCropDragSnapshot, + deltaX: number, + deltaY: number, + imageSize: { width: number; height: number }, +) { + const start = snapshot.cropRect; + const startRight = start.x + start.size; + const startBottom = start.y + start.size; + const startCenterX = start.x + start.size / 2; + const startCenterY = start.y + start.size / 2; + const { minSize, maxSize } = getPuzzleCropSizeBounds(imageSize); + const chooseSize = (sizeFromX: number, sizeFromY: number) => { + const xDistance = Math.abs(sizeFromX - start.size); + const yDistance = Math.abs(sizeFromY - start.size); + + return xDistance >= yDistance ? sizeFromX : sizeFromY; + }; + const clampSize = (size: number, maxByAnchor = maxSize) => + clampNumber(size, minSize, Math.max(minSize, Math.min(maxSize, maxByAnchor))); + + if (snapshot.handle === 'move') { + return clampPuzzleImageCropRect(imageSize, { + ...start, + x: start.x + deltaX, + y: start.y + deltaY, + }); + } + + if (snapshot.handle === 'east' || snapshot.handle === 'west') { + const isEast = snapshot.handle === 'east'; + const anchorX = isEast ? start.x : startRight; + const maxByAnchorX = isEast ? imageSize.width - anchorX : anchorX; + const maxByCenterY = + 2 * Math.min(startCenterY, imageSize.height - startCenterY); + const size = clampSize( + start.size + (isEast ? deltaX : -deltaX), + Math.min(maxByAnchorX, maxByCenterY), + ); + + return clampPuzzleImageCropRect(imageSize, { + x: isEast ? anchorX : anchorX - size, + y: startCenterY - size / 2, + size, + }); + } + + if (snapshot.handle === 'north' || snapshot.handle === 'south') { + const isSouth = snapshot.handle === 'south'; + const anchorY = isSouth ? start.y : startBottom; + const maxByAnchorY = isSouth ? imageSize.height - anchorY : anchorY; + const maxByCenterX = + 2 * Math.min(startCenterX, imageSize.width - startCenterX); + const size = clampSize( + start.size + (isSouth ? deltaY : -deltaY), + Math.min(maxByAnchorY, maxByCenterX), + ); + + return clampPuzzleImageCropRect(imageSize, { + x: startCenterX - size / 2, + y: isSouth ? anchorY : anchorY - size, + size, + }); + } + + const isEast = snapshot.handle === 'northEast' || snapshot.handle === 'southEast'; + const isSouth = snapshot.handle === 'southEast' || snapshot.handle === 'southWest'; + const anchorX = isEast ? start.x : startRight; + const anchorY = isSouth ? start.y : startBottom; + const maxByAnchorX = isEast ? imageSize.width - anchorX : anchorX; + const maxByAnchorY = isSouth ? imageSize.height - anchorY : anchorY; + const sizeFromX = start.size + (isEast ? deltaX : -deltaX); + const sizeFromY = start.size + (isSouth ? deltaY : -deltaY); + const size = clampSize( + chooseSize(sizeFromX, sizeFromY), + Math.min(maxByAnchorX, maxByAnchorY), + ); + + return clampPuzzleImageCropRect(imageSize, { + x: isEast ? anchorX : anchorX - size, + y: isSouth ? anchorY : anchorY - size, + size, + }); +} + function PuzzleImageCropModal({ state, - onScaleChange, - onCropChange, + onCropRectChange, onClose, onSubmit, }: { state: PuzzleImageCropState; - onScaleChange: (value: number) => void; - onCropChange: (nextCrop: { x: number; y: number }) => void; + onCropRectChange: (nextCrop: { x: number; y: number; size: number }) => void; onClose: () => void; onSubmit: () => void; }) { const previewRef = useRef(null); - const dragStartRef = useRef<{ - pointerId: number; - clientX: number; - clientY: number; - cropX: number; - cropY: number; - } | null>(null); - const [isDragging, setIsDragging] = useState(false); - const cropSize = Math.min(state.imageSize.width, state.imageSize.height) / - state.scale; - const maxCropX = Math.max(0, state.imageSize.width - cropSize); - const maxCropY = Math.max(0, state.imageSize.height - cropSize); - const backgroundSize = `${(state.imageSize.width / cropSize) * 100}% ${(state.imageSize.height / cropSize) * 100}%`; - const backgroundPosition = `${maxCropX > 0 ? (state.cropX / maxCropX) * 100 : 50}% ${maxCropY > 0 ? (state.cropY / maxCropY) * 100 : 50}%`; - const updateDragCrop = (event: PointerEvent) => { - const dragStart = dragStartRef.current; + const dragSnapshotRef = useRef(null); + const [activeDragHandle, setActiveDragHandle] = + useState(null); + const cropRect = useMemo( + () => + clampPuzzleImageCropRect(state.imageSize, { + x: state.cropX, + y: state.cropY, + size: state.cropSize, + }), + [state.cropSize, state.cropX, state.cropY, state.imageSize], + ); + const previewStyle = useMemo( + () => buildPuzzleCropPreviewStyle(cropRect, state.imageSize), + [cropRect, state.imageSize], + ); + const editorPreviewStyle = useMemo( + () => + ({ + aspectRatio: `${state.imageSize.width} / ${state.imageSize.height}`, + width: `min(100%, calc(min(52vh, 22rem) * ${ + state.imageSize.width / Math.max(1, state.imageSize.height) + }))`, + }) satisfies CSSProperties, + [state.imageSize], + ); + + const beginCropDrag = ( + handle: PuzzleCropDragHandle, + event: PointerEvent, + ) => { + if (state.isSaving) { + return; + } + const preview = previewRef.current; - if (!dragStart || !preview || event.pointerId !== dragStart.pointerId) { + if (!preview) { return; } const rect = preview.getBoundingClientRect(); - const sourcePixelsPerPreviewPixel = cropSize / Math.max(1, rect.width); - onCropChange({ - x: - dragStart.cropX - - (event.clientX - dragStart.clientX) * sourcePixelsPerPreviewPixel, - y: - dragStart.cropY - - (event.clientY - dragStart.clientY) * sourcePixelsPerPreviewPixel, - }); + dragSnapshotRef.current = { + pointerId: event.pointerId, + handle, + clientX: event.clientX, + clientY: event.clientY, + cropRect, + previewWidth: rect.width, + previewHeight: rect.height, + }; + setActiveDragHandle(handle); + event.preventDefault(); + event.stopPropagation(); + event.currentTarget.setPointerCapture(event.pointerId); }; - const stopDragging = (event: PointerEvent) => { - if (dragStartRef.current?.pointerId === event.pointerId) { - dragStartRef.current = null; - setIsDragging(false); - event.currentTarget.releasePointerCapture(event.pointerId); + + const updateCropDrag = (event: PointerEvent) => { + const snapshot = dragSnapshotRef.current; + if (!snapshot || snapshot.pointerId !== event.pointerId) { + return; } + + const deltaX = + ((event.clientX - snapshot.clientX) * state.imageSize.width) / + Math.max(1, snapshot.previewWidth); + const deltaY = + ((event.clientY - snapshot.clientY) * state.imageSize.height) / + Math.max(1, snapshot.previewHeight); + onCropRectChange( + resizePuzzleCropRectFromHandle(snapshot, deltaX, deltaY, state.imageSize), + ); + }; + + const stopCropDrag = (event: PointerEvent) => { + if (dragSnapshotRef.current?.pointerId !== event.pointerId) { + return; + } + + dragSnapshotRef.current = null; + setActiveDragHandle(null); + event.currentTarget.releasePointerCapture(event.pointerId); }; return ( @@ -218,85 +435,53 @@ function PuzzleImageCropModal({
{ - dragStartRef.current = { - pointerId: event.pointerId, - clientX: event.clientX, - clientY: event.clientY, - cropX: state.cropX, - cropY: state.cropY, - }; - setIsDragging(true); - event.currentTarget.setPointerCapture(event.pointerId); - }} - onPointerMove={updateDragCrop} - onPointerUp={stopDragging} - onPointerCancel={stopDragging} - /> -
- -
-
-
+
+ +
+
+
+ ) : null}
); } diff --git a/src/components/puzzle-result/PuzzleResultView.tsx b/src/components/puzzle-result/PuzzleResultView.tsx index 6f4357c1..2ed3bdc1 100644 --- a/src/components/puzzle-result/PuzzleResultView.tsx +++ b/src/components/puzzle-result/PuzzleResultView.tsx @@ -864,7 +864,7 @@ function PuzzleLevelDetailDialog({ {referenceImageSrc ? (
- 拼图参考图 { expect(outlineStroke).toBeTruthy(); expect(outlineStroke?.getAttribute('d')).toContain('Q 2 1 1.84 1'); expect(outlineStroke?.getAttribute('d')).toContain('Q 1 1 1 1.16'); + expect( + container + .querySelector('[data-merged-group-outline="true"]') + ?.getAttribute('fill'), + ).toBe('transparent'); expect((outlinedPieces[0] as HTMLElement).style.clipPath).toBe(''); + for (const outlinedPiece of outlinedPieces) { + const outlinedPieceElement = outlinedPiece as HTMLElement; + expect(outlinedPieceElement.className).not.toContain('bg-emerald-300/10'); + expect( + outlinedPieceElement.querySelector('.absolute.inset-0.bg-black\\/8'), + ).toBeNull(); + } const clippedLayer = container.querySelector( '[style*="clip-path"]', ) as HTMLElement | null; diff --git a/src/components/puzzle-runtime/PuzzleRuntimeShell.tsx b/src/components/puzzle-runtime/PuzzleRuntimeShell.tsx index 64984fa9..f0aeb93f 100644 --- a/src/components/puzzle-runtime/PuzzleRuntimeShell.tsx +++ b/src/components/puzzle-runtime/PuzzleRuntimeShell.tsx @@ -41,6 +41,7 @@ type PuzzleRuntimeShellProps = { isBusy?: boolean; error?: string | null; hideBackButton?: boolean; + embedded?: boolean; onBack: () => void; onRemodelWork?: (profileId: string) => void | Promise; onSwapPieces: (payload: SwapPuzzlePiecesRequest) => void; @@ -308,6 +309,7 @@ export function PuzzleRuntimeShell({ isBusy = false, error = null, hideBackButton = false, + embedded = false, onBack, onRemodelWork, onSwapPieces, @@ -787,7 +789,9 @@ export function PuzzleRuntimeShell({ if (!run || !currentLevel || !board) { return ( -
+
正在进入拼图关卡 @@ -1079,7 +1083,9 @@ export function PuzzleRuntimeShell({ }; return ( -
+
{currentLevel.coverImageSrc ? ( (
)} -
))}
diff --git a/src/components/rpg-creation-editor/RpgCreationEntityEditorShared.tsx b/src/components/rpg-creation-editor/RpgCreationEntityEditorShared.tsx index 47e696ab..ffe8c17e 100644 --- a/src/components/rpg-creation-editor/RpgCreationEntityEditorShared.tsx +++ b/src/components/rpg-creation-editor/RpgCreationEntityEditorShared.tsx @@ -1,6 +1,5 @@ import { X } from 'lucide-react'; -import type { ChangeEvent } from 'react'; -import type { CSSProperties } from 'react'; +import type { ChangeEvent, CSSProperties, PointerEvent } from 'react'; import { Children, type ReactNode, @@ -954,18 +953,123 @@ function loadImageDimensionsFromDataUrl(source: string) { }); } +const COVER_CROP_RATIO = 16 / 9; + +type CoverCropDragHandle = + | 'move' + | 'north' + | 'northEast' + | 'east' + | 'southEast' + | 'south' + | 'southWest' + | 'west' + | 'northWest'; + +type CoverCropDragSnapshot = { + pointerId: number; + handle: CoverCropDragHandle; + clientX: number; + clientY: number; + cropRect: CustomWorldCoverCropRect; + previewWidth: number; + previewHeight: number; +}; + +const COVER_CROP_RESIZE_HANDLES: Array<{ + handle: Exclude; + label: string; + className: string; + dotClassName: string; +}> = [ + { + handle: 'northWest', + label: '拖拽左上角裁剪边界', + className: 'left-0 top-0 -translate-x-1/2 -translate-y-1/2 cursor-nwse-resize', + dotClassName: 'left-1/2 top-1/2', + }, + { + handle: 'north', + label: '拖拽上边裁剪边界', + className: 'left-1/2 top-0 -translate-x-1/2 -translate-y-1/2 cursor-ns-resize', + dotClassName: 'left-1/2 top-1/2', + }, + { + handle: 'northEast', + label: '拖拽右上角裁剪边界', + className: 'right-0 top-0 translate-x-1/2 -translate-y-1/2 cursor-nesw-resize', + dotClassName: 'left-1/2 top-1/2', + }, + { + handle: 'east', + label: '拖拽右边裁剪边界', + className: 'right-0 top-1/2 -translate-y-1/2 translate-x-1/2 cursor-ew-resize', + dotClassName: 'left-1/2 top-1/2', + }, + { + handle: 'southEast', + label: '拖拽右下角裁剪边界', + className: 'bottom-0 right-0 translate-x-1/2 translate-y-1/2 cursor-nwse-resize', + dotClassName: 'left-1/2 top-1/2', + }, + { + handle: 'south', + label: '拖拽下边裁剪边界', + className: 'bottom-0 left-1/2 -translate-x-1/2 translate-y-1/2 cursor-ns-resize', + dotClassName: 'left-1/2 top-1/2', + }, + { + handle: 'southWest', + label: '拖拽左下角裁剪边界', + className: 'bottom-0 left-0 -translate-x-1/2 translate-y-1/2 cursor-nesw-resize', + dotClassName: 'left-1/2 top-1/2', + }, + { + handle: 'west', + label: '拖拽左边裁剪边界', + className: 'left-0 top-1/2 -translate-x-1/2 -translate-y-1/2 cursor-ew-resize', + dotClassName: 'left-1/2 top-1/2', + }, +]; + +function clampNumber(value: number, min: number, max: number) { + return Math.max(min, Math.min(max, value)); +} + +function getCoverCropSizeBounds(imageSize: { width: number; height: number }) { + const maxWidth = Math.max( + 1, + Math.min(imageSize.width, imageSize.height * COVER_CROP_RATIO), + ); + const minWidth = Math.min(maxWidth, Math.max(48, maxWidth * 0.16)); + + return { minWidth, maxWidth }; +} + +function normalizeCoverCropRect( + cropRect: CustomWorldCoverCropRect, + imageSize: { width: number; height: number }, +): CustomWorldCoverCropRect { + const { minWidth, maxWidth } = getCoverCropSizeBounds(imageSize); + const width = clampNumber(cropRect.width, minWidth, maxWidth); + const height = width / COVER_CROP_RATIO; + const x = clampNumber(cropRect.x, 0, Math.max(0, imageSize.width - width)); + const y = clampNumber(cropRect.y, 0, Math.max(0, imageSize.height - height)); + + return { x, y, width, height }; +} + function buildCenteredCoverCropRect( width: number, height: number, ): CustomWorldCoverCropRect { - const targetRatio = 16 / 9; if (width <= 0 || height <= 0) { return { x: 0, y: 0, width: 1, height: 1 }; } - if (width / height >= targetRatio) { + if (width / height >= COVER_CROP_RATIO) { const cropHeight = height; - const cropWidth = cropHeight * targetRatio; + const cropWidth = cropHeight * COVER_CROP_RATIO; return { x: (width - cropWidth) / 2, y: 0, @@ -975,7 +1079,7 @@ function buildCenteredCoverCropRect( } const cropWidth = width; - const cropHeight = cropWidth / targetRatio; + const cropHeight = cropWidth / COVER_CROP_RATIO; return { x: 0, y: (height - cropHeight) / 2, @@ -984,16 +1088,111 @@ function buildCenteredCoverCropRect( }; } -function clampCoverCropRect( - cropRect: CustomWorldCoverCropRect, +function resizeCoverCropRectFromHandle( + snapshot: CoverCropDragSnapshot, + deltaX: number, + deltaY: number, imageSize: { width: number; height: number }, -) { - const width = Math.max(1, Math.min(imageSize.width, cropRect.width)); - const height = Math.max(1, Math.min(imageSize.height, cropRect.height)); - const x = Math.max(0, Math.min(imageSize.width - width, cropRect.x)); - const y = Math.max(0, Math.min(imageSize.height - height, cropRect.y)); +): CustomWorldCoverCropRect { + const start = snapshot.cropRect; + const startRight = start.x + start.width; + const startBottom = start.y + start.height; + const startCenterX = start.x + start.width / 2; + const startCenterY = start.y + start.height / 2; + const { minWidth, maxWidth } = getCoverCropSizeBounds(imageSize); + const chooseWidth = (widthFromX: number, widthFromY: number) => { + const xDistance = Math.abs(widthFromX - start.width); + const yDistance = Math.abs(widthFromY - start.width); - return { x, y, width, height }; + return xDistance >= yDistance ? widthFromX : widthFromY; + }; + const clampWidth = (width: number, maxByAnchor = maxWidth) => + clampNumber(width, minWidth, Math.max(minWidth, Math.min(maxWidth, maxByAnchor))); + + if (snapshot.handle === 'move') { + return normalizeCoverCropRect( + { + ...start, + x: start.x + deltaX, + y: start.y + deltaY, + }, + imageSize, + ); + } + + if (snapshot.handle === 'east' || snapshot.handle === 'west') { + const isEast = snapshot.handle === 'east'; + const anchorX = isEast ? start.x : startRight; + const maxByAnchorX = isEast ? imageSize.width - anchorX : anchorX; + const maxByCenterY = + 2 * Math.min(startCenterY, imageSize.height - startCenterY) * + COVER_CROP_RATIO; + const width = clampWidth( + start.width + (isEast ? deltaX : -deltaX), + Math.min(maxByAnchorX, maxByCenterY), + ); + const height = width / COVER_CROP_RATIO; + + return normalizeCoverCropRect( + { + x: isEast ? anchorX : anchorX - width, + y: startCenterY - height / 2, + width, + height, + }, + imageSize, + ); + } + + if (snapshot.handle === 'north' || snapshot.handle === 'south') { + const isSouth = snapshot.handle === 'south'; + const anchorY = isSouth ? start.y : startBottom; + const maxByAnchorY = + (isSouth ? imageSize.height - anchorY : anchorY) * COVER_CROP_RATIO; + const maxByCenterX = + 2 * Math.min(startCenterX, imageSize.width - startCenterX); + const width = clampWidth( + (start.height + (isSouth ? deltaY : -deltaY)) * COVER_CROP_RATIO, + Math.min(maxByAnchorY, maxByCenterX), + ); + const height = width / COVER_CROP_RATIO; + + return normalizeCoverCropRect( + { + x: startCenterX - width / 2, + y: isSouth ? anchorY : anchorY - height, + width, + height, + }, + imageSize, + ); + } + + const isEast = snapshot.handle === 'northEast' || snapshot.handle === 'southEast'; + const isSouth = snapshot.handle === 'southEast' || snapshot.handle === 'southWest'; + const anchorX = isEast ? start.x : startRight; + const anchorY = isSouth ? start.y : startBottom; + const maxByAnchorX = isEast ? imageSize.width - anchorX : anchorX; + const maxByAnchorY = + (isSouth ? imageSize.height - anchorY : anchorY) * COVER_CROP_RATIO; + const widthFromX = start.width + (isEast ? deltaX : -deltaX); + const widthFromY = + (start.height + (isSouth ? deltaY : -deltaY)) * COVER_CROP_RATIO; + const width = clampWidth( + chooseWidth(widthFromX, widthFromY), + Math.min(maxByAnchorX, maxByAnchorY), + ); + const height = width / COVER_CROP_RATIO; + + return normalizeCoverCropRect( + { + x: isEast ? anchorX : anchorX - width, + y: isSouth ? anchorY : anchorY - height, + width, + height, + }, + imageSize, + ); } function buildCoverCropPreviewStyle( @@ -3316,51 +3515,116 @@ function buildGeneratedCoverProfile( function CoverUploadCropModal({ imageDataUrl, imageSize, - worldName, isSubmitting, onCancel, onConfirm, }: { imageDataUrl: string; imageSize: { width: number; height: number }; - worldName: string; isSubmitting: boolean; onCancel: () => void; onConfirm: (cropRect: CustomWorldCoverCropRect) => void; }) { - const [zoomPercent, setZoomPercent] = useState(100); - const baseCropRect = useMemo( - () => buildCenteredCoverCropRect(imageSize.width, imageSize.height), - [imageSize], + const previewRef = useRef(null); + const dragSnapshotRef = useRef(null); + const [activeDragHandle, setActiveDragHandle] = + useState(null); + const [cropRect, setCropRect] = useState(() => + normalizeCoverCropRect( + buildCenteredCoverCropRect(imageSize.width, imageSize.height), + imageSize, + ), ); - const [offsetX, setOffsetX] = useState(0); - const [offsetY, setOffsetY] = useState(0); useEffect(() => { - setZoomPercent(100); - setOffsetX(0); - setOffsetY(0); - }, [imageDataUrl]); - - const cropRect = useMemo(() => { - const scale = Math.max(1, zoomPercent / 100); - const nextCropRect = { - width: baseCropRect.width / scale, - height: baseCropRect.height / scale, - x: baseCropRect.x + offsetX, - y: baseCropRect.y + offsetY, - }; - - return clampCoverCropRect(nextCropRect, imageSize); - }, [baseCropRect, imageSize, offsetX, offsetY, zoomPercent]); + setActiveDragHandle(null); + dragSnapshotRef.current = null; + setCropRect( + normalizeCoverCropRect( + buildCenteredCoverCropRect(imageSize.width, imageSize.height), + imageSize, + ), + ); + }, [imageDataUrl, imageSize]); const previewStyle = useMemo( () => buildCoverCropPreviewStyle(cropRect, imageSize), [cropRect, imageSize], ); + const editorPreviewStyle = useMemo( + () => + ({ + aspectRatio: `${imageSize.width} / ${imageSize.height}`, + width: `min(100%, calc(min(58vh, 34rem) * ${ + imageSize.width / Math.max(1, imageSize.height) + }))`, + }) satisfies CSSProperties, + [imageSize], + ); + const outputPreviewStyle = useMemo( + () => + ({ + left: `${-(cropRect.x / cropRect.width) * 100}%`, + top: `${-(cropRect.y / cropRect.height) * 100}%`, + width: `${(imageSize.width / cropRect.width) * 100}%`, + height: `${(imageSize.height / cropRect.height) * 100}%`, + }) satisfies CSSProperties, + [cropRect, imageSize], + ); - const maxOffsetX = Math.max(0, imageSize.width - cropRect.width); - const maxOffsetY = Math.max(0, imageSize.height - cropRect.height); + const beginCropDrag = ( + handle: CoverCropDragHandle, + event: PointerEvent, + ) => { + if (isSubmitting) { + return; + } + + const preview = previewRef.current; + if (!preview) { + return; + } + + const rect = preview.getBoundingClientRect(); + dragSnapshotRef.current = { + pointerId: event.pointerId, + handle, + clientX: event.clientX, + clientY: event.clientY, + cropRect, + previewWidth: rect.width, + previewHeight: rect.height, + }; + setActiveDragHandle(handle); + event.preventDefault(); + event.stopPropagation(); + event.currentTarget.setPointerCapture(event.pointerId); + }; + + const updateCropDrag = (event: PointerEvent) => { + const snapshot = dragSnapshotRef.current; + if (!snapshot || snapshot.pointerId !== event.pointerId) { + return; + } + + const deltaX = + ((event.clientX - snapshot.clientX) * imageSize.width) / + Math.max(1, snapshot.previewWidth); + const deltaY = + ((event.clientY - snapshot.clientY) * imageSize.height) / + Math.max(1, snapshot.previewHeight); + setCropRect(resizeCoverCropRectFromHandle(snapshot, deltaX, deltaY, imageSize)); + }; + + const stopCropDrag = (event: PointerEvent) => { + if (dragSnapshotRef.current?.pointerId !== event.pointerId) { + return; + } + + dragSnapshotRef.current = null; + setActiveDragHandle(null); + event.currentTarget.releasePointerCapture(event.pointerId); + }; return (
-
-
- -
-
- - } - /> -
-
- - setZoomPercent(Number(event.target.value))} - disabled={isSubmitting} - className="w-full accent-sky-400" +
+
+
+ - - - - setOffsetX(Number(event.target.value) - baseCropRect.x) - } - disabled={isSubmitting} - className="w-full accent-sky-400" +
beginCropDrag('move', event)} + onPointerMove={updateCropDrag} + onPointerUp={stopCropDrag} + onPointerCancel={stopCropDrag} /> - - - - setOffsetY(Number(event.target.value) - baseCropRect.y) - } - disabled={isSubmitting} - className="w-full accent-sky-400" - /> - +
+
+
+
+
+
+
+ {COVER_CROP_RESIZE_HANDLES.map((handleConfig) => ( + + ))} +
+
-
- 成品会固定保存为 16:9,并由后端统一压缩到 1600 × 900。 -
-
- 当前裁剪区域: -
- {`x ${Math.round(cropRect.x)} / y ${Math.round(cropRect.y)} / w ${Math.round(cropRect.width)} / h ${Math.round(cropRect.height)}`} +
+
+ +
{ if (isUploading) { diff --git a/src/components/rpg-entry/RpgEntryFlowShell.agent.interaction.test.tsx b/src/components/rpg-entry/RpgEntryFlowShell.agent.interaction.test.tsx index 69349627..2ff53263 100644 --- a/src/components/rpg-entry/RpgEntryFlowShell.agent.interaction.test.tsx +++ b/src/components/rpg-entry/RpgEntryFlowShell.agent.interaction.test.tsx @@ -74,6 +74,10 @@ import { listPuzzleGallery, remixPuzzleGalleryWork, } from '../../services/puzzle-gallery'; +import { + generatePuzzleOnboardingWork, + savePuzzleOnboardingWork, +} from '../../services/puzzle-onboarding'; import { advancePuzzleNextLevel, dragPuzzlePieceOrGroup, @@ -161,7 +165,7 @@ async function clickFirstAsyncButtonByName( async function openCreateTemplateHub(user: ReturnType) { await clickFirstButtonByName(user, '创作'); expect( - await screen.findByText('10分钟创作一个精品互动玩法'), + await screen.findByRole('tablist', { name: '选择模板' }), ).toBeTruthy(); expect(screen.getByRole('tab', { name: '拼图' })).toBeTruthy(); expect(screen.getByText('拼图工作区:missing-session')).toBeTruthy(); @@ -390,6 +394,11 @@ vi.mock('../../services/puzzle-runtime/puzzleLocalRuntime', async () => { }; }); +vi.mock('../../services/puzzle-onboarding', () => ({ + generatePuzzleOnboardingWork: vi.fn(), + savePuzzleOnboardingWork: vi.fn(), +})); + vi.mock('../../services/puzzle-agent', () => ({ createPuzzleAgentSession: vi.fn(), executePuzzleAgentAction: vi.fn(), @@ -2080,6 +2089,107 @@ beforeEach(() => { vi.mocked(listPuzzleGallery).mockResolvedValue({ items: [], }); + vi.mocked(generatePuzzleOnboardingWork).mockResolvedValue({ + item: { + workId: 'onboarding-work-1', + profileId: 'onboarding-profile-1', + ownerUserId: 'onboarding-guest', + sourceSessionId: null, + authorDisplayName: '百梦主', + workTitle: '梦境拼图', + workDescription: '我想飞上天', + levelName: '云上飞行', + summary: '我想飞上天', + themeTags: ['新手引导', '拼图'], + coverImageSrc: 'data:image/svg+xml;utf8,onboarding', + coverAssetId: 'onboarding-asset-1', + publicationStatus: 'draft', + updatedAt: '2026-05-05T12:00:00.000Z', + publishedAt: null, + playCount: 0, + remixCount: 0, + likeCount: 0, + publishReady: true, + levels: [], + }, + level: { + levelId: 'onboarding-level-1', + levelName: '云上飞行', + pictureDescription: '我想飞上天', + pictureReference: null, + candidates: [ + { + candidateId: 'onboarding-candidate-1', + imageSrc: 'data:image/svg+xml;utf8,onboarding', + assetId: 'onboarding-asset-1', + prompt: '我想飞上天', + actualPrompt: '我想飞上天', + sourceType: 'generated', + selected: true, + }, + ], + selectedCandidateId: 'onboarding-candidate-1', + coverImageSrc: 'data:image/svg+xml;utf8,onboarding', + coverAssetId: 'onboarding-asset-1', + generationStatus: 'ready', + }, + }); + vi.mocked(savePuzzleOnboardingWork).mockResolvedValue({ + item: { + workId: 'onboarding-work-saved', + profileId: 'onboarding-profile-saved', + ownerUserId: mockAuthUser.id, + sourceSessionId: 'puzzle-session-onboarding', + authorDisplayName: mockAuthUser.displayName, + workTitle: '梦境拼图', + workDescription: '我想飞上天', + levelName: '云上飞行', + summary: '我想飞上天', + themeTags: ['新手引导', '拼图'], + coverImageSrc: 'data:image/svg+xml;utf8,onboarding', + coverAssetId: 'onboarding-asset-1', + publicationStatus: 'draft', + updatedAt: '2026-05-05T12:00:00.000Z', + publishedAt: null, + playCount: 0, + remixCount: 0, + likeCount: 0, + publishReady: true, + levels: [], + anchorPack: { + themePromise: { + key: 'theme_promise', + label: '主题承诺', + value: '新手引导', + status: 'confirmed', + }, + visualSubject: { + key: 'visual_subject', + label: '视觉主体', + value: '云上飞行', + status: 'confirmed', + }, + visualMood: { + key: 'visual_mood', + label: '视觉气质', + value: '明亮', + status: 'confirmed', + }, + compositionHooks: { + key: 'composition_hooks', + label: '构图钩子', + value: '天空', + status: 'confirmed', + }, + tagsAndForbidden: { + key: 'tags_and_forbidden', + label: '标签与禁区', + value: '拼图', + status: 'confirmed', + }, + }, + }, + }); vi.mocked(remixPuzzleGalleryWork).mockRejectedValue( new Error('未启用拼图 remix'), ); @@ -3262,7 +3372,7 @@ test('puzzle draft result back button returns to creation hub', async () => { await user.click(screen.getByRole('button', { name: '返回' })); expect( - await screen.findByText('10分钟创作一个精品互动玩法'), + await screen.findByRole('tablist', { name: '选择模板' }), ).toBeTruthy(); expect(screen.getByText('雨夜里有一只会发光的猫站在遗迹台阶上。')).toBeTruthy(); expect(screen.queryByText('拼图结果页')).toBeNull(); @@ -3312,6 +3422,82 @@ test('published puzzle work card restores its source session for editing', async expect(screen.getByDisplayValue('雨夜猫塔')).toBeTruthy(); }); +test('first launch puzzle onboarding can be skipped from top right', async () => { + const user = userEvent.setup(); + window.localStorage.removeItem( + 'genarrative.puzzle-onboarding.first-visit.v1', + ); + + render( + {}, + requireAuth: () => {}, + })} + />, + ); + + expect(await screen.findByText('待定待定待定')).toBeTruthy(); + await user.click(screen.getByRole('button', { name: '跳过' })); + + await waitFor(() => { + expect(screen.queryByText('待定待定待定')).toBeNull(); + }); + expect( + window.localStorage.getItem( + 'genarrative.puzzle-onboarding.first-visit.v1', + ), + ).toBe('1'); + expect(generatePuzzleOnboardingWork).not.toHaveBeenCalled(); +}); + +test('first launch puzzle onboarding falls back to local run when generate route is missing', async () => { + const user = userEvent.setup(); + window.localStorage.removeItem( + 'genarrative.puzzle-onboarding.first-visit.v1', + ); + vi.mocked(generatePuzzleOnboardingWork).mockRejectedValueOnce( + new ApiClientError({ + message: '资源不存在', + status: 404, + code: 'NOT_FOUND', + }), + ); + + render( + {}, + requireAuth: () => {}, + })} + />, + ); + + await user.type( + await screen.findByPlaceholderText('把你的梦讲给我听吧'), + '我想飞上天', + ); + await user.click(screen.getByRole('button', { name: '生成' })); + + expect( + await screen.findByTestId('puzzle-board', undefined, { timeout: 3000 }), + ).toBeTruthy(); + expect(generatePuzzleOnboardingWork).toHaveBeenCalledWith({ + promptText: '我想飞上天', + }); + expect(screen.queryByText('资源不存在')).toBeNull(); + expect(startPuzzleRun).not.toHaveBeenCalled(); + expect( + window.localStorage.getItem( + 'genarrative.puzzle-onboarding.first-visit.v1', + ), + ).toBe('1'); +}); + test('formal puzzle runtime uses frontend move merge logic and backend leaderboard next level', async () => { const user = userEvent.setup(); const clearedFirstLevel = buildClearedPuzzleRun({ @@ -4717,7 +4903,7 @@ test('agent draft result back button returns to creation hub without syncing res await user.click(screen.getByRole('button', { name: /返回创作/u })); await waitFor(() => { - expect(screen.getByText('10分钟创作一个精品互动玩法')).toBeTruthy(); + expect(screen.getByRole('tablist', { name: '选择模板' })).toBeTruthy(); }); expect( @@ -5041,16 +5227,16 @@ test('manual tab switch is preserved after platform bootstrap requests finish', await clickFirstButtonByName(user, '创作'); expect( - await screen.findByText('10分钟创作一个精品互动玩法'), + await screen.findByRole('tablist', { name: '选择模板' }), ).toBeTruthy(); resolveGalleryRequest([]); await waitFor(() => { expect( - within(getPlatformTabPanel('create')).getByText( - '10分钟创作一个精品互动玩法', - ), + within(getPlatformTabPanel('create')).getByRole('tablist', { + name: '选择模板', + }), ).toBeTruthy(); }); diff --git a/src/components/rpg-entry/RpgEntryHomeView.recharge.test.tsx b/src/components/rpg-entry/RpgEntryHomeView.recharge.test.tsx index 1ef02f8b..517d4f77 100644 --- a/src/components/rpg-entry/RpgEntryHomeView.recharge.test.tsx +++ b/src/components/rpg-entry/RpgEntryHomeView.recharge.test.tsx @@ -508,6 +508,11 @@ function renderLoggedOutHomeView( | 'latestEntries' | 'onOpenGalleryDetail' | 'onSearchPublicCode' + | 'recommendRuntimeContent' + | 'activeRecommendEntryKey' + | 'isStartingRecommendEntry' + | 'recommendRuntimeError' + | 'onSelectRecommendEntry' > > = {}, ) { @@ -553,6 +558,15 @@ function renderLoggedOutHomeView( onOpenCreateWorld={vi.fn()} onOpenCreateTypePicker={vi.fn()} onOpenGalleryDetail={overrides.onOpenGalleryDetail ?? vi.fn()} + recommendRuntimeContent={ + overrides.recommendRuntimeContent ?? ( +
运行内容
+ ) + } + activeRecommendEntryKey={overrides.activeRecommendEntryKey} + isStartingRecommendEntry={overrides.isStartingRecommendEntry} + recommendRuntimeError={overrides.recommendRuntimeError} + onSelectRecommendEntry={overrides.onSelectRecommendEntry} onOpenLibraryDetail={vi.fn()} onSearchPublicCode={overrides.onSearchPublicCode ?? vi.fn()} /> @@ -562,7 +576,13 @@ function renderLoggedOutHomeView( function renderStatefulLoggedOutHomeView( overrides: Partial< - Pick + Pick< + RpgEntryHomeViewProps, + | 'featuredEntries' + | 'latestEntries' + | 'onOpenGalleryDetail' + | 'onSearchPublicCode' + > > = {}, ) { function StatefulLoggedOutHomeView() { @@ -610,9 +630,10 @@ function renderStatefulLoggedOutHomeView( onResumeSave={vi.fn()} onOpenCreateWorld={vi.fn()} onOpenCreateTypePicker={vi.fn()} - onOpenGalleryDetail={vi.fn()} + onOpenGalleryDetail={overrides.onOpenGalleryDetail ?? vi.fn()} + recommendRuntimeContent={
} onOpenLibraryDetail={vi.fn()} - onSearchPublicCode={vi.fn()} + onSearchPublicCode={overrides.onSearchPublicCode ?? vi.fn()} /> ); @@ -956,57 +977,12 @@ test('logged out bottom nav keeps creation centered with recommend icon', () => expect(buttons[2]?.querySelector('.lucide-compass')).toBeTruthy(); }); -test('mobile home search submits public work code', async () => { +test('mobile discover search submits public work code', async () => { const user = userEvent.setup(); const onSearchPublicCode = vi.fn(); - render( - undefined), - musicVolume: 0.42, - setMusicVolume: vi.fn(), - platformTheme: 'light', - setPlatformTheme: vi.fn(), - isHydratingSettings: false, - isPersistingSettings: false, - settingsError: null, - }} - > - - , - ); + renderStatefulLoggedOutHomeView({ onSearchPublicCode }); + await user.click(screen.getByRole('button', { name: '发现' })); const searchInput = screen.getByPlaceholderText( '搜索作品号、名称、作者、描述', @@ -1016,7 +992,7 @@ test('mobile home search submits public work code', async () => { expect(onSearchPublicCode).toHaveBeenCalledWith('PZ-PROFILE1'); }); -test('home search fuzzy matches public work id, name, author and description', async () => { +test('discover search fuzzy matches public work id, name, author and description', async () => { const user = userEvent.setup(); const onOpenGalleryDetail = vi.fn(); const onSearchPublicCode = vi.fn(); @@ -1041,46 +1017,52 @@ test('home search fuzzy matches public work id, name, author and description', a }, ] satisfies PlatformPublicGalleryCard[]; - renderLoggedOutHomeView(vi.fn(), { + renderStatefulLoggedOutHomeView({ latestEntries: entries, onOpenGalleryDetail, onSearchPublicCode, }); + await user.click(screen.getByRole('button', { name: '发现' })); + const discoverPanel = document.getElementById('platform-tab-panel-category'); + if (!discoverPanel) { + throw new Error('缺少发现面板'); + } const searchInput = screen.getByPlaceholderText('搜索作品号、名称、作者、描述'); await user.type(searchInput, 'MOON01{enter}'); - expect(await screen.findByText('搜索结果')).toBeTruthy(); - expect(screen.getByText('月井机关')).toBeTruthy(); - expect(screen.queryByText('火桥谜图')).toBeNull(); + expect(await within(discoverPanel).findByText('搜索结果')).toBeTruthy(); + expect(within(discoverPanel).getByText('月井机关')).toBeTruthy(); + expect(within(discoverPanel).queryByText('火桥谜图')).toBeNull(); expect(onSearchPublicCode).not.toHaveBeenCalled(); await user.clear(searchInput); await user.type(searchInput, '火桥{enter}'); - expect(await screen.findByText('火桥谜图')).toBeTruthy(); - expect(screen.queryByText('月井机关')).toBeNull(); + expect(await within(discoverPanel).findByText('火桥谜图')).toBeTruthy(); + expect(within(discoverPanel).queryByText('月井机关')).toBeNull(); await user.clear(searchInput); await user.type(searchInput, '月井守望{enter}'); - expect(await screen.findByText('月井机关')).toBeTruthy(); - expect(screen.queryByText('火桥谜图')).toBeNull(); + expect(await within(discoverPanel).findByText('月井机关')).toBeTruthy(); + expect(within(discoverPanel).queryByText('火桥谜图')).toBeNull(); await user.clear(searchInput); await user.type(searchInput, '熔岩断桥{enter}'); - expect(await screen.findByText('火桥谜图')).toBeTruthy(); - expect(screen.queryByText('月井机关')).toBeNull(); + expect(await within(discoverPanel).findByText('火桥谜图')).toBeTruthy(); + expect(within(discoverPanel).queryByText('月井机关')).toBeNull(); await user.click(screen.getByRole('button', { name: /火桥谜图/u })); expect(onOpenGalleryDetail).toHaveBeenCalledWith(entries[1]); }); -test('home search keeps public code fallback when local works do not match', async () => { +test('discover search keeps public code fallback when local works do not match', async () => { const user = userEvent.setup(); const onSearchPublicCode = vi.fn(); - renderLoggedOutHomeView(vi.fn(), { + renderStatefulLoggedOutHomeView({ latestEntries: [puzzlePublicEntry], onSearchPublicCode, }); + await user.click(screen.getByRole('button', { name: '发现' })); const searchInput = screen.getByPlaceholderText('搜索作品号、名称、作者、描述'); await user.type(searchInput, 'CW-REMOTE-ONLY{enter}'); @@ -1093,10 +1075,11 @@ test('public gallery cards hide work code until detail is opened', async () => { const user = userEvent.setup(); const onOpenGalleryDetail = vi.fn(); - renderLoggedOutHomeView(vi.fn(), { + renderStatefulLoggedOutHomeView({ latestEntries: [puzzlePublicEntry], onOpenGalleryDetail, }); + await user.click(screen.getByRole('button', { name: '发现' })); expect(screen.queryByText('PZ-EPUBLIC1')).toBeNull(); expect( @@ -1108,47 +1091,54 @@ test('public gallery cards hide work code until detail is opened', async () => { expect(onOpenGalleryDetail).toHaveBeenCalledWith(puzzlePublicEntry); }); -test('mobile public work cards render cover, author, kind and cover stats', () => { - const { container } = renderLoggedOutHomeView(vi.fn(), { +test('mobile recommend page renders runtime viewport and bottom switcher', () => { + const onSelectRecommendEntry = vi.fn(); + + renderLoggedOutHomeView(vi.fn(), { latestEntries: [puzzlePublicEntry], + activeRecommendEntryKey: 'puzzle:user-2:puzzle-profile-public-1', + onSelectRecommendEntry, }); - const card = screen.getByRole('button', { - name: /奇幻拼图,拼图,20游玩,5改造,12点赞/u, - }); + expect(screen.getByTestId('recommend-runtime')).toBeTruthy(); + expect(screen.queryByText('一张用于公开分享的拼图作品。')).toBeNull(); expect( - card.querySelector('.platform-public-work-card__cover.aspect-video'), - ).toBeTruthy(); - expect( - card.querySelector('.platform-public-work-card__cover-stats'), - ).toBeTruthy(); - expect( - card.querySelectorAll('.platform-public-work-card__cover-stat'), - ).toHaveLength(3); - expect( - card.querySelector('.platform-public-work-card__kind')?.textContent, - ).toBe('拼图'); - expect( - card.querySelector('.platform-public-work-card__author-avatar') - ?.textContent, - ).toBe('拼'); - expect(screen.getByText('奇幻拼图')).toBeTruthy(); + document.querySelector('.platform-public-work-card__cover'), + ).toBeNull(); expect(screen.getByText('拼图玩家')).toBeTruthy(); - expect(screen.getByText('一张用于公开分享的拼图作品。')).toBeTruthy(); - expect(screen.getByText('奇幻')).toBeTruthy(); - expect(screen.getByText('20')).toBeTruthy(); - expect(screen.getByText('5')).toBeTruthy(); - expect(screen.getByText('12')).toBeTruthy(); - expect(card.querySelector('.platform-pill--warm')?.textContent).not.toBe( - '推荐', - ); - expect( - container.querySelector('.platform-mobile-home-channel--active') - ?.textContent, - ).toBe('推荐'); + expect(screen.getAllByText('奇幻拼图').length).toBeGreaterThan(0); + expect(screen.getAllByText('20').length).toBeGreaterThan(0); + expect(screen.getAllByText('12').length).toBeGreaterThan(0); + + const switchButton = screen.getByRole('button', { + name: '切换到 奇幻拼图', + }); + expect(switchButton.getAttribute('aria-pressed')).toBe('true'); }); -test('public work cards load real author avatar from public user summary', async () => { +test('mobile recommend switcher selects a different public work', async () => { + const user = userEvent.setup(); + const onSelectRecommendEntry = vi.fn(); + const secondEntry = { + ...puzzlePublicEntry, + workId: 'puzzle-work-second', + profileId: 'puzzle-profile-second', + publicWorkCode: 'PZ-SECOND', + worldName: '第二拼图', + } satisfies PlatformPublicGalleryCard; + + renderLoggedOutHomeView(vi.fn(), { + latestEntries: [puzzlePublicEntry, secondEntry], + activeRecommendEntryKey: 'puzzle:user-2:puzzle-profile-public-1', + onSelectRecommendEntry, + }); + + await user.click(screen.getByRole('button', { name: '切换到 第二拼图' })); + + expect(onSelectRecommendEntry).toHaveBeenCalledWith(secondEntry); +}); + +test('mobile recommend meta loads real author avatar from public user summary', async () => { mockGetPublicAuthUserById.mockResolvedValueOnce({ id: 'user-2', publicUserCode: 'SY-00000002', @@ -1159,16 +1149,13 @@ test('public work cards load real author avatar from public user summary', async renderLoggedOutHomeView(vi.fn(), { featuredEntries: [puzzlePublicEntry], latestEntries: [puzzlePublicEntry], - }); - - const card = screen.getByRole('button', { - name: /奇幻拼图,拼图,20游玩,5改造,12点赞/u, + activeRecommendEntryKey: 'puzzle:user-2:puzzle-profile-public-1', }); await waitFor(() => { expect( - card - .querySelector('.platform-public-work-card__author-avatar-image') + document + .querySelector('.platform-recommend-work-meta__avatar img') ?.getAttribute('src'), ).toBe('data:image/png;base64,AUTHOR'); }); @@ -1177,7 +1164,7 @@ test('public work cards load real author avatar from public user summary', async expect(mockGetPublicAuthUserByCode).not.toHaveBeenCalled(); }); -test('mobile home feed only rotates the card closest to screen center', () => { +test('mobile discover recommend feed only rotates the card closest to screen center', async () => { vi.useFakeTimers(); Object.defineProperty(window, 'requestAnimationFrame', { configurable: true, @@ -1199,9 +1186,12 @@ test('mobile home feed only rotates the card closest to screen center', () => { ); const cardRects = new Map(); - renderLoggedOutHomeView(vi.fn(), { + renderStatefulLoggedOutHomeView({ latestEntries: [firstEntry, secondEntry], }); + act(() => { + screen.getByRole('button', { name: '发现' }).click(); + }); const tabPanel = document.querySelector('.platform-tab-panel--active'); const firstCard = screen.getByRole('button', { name: /中心拼图一/u }); @@ -1340,15 +1330,22 @@ test('mobile today channel only shows newly published works from today', async ( updatedAt: todayPublishedAt, } satisfies PlatformPublicGalleryCard; - renderLoggedOutHomeView(vi.fn(), { + renderStatefulLoggedOutHomeView({ latestEntries: [yesterdayEntry, updatedTodayEntry, todayEntry], }); - await user.click(screen.getByRole('button', { name: '今日游戏' })); + await user.click(screen.getByRole('button', { name: '发现' })); + await user.click(screen.getByRole('button', { name: '今日' })); + const discoverPanel = document.getElementById('platform-tab-panel-category'); + if (!discoverPanel) { + throw new Error('缺少发现面板'); + } - expect(screen.getByRole('button', { name: /今日新游/u })).toBeTruthy(); - expect(screen.queryByText('昨日旧作')).toBeNull(); - expect(screen.queryByText('今日更新旧作')).toBeNull(); + expect( + within(discoverPanel).getByRole('button', { name: /今日新游/u }), + ).toBeTruthy(); + expect(within(discoverPanel).queryByText('昨日旧作')).toBeNull(); + expect(within(discoverPanel).queryByText('今日更新旧作')).toBeNull(); }); test('desktop home syncs mobile home modules without square or latest labels', () => { @@ -1369,7 +1366,7 @@ test('desktop home syncs mobile home modules without square or latest labels', ( }); expect(screen.getByText('今日游戏')).toBeTruthy(); - expect(screen.getByText('推荐')).toBeTruthy(); + expect(screen.getAllByText('推荐').length).toBeGreaterThan(0); expect(screen.getByText('作品分类')).toBeTruthy(); expect(screen.getAllByText('桌面今日新游').length).toBeGreaterThan(0); expect(screen.queryByText('趋势关注')).toBeNull(); @@ -1383,16 +1380,17 @@ test('desktop home syncs mobile home modules without square or latest labels', ( test('mobile home moves category shelf into game category channel', async () => { const user = userEvent.setup(); - const { container } = renderLoggedOutHomeView(vi.fn(), { + const { container } = renderStatefulLoggedOutHomeView({ latestEntries: [puzzlePublicEntry], }); expect(screen.queryByRole('button', { name: 'PC游戏' })).toBeNull(); expect(screen.queryByRole('button', { name: '即点即玩' })).toBeNull(); - await user.click(screen.getByRole('button', { name: '游戏分类' })); + await user.click(screen.getByRole('button', { name: '发现' })); + await user.click(screen.getByRole('button', { name: '分类' })); - expect(screen.getAllByText('游戏分类').length).toBeGreaterThan(0); + expect(screen.getAllByText('分类').length).toBeGreaterThan(0); expect(screen.getByRole('button', { name: /筛选/u })).toBeTruthy(); expect(screen.getByRole('button', { name: '奇幻' })).toBeTruthy(); expect(screen.getByRole('button', { name: /奇幻拼图,试玩/u })).toBeTruthy(); @@ -1407,11 +1405,12 @@ test('mobile home moves category shelf into game category channel', async () => test('mobile game category list orders works by composite public metric', async () => { const user = userEvent.setup(); - renderLoggedOutHomeView(vi.fn(), { + renderStatefulLoggedOutHomeView({ latestEntries: [puzzlePublicEntry, hotRankEntry], }); - await user.click(screen.getByRole('button', { name: '游戏分类' })); + await user.click(screen.getByRole('button', { name: '发现' })); + await user.click(screen.getByRole('button', { name: '分类' })); await user.click(screen.getByRole('button', { name: '奇幻' })); const gameItems = Array.from( @@ -1427,8 +1426,7 @@ test('bottom category tab becomes ranking and switches ranking metrics', async ( latestEntries: [remixRankEntry, hotRankEntry, newRankEntry], }); - expect(screen.queryByRole('button', { name: '分类' })).toBeNull(); - + await user.click(screen.getByRole('button', { name: '发现' })); await user.click(screen.getByRole('button', { name: '排行' })); expect(await screen.findByRole('tab', { name: '热门榜' })).toBeTruthy(); @@ -1462,6 +1460,7 @@ test('ranking rows limit displayed work name and show two short tags on the thir latestEntries: [longTextRankEntry], }); + await user.click(screen.getByRole('button', { name: '发现' })); await user.click(screen.getByRole('button', { name: '排行' })); const rankingPanel = document.getElementById('platform-tab-panel-category'); diff --git a/src/components/rpg-entry/RpgEntryHomeView.tsx b/src/components/rpg-entry/RpgEntryHomeView.tsx index 3364449d..1749028e 100644 --- a/src/components/rpg-entry/RpgEntryHomeView.tsx +++ b/src/components/rpg-entry/RpgEntryHomeView.tsx @@ -118,6 +118,11 @@ export interface RpgEntryHomeViewProps { onOpenCreateWorld: () => void; onOpenCreateTypePicker: () => void; onOpenGalleryDetail: (entry: PlatformPublicGalleryCard) => void; + recommendRuntimeContent?: ReactNode; + activeRecommendEntryKey?: string | null; + isStartingRecommendEntry?: boolean; + recommendRuntimeError?: string | null; + onSelectRecommendEntry?: (entry: PlatformPublicGalleryCard) => void; onOpenLibraryDetail: ( entry: CustomWorldLibraryEntry, ) => void; @@ -656,6 +661,131 @@ function CreationLibraryCard({ ); } +function RecommendRuntimeMeta({ + entry, + authorAvatarUrl, + onOpenDetail, +}: { + entry: PlatformPublicGalleryCard; + authorAvatarUrl?: string | null; + onOpenDetail: () => void; +}) { + const playCount = getPlatformWorldPlayCount(entry); + const remixCount = getPlatformWorldRemixCount(entry); + const likeCount = getPlatformWorldLikeCount(entry); + const authorName = entry.authorDisplayName.trim() || '玩家'; + const authorAvatarLabel = getPublicAuthorAvatarLabel(authorName); + const normalizedAuthorAvatarUrl = authorAvatarUrl?.trim() ?? ''; + const displayName = formatPlatformWorkDisplayName(entry.worldName); + const statItems = [ + { label: '游玩', value: playCount, icon: Gamepad2 }, + { label: '点赞', value: likeCount, icon: Heart }, + { label: '改造', value: remixCount, icon: MessageCircle }, + ]; + + return ( +
+
+ {statItems.map(({ label, value, icon: Icon }) => ( + + + ))} +
+ +
+ + + +
+
+ ); +} + +function RecommendWorkSwitchItem({ + entry, + active, + onSelect, +}: { + entry: PlatformPublicGalleryCard; + active: boolean; + onSelect: () => void; +}) { + const displayName = formatPlatformWorkDisplayName(entry.worldName); + const typeLabel = describePublicGalleryCardKind(entry); + const playCount = getPlatformWorldPlayCount(entry); + const likeCount = getPlatformWorldLikeCount(entry); + + return ( + + ); +} + function SaveArchiveCard({ entry, onClick, @@ -2727,6 +2857,11 @@ export function RpgEntryHomeView({ onResumeSave, onOpenCreateTypePicker, onOpenGalleryDetail, + recommendRuntimeContent, + activeRecommendEntryKey = null, + isStartingRecommendEntry = false, + recommendRuntimeError = null, + onSelectRecommendEntry, onOpenLibraryDetail, onDeleteLibraryEntry, deletingLibraryEntryId = null, @@ -2796,7 +2931,6 @@ export function RpgEntryHomeView({ ); const [discoverChannel, setDiscoverChannel] = useState('recommend'); - const mobileRecommendFeedRef = useRef(null); const mobileDiscoverFeedRef = useRef(null); const [mobileCenteredCardKey, setMobileCenteredCardKey] = useState< string | null @@ -3494,19 +3628,15 @@ export function RpgEntryHomeView({ }, [discoverChannel, latestEntries, recommendedFeedEntries]); const mobileFeedCarouselEnabled = !isDesktopLayout && - ((activeTab === 'home' && recommendedFeedEntries.length > 0) || - (activeTab === 'category' && - (discoverChannel === 'recommend' || discoverChannel === 'today'))); + activeTab === 'category' && + (discoverChannel === 'recommend' || discoverChannel === 'today'); useEffect(() => { if (!mobileFeedCarouselEnabled) { setMobileCenteredCardKey(null); return undefined; } - const feedElement = - activeTab === 'home' - ? mobileRecommendFeedRef.current - : mobileDiscoverFeedRef.current; + const feedElement = mobileDiscoverFeedRef.current; const scrollElement = feedElement?.closest('.platform-tab-panel'); if (!feedElement || !scrollElement) { setMobileCenteredCardKey(null); @@ -3577,13 +3707,7 @@ export function RpgEntryHomeView({ scrollElement.removeEventListener('scroll', scheduleUpdate); window.removeEventListener('resize', scheduleUpdate); }; - }, [ - discoverChannel, - discoverFeedEntries, - activeTab, - mobileFeedCarouselEnabled, - recommendedFeedEntries, - ]); + }, [discoverChannel, discoverFeedEntries, activeTab, mobileFeedCarouselEnabled]); const activeRankingConfig = PLATFORM_RANKING_TABS.find( (tab) => tab.id === activeRankingTab, ) as (typeof PLATFORM_RANKING_TABS)[number]; @@ -3592,6 +3716,12 @@ export function RpgEntryHomeView({ buildPlatformRankingEntries(publicEntries, activeRankingTab).slice(0, 30), [activeRankingTab, publicEntries], ); + const activeRecommendEntry = + recommendedFeedEntries.find( + (entry) => buildPublicGalleryCardKey(entry) === activeRecommendEntryKey, + ) ?? + recommendedFeedEntries[0] ?? + null; const leadPublicEntry = featuredShelf[0] ?? latestEntries[0] ?? null; const openLeadPublicEntry = () => { if (leadPublicEntry) { @@ -3663,36 +3793,75 @@ export function RpgEntryHomeView({
) : null} -
+
{isLoadingPlatform ? ( - - ) : recommendedFeedEntries.length > 0 ? ( -
- {recommendedFeedEntries.map((entry) => { - const cardKey = buildPublicGalleryCardKey(entry); - - return ( - onOpenGalleryDetail(entry)} - className="w-full" - authorAvatarUrl={getPublicEntryAuthorAvatarUrl(entry)} - feedCardKey={cardKey} - enableCoverCarousel={mobileFeedCarouselEnabled} - isCoverCarouselActive={mobileCenteredCardKey === cardKey} - variant="immersive" - /> - ); - })} +
+ 正在读取公开作品...
+ ) : recommendRuntimeError ? ( + + ) : isStartingRecommendEntry || !recommendRuntimeContent ? ( +
加载中...
) : ( - +
+ {recommendRuntimeContent} +
)}
+ + {activeRecommendEntry ? ( + onOpenGalleryDetail(activeRecommendEntry)} + /> + ) : null} + + {recommendedFeedEntries.length > 0 ? ( +
+ {recommendedFeedEntries.map((entry) => { + const cardKey = buildPublicGalleryCardKey(entry); + const active = + activeRecommendEntryKey === cardKey || + Boolean( + !activeRecommendEntryKey && + activeRecommendEntry && + buildPublicGalleryCardKey(activeRecommendEntry) === cardKey, + ); + + return ( + { + if (onSelectRecommendEntry) { + onSelectRecommendEntry(entry); + return; + } + + onOpenGalleryDetail(entry); + }} + /> + ); + })} +
+ ) : !isLoadingPlatform ? ( + + ) : null}
); diff --git a/src/components/square-hole-runtime/SquareHoleRuntimeShell.tsx b/src/components/square-hole-runtime/SquareHoleRuntimeShell.tsx index f5d550d7..53043fe7 100644 --- a/src/components/square-hole-runtime/SquareHoleRuntimeShell.tsx +++ b/src/components/square-hole-runtime/SquareHoleRuntimeShell.tsx @@ -29,6 +29,7 @@ type SquareHoleRuntimeShellProps = { run: SquareHoleRunSnapshot | null; isBusy?: boolean; error?: string | null; + embedded?: boolean; onBack: () => void; onRestart: () => void; onDropShape: ( @@ -148,6 +149,7 @@ export function SquareHoleRuntimeShell({ run, isBusy = false, error = null, + embedded = false, onBack, onRestart, onDropShape, @@ -327,7 +329,9 @@ export function SquareHoleRuntimeShell({ if (!run) { return ( -
+
{isBusy ? '载入中' : (error ?? '暂无运行态')}
); @@ -336,7 +340,9 @@ export function SquareHoleRuntimeShell({ const feedback = run.lastFeedback; return ( -
+
{run.backgroundImageSrc ? ( ) : null}
({ + createVisualNovelBackgroundMusicTask: vi.fn(), + createVisualNovelSoundEffectTask: vi.fn(), listVisualNovelHistoryAssets: vi.fn().mockResolvedValue([]), + publishVisualNovelBackgroundMusicAsset: vi.fn(), + publishVisualNovelSoundEffectAsset: vi.fn(), uploadVisualNovelAsset: vi.fn(), })); @@ -91,8 +95,10 @@ test('visual novel result uploads scene and character assets into platform refer uploadMock.mockResolvedValue({ assetObjectId: 'asset-scene-1', assetKind: 'scene_image', - objectKey: 'generated-custom-world-scenes/vn-profile/scene-1/background.png', - imageSrc: '/generated-custom-world-scenes/vn-profile/scene-1/background.png', + objectKey: + 'generated-custom-world-scenes/vn-profile/scene-1/background.png', + imageSrc: + '/generated-custom-world-scenes/vn-profile/scene-1/background.png', }); render( @@ -112,9 +118,9 @@ test('visual novel result uploads scene and character assets into platform refer }); await user.click(backgroundButtons[0]!); - const fileInput = within(screen.getByRole('dialog', { name: '背景图' })).getByLabelText( - '上传背景图文件', - ) as HTMLInputElement; + const fileInput = within( + screen.getByRole('dialog', { name: '背景图' }), + ).getByLabelText('上传背景图文件') as HTMLInputElement; await user.upload( fileInput, new File(['image-bytes'], 'scene.png', { type: 'image/png' }), @@ -124,7 +130,7 @@ test('visual novel result uploads scene and character assets into platform refer await user.click(screen.getAllByRole('button', { name: '保存草稿' })[1]!); expect(onSaveDraft).toHaveBeenCalled(); - expect(onSaveDraft.mock.calls[0]?.[0].scenes[0]?.backgroundImageSrc).toContain( - '/generated-custom-world-scenes/', - ); + expect( + onSaveDraft.mock.calls[0]?.[0].scenes[0]?.backgroundImageSrc, + ).toContain('/generated-custom-world-scenes/'); }); diff --git a/src/components/visual-novel-result/VisualNovelResultView.tsx b/src/components/visual-novel-result/VisualNovelResultView.tsx index 585f3b46..893188d2 100644 --- a/src/components/visual-novel-result/VisualNovelResultView.tsx +++ b/src/components/visual-novel-result/VisualNovelResultView.tsx @@ -9,7 +9,9 @@ import { PenLine, Play, Settings, + Sparkles, Upload, + Waves, X, type LucideIcon, } from 'lucide-react'; @@ -26,7 +28,11 @@ import type { VisualNovelValidationIssue, } from '../../../packages/shared/src/contracts/visualNovel'; import { + createVisualNovelBackgroundMusicTask, + createVisualNovelSoundEffectTask, listVisualNovelHistoryAssets, + publishVisualNovelBackgroundMusicAsset, + publishVisualNovelSoundEffectAsset, uploadVisualNovelAsset, type VisualNovelAssetReference, type VisualNovelHistoryAssetKind, @@ -98,6 +104,17 @@ type VisualNovelAssetPickerConfig = { previewTone: 'image' | 'audio'; }; +type VisualNovelAudioGeneratorKind = 'background_music' | 'sound_effect'; + +type VisualNovelAudioGeneratorConfig = { + kind: VisualNovelAudioGeneratorKind; + scene: VisualNovelSceneDraft; + profileId?: string | null; +}; + +const AUDIO_POLL_INTERVAL_MS = 3600; +const AUDIO_POLL_MAX_ATTEMPTS = 36; + const RESULT_TABS: Array<{ id: VisualNovelResultTab; label: string }> = [ { id: 'profile', label: '作品' }, { id: 'world', label: '世界' }, @@ -537,7 +554,9 @@ function VisualNovelAssetPickerDialog({
) : null} - {!isLoadingHistory && config.historyKind && historyAssets.length <= 0 ? ( + {!isLoadingHistory && + config.historyKind && + historyAssets.length <= 0 ? (
暂无历史素材
@@ -704,6 +723,299 @@ function VisualNovelAssetField({ ); } +async function waitForVisualNovelGeneratedAudioAsset( + config: VisualNovelAudioGeneratorConfig, + taskId: string, +) { + for (let attempt = 0; attempt < AUDIO_POLL_MAX_ATTEMPTS; attempt += 1) { + if (attempt > 0) { + await new Promise((resolve) => { + window.setTimeout(resolve, AUDIO_POLL_INTERVAL_MS); + }); + } + + const payload = { + sceneId: config.scene.sceneId, + profileId: config.profileId ?? null, + }; + const asset = + config.kind === 'background_music' + ? await publishVisualNovelBackgroundMusicAsset(taskId, payload) + : await publishVisualNovelSoundEffectAsset(taskId, payload); + + if (asset.audioSrc?.trim()) { + return asset; + } + } + + throw new Error('音频生成仍在处理中,请稍后重试。'); +} + +function buildDefaultAudioPrompt( + kind: VisualNovelAudioGeneratorKind, + scene: VisualNovelSceneDraft, +) { + const name = scene.name.trim() || '当前场景'; + const description = scene.description.trim(); + if (kind === 'background_music') { + return [name, description, '适合作为视觉小说循环播放的无歌词背景音乐'] + .filter(Boolean) + .join(','); + } + return [name, description, '短促、清晰、适合场景切换时播放的环境音效'] + .filter(Boolean) + .join(','); +} + +function VisualNovelAudioGeneratorDialog({ + config, + disabled, + onClose, + onGenerated, +}: { + config: VisualNovelAudioGeneratorConfig; + disabled: boolean; + onClose: () => void; + onGenerated: (asset: VisualNovelAssetReference) => void; +}) { + const authUi = useAuthUi(); + const platformTheme = authUi?.platformTheme ?? 'light'; + const isBackgroundMusic = config.kind === 'background_music'; + const [prompt, setPrompt] = useState(() => + buildDefaultAudioPrompt(config.kind, config.scene), + ); + const [title, setTitle] = useState(() => + (config.scene.name.trim() || '视觉小说场景音乐').slice(0, 40), + ); + const [tags, setTags] = useState('cinematic, ambient, emotional'); + const [duration, setDuration] = useState(5); + const [isGenerating, setIsGenerating] = useState(false); + const [error, setError] = useState(null); + + useEffect(() => { + setPrompt(buildDefaultAudioPrompt(config.kind, config.scene)); + setTitle((config.scene.name.trim() || '视觉小说场景音乐').slice(0, 40)); + setTags('cinematic, ambient, emotional'); + setDuration(5); + setError(null); + }, [config]); + + const handleGenerate = async () => { + if (!prompt.trim()) { + setError('提示词不能为空。'); + return; + } + if (isBackgroundMusic && !title.trim()) { + setError('标题不能为空。'); + return; + } + + setIsGenerating(true); + setError(null); + try { + const task = isBackgroundMusic + ? await createVisualNovelBackgroundMusicTask({ + prompt, + title, + tags: tags.trim() || null, + model: 'chirp-v4', + }) + : await createVisualNovelSoundEffectTask({ + prompt, + duration, + }); + const asset = await waitForVisualNovelGeneratedAudioAsset( + config, + task.taskId, + ); + onGenerated({ + assetObjectId: asset.assetObjectId ?? task.taskId, + assetKind: + asset.assetKind ?? + (isBackgroundMusic + ? 'visual_novel_music' + : 'visual_novel_ambient_sound'), + objectKey: '', + imageSrc: asset.audioSrc ?? '', + profileId: config.profileId ?? null, + entityId: config.scene.sceneId, + }); + onClose(); + } catch (generateError) { + setError( + generateError instanceof Error + ? generateError.message + : '音频生成失败。', + ); + } finally { + setIsGenerating(false); + } + }; + + if (typeof document === 'undefined') { + return null; + } + + return createPortal( +
{ + if (event.target === event.currentTarget && !isGenerating) { + onClose(); + } + }} + > +
event.stopPropagation()} + > +
+

+ {isBackgroundMusic ? '生成音乐' : '生成音效'} +

+ +
+
+ {isBackgroundMusic ? ( + <> + + + + ) : ( + + )} +