Skip to content

检索流程

getRelevantDocuments() 步骤详解

1. 查询嵌入

该函数首先通过配置的嵌入提供方(Qwen 或 Gemini)将用户查询转化为嵌入向量。

// worker.ts 第 49 行
const [qv] = await embedder.embed([message], Number(env.EMBED_DIM));

日志输出:

=== QWEN EMBEDDING REQUEST DEBUG ===
URL: https://dashscope.aliyuncs.com/compatible-mode/v1/embeddings
Model: text-embedding-v2
Texts count: 1
Target dimension: 1024
Input texts preview: ["什么是 Kubernetes?"]
API Key prefix: sk-abcd1234...
Request body: {
  "model": "text-embedding-v2",
  "input": ["什么是 Kubernetes?"]
}
Response status: 200
=== END QWEN DEBUG ===
使用当前消息进行向量检索:什么是 Kubernetes?
查询向量长度:1024

2. 直接向量查询(已简化)

系统直接执行全量向量查询,不再使用语言过滤或回退机制,适合数据库规模较小的场景。

// 直接查询,无语言过滤
const queryRes = await env.VECTORIZE.query(qvec, {
  topK: k,
  returnValues: false,
  returnMetadata: 'all'
});

日志输出:

=== GET RELEVANT DOCUMENTS ===
Querying with topK: 8
Vector query successful: 8 matches found
Found 8 matches with metadata

3. 语言偏好处理与 URL 转换

系统不再进行语言过滤,但仍会根据语言偏好进行 URL 转换和标题翻译:

// 步骤 3:处理语言偏好(无过滤)
const metadataResults = validMatches.map(m => ({
  id: m.id,
  metadata: m.metadata
}));

// 语言偏好主要用于 URL 转换和标题翻译

4. 组装 contextssources

函数构建上下文字符串,并对来源进行 URL 转换和去重处理:

// 步骤 4:构建上下文
const contexts = metadataResults
  .map((v: any) => v.metadata.text)
  .filter((text: any) => text && text.length > 0)
  .join('\n---\n');

// 步骤 5:构建来源并进行语言转换
const sources = metadataResults
  .filter((v: any) => v.metadata && (v.metadata.url || v.metadata.title))
  .map((v: any) => {
    const originalUrl = v.metadata.url || '#';
    const originalTitle = v.metadata.title || v.metadata.source || v.id;

    if (currentLang === 'en' && !originalUrl.includes('/en/')) {
      // 将中文内容转换为英文版本
      const translatedTitle = translateTitleToEnglish(originalTitle);
      if (translatedTitle !== originalTitle) {
        const finalUrl = convertUrlForLanguage(originalUrl, currentLang);
        return {
          id: v.id,
          url: finalUrl,
          title: translatedTitle,
          source: v.metadata.source || v.id,
          hasEnglishVersion: true
        };
      }
    }
    // ... 其他处理
  });

日志输出:

English version found: "Kubernetes 容器编排" -> "Kubernetes Container Orchestration"
URL converted: https://jimmysong.io/blog/kubernetes-intro/ -> https://jimmysong.io/en/blog/kubernetes-intro/
Final contexts length: 3247
Final sources: [
  {
    "id": "a1b2c3d4e5f6-0",
    "url": "https://jimmysong.io/blog/kubernetes-intro/",
    "title": "Kubernetes 入门指南",
    "source": "zh/blog/kubernetes-intro/index.md"
  },
  {
    "id": "a1b2c3d4e5f6-1",
    "url": "https://jimmysong.io/blog/microservices/",
    "title": "微服务架构指南",
    "source": "zh/blog/microservices/index.md"
  }
]
Used fallback: false
=== END GET RELEVANT DOCUMENTS ===

完整日志输出示例

以下为一次成功查询的完整日志:

=== QWEN EMBEDDING REQUEST DEBUG ===
URL: https://dashscope.aliyuncs.com/compatible-mode/v1/embeddings
Model: text-embedding-v4
Texts count: 1
Target dimension: 1024
Input texts preview: ["什么是 Kubernetes?"]
API Key prefix: sk-abcd1234...
Request body: {
  "model": "text-embedding-v4",
  "input": ["什么是 Kubernetes?"]
}
Response status: 200
Response data structure: {
  hasData: true,
  dataLength: 1,
  firstItemKeys: ["object", "embedding", "index"],
  firstEmbeddingLength: 1024
}
=== END QWEN DEBUG ===

Chat request: 什么是 Kubernetes?
History length: 0
Language: zh
使用当前消息进行向量检索:什么是 Kubernetes?
查询向量长度:1024

=== GET RELEVANT DOCUMENTS ===
Querying with topK: 8
Vector query successful: 8 matches found
Found 8 matches with metadata
Final contexts length: 3247
Final sources: [
  {
    "id": "a1b2c3d4e5f6-0",
    "url": "https://jimmysong.io/blog/kubernetes-intro/",
    "title": "Kubernetes 入门指南"
  },
  {
    "id": "b2c3d4e5f6a7-0",
    "url": "https://jimmysong.io/blog/container-orchestration/",
    "title": "容器编排技术对比"
  },
  {
    "id": "c3d4e5f6a7b8-0",
    "url": "https://jimmysong.io/blog/cloud-native/",
    "title": "云原生架构实践"
  }
]
No fallback used: Direct query
=== END GET RELEVANT DOCUMENTS ===

Final contexts: 3247 chars
Final sources: [object Array length: 3]
Prompt length: 3891

该检索流程经过简化后,提供了高效的向量检索能力,适合数据库规模较小、英文内容占比少的场景。系统直接进行全量查询,消除了 fallback 模式,提升了响应速度并减少了代码复杂性。