Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
41 changes: 41 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
# Contributing to HugeGraph-AI

Thank you for your interest in contributing! Before submitting a pull request, please run the end-to-end integration tests locally to make sure nothing is broken.

## Prerequisites

1. **HugeGraph Server** running on `localhost:8080` (see [README.md](./README.md) for setup)
2. **Python 3.10+** with dependencies installed via `uv sync --extra llm`
3. **Proxy users**: If you have `http_proxy`/`https_proxy` set, make sure to exclude localhost:
```bash
export no_proxy=localhost,127.0.0.1
export NO_PROXY=localhost,127.0.0.1
```

## Running Integration Tests

```bash
# Activate the virtual environment
source .venv/bin/activate

# Run the end-to-end integration tests
cd hugegraph-llm
python -m pytest src/tests/integration/test_flows_integration.py -v
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Important:项目用 uv 管理依赖,CONTRIBUTING.md 主命令应统一为 uv run pytest

hugegraph-llm/AGENTS.md.github/workflows/hugegraph-llm.yml:82 都使用 uv run pytest。本文件 line 23 的 python -m pytest 在 contributor 没正确激活 .venv、或 .venv--extra llm 时会跑到系统 Python,得到莫名其妙的 import error。

CONTRIBUTING.md 是 contributor 的入口,命令必须与 CI 保持一致。

# Run the end-to-end integration tests
cd hugegraph-llm
uv run pytest src/tests/integration/test_flows_integration.py -v

另外,由于 conftest.py 默认 SKIP_EXTERNAL_SERVICES=true,contributor 直接跑命令时这套 test 可能被全部 skip。需要在文档里告知:

# 显式开启外部服务测试
SKIP_EXTERNAL_SERVICES=false uv run pytest src/tests/integration/test_flows_integration.py -v

```

All 6 tests must pass before you submit your code:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Minor:CONTRIBUTING.md 选址 / 范围 / 强制性

  1. 选址错位:标题 "Contributing to HugeGraph-AI" 但内容只覆盖 hugegraph-llm 子模块。HugeGraph-AI 是 monorepo(含 hugegraph-mlhugegraph-python-clientvermeer-python-client)。建议:

    • 方案 A:把当前内容移动到 hugegraph-llm/TESTING.md(与 hugegraph-llm/AGENTS.md 同级),把根 CONTRIBUTING.md 作为通用入口;
    • 方案 B:保留根级 CONTRIBUTING.md,但把当前内容降级为 "### Module: hugegraph-llm" 章节,并补充 ASF 通用部分(DCO/sign-off、commit message 规范、ICLA 链接、Issue/PR 模板)。
  2. 强制性过度:line 26 "All 6 tests must pass before you submit your code" 强制要求 contributor 本地启动 HugeGraph + 配置 LLM API key 才能贡献。对纯文档 / UI / 非 flow PR 是过度要求。建议改为:

    如果你的修改涉及 flow / 索引 / RAG 相关代码,请在本地通过下列 6 个测试。

  3. 额外缺失:当前文档没列出 LLM API key 与 embedding provider 等前置条件,但作者表态"测试需要 api-key"——这条信息应当出现在 Prerequisites。

  4. 数字硬编码:"All 6 tests must pass" / "6 passed"(line 39)会随测试增删而过时,建议改成"全部通过"。


| Test | What it verifies |
|------|-----------------|
| `test_build_knowledge_graph` | Vector index building, graph extraction, data import, and VID embedding update |
| `test_schema_generator` | Schema generation from text |
| `test_graph_extract_prompt` | Graph extraction prompt generation |
| `test_rag` | All RAG modes (raw, vector-only, graph-only, graph+vector) |
| `test_build_example_index` | Example vector index building for Text2Gremlin |
| `test_text_2_gremlin` | Natural language to Gremlin query translation |

## Submission Checklist

- [ ] Integration tests pass locally (`6 passed`)
- [ ] Code is formatted with `ruff format .`
- [ ] Linting passes with `ruff check .`
3 changes: 2 additions & 1 deletion hugegraph-llm/src/hugegraph_llm/utils/vector_index_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@
import gradio as gr

from hugegraph_llm.config import huge_settings, index_settings
from hugegraph_llm.flows import FlowName
from hugegraph_llm.flows.scheduler import SchedulerSingleton
from hugegraph_llm.indices.vector_index.base import VectorStoreBase
from hugegraph_llm.indices.vector_index.faiss_vector_store import FaissVectorIndex
Expand Down Expand Up @@ -83,7 +84,7 @@ def build_vector_index(input_file, input_text):
raise gr.Error("Please only choose one between file and text.")
texts = read_documents(input_file, input_text)
scheduler = SchedulerSingleton.get_instance()
return scheduler.schedule_flow("build_vector_index", texts)
return scheduler.schedule_flow(FlowName.BUILD_VECTOR_INDEX, texts)


def get_vector_index_class(vector_index_str: str) -> Type[VectorStoreBase]:
Expand Down
329 changes: 329 additions & 0 deletions hugegraph-llm/src/tests/integration/test_flows_integration.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,329 @@
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.

import pytest

from hugegraph_llm.config.prompt_config import PromptConfig
from hugegraph_llm.demo.rag_demo.rag_block import update_ui_configs
from hugegraph_llm.demo.rag_demo.text2gremlin_block import build_example_vector_index
from hugegraph_llm.demo.rag_demo.vector_graph_block import load_query_examples
from hugegraph_llm.flows import FlowName
from hugegraph_llm.flows.scheduler import SchedulerSingleton
from hugegraph_llm.utils.log import log


class TestFlowsIntegration:
"""Flow集成测试 - 验证各个Flow能正常执行不抛异常"""

@pytest.fixture(autouse=True)
def setup(self):
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

‼️ Critical:缺 should_skip_external() 守卫,本地默认命令必爆

hugegraph-llm/src/tests/conftest.py:45 强制设置 SKIP_EXTERNAL_SERVICES=true,所有现存 integration test(如 test_kg_construction.py:126test_rag_pipeline.py:113)都会调用 should_skip_external() 在此前提下自动 skip。

新文件没有这个守卫 → contributor 跑 pytest src/tests/pytest src/tests/integration/ 时,其它 integration test 全部跳过,只有这套新 test 不跳过且必爆(因为它会真实调 LLM/HugeGraph 但默认 SKIP=true 时基础设施不一定齐全),对 contributor 来说是最糟糕的体验。

@coderzc 已在 P2 提出过一次,作者回复"故意不进 CI"——但进不进 CI 与本地默认 pytest 行为是两回事。至少二选一:

方案 A:在 setup 中加守卫,与现有 integration test 一致:

from src.tests.test_utils import should_skip_external

@pytest.fixture(autouse=True)
def setup(self):
    if should_skip_external():
        pytest.skip("Skipping tests that require external services")
    self.index_text = "..."
    self.scheduler = SchedulerSingleton.get_instance()

方案 B:用自定义 mark + CONTRIBUTING.md 引导:

# pytest.ini 或 pyproject.toml
[tool.pytest.ini_options]
markers = ["local_e2e: tests requiring HugeGraph + LLM API key (skipped by default)"]

# 测试文件
@pytest.mark.local_e2e
class TestFlowsIntegration: ...

并在 CONTRIBUTING.md 写明 pytest -m local_e2e 才会跑这套测试。

self.index_text = """
梁漱溟年轻时,一日,他与父亲梁济讨论当时一战欧洲的时局,梁济突然问道:“这个世界会好吗?”梁漱溟答:“我相信世界是一天一天往好里去的。”梁济叹道:“能好就好啊!”然后离家,三日后,梁济投湖自尽。晚年梁漱溟回忆自己的一生和跌宕起伏的近代社会,总结了一本书,书名就叫《这个世界会好吗?》。梁漱溟的回答与年轻时一致。但很多人特别是遗老遗少们总在回忆往日的时光,仿佛那是人类的黄金时代。如同鲁迅笔下的九斤老太,整日里念叨着“一代不如一代”。或者极端如梁济,对世界未来充满悲观,一死了之。在今天的时代,很多人认为“世界正变得越来越糟”,这其中不乏知名的知识分子。平克将这种情况称之为「进步恐惧症」,并总结为「认知偏差」。因为每天的新闻报道里总是充斥着战争、恐怖主义、犯罪、污染等坏消息,不是因为这些事情是主流,而是因为它们是热点,导致给人们的印象是世界越来越糟。所谓“好事不出门,坏事传千里”,而在互联网时代,发达的信息传播让坏事传播的更快更广。要纠正这种「可得性偏差」的方法是用数据说话。数字是最能反应趋势,看战争的比例、犯罪死亡人数在总人数的占比,就能看出犯罪是增加了,还是减少了。实际上,从各种数字显示,人类暴力事件在历史呈明显的下降趋势,这在平克之前发表的另一大部头著作《人性中的善良天使:暴力为什么会减少》中详细阐述过。世界变得更好了,说到底就是进步。
"""
self.scheduler = SchedulerSingleton.get_instance()
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ ImportantSchedulerSingleton 是进程级单例,跨测试共享 pipeline_pool

SchedulerSingleton.get_instance()(见 flows/scheduler.py:181-191)是进程级单例,其 pipeline_pool 内每个 GPipelineManager 都会缓存 pipeline。这意味着:

  1. 顺序耦合test_build_knowledge_graph 写入的图数据 / 向量索引会被后续 test_rag 看到。如果用户用 pytest -k test_rag 单独跑,test_rag 是否通过取决于此前是否跑过 test_build_knowledge_graph,行为不可复现。
  2. 脏 pipeline:某测试中途异常未走到 manager.release/add 时,下一个测试可能拿到错误状态的 pipeline。

对 contributor 在反复修复-重试的场景下尤其糟糕:上一次失败留下的脏状态会让下一次的失败原因看起来与代码无关。

修法:teardown 中重置;或为每个测试用唯一的 graph_name(避免共享数据)。例如:

@pytest.fixture(autouse=True)
def setup(self):
    if should_skip_external():
        pytest.skip("...")
    self.index_text = "..."
    huge_settings.graph_name = f"test_{uuid.uuid4().hex[:8]}"
    self.scheduler = SchedulerSingleton.get_instance()
    yield
    # teardown:清空该 graph,或释放 pipeline

至少在文件 docstring 写明"测试间存在隐式数据依赖",让维护者心里有数。


def test_build_knowledge_graph(self):
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Important:单个测试串行 4 个 flow,配合错误信息 bug 让定位极困难

test_build_knowledge_graph 串行执行 BUILD_VECTOR_INDEX → GRAPH_EXTRACT → IMPORT_GRAPH_DATA → UPDATE_VID_EMBEDDINGS 四个 flow(line 41-129)。若 IMPORT_GRAPH_DATA 挂掉,配合上面那条复制粘贴 bug,contributor 看到的错误是 "BUILD_VECTOR_INDEX flow failed",会被严重误导。

另外 CONTRIBUTING.md 表格把它标为 1 项测试,与实际行为不符。

修法:拆成 4 个独立测试,用 fixture 共享上一步的产物;或至少改用 pytest-subtests

def test_build_kg__vector_index(self):
    res = self.scheduler.schedule_flow(FlowName.BUILD_VECTOR_INDEX, [self.index_text])
    assert "chunks" in res and len(res["chunks"]) > 0

def test_build_kg__graph_extract(self):
    data = self.scheduler.schedule_flow(FlowName.GRAPH_EXTRACT, ...)
    assert data["vertices"] and data["edges"]

# 以此类推

好处:失败定位精确到 flow;CONTRIBUTING.md 表格与代码对齐;contributor 跑挂时不需要从头再来。

try:
res = self.scheduler.schedule_flow(FlowName.BUILD_VECTOR_INDEX, [self.index_text])
assert "chunks" in res, "The result of BUILD_VECTOR_INDEX flow should contain 'chunks' field"
log.info("✓ BUILD_VECTOR_INDEX flow executed successfully")

schema = """
{
"vertexlabels": [
{
"id": 1,
"name": "Person",
"id_strategy": "PRIMARY_KEY",
"primary_keys": [
"name"
],
"properties": [
"name",
"age",
"occupation"
]
},
{
"id": 2,
"name": "Book",
"id_strategy": "PRIMARY_KEY",
"primary_keys": [
"title"
],
"properties": [
"title",
"author",
"year"
]
},
{
"id": 3,
"name": "Concept",
"id_strategy": "PRIMARY_KEY",
"primary_keys": [
"name"
],
"properties": [
"name",
"description"
]
}
],
"edgelabels": [
{
"id": 1,
"name": "Wrote",
"source_label": "Person",
"target_label": "Book",
"properties": []
},
{
"id": 2,
"name": "Discussed",
"source_label": "Person",
"target_label": "Concept",
"properties": []
},
{
"id": 3,
"name": "Believes",
"source_label": "Person",
"target_label": "Concept",
"properties": []
}
]
}
"""

data = self.scheduler.schedule_flow(
FlowName.GRAPH_EXTRACT,
schema,
[self.index_text],
PromptConfig.extract_graph_prompt_EN,
"property_graph",
)
assert "vertices" in data, "The result of GRAPH_EXTRACT flow should contain 'vertices' field"
assert "edges" in data, "The result of GRAPH_EXTRACT flow should contain 'edges' field"
log.info("✓ GRAPH_EXTRACT flow executed successfully")

res = self.scheduler.schedule_flow(FlowName.IMPORT_GRAPH_DATA, data, schema)
assert res is not None, "The result of IMPORT_GRAPH_DATA flow should not be None"
log.info("✓ IMPORT_GRAPH_DATA flow executed successfully")

self.scheduler.schedule_flow(FlowName.UPDATE_VID_EMBEDDINGS)
log.info("✓ UPDATE_VID_EMBEDDING flow executed successfully")
except Exception as e:
pytest.fail(f"BUILD_VECTOR_INDEX flow failed: {e}")
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

‼️ Critical:复制粘贴错误信息 + try/except + pytest.fail 反模式

本行(line 131)以及 line 189、line 199 三处 pytest.fail(...) 全部写成 "BUILD_VECTOR_INDEX flow failed: {e}",但实际抛出的可能是 BUILD_SCHEMA / PROMPT_GENERATE / IMPORT_GRAPH_DATA 等任意一个 flow。对于一个面向 contributor 本地自查的 smoke test,错误信息直接误导新人去排查错误的 flow

更深层的问题是 try/except + pytest.fail(str(e)) 本身就是反模式:

  1. pytest 默认遇到异常就会 fail 并展示完整 traceback;这层包装反而吞掉异常类型与 cause chain;
  2. 多余的 try 块降低了可读性。

最简修法:直接删除 try/except 让异常自然冒泡。例如本测试(test_build_knowledge_graph):

def test_build_knowledge_graph(self):
    res = self.scheduler.schedule_flow(FlowName.BUILD_VECTOR_INDEX, [self.index_text])
    assert "chunks" in res
    log.info("BUILD_VECTOR_INDEX flow executed successfully")
    # ...后续步骤同样直接调用,不要包 try/except

如果非要保留上下文,至少把 flow 名称变量化:pytest.fail(f"{current_flow} failed: {e}")


def test_schema_generator(self):
try:
query_examples = load_query_examples()

few_shot = """
{
"vertexlabels": [
{
"id": 1,
"name": "person",
"id_strategy": "PRIMARY_KEY",
"primary_keys": [
"name"
],
"properties": [
"name",
"age",
"occupation"
]
},
{
"id": 2,
"name": "webpage",
"id_strategy": "PRIMARY_KEY",
"primary_keys": [
"name"
],
"properties": [
"name",
"url"
]
}
],
"edgelabels": [
{
"id": 1,
"name": "roommate",
"source_label": "person",
"target_label": "person",
"properties": [
"date"
]
},
{
"id": 2,
"name": "link",
"source_label": "webpage",
"target_label": "person",
"properties": []
}
]
}
"""

self.scheduler.schedule_flow(FlowName.BUILD_SCHEMA, [self.index_text], query_examples, few_shot)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Important:断言过弱 / test_schema_generator 无任何断言

本行 self.scheduler.schedule_flow(FlowName.BUILD_SCHEMA, ...) 没有任何断言,等价于"只要不抛异常就算通过"。schema 生成器返回空 list、错误结构、或 LLM 回了一句 "抱歉,我无法生成" 都会被判通过。

同样的弱断言遍布全文件:

  • line 197 / 242 / 266 / 290 / 314 / 329:仅 assert res is not None
  • line 125:assert res is not None(IMPORT_GRAPH_DATA)

对于面向 contributor 的本地自查,"通过"应该真正校验返回结构。

最低限度修法:

res = self.scheduler.schedule_flow(FlowName.BUILD_SCHEMA, ...)
assert isinstance(res, dict)
assert res.get("schema", {}).get("vertexlabels"), "BUILD_SCHEMA returned empty vertexlabels"
assert res["schema"].get("edgelabels"), "BUILD_SCHEMA returned empty edgelabels"

理想做法:把期望产物存到 src/tests/data/flows/expected_schema.json(仓库已有 src/tests/data/ 目录),用结构 diff 校验。

except Exception as e:
pytest.fail(f"BUILD_VECTOR_INDEX flow failed: {e}")

def test_graph_extract_prompt(self):
try:
scenario = "social relationships"
example_name = "Official Person-Relationship Extraction"

res = self.scheduler.schedule_flow(FlowName.PROMPT_GENERATE, self.index_text, scenario, example_name)
assert res is not None, "The result of PROMPT_GENERATE flow should not be None"
except Exception as e:
pytest.fail(f"BUILD_VECTOR_INDEX flow failed: {e}")

def test_rag(self):
query = "梁漱溟和梁济的关系是什么?"

raw_answer = True
vector_only_answer = False
graph_only_answer = False
graph_vector_answer = False
graph_ratio = 0.6
rerank_method = "bleu"
near_neighbor_first = False
custom_related_information = ""

graph_search, gremlin_prompt, vector_search = update_ui_configs(
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

‼️ Criticaltest_rag 实际只测了 raw 模式 + 测试会写回 config_prompt.yaml

问题 1(死代码 → 名义 4 模式实际只测 1 种)
update_ui_configs(...)(line 213-222)只调用一次,传入的 vector_only_answer / graph_only_answer / graph_vector_answer 都是 False,因此返回 vector_search=False, graph_search=False
后续四段 schedule_flow 始终复用这同一组标记(见 line 227-228 / 251-252 / 275-276 / 299-300),而对局部 bool 的反复赋值(line 244-247 / 268-271 / 292-295)根本没传给 flow,是死代码。
=> 四次调用实际全部以"无检索"模式运行;RAG_VECTOR_ONLY / RAG_GRAPH_ONLY / RAG_GRAPH_VECTOR 这三条核心路径没被实际验证。CONTRIBUTING.md 中的 "All RAG modes" 描述不成立。

问题 2(副作用:测试会改 contributor 的 yaml)
update_ui_configs 在 prompt 内容变化时会触发 prompt.update_yaml_file()(见 rag_block.py:141-147)。Contributor 跑完该测试后,工作区里的 config_prompt.yaml 会被悄悄改写为测试中的 EN prompt 与 default_question="梁漱溟和梁济的关系是什么?",下次 git status 会冒出无关 diff,极易误提交。

修法:参数化覆盖 4 种模式 + 直接构造 vector_search / graph_search,绕开 demo helper 副作用:

@pytest.mark.parametrize(
    ("flow_name", "raw", "vec_only", "graph_only", "graph_vec"),
    [
        (FlowName.RAG_RAW, True, False, False, False),
        (FlowName.RAG_VECTOR_ONLY, False, True, False, False),
        (FlowName.RAG_GRAPH_ONLY, False, False, True, False),
        (FlowName.RAG_GRAPH_VECTOR, False, False, False, True),
    ],
)
def test_rag(self, flow_name, raw, vec_only, graph_only, graph_vec):
    query = "梁漱溟和梁济的关系是什么?"
    graph_search = graph_only or graph_vec
    vector_search = vec_only or graph_vec
    res = self.scheduler.schedule_flow(
        flow_name,
        query=query,
        vector_search=vector_search,
        graph_search=graph_search,
        raw_answer=raw,
        vector_only_answer=vec_only,
        graph_only_answer=graph_only,
        graph_vector_answer=graph_vec,
        graph_ratio=0.6,
        rerank_method="bleu",
        near_neighbor_first=False,
        custom_related_information="",
        answer_prompt=PromptConfig.answer_prompt_EN,
        keywords_extract_prompt=PromptConfig.keywords_extract_prompt_EN,
        gremlin_tmpl_num=-1,
        gremlin_prompt=PromptConfig.gremlin_generate_prompt_EN,
    )
    assert isinstance(res, dict) and res.get("answer"), f"{flow_name} returned empty answer"

PromptConfig.answer_prompt_EN,
custom_related_information,
graph_only_answer,
graph_vector_answer,
None,
PromptConfig.keywords_extract_prompt_EN,
query,
vector_only_answer,
)

res = self.scheduler.schedule_flow(
FlowName.RAG_RAW,
query=query,
vector_search=vector_search,
graph_search=graph_search,
raw_answer=raw_answer,
vector_only_answer=vector_only_answer,
graph_only_answer=graph_only_answer,
graph_vector_answer=graph_vector_answer,
graph_ratio=graph_ratio,
rerank_method=rerank_method,
near_neighbor_first=near_neighbor_first,
custom_related_information=custom_related_information,
answer_prompt=PromptConfig.answer_prompt_EN,
keywords_extract_prompt=PromptConfig.keywords_extract_prompt_EN,
gremlin_tmpl_num=-1,
gremlin_prompt=gremlin_prompt,
)
assert res is not None, "The result of RAG flow should not be None"

raw_answer = False
vector_only_answer = True
graph_only_answer = False
graph_vector_answer = False
res = self.scheduler.schedule_flow(
FlowName.RAG_VECTOR_ONLY,
query=query,
vector_search=vector_search,
graph_search=graph_search,
raw_answer=raw_answer,
vector_only_answer=vector_only_answer,
graph_only_answer=graph_only_answer,
graph_vector_answer=graph_vector_answer,
graph_ratio=graph_ratio,
rerank_method=rerank_method,
near_neighbor_first=near_neighbor_first,
custom_related_information=custom_related_information,
answer_prompt=PromptConfig.answer_prompt_EN,
keywords_extract_prompt=PromptConfig.keywords_extract_prompt_EN,
gremlin_tmpl_num=-1,
gremlin_prompt=gremlin_prompt,
)
assert res is not None, "The result of RAG flow should not be None"

raw_answer = False
vector_only_answer = False
graph_only_answer = True
graph_vector_answer = False
res = self.scheduler.schedule_flow(
FlowName.RAG_GRAPH_ONLY,
query=query,
vector_search=vector_search,
graph_search=graph_search,
raw_answer=raw_answer,
vector_only_answer=vector_only_answer,
graph_only_answer=graph_only_answer,
graph_vector_answer=graph_vector_answer,
graph_ratio=graph_ratio,
rerank_method=rerank_method,
near_neighbor_first=near_neighbor_first,
custom_related_information=custom_related_information,
answer_prompt=PromptConfig.answer_prompt_EN,
keywords_extract_prompt=PromptConfig.keywords_extract_prompt_EN,
gremlin_tmpl_num=-1,
gremlin_prompt=gremlin_prompt,
)
assert res is not None, "The result of RAG flow should not be None"

raw_answer = False
vector_only_answer = False
graph_only_answer = False
graph_vector_answer = True
res = self.scheduler.schedule_flow(
FlowName.RAG_GRAPH_VECTOR,
query=query,
vector_search=vector_search,
graph_search=graph_search,
raw_answer=raw_answer,
vector_only_answer=vector_only_answer,
graph_only_answer=graph_only_answer,
graph_vector_answer=graph_vector_answer,
graph_ratio=graph_ratio,
rerank_method=rerank_method,
near_neighbor_first=near_neighbor_first,
custom_related_information=custom_related_information,
answer_prompt=PromptConfig.answer_prompt_EN,
keywords_extract_prompt=PromptConfig.keywords_extract_prompt_EN,
gremlin_tmpl_num=-1,
gremlin_prompt=gremlin_prompt,
)
assert res is not None, "The result of RAG flow should not be None"

def test_build_example_index(self):
res = build_example_vector_index(None)
assert "embed_dim" in res, "The result of build_example_vector_index should contain embed_dim"

def test_text_2_gremlin(self):
query = "梁漱溟和梁济的关系是什么?"
schema = "hugegraph"
example_num = 2

res = self.scheduler.schedule_flow(
FlowName.TEXT2GREMLIN, query, example_num, schema, PromptConfig.gremlin_generate_prompt_EN, None
)

assert res is not None, "The result of TEXT2GREMLIN flow should not be None"
Loading