-
Notifications
You must be signed in to change notification settings - Fork 78
feat: add integration test case for flow execution #315
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
caf0123
89e83f3
5bdb06a
680059b
10e8d59
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,41 @@ | ||
| # Contributing to HugeGraph-AI | ||
|
|
||
| Thank you for your interest in contributing! Before submitting a pull request, please run the end-to-end integration tests locally to make sure nothing is broken. | ||
|
|
||
| ## Prerequisites | ||
|
|
||
| 1. **HugeGraph Server** running on `localhost:8080` (see [README.md](./README.md) for setup) | ||
| 2. **Python 3.10+** with dependencies installed via `uv sync --extra llm` | ||
| 3. **Proxy users**: If you have `http_proxy`/`https_proxy` set, make sure to exclude localhost: | ||
| ```bash | ||
| export no_proxy=localhost,127.0.0.1 | ||
| export NO_PROXY=localhost,127.0.0.1 | ||
| ``` | ||
|
|
||
| ## Running Integration Tests | ||
|
|
||
| ```bash | ||
| # Activate the virtual environment | ||
| source .venv/bin/activate | ||
|
|
||
| # Run the end-to-end integration tests | ||
| cd hugegraph-llm | ||
| python -m pytest src/tests/integration/test_flows_integration.py -v | ||
| ``` | ||
|
|
||
| All 6 tests must pass before you submit your code: | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🧹 Minor:CONTRIBUTING.md 选址 / 范围 / 强制性
|
||
|
|
||
| | Test | What it verifies | | ||
| |------|-----------------| | ||
| | `test_build_knowledge_graph` | Vector index building, graph extraction, data import, and VID embedding update | | ||
| | `test_schema_generator` | Schema generation from text | | ||
| | `test_graph_extract_prompt` | Graph extraction prompt generation | | ||
| | `test_rag` | All RAG modes (raw, vector-only, graph-only, graph+vector) | | ||
| | `test_build_example_index` | Example vector index building for Text2Gremlin | | ||
| | `test_text_2_gremlin` | Natural language to Gremlin query translation | | ||
|
|
||
| ## Submission Checklist | ||
|
|
||
| - [ ] Integration tests pass locally (`6 passed`) | ||
| - [ ] Code is formatted with `ruff format .` | ||
| - [ ] Linting passes with `ruff check .` | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,329 @@ | ||
| # Licensed to the Apache Software Foundation (ASF) under one | ||
| # or more contributor license agreements. See the NOTICE file | ||
| # distributed with this work for additional information | ||
| # regarding copyright ownership. The ASF licenses this file | ||
| # to you under the Apache License, Version 2.0 (the | ||
| # "License"); you may not use this file except in compliance | ||
| # with the License. You may obtain a copy of the License at | ||
| # | ||
| # http://www.apache.org/licenses/LICENSE-2.0 | ||
| # | ||
| # Unless required by applicable law or agreed to in writing, | ||
| # software distributed under the License is distributed on an | ||
| # "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY | ||
| # KIND, either express or implied. See the License for the | ||
| # specific language governing permissions and limitations | ||
| # under the License. | ||
|
|
||
| import pytest | ||
|
|
||
| from hugegraph_llm.config.prompt_config import PromptConfig | ||
| from hugegraph_llm.demo.rag_demo.rag_block import update_ui_configs | ||
| from hugegraph_llm.demo.rag_demo.text2gremlin_block import build_example_vector_index | ||
| from hugegraph_llm.demo.rag_demo.vector_graph_block import load_query_examples | ||
| from hugegraph_llm.flows import FlowName | ||
| from hugegraph_llm.flows.scheduler import SchedulerSingleton | ||
| from hugegraph_llm.utils.log import log | ||
|
|
||
|
|
||
| class TestFlowsIntegration: | ||
| """Flow集成测试 - 验证各个Flow能正常执行不抛异常""" | ||
|
|
||
| @pytest.fixture(autouse=True) | ||
| def setup(self): | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
新文件没有这个守卫 → contributor 跑 @coderzc 已在 P2 提出过一次,作者回复"故意不进 CI"——但进不进 CI 与本地默认 pytest 行为是两回事。至少二选一: 方案 A:在 setup 中加守卫,与现有 integration test 一致: from src.tests.test_utils import should_skip_external
@pytest.fixture(autouse=True)
def setup(self):
if should_skip_external():
pytest.skip("Skipping tests that require external services")
self.index_text = "..."
self.scheduler = SchedulerSingleton.get_instance()方案 B:用自定义 mark + CONTRIBUTING.md 引导: # pytest.ini 或 pyproject.toml
[tool.pytest.ini_options]
markers = ["local_e2e: tests requiring HugeGraph + LLM API key (skipped by default)"]
# 测试文件
@pytest.mark.local_e2e
class TestFlowsIntegration: ...并在 CONTRIBUTING.md 写明 |
||
| self.index_text = """ | ||
| 梁漱溟年轻时,一日,他与父亲梁济讨论当时一战欧洲的时局,梁济突然问道:“这个世界会好吗?”梁漱溟答:“我相信世界是一天一天往好里去的。”梁济叹道:“能好就好啊!”然后离家,三日后,梁济投湖自尽。晚年梁漱溟回忆自己的一生和跌宕起伏的近代社会,总结了一本书,书名就叫《这个世界会好吗?》。梁漱溟的回答与年轻时一致。但很多人特别是遗老遗少们总在回忆往日的时光,仿佛那是人类的黄金时代。如同鲁迅笔下的九斤老太,整日里念叨着“一代不如一代”。或者极端如梁济,对世界未来充满悲观,一死了之。在今天的时代,很多人认为“世界正变得越来越糟”,这其中不乏知名的知识分子。平克将这种情况称之为「进步恐惧症」,并总结为「认知偏差」。因为每天的新闻报道里总是充斥着战争、恐怖主义、犯罪、污染等坏消息,不是因为这些事情是主流,而是因为它们是热点,导致给人们的印象是世界越来越糟。所谓“好事不出门,坏事传千里”,而在互联网时代,发达的信息传播让坏事传播的更快更广。要纠正这种「可得性偏差」的方法是用数据说话。数字是最能反应趋势,看战争的比例、犯罪死亡人数在总人数的占比,就能看出犯罪是增加了,还是减少了。实际上,从各种数字显示,人类暴力事件在历史呈明显的下降趋势,这在平克之前发表的另一大部头著作《人性中的善良天使:暴力为什么会减少》中详细阐述过。世界变得更好了,说到底就是进步。 | ||
| """ | ||
| self.scheduler = SchedulerSingleton.get_instance() | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
对 contributor 在反复修复-重试的场景下尤其糟糕:上一次失败留下的脏状态会让下一次的失败原因看起来与代码无关。 修法:teardown 中重置;或为每个测试用唯一的 @pytest.fixture(autouse=True)
def setup(self):
if should_skip_external():
pytest.skip("...")
self.index_text = "..."
huge_settings.graph_name = f"test_{uuid.uuid4().hex[:8]}"
self.scheduler = SchedulerSingleton.get_instance()
yield
# teardown:清空该 graph,或释放 pipeline至少在文件 docstring 写明"测试间存在隐式数据依赖",让维护者心里有数。 |
||
|
|
||
| def test_build_knowledge_graph(self): | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
另外 CONTRIBUTING.md 表格把它标为 1 项测试,与实际行为不符。 修法:拆成 4 个独立测试,用 fixture 共享上一步的产物;或至少改用 def test_build_kg__vector_index(self):
res = self.scheduler.schedule_flow(FlowName.BUILD_VECTOR_INDEX, [self.index_text])
assert "chunks" in res and len(res["chunks"]) > 0
def test_build_kg__graph_extract(self):
data = self.scheduler.schedule_flow(FlowName.GRAPH_EXTRACT, ...)
assert data["vertices"] and data["edges"]
# 以此类推好处:失败定位精确到 flow;CONTRIBUTING.md 表格与代码对齐;contributor 跑挂时不需要从头再来。 |
||
| try: | ||
| res = self.scheduler.schedule_flow(FlowName.BUILD_VECTOR_INDEX, [self.index_text]) | ||
| assert "chunks" in res, "The result of BUILD_VECTOR_INDEX flow should contain 'chunks' field" | ||
| log.info("✓ BUILD_VECTOR_INDEX flow executed successfully") | ||
|
|
||
| schema = """ | ||
| { | ||
| "vertexlabels": [ | ||
| { | ||
| "id": 1, | ||
| "name": "Person", | ||
| "id_strategy": "PRIMARY_KEY", | ||
| "primary_keys": [ | ||
| "name" | ||
| ], | ||
| "properties": [ | ||
| "name", | ||
| "age", | ||
| "occupation" | ||
| ] | ||
| }, | ||
| { | ||
| "id": 2, | ||
| "name": "Book", | ||
| "id_strategy": "PRIMARY_KEY", | ||
| "primary_keys": [ | ||
| "title" | ||
| ], | ||
| "properties": [ | ||
| "title", | ||
| "author", | ||
| "year" | ||
| ] | ||
| }, | ||
| { | ||
| "id": 3, | ||
| "name": "Concept", | ||
| "id_strategy": "PRIMARY_KEY", | ||
| "primary_keys": [ | ||
| "name" | ||
| ], | ||
| "properties": [ | ||
| "name", | ||
| "description" | ||
| ] | ||
| } | ||
| ], | ||
| "edgelabels": [ | ||
| { | ||
| "id": 1, | ||
| "name": "Wrote", | ||
| "source_label": "Person", | ||
| "target_label": "Book", | ||
| "properties": [] | ||
| }, | ||
| { | ||
| "id": 2, | ||
| "name": "Discussed", | ||
| "source_label": "Person", | ||
| "target_label": "Concept", | ||
| "properties": [] | ||
| }, | ||
| { | ||
| "id": 3, | ||
| "name": "Believes", | ||
| "source_label": "Person", | ||
| "target_label": "Concept", | ||
| "properties": [] | ||
| } | ||
| ] | ||
| } | ||
| """ | ||
|
|
||
| data = self.scheduler.schedule_flow( | ||
| FlowName.GRAPH_EXTRACT, | ||
| schema, | ||
| [self.index_text], | ||
| PromptConfig.extract_graph_prompt_EN, | ||
| "property_graph", | ||
| ) | ||
| assert "vertices" in data, "The result of GRAPH_EXTRACT flow should contain 'vertices' field" | ||
| assert "edges" in data, "The result of GRAPH_EXTRACT flow should contain 'edges' field" | ||
| log.info("✓ GRAPH_EXTRACT flow executed successfully") | ||
|
|
||
| res = self.scheduler.schedule_flow(FlowName.IMPORT_GRAPH_DATA, data, schema) | ||
| assert res is not None, "The result of IMPORT_GRAPH_DATA flow should not be None" | ||
| log.info("✓ IMPORT_GRAPH_DATA flow executed successfully") | ||
|
|
||
| self.scheduler.schedule_flow(FlowName.UPDATE_VID_EMBEDDINGS) | ||
| log.info("✓ UPDATE_VID_EMBEDDING flow executed successfully") | ||
| except Exception as e: | ||
| pytest.fail(f"BUILD_VECTOR_INDEX flow failed: {e}") | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
本行(line 131)以及 line 189、line 199 三处 更深层的问题是
最简修法:直接删除 def test_build_knowledge_graph(self):
res = self.scheduler.schedule_flow(FlowName.BUILD_VECTOR_INDEX, [self.index_text])
assert "chunks" in res
log.info("BUILD_VECTOR_INDEX flow executed successfully")
# ...后续步骤同样直接调用,不要包 try/except如果非要保留上下文,至少把 flow 名称变量化: |
||
|
|
||
| def test_schema_generator(self): | ||
| try: | ||
| query_examples = load_query_examples() | ||
|
|
||
| few_shot = """ | ||
| { | ||
| "vertexlabels": [ | ||
| { | ||
| "id": 1, | ||
| "name": "person", | ||
| "id_strategy": "PRIMARY_KEY", | ||
| "primary_keys": [ | ||
| "name" | ||
| ], | ||
| "properties": [ | ||
| "name", | ||
| "age", | ||
| "occupation" | ||
| ] | ||
| }, | ||
| { | ||
| "id": 2, | ||
| "name": "webpage", | ||
| "id_strategy": "PRIMARY_KEY", | ||
| "primary_keys": [ | ||
| "name" | ||
| ], | ||
| "properties": [ | ||
| "name", | ||
| "url" | ||
| ] | ||
| } | ||
| ], | ||
| "edgelabels": [ | ||
| { | ||
| "id": 1, | ||
| "name": "roommate", | ||
| "source_label": "person", | ||
| "target_label": "person", | ||
| "properties": [ | ||
| "date" | ||
| ] | ||
| }, | ||
| { | ||
| "id": 2, | ||
| "name": "link", | ||
| "source_label": "webpage", | ||
| "target_label": "person", | ||
| "properties": [] | ||
| } | ||
| ] | ||
| } | ||
| """ | ||
|
|
||
| self.scheduler.schedule_flow(FlowName.BUILD_SCHEMA, [self.index_text], query_examples, few_shot) | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
本行 同样的弱断言遍布全文件:
对于面向 contributor 的本地自查,"通过"应该真正校验返回结构。 最低限度修法: res = self.scheduler.schedule_flow(FlowName.BUILD_SCHEMA, ...)
assert isinstance(res, dict)
assert res.get("schema", {}).get("vertexlabels"), "BUILD_SCHEMA returned empty vertexlabels"
assert res["schema"].get("edgelabels"), "BUILD_SCHEMA returned empty edgelabels"理想做法:把期望产物存到 |
||
| except Exception as e: | ||
| pytest.fail(f"BUILD_VECTOR_INDEX flow failed: {e}") | ||
|
|
||
| def test_graph_extract_prompt(self): | ||
| try: | ||
| scenario = "social relationships" | ||
| example_name = "Official Person-Relationship Extraction" | ||
|
|
||
| res = self.scheduler.schedule_flow(FlowName.PROMPT_GENERATE, self.index_text, scenario, example_name) | ||
| assert res is not None, "The result of PROMPT_GENERATE flow should not be None" | ||
| except Exception as e: | ||
| pytest.fail(f"BUILD_VECTOR_INDEX flow failed: {e}") | ||
|
|
||
| def test_rag(self): | ||
| query = "梁漱溟和梁济的关系是什么?" | ||
|
|
||
| raw_answer = True | ||
| vector_only_answer = False | ||
| graph_only_answer = False | ||
| graph_vector_answer = False | ||
| graph_ratio = 0.6 | ||
| rerank_method = "bleu" | ||
| near_neighbor_first = False | ||
| custom_related_information = "" | ||
|
|
||
| graph_search, gremlin_prompt, vector_search = update_ui_configs( | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
问题 1(死代码 → 名义 4 模式实际只测 1 种) 问题 2(副作用:测试会改 contributor 的 yaml) 修法:参数化覆盖 4 种模式 + 直接构造 @pytest.mark.parametrize(
("flow_name", "raw", "vec_only", "graph_only", "graph_vec"),
[
(FlowName.RAG_RAW, True, False, False, False),
(FlowName.RAG_VECTOR_ONLY, False, True, False, False),
(FlowName.RAG_GRAPH_ONLY, False, False, True, False),
(FlowName.RAG_GRAPH_VECTOR, False, False, False, True),
],
)
def test_rag(self, flow_name, raw, vec_only, graph_only, graph_vec):
query = "梁漱溟和梁济的关系是什么?"
graph_search = graph_only or graph_vec
vector_search = vec_only or graph_vec
res = self.scheduler.schedule_flow(
flow_name,
query=query,
vector_search=vector_search,
graph_search=graph_search,
raw_answer=raw,
vector_only_answer=vec_only,
graph_only_answer=graph_only,
graph_vector_answer=graph_vec,
graph_ratio=0.6,
rerank_method="bleu",
near_neighbor_first=False,
custom_related_information="",
answer_prompt=PromptConfig.answer_prompt_EN,
keywords_extract_prompt=PromptConfig.keywords_extract_prompt_EN,
gremlin_tmpl_num=-1,
gremlin_prompt=PromptConfig.gremlin_generate_prompt_EN,
)
assert isinstance(res, dict) and res.get("answer"), f"{flow_name} returned empty answer" |
||
| PromptConfig.answer_prompt_EN, | ||
| custom_related_information, | ||
| graph_only_answer, | ||
| graph_vector_answer, | ||
| None, | ||
| PromptConfig.keywords_extract_prompt_EN, | ||
| query, | ||
| vector_only_answer, | ||
| ) | ||
|
|
||
| res = self.scheduler.schedule_flow( | ||
| FlowName.RAG_RAW, | ||
| query=query, | ||
| vector_search=vector_search, | ||
| graph_search=graph_search, | ||
| raw_answer=raw_answer, | ||
| vector_only_answer=vector_only_answer, | ||
| graph_only_answer=graph_only_answer, | ||
| graph_vector_answer=graph_vector_answer, | ||
| graph_ratio=graph_ratio, | ||
| rerank_method=rerank_method, | ||
| near_neighbor_first=near_neighbor_first, | ||
| custom_related_information=custom_related_information, | ||
| answer_prompt=PromptConfig.answer_prompt_EN, | ||
| keywords_extract_prompt=PromptConfig.keywords_extract_prompt_EN, | ||
| gremlin_tmpl_num=-1, | ||
| gremlin_prompt=gremlin_prompt, | ||
| ) | ||
| assert res is not None, "The result of RAG flow should not be None" | ||
|
|
||
| raw_answer = False | ||
| vector_only_answer = True | ||
| graph_only_answer = False | ||
| graph_vector_answer = False | ||
| res = self.scheduler.schedule_flow( | ||
| FlowName.RAG_VECTOR_ONLY, | ||
| query=query, | ||
| vector_search=vector_search, | ||
| graph_search=graph_search, | ||
| raw_answer=raw_answer, | ||
| vector_only_answer=vector_only_answer, | ||
| graph_only_answer=graph_only_answer, | ||
| graph_vector_answer=graph_vector_answer, | ||
| graph_ratio=graph_ratio, | ||
| rerank_method=rerank_method, | ||
| near_neighbor_first=near_neighbor_first, | ||
| custom_related_information=custom_related_information, | ||
| answer_prompt=PromptConfig.answer_prompt_EN, | ||
| keywords_extract_prompt=PromptConfig.keywords_extract_prompt_EN, | ||
| gremlin_tmpl_num=-1, | ||
| gremlin_prompt=gremlin_prompt, | ||
| ) | ||
| assert res is not None, "The result of RAG flow should not be None" | ||
|
|
||
| raw_answer = False | ||
| vector_only_answer = False | ||
| graph_only_answer = True | ||
| graph_vector_answer = False | ||
| res = self.scheduler.schedule_flow( | ||
| FlowName.RAG_GRAPH_ONLY, | ||
| query=query, | ||
| vector_search=vector_search, | ||
| graph_search=graph_search, | ||
| raw_answer=raw_answer, | ||
| vector_only_answer=vector_only_answer, | ||
| graph_only_answer=graph_only_answer, | ||
| graph_vector_answer=graph_vector_answer, | ||
| graph_ratio=graph_ratio, | ||
| rerank_method=rerank_method, | ||
| near_neighbor_first=near_neighbor_first, | ||
| custom_related_information=custom_related_information, | ||
| answer_prompt=PromptConfig.answer_prompt_EN, | ||
| keywords_extract_prompt=PromptConfig.keywords_extract_prompt_EN, | ||
| gremlin_tmpl_num=-1, | ||
| gremlin_prompt=gremlin_prompt, | ||
| ) | ||
| assert res is not None, "The result of RAG flow should not be None" | ||
|
|
||
| raw_answer = False | ||
| vector_only_answer = False | ||
| graph_only_answer = False | ||
| graph_vector_answer = True | ||
| res = self.scheduler.schedule_flow( | ||
| FlowName.RAG_GRAPH_VECTOR, | ||
| query=query, | ||
| vector_search=vector_search, | ||
| graph_search=graph_search, | ||
| raw_answer=raw_answer, | ||
| vector_only_answer=vector_only_answer, | ||
| graph_only_answer=graph_only_answer, | ||
| graph_vector_answer=graph_vector_answer, | ||
| graph_ratio=graph_ratio, | ||
| rerank_method=rerank_method, | ||
| near_neighbor_first=near_neighbor_first, | ||
| custom_related_information=custom_related_information, | ||
| answer_prompt=PromptConfig.answer_prompt_EN, | ||
| keywords_extract_prompt=PromptConfig.keywords_extract_prompt_EN, | ||
| gremlin_tmpl_num=-1, | ||
| gremlin_prompt=gremlin_prompt, | ||
| ) | ||
| assert res is not None, "The result of RAG flow should not be None" | ||
|
|
||
| def test_build_example_index(self): | ||
| res = build_example_vector_index(None) | ||
| assert "embed_dim" in res, "The result of build_example_vector_index should contain embed_dim" | ||
|
|
||
| def test_text_2_gremlin(self): | ||
| query = "梁漱溟和梁济的关系是什么?" | ||
| schema = "hugegraph" | ||
| example_num = 2 | ||
|
|
||
| res = self.scheduler.schedule_flow( | ||
| FlowName.TEXT2GREMLIN, query, example_num, schema, PromptConfig.gremlin_generate_prompt_EN, None | ||
| ) | ||
|
|
||
| assert res is not None, "The result of TEXT2GREMLIN flow should not be None" | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
uv管理依赖,CONTRIBUTING.md 主命令应统一为uv run pytesthugegraph-llm/AGENTS.md与.github/workflows/hugegraph-llm.yml:82都使用uv run pytest。本文件 line 23 的python -m pytest在 contributor 没正确激活.venv、或.venv缺--extra llm时会跑到系统 Python,得到莫名其妙的 import error。CONTRIBUTING.md 是 contributor 的入口,命令必须与 CI 保持一致。
另外,由于 conftest.py 默认
SKIP_EXTERNAL_SERVICES=true,contributor 直接跑命令时这套 test 可能被全部 skip。需要在文档里告知:# 显式开启外部服务测试 SKIP_EXTERNAL_SERVICES=false uv run pytest src/tests/integration/test_flows_integration.py -v