[DAPS-1857] python client schema support 2#1895
Conversation
…RNL/DataFed into 1857-DAPS-python-client-schema-support
Co-authored-by: sourcery-ai[bot] <58596630+sourcery-ai[bot]@users.noreply.github.com>
Co-authored-by: sourcery-ai[bot] <58596630+sourcery-ai[bot]@users.noreply.github.com>
Reviewer's GuideAdds full schema management support across the Python client, CLI, server, and Foxx schema router, including JSON validation helpers, schema/metadata operations, stricter schema ID/version handling, richer replies for schema create/revise, and comprehensive end-to-end tests for schemas and schema-enforced records. Sequence diagram for schema creation via CLI and serversequenceDiagram
actor User
participant CLI as CLI_schema_commands
participant CmdLib as CommandLib
participant MAPI as MessageAPI
participant Server as ClientWorker
participant DBAPI as DatabaseAPI
participant Foxx as schema_router
participant DB as ArangoDB
User->>CLI: datafed schema create ID --definition-file file.json
CLI->>CmdLib: schemaCreate(schema_id, definition=None, definition_file, ...)
CmdLib->>CmdLib: _load_schema_file()
CmdLib->>CmdLib: _validate_json(definition, Schema definition)
CmdLib->>MAPI: sendRecv(SchemaCreateRequest)
MAPI->>Server: SchemaCreateRequest
Server->>Server: procSchemaCreateRequest()
Server->>DBAPI: schemaCreate(request, SchemaDataReply)
DBAPI->>DBAPI: build JSON payload (id, def, desc, pub, sys)
DBAPI->>Foxx: POST /schema/create
Foxx->>Foxx: parseSchemaId(req.body.id)
Foxx->>Foxx: stripSchemaIdVersion(obj) before save
Foxx->>DB: g_db.sch.save(obj)
DB-->>Foxx: stored schema (id, ver)
Foxx->>Foxx: set sch.id = parsed.id:parsed.ver
Foxx-->>DBAPI: JSON schema array
DBAPI->>DBAPI: setSchemaDataReply(reply, result)
DBAPI-->>Server: SchemaDataReply
Server-->>MAPI: SchemaDataReply
MAPI-->>CmdLib: SchemaDataReply
CmdLib-->>CLI: SchemaDataReply
CLI->>CLI: _generic_reply_handler(reply, _print_schema)
CLI-->>User: Printed schema with definition and refs
Class diagram for updated Python client and C++ server schema APIsclassDiagram
class CommandLib {
+schemaCreate(schema_id, definition, definition_file, description, public, system)
+schemaRevise(schema_id, definition, definition_file, description, public, system)
+schemaUpdate(schema_id, new_id, definition, definition_file, description, public, system)
+schemaView(schema_id, resolve)
+schemaSearch(schema_id, text, owner, sort, sort_rev, offset, count)
+schemaDelete(schema_id)
+metadataValidate(schema_id, metadata, metadata_file)
-_load_schema_file(filepath)
-_validate_json(json_str, label)
-_mapi
}
class MessageAPI {
+sendRecv(message)
}
CommandLib --> MessageAPI : uses
class ClientWorker {
+procSchemaCreateRequest(a_uid, msg_request, log_context)
+procSchemaReviseRequest(a_uid, msg_request, log_context)
-m_db_client
}
class DatabaseAPI {
+schemaCreate(request, reply, log_context)
+schemaRevise(request, reply, log_context)
+schemaView(a_id, result, log_context)
+schemaUpdate(request, log_context)
}
ClientWorker --> DatabaseAPI : uses
class SchemaCreateRequest {
+string id
+string def
+string desc
+bool pub
+bool sys
}
class SchemaReviseRequest {
+string id
+string def
+string desc
+bool pub
+bool sys
}
class SchemaUpdateRequest {
+string id
+string id_new
+string def
+string desc
+bool pub
+bool sys
}
class SchemaDataReply {
+repeated SchemaData schema
}
class SchemaData {
+string id
+int ver
+string own_id
+string own_nm
+bool pub
+bool depr
+int cnt
+string desc
+string def
+repeated SchemaRef uses
+repeated SchemaRef used_by
}
class SchemaRef {
+string id
+int ver
}
ClientWorker --> SchemaCreateRequest
ClientWorker --> SchemaReviseRequest
ClientWorker --> SchemaUpdateRequest
ClientWorker --> SchemaDataReply
DatabaseAPI --> SchemaCreateRequest
DatabaseAPI --> SchemaReviseRequest
DatabaseAPI --> SchemaUpdateRequest
DatabaseAPI --> SchemaDataReply
File-Level Changes
Possibly linked issues
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Hey - I've found 4 issues, and left some high level feedback:
- The new helper
_load_schema_fileinCommandLib.pyis indented one level too deep (docstring and body are over-indented relative to the method definition), which will cause a syntax/indentation error; please fix the indentation so the body aligns under thedef. - In
test_api_schema.py,test_schema_create_both_definition_sourcesis declared twice in succession, which is invalid Python syntax and will prevent the test module from importing; remove the duplicatedefline. - In
schema_router.jsthe create route setssch.id = parsed.id + ':' + parsed.ver, but the updated unit test expectsidto include the actual stored version (e.g.,...:1); consider usingsch.verinstead ofparsed.verwhen composing the returnedidso the ID and version are consistent and match the tests.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The new helper `_load_schema_file` in `CommandLib.py` is indented one level too deep (docstring and body are over-indented relative to the method definition), which will cause a syntax/indentation error; please fix the indentation so the body aligns under the `def`.
- In `test_api_schema.py`, `test_schema_create_both_definition_sources` is declared twice in succession, which is invalid Python syntax and will prevent the test module from importing; remove the duplicate `def` line.
- In `schema_router.js` the create route sets `sch.id = parsed.id + ':' + parsed.ver`, but the updated unit test expects `id` to include the actual stored version (e.g., `...:1`); consider using `sch.ver` instead of `parsed.ver` when composing the returned `id` so the ID and version are consistent and match the tests.
## Individual Comments
### Comment 1
<location path="core/database/foxx/api/schema_router.js" line_range="53-62" />
<code_context>
+function parseSchemaId(schId) {
</code_context>
<issue_to_address>
**issue (bug_risk):** parseSchemaId return value contradicts its own doc and downstream checks, and likely never returns `null` for `ver`.
JSDoc documents `ver` as `null` when no version suffix is present, but `parseSchemaId` returns `{ id: schId, ver: 0 }` when there is no colon. Call sites then check `parsed.ver === null` (e.g., revise/update/delete/view routes) to detect a missing version, which can never be true. Please align the implementation with the documented contract by returning `{ id: schId, ver: null }` when `colonCount === 0`, and update the create handler to interpret `null` as “no version provided” (enforcing `0` where appropriate). Right now the implementation contradicts the docstring and makes the `=== null` checks effectively dead code.
</issue_to_address>
### Comment 2
<location path="python/datafed_pkg/datafed/CommandLib.py" line_range="366-367" />
<code_context>
+
+ Returns
+ -------
+ msg : AckReply Google protobuf message
+ Response from DataFed
+
+ Raises
</code_context>
<issue_to_address>
**issue (bug_risk):** The documented reply type for schemaCreate/schemaRevise no longer matches the server-side implementation.
These docstrings still advertise an `AckReply`, but the server handlers now return `SchemaDataReply` populated via `DatabaseAPI::schemaCreate/schemaRevise`. Please update the documented return type, and double‑check all callers (e.g., CLI printing functions) to ensure they correctly handle `SchemaDataReply` or adjust the server to preserve the expected `AckReply` contract.
</issue_to_address>
### Comment 3
<location path="tests/end-to-end/test_api_schema.py" line_range="133" />
<code_context>
+
+ self.assertIn("Must specify", str(ctx.exception))
+
+ def test_schema_create_both_definition_sources(self):
+ def test_schema_create_both_definition_sources(self):
+ """Cannot specify both definition and definition_file."""
</code_context>
<issue_to_address>
**issue (bug_risk):** Duplicate `test_schema_create_both_definition_sources` definition will cause a syntax error and prevent the test module from running.
Only the second `def` has a body, so the first is a stray line. Remove the extra `def test_schema_create_both_definition_sources(self):` so the function is defined once, immediately followed by its docstring and body.
</issue_to_address>
### Comment 4
<location path="tests/end-to-end/test_api_schema.py" line_range="467-470" />
<code_context>
+ suite.addTest(TestDataFedPythonAPISchemaCRUD("test_schema_search"))
+ suite.addTest(TestDataFedPythonAPISchemaCRUD("test_schema_public_flag"))
+ suite.addTest(TestDataFedPythonAPISchemaCRUD("test_metadata_validate_pass"))
+ suite.addTest(TestDataFedPythonAPISchemaCRUD("test_metadata_validate_fail"))
+ suite.addTest(TestDataFedPythonAPISchemaCRUD("test_metadata_validate_client_rejects_bad_json"))
+ suite.addTest(TestDataFedPythonAPISchemaCRUD("test_metadata_validate_requires_input"))
+ suite.addTest(TestDataFedPythonAPISchemaCRUD("test_schema_create_from_file"))
+ runner = unittest.TextTestRunner()
+ result = runner.run(suite)
</code_context>
<issue_to_address>
**issue (testing):** The `test_metadata_validate_metadata_file_cannot_be_opened` test is defined but never added to the explicit test suite.
Because the `unittest.TestSuite` is built manually, `test_metadata_validate_metadata_file_cannot_be_opened` will never run unless it’s added to the suite. Please include it, e.g.:
```python
suite.addTest(TestDataFedPythonAPISchemaCRUD("test_metadata_validate_metadata_file_cannot_be_opened"))
```
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| msg : AckReply Google protobuf message | ||
| Response from DataFed |
There was a problem hiding this comment.
issue (bug_risk): The documented reply type for schemaCreate/schemaRevise no longer matches the server-side implementation.
These docstrings still advertise an AckReply, but the server handlers now return SchemaDataReply populated via DatabaseAPI::schemaCreate/schemaRevise. Please update the documented return type, and double‑check all callers (e.g., CLI printing functions) to ensure they correctly handle SchemaDataReply or adjust the server to preserve the expected AckReply contract.
Description
This PR attempts to address concerns raised during code review.
Summary by Sourcery
Add full schema lifecycle support and validation to the Python client, CLI, and backend, including versioned schema IDs and metadata validation, with comprehensive end-to-end tests.
New Features:
Bug Fixes:
Enhancements:
Tests: