Skip to content

Conversation

@harshhsahu
Copy link
Collaborator

No description provided.

husainhackerrank-afk and others added 30 commits December 29, 2025 17:04
…or_dev

fix: include message_id in error responses for chat function
… save conversations to Hippocampus using `agent_id`.
…r_chabot_dev

feat: integrate Hippocampus API for saving chatbot conversations
refactor: remove environment variable for Hippocampus API URL and use…
feat: Add Firecrawl web crawling built-in tool with configurable API …
…search

fix: Correctly retrieve 'chatbot_auto_answers' from bridge_data inste…
…e2 service (#1143)

Co-authored-by: Husain Baghwala <husainhackerrank@gmail.com>
* feat: Enhance logging for saving conversations to Hippocampus in Queue2 service

* fix: Update logging message for saving conversations to Hippocampus in Queue2 service

* fix: Correct logging message for saving conversations to Hippocampus in Queue2 service

---------

Co-authored-by: Husain Baghwala <husainhackerrank@gmail.com>
…aving conversations to Hippocampus (#1145)

Co-authored-by: Husain Baghwala <husainhackerrank@gmail.com>
…_request_data_and_publish_sub_queue (#1148)

Co-authored-by: Husain Baghwala <husainhackerrank@gmail.com>
…delRouter and common services (#1152)

Co-authored-by: Husain Baghwala <husainhackerrank@gmail.com>
Co-authored-by: adityajunwal <adityakjunwal@gmail.com>
… 'images' key over 'user_urls' (#1154)

Co-authored-by: Husain Baghwala <husainhackerrank@gmail.com>
…h_tools_and_apikeys function (#1155)

Co-authored-by: Husain Baghwala <husainhackerrank@gmail.com>
#1141)

* feat: Integrate Hippocampus API for document retrieval and update parameters in RAG tool

* refactor: Update document handling in ConfigurationServices and improve descriptions in utils

* feat: Add owner_id parameter to RAG queries for multi-tenant document isolation

---------

Co-authored-by: Husain Baghwala <husainhackerrank@gmail.com>
Co-authored-by: Natwar589 <natwarrathor961@gmail.com>
ViaSocket-Git and others added 10 commits January 9, 2026 17:20
Co-authored-by: adityajunwal <adityakjunwal@gmail.com>
Co-authored-by: Husain Baghwala <89205542+Husainbw786@users.noreply.github.com>
…variable in GCP upload service

- Remove conditional checks for params and latency object creation
- Fix error response to use 'error' key instead of 'modelResponse'
- Add structured error handling for chained exceptions with combined error messages
- Include message_id in error objects for better tracking
- Add playground response format support for error responses
- Initialize filename variable in uploadDoc to prevent UnboundLocalError
…iables

- Added validation for batch_variables to ensure they are an array and match the length of the batch.
- Implemented variable replacement in prompts for each batch message, handling missing variables appropriately.
- Updated services to utilize processed prompts instead of the original system prompt.
- Enhanced error reporting for missing variables across batch items.
@walkover101
Copy link

Rejected.

Why (short):

  • Violates the rule: "All new code must be TypeScript" — this patch adds dozens of new Python modules. Unacceptable.
  • Inconsistent naming & casing (get_text_from_vectorsQuery, ownerId, inbuild_tools, Gtwy_Web_Search). Many identifiers are not self-explanatory or follow project conventions.
  • Duplicate batch implementations (OpenAI/Groq/Mistral/Gemini/Anthropic/Mistral/etc.) — copy/paste everywhere. Should be a single reusable Batch interface + provider adapters.
  • Missing/unclear dependencies and imports (Config, index, OpenAIEmbeddings, logger, client, uploadDoc signatures, fetch implementation). This will break at runtime.
  • Error handling/logging leaks and is inconsistent: raw tracebacks/print statements, printing exceptions with potential secrets, inconsistent HTTPException usage.
  • Security/concurrency issues: Redis lock is naive (set/delete race). Should use unique token + safe release (LUA) or Redlock library.
  • API contract inconsistencies: mixing bytes, URLs, base64 for uploads without clear, typed helpers. uploadDoc usage is inconsistent.
  • Naming fixes are partial (Anthropic rename ok), but many new files and exports lack clear purpose statements and types.
  • No tests, no type checking, no docs for new endpoints or behaviors (hippocampus integration, Firecrawl, built-in tools, owner_id logic).

What to do next (actionable):

  1. Revert Python additions. Implement new features in TypeScript as required by project policy.
  2. Create a single Batch abstraction (TypeScript) and provider adapters for each LLM/batch service — share upload/download/monitor logic.
  3. Standardize naming: snake_case for functions, camelCase for variables if project style requires, PascalCase for classes. Ensure every exported name is self-explanatory or has a one-line purpose.
  4. Replace naive Redis lock with proven pattern (unique token + compare-and-delete or Redlock). Add types.
  5. Centralize file upload/download helpers with typed interfaces; ensure consistent handling of url vs bytes vs base64.
  6. Remove prints; use structured logger and redact secrets.
  7. Add unit + integration tests for new features and CI type checks.
  8. Provide missing imports/types and validate runtime globals (Config, index, OpenAIEmbeddings, client, logger).

Fix these and resubmit. Fast, focused, no fluff.

@walkover101
Copy link

Reject.

Why — short and brutal:

  • You added a lot of new Python code. Developer rule: "All new code must be written in TypeScript." This alone is a hard blocker.
  • You removed the orchestrator implementation (big chunk) without a clear replacement — likely breaks orchestration features across the app.
  • get_text_from_vectorsQuery:
    • Hardcoded hippocampus URL; should come from Config/env.
    • Uses Config.HIPPOCAMPUS_API_KEY and fetch — ensure Config is imported and fetch signature matches (you didn't show imports).
    • Changed tool schema and removed MongoDB lookups; but tool clients (LLM functions / tool definitions) may still expect the old Document_id flow. Inconsistent contract.
  • Tool schema change for get_knowledge_base_data: required params changed to resource_id and collection_id — must ensure LLM function mapping and the actual functions that implement that tool are updated to match.
  • Image pipeline:
    • New gemini_image_utils is Python new code (violates TS rule).
    • handle_imagen_generation / handle_gemini_generation rely on google.genai types — verify API compatibility with bumped google-genai version.
  • Unconditional latency creation in image handler: create_latency_object(timer, params) is now called without checking params exist — possible NameError.
  • Error handling changes sometimes send result.get("error") where previous behavior used modelResponse — could return unexpected payloads.
  • Mongo/MongoId conversion logic in aggregation updated — check all input shapes because you now branch on "$$doc_id" being an object expecting resource_id field. Risk of pipeline failures if earlier shape persists.
  • Minor: file preview blacklist is aggressive; consider driving from accepted list instead of long blacklist.

Actionable fixes (do these, then resubmit):

  1. Convert all newly added modules/files to TypeScript per policy OR mark exception with clear justification and get approval.
  2. Ensure Config and fetch are imported where used; move hippocampus URL to Config.
  3. Reconcile tool schema change: update the tool implementation get_knowledge_base_data to accept resource_id & collection_id, and update any LLM tool definitions to match.
  4. Restore or replace orchestrator flow intentionally — don’t partially delete it. If removing, remove all references and tests that expect it.
  5. Guard create_latency_object calls with existence checks for params/local variable.
  6. Run an integration test for RAG tool calls and image generation (Gemini/Imagen) with the new google-genai version.
  7. Tighten file preview logic: prefer allow-list, or at least log/return consistent errors.

Fix the TS-rule violation first. Resubmit when the above are addressed.

…on mapping and updating query handling

- Updated `get_text_from_vectorsQuery` to accept an optional `resource_to_collection_mapping` parameter.
- Modified the logic to retrieve `collection_id` based on `resource_id` from the mapping.
- Adjusted the payload construction to include `collectionId` only if available.
- Cleaned up the `add_rag_tool` function to create a mapping of `resource_id` to `collection_id` and removed the requirement for `collection_id` in the tool's parameters.
- Updated the `Helper` class to remove references to `collection_id` in the prompt generation.
- Updated the variable name from `collection_description` to `resource_description` for better clarity in the prompt generation logic.
- Added import for the `resource` module to maintain consistency in the code structure.
@walkover101
Copy link

Rejected.

Fixes required before approval:

  1. Major policy break — new code added in Python. The project rule says "All new code must be written in TypeScript." You added new Python module(s) and Python changes. Convert new modules/logic to TypeScript or get explicit exception.

  2. Naming / consistency issues

    • get_text_from_vectorsQuery: camelCase in Python. Rename to snake_case (get_text_from_vectors_query) for consistency and clarity. Ensure all callers updated.
    • ownerId vs owner_id: inconsistent variable naming inside get_text_from_vectorsQuery. Use owner_id everywhere.
    • add_rag_tool description text is unclear/grammatically wrong. Make it explicit: "Use this function to fetch resource content by resource_id from the knowledge base."
    • rag_data vs doc_ids mismatch: you changed rag_data source to bridge.get('doc_ids') but other code expects rag_data. Either keep naming consistent or propagate name change everywhere.
  3. Unused / extraneous imports

    • Added import resource (crypto file) but it is unused. Remove it.
  4. Robustness / error handling

    • get_text_from_vectorsQuery assumes fetch returns (api_response, headers). Make return contract explicit and add type checks and retries/fail paths.
    • file_processing: rejecting a broad list of extensions is OK but message hard-coded; consider centralizing allowed-preview list and tests.
  5. Backwards compatibility

    • You removed orchestrator_chat implementations and references in some places — ensure no other modules still call it (search/CI tests). If removal intentional, update docs and remove dead imports.

Make these changes, convert new code to TypeScript (or get approved exception), and re-submit. Short. Fix it.

@harshhsahu harshhsahu merged commit f986884 into master Jan 15, 2026
4 checks passed
@walkover101
Copy link

Rejected.

Why (short and brutal):

  • You added a bunch of new modules/logic in Python — but the rule in this repo for all new code is TypeScript. That alone is a hard stop.
  • Naming inconsistencies that will cause runtime/contract bugs: owner_id vs ownerId, resource_id vs resourceId, payload keys vs function parameters — unify to one style and document the contract.
  • You changed the RAG tool contract (Document_id → resource_id) but didn’t update all call sites/consumers — this will break callers. Tool description text and prompt generation are inconsistent with how the data is looked up.
  • Large orchestration chunk removed (orchestrator_chat/orchestrator_agent_chat). If intentional, must be justified and reflected across the codebase and tests. If accidental, revert.
  • Error handling now returns combined raw exception strings to clients (may leak internals). Use structured errors and logging instead.
  • New utility file lacks TypeScript typing, exports, and tests. Also camelCase vs snake_case mismatch across modules.
  • Minor but important: ensure fetch/hippocampus contract (headers, payload fields, returned shape) is explicitly typed and validated.

What I want to see to approve (actionable checklist):

  1. Convert all newly added files to TypeScript with proper types/exports. No new Python files.
  2. Standardize naming (recommend snake_case for internal Python-style if you keep any Python — but repo rule is TypeScript). Pick one: owner_id OR ownerId — apply everywhere.
  3. Update and document the RAG tool contract (field names, required params), and update every call site.
  4. Restore or explicitly remove orchestration flows with tests and a clear migration note.
  5. Replace raw exception strings sent to users with structured error objects; log internals separately.
  6. Add unit/integration tests for:
    • get_text_from_vectorsQuery against mocked hippocampus response
    • gemini/imagen image flow (or TS equivalent)
    • RAG tool interface changes
  7. Run lint/typecheck and include output in PR.

Fix those, resubmit. Fast, precise, and no excuses.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants