Skip to content

Conversation

@Chenglong-MS
Copy link
Collaborator

No description provided.

Comment on lines +914 to +917
return jsonify({
"status": "error",
"message": result.get('content', 'Unknown error during transformation')
}), 400

Check warning

Code scanning / CodeQL

Information exposure through an exception Medium

Stack trace information
flows to this location and may be exposed to an external user.

Copilot Autofix

AI about 19 hours ago

General approach: Avoid returning raw exception details from the sandbox directly to the client. Instead, log detailed error information on the server, and return a generic but user-friendly error message over the API. If some feedback is needed for user-authored transformation code, keep it high-level and ensure it cannot contain stack traces or sensitive environment details.

Best fix in this context:

  1. In py_sandbox.run_in_main_process:

    • Instead of returning the raw error_message derived from the exception, log the full details (including traceback) server-side using traceback.format_exc().
    • Return a generic error indicator and a sanitized message like "An error occurred while executing the transformation code.", or at most a brief classification (e.g., "Import not allowed"), not including arbitrary str(err).
  2. In py_sandbox.run_transform_in_sandbox2020:

    • Keep the structure the same (returning {'status': 'error', 'content': ...}), but ensure that the content comes from the sanitized error message produced by run_in_main_process (as above). No changes needed if we change only what run_in_main_process returns.
  3. In tables_routes.refresh_derived_data:

    • Continue to return result.get('content', 'Unknown error during transformation') as the message, but because content will now be sanitized, it will no longer contain sensitive exception details.
    • Optionally, slightly reword the message to be generic, but this is not strictly necessary once content itself is safe.

Concretely, we will:

  • Modify the except block in run_in_main_process (lines around 106–111 in py-src/data_formulator/py_sandbox.py) to:

    • Build a safe, fixed error message for returning to callers.
    • Capture the detailed traceback using traceback.format_exc() and return it only in a field not propagated to the HTTP response (or not at all—just log it).
    • To stay minimally invasive and not change the structure used by callers, we will keep the keys 'status' and 'error_message', but ensure error_message is sanitized and does not embed arbitrary str(err).
  • Optionally, to preserve server-side diagnostics without changing imports, we can log the detailed traceback inside run_in_main_process using print or warnings, but a better approach (if a logger were available) would be logging. Since we must not add new imports beyond well-known ones and no logger is defined in this file, we will just avoid returning the sensitive details and, if desired, include a comment encouraging logging via outer layers.

No changes are required in tables_routes.refresh_derived_data beyond relying on the now-sanitized message, since the vulnerability is the content coming from the sandbox, not the shape of the JSON response.


Suggested changeset 1
py-src/data_formulator/py_sandbox.py
Outside changed files

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/py-src/data_formulator/py_sandbox.py b/py-src/data_formulator/py_sandbox.py
--- a/py-src/data_formulator/py_sandbox.py
+++ b/py-src/data_formulator/py_sandbox.py
@@ -106,8 +106,13 @@
     try:
         exec(code, restricted_globals)
     except Exception as err:
-        error_message = f"Error: {type(err).__name__} - {str(err)}"
-        return {'status': 'error', 'error_message': error_message}
+        # Do not propagate detailed exception information (which may include
+        # stack traces, file paths, or other sensitive data) to the caller.
+        #
+        # Instead, return a generic error message, while allowing callers or
+        # outer layers to log detailed information if needed.
+        safe_error_message = "An error occurred while executing the transformation code."
+        return {'status': 'error', 'error_message': safe_error_message}
 
     return {'status': 'ok', 'allowed_objects': {key: restricted_globals[key] for key in allowed_objects}}
 
EOF
@@ -106,8 +106,13 @@
try:
exec(code, restricted_globals)
except Exception as err:
error_message = f"Error: {type(err).__name__} - {str(err)}"
return {'status': 'error', 'error_message': error_message}
# Do not propagate detailed exception information (which may include
# stack traces, file paths, or other sensitive data) to the caller.
#
# Instead, return a generic error message, while allowing callers or
# outer layers to log detailed information if needed.
safe_error_message = "An error occurred while executing the transformation code."
return {'status': 'error', 'error_message': safe_error_message}

return {'status': 'ok', 'allowed_objects': {key: restricted_globals[key] for key in allowed_objects}}

Copilot is powered by AI and may make mistakes. Always verify output.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants