diff --git a/RPM-CVE-CHECKER.md b/RPM-CVE-CHECKER.md index 0b245c963..4907885de 100644 --- a/RPM-CVE-CHECKER.md +++ b/RPM-CVE-CHECKER.md @@ -119,19 +119,19 @@ Controls which build system is used for fetching SRPMs and build logs: ```yaml cve_source_acquisition: _type: cve_source_acquisition - rpm_user_type: ${RPM_USER_TYPE:-internal} + rpm_user_type: ${RPM_USER_TYPE:-external} cve_package_code_agent: _type: cve_package_code_agent - rpm_user_type: ${RPM_USER_TYPE:-internal} + rpm_user_type: ${RPM_USER_TYPE:-external} ``` | Profile | Hub | Build Logs | Use Case | |---------|-----|------------|----------| +| `external` | Fedora Koji (`koji.fedoraproject.org`) | Not fetched | Fedora packages (public access, default) | | `internal` | Red Hat Brew (`brewhub.engineering.redhat.com`) | Auto-fetched | RHEL packages (requires VPN) | -| `external` | Fedora Koji (`koji.fedoraproject.org`) | Not fetched | Fedora packages (public access) | -**Environment variable override:** Set `RPM_USER_TYPE=external` to override the config file value. +**Environment variable override:** Set `RPM_USER_TYPE=internal` to switch to Red Hat VPN mode. ### Cache Directories diff --git a/kustomize/README.md b/kustomize/README.md index 373017ef7..db39717d8 100644 --- a/kustomize/README.md +++ b/kustomize/README.md @@ -182,6 +182,11 @@ export CALLBACK_URL="https://exploit-iq-client.$(oc project -q).svc:8443" find . -type f -name 'exploit-iq-config.yml' -exec sed -i "s|CALLBACK_URL_PLACEHOLDER|$CALLBACK_URL|g" {} + ``` +### Step 8. Create randomized symmetric 32 bytes encryption key + +```shell + echo -n "credential-encryption-key=$(openssl rand -hex 32)" > base/credential-encryption-key.env +``` --- ## Selecting a Deployment Variant @@ -195,7 +200,54 @@ Exploit Intelligence supports the following deployment variants. Run only one de | Remote NIM | `remote-nim-all` | NVIDIA-hosted NIM | You use NVIDIA-hosted inference | --- +### Configure a Custom Git Server CA TEMP Solution + +Complete this step if your Git server uses a certificate signed by a custom Certificate Authority (CA). + +> [!IMPORTANT] +> Complete this step if you access Red Hat internal Git repositories such as `gitlab.cee.redhat.com`. + +**1.** Create the certificate directory: +```shell +mkdir -p base/ca-certs +``` + +**2.** Obtain your CA certificates. + +For Red Hat internal Git repositories: + +```shell +curl -o base/ca-certs/internal-root-ca.pem \ + https://certs.corp.redhat.com/certs/2022-IT-Root-CA.pem + +curl -o base/ca-certs/rhcs-intermediate-ca.crt \ + https://certs.corp.redhat.com/chains/rhcs-ca-chain-2022-self-signed.crt +``` + +For other custom CAs: + +```shell +cp /path/to/your-custom-ca.pem base/ca-certs/ +``` + +**3.** Create the CA bundle: + +The bundle must include both system CA certificates (for public endpoints) and custom CA certificates (for internal services). This ensures the application can connect to both public services (such as package registries) and internal services (such as Red Hat Git repositories and OSIDB). + +```shell +cat /etc/ssl/certs/ca-certificates.crt base/ca-certs/*.{pem,crt} > base/ca-certs/ca-bundle.crt +``` + +**4.** Verify the bundle: + +```shell +openssl crl2pkcs7 -nocrl -certfile base/ca-certs/ca-bundle.crt | \ + openssl pkcs7 -print_certs -noout +``` + + +--- ## Deploying Exploit Intelligence ### Deploy with a Self-Hosted LLM @@ -344,55 +396,37 @@ oc set env deployment -l component=exploit-iq-client \ QUARKUS_SMALLRYE_OPENAPI_SERVERS=https://$(oc get route exploit-iq-client -o=jsonpath='{..spec.host}') ``` -### Configure a Custom Git Server CA - -Complete this step if your Git server uses a certificate signed by a custom Certificate Authority (CA). - -> [!IMPORTANT] -> Complete this step if you access Red Hat internal Git repositories such as `gitlab.cee.redhat.com`. - -**1.** Create the certificate directory: - -```shell -mkdir -p base/ca-certs -``` +### Configure for Red Hat VPN Environment (Internal Profile) -**2.** Obtain your CA certificates. +By default, the deployment uses the `external` profile which connects to Fedora public Koji servers without requiring VPN. For Red Hat internal environments with VPN access, switch to the `internal` profile to enable: -For Red Hat internal Git repositories: +- Red Hat Brew server access (`brewhub.engineering.redhat.com`) +- OSIDB intelligence integration +- Build log auto-fetch -```shell -curl -o base/ca-certs/internal-root-ca.pem \ - https://certs.corp.redhat.com/certs/2022-IT-Root-CA.pem +**Requirements for internal profile:** -curl -o base/ca-certs/rhcs-intermediate-ca.crt \ - https://certs.corp.redhat.com/chains/rhcs-ca-chain-2022-self-signed.crt -``` +- Active Red Hat VPN connection +- SSL certificates configured (see [Configure a Custom Git Server CA](#configure-a-custom-git-server-ca) below) -For other custom CAs: +To switch to the internal profile after deployment: ```shell -cp /path/to/your-custom-ca.pem base/ca-certs/ +oc set env deployment/exploit-iq-service RPM_USER_TYPE=internal -n $YOUR_NAMESPACE_NAME ``` -**3.** Create the CA bundle: +To verify the current profile setting: ```shell -cat base/ca-certs/*.{pem,crt} > base/ca-certs/ca-bundle.crt +oc get deployment/exploit-iq-service -o jsonpath='{.spec.template.spec.containers[0].env[?(@.name=="RPM_USER_TYPE")].value}' -n $YOUR_NAMESPACE_NAME ``` -**4.** Verify the bundle: +To switch back to the external profile (default): ```shell -openssl crl2pkcs7 -nocrl -certfile base/ca-certs/ca-bundle.crt | \ - openssl pkcs7 -print_certs -noout +oc set env deployment/exploit-iq-service RPM_USER_TYPE=external -n $YOUR_NAMESPACE_NAME ``` -Then redeploy: - -```shell -oc kustomize overlays/ | oc apply -f - -n $YOUR_NAMESPACE_NAME -``` ### Configure the OAuth Endpoint CA diff --git a/kustomize/base/ca-certs/README.md b/kustomize/base/ca-certs/README.md index 6b6ea70b6..d39795131 100644 --- a/kustomize/base/ca-certs/README.md +++ b/kustomize/base/ca-certs/README.md @@ -1,9 +1,14 @@ # Git SSL CA Certificates Directory -This directory contains Certificate Authority (CA) certificates for Git SSL verification. +This directory contains Certificate Authority (CA) certificates for SSL verification. ## Purpose +This directory stores CA certificates used by the Exploit Intelligence application for SSL/TLS verification. The `ca-bundle.crt` file must contain: + +1. **System CA certificates** - Required for connecting to public endpoints (package registries, external APIs) +2. **Custom CA certificates** - Required for internal services (Red Hat Git repositories, OSIDB, Brew) + If your Git server uses certificates signed by a custom CA, place the CA certificate files in this directory. The deployment process creates a ConfigMap from these certificates and mounts it to the ExploitIQ pod. ## Instructions @@ -17,6 +22,16 @@ For Red Hat internal Git repositories (gitlab.cee.redhat.com), download: - Root CA: - Intermediate CA: +## Creating the CA Bundle + +After downloading custom CA certificates, create the merged bundle: + +```shell +cat /etc/ssl/certs/ca-certificates.crt *.{pem,crt} > ca-bundle.crt +``` + +This includes both system CAs (for public endpoints) and custom CAs (for internal services). + ## Security **Do not commit certificate files to version control.** These files are gitignored to prevent accidental exposure of internal CA certificates in public repositories. diff --git a/kustomize/base/exploit-iq-config.yml b/kustomize/base/exploit-iq-config.yml index e5a9363bd..e3c9658ad 100644 --- a/kustomize/base/exploit-iq-config.yml +++ b/kustomize/base/exploit-iq-config.yml @@ -46,7 +46,7 @@ functions: ignore_code_embedding: true cve_fetch_intel: _type: cve_fetch_intel - rpm_user_type: ${RPM_USER_TYPE:-internal} + rpm_user_type: ${RPM_USER_TYPE:-external} retry_on_client_errors: false intel_plugin_config: plugin_name: vuln_analysis.data_models.plugins.intel_plugin.SimpleHttpIntelPlugin @@ -184,7 +184,7 @@ functions: base_pickle_dir: ${EXPLOIT_IQ_DATA_DIR:-/exploit-iq-data/}pickle base_rpm_dir: ${EXPLOIT_IQ_DATA_DIR:-/exploit-iq-data/}rpms base_checker_dir: ${EXPLOIT_IQ_DATA_DIR:-/exploit-iq-data/}checker - rpm_user_type: ${RPM_USER_TYPE:-internal} + rpm_user_type: ${RPM_USER_TYPE:-external} cve_checker_segmentation: _type: cve_checker_segmentation base_checker_dir: ${EXPLOIT_IQ_DATA_DIR:-/exploit-iq-data/}checker @@ -194,7 +194,7 @@ functions: llm_name: cve_agent_executor_llm base_checker_dir: ${EXPLOIT_IQ_DATA_DIR:-/exploit-iq-data/}checker base_code_index_dir: ${EXPLOIT_IQ_DATA_DIR:-/exploit-iq-data/}code_index - rpm_user_type: ${RPM_USER_TYPE:-internal} + rpm_user_type: ${RPM_USER_TYPE:-external} tool_names: - Source Grep - Code Keyword Search diff --git a/kustomize/base/exploit_iq_service.yaml b/kustomize/base/exploit_iq_service.yaml index 2f99c7411..e4e3de7ac 100644 --- a/kustomize/base/exploit_iq_service.yaml +++ b/kustomize/base/exploit_iq_service.yaml @@ -137,7 +137,7 @@ spec: - name: EXPLOIT_IQ_DATA_DIR value: /exploit-iq-data/ - name: RPM_USER_TYPE - value: "internal" + value: "external" - name: NAMESPACE valueFrom: fieldRef: @@ -155,6 +155,10 @@ spec: value: https://exploit-iq-client.$(NAMESPACE).svc.cluster.local:8443 - name: JAVA_MAVEN_DEFAULT_SETTINGS_FILE_PATH value: /maven-config/settings.xml + - name: REQUESTS_CA_BUNDLE + value: /app/git-ca-bundle/ca-bundle.crt + - name: SSL_CERT_FILE + value: /app/git-ca-bundle/ca-bundle.crt volumeMounts: - name: config mountPath: /configs diff --git a/kustomize/base/nginx/templates/routes/nim.conf.template b/kustomize/base/nginx/templates/routes/nim.conf.template index 13bd6844a..9c6787d3e 100644 --- a/kustomize/base/nginx/templates/routes/nim.conf.template +++ b/kustomize/base/nginx/templates/routes/nim.conf.template @@ -27,6 +27,8 @@ location ~* ^/nim_embed/v1/embeddings$ { proxy_cache llm_cache; proxy_cache_methods GET POST; proxy_cache_key "$request_method|$request_uri|$request_body"; + proxy_cache_valid 200 201 202 14d; + proxy_cache_valid any 0; add_header X-Cache-Status $upstream_cache_status; client_body_buffer_size 4m; } diff --git a/kustomize/config-http-openai-local.yml b/kustomize/config-http-openai-local.yml index 99cd4c809..7e7fe0789 100644 --- a/kustomize/config-http-openai-local.yml +++ b/kustomize/config-http-openai-local.yml @@ -45,7 +45,7 @@ functions: ignore_code_embedding: true cve_fetch_intel: _type: cve_fetch_intel - rpm_user_type: ${RPM_USER_TYPE:-internal} + rpm_user_type: ${RPM_USER_TYPE:-external} retry_on_client_errors: false intel_plugin_config: plugin_name: vuln_analysis.data_models.plugins.intel_plugin.SimpleHttpIntelPlugin diff --git a/kustomize/overlays/self-hosted-llama3.1-70b-4bit/nginx-patch.yaml b/kustomize/overlays/self-hosted-llama3.1-70b-4bit/nginx-patch.yaml index 8e3fb5e70..68318a632 100644 --- a/kustomize/overlays/self-hosted-llama3.1-70b-4bit/nginx-patch.yaml +++ b/kustomize/overlays/self-hosted-llama3.1-70b-4bit/nginx-patch.yaml @@ -11,6 +11,6 @@ spec: - name: NGINX_UPSTREAM_NIM_LLM value: http://llama3-1-70b-instruct-4bit.exploit-iq-models.svc.cluster.local:8000 - name: NGINX_UPSTREAM_NIM_EMBED - value: http://llama3-1-70b-instruct-4bit.exploit-iq-models.svc.cluster.local:8000 + value: http://nim-embed.exploit-iq-models.svc.cluster.local:8000 - name: NGINX_UPSTREAM_OPENAI value: http://llama3-1-70b-instruct-4bit.exploit-iq-models.svc.cluster.local:8000 diff --git a/src/vuln_analysis/configs/config-http-nim.yml b/src/vuln_analysis/configs/config-http-nim.yml index 23a87e85e..21b19db50 100644 --- a/src/vuln_analysis/configs/config-http-nim.yml +++ b/src/vuln_analysis/configs/config-http-nim.yml @@ -39,7 +39,7 @@ functions: ignore_code_embedding: true cve_fetch_intel: _type: cve_fetch_intel - rpm_user_type: ${RPM_USER_TYPE:-internal} + rpm_user_type: ${RPM_USER_TYPE:-external} intel_plugin_config: plugin_name: vuln_analysis.data_models.plugins.intel_plugin.SimpleHttpIntelPlugin plugin_config: diff --git a/src/vuln_analysis/configs/config-http-openai.yml b/src/vuln_analysis/configs/config-http-openai.yml index 08a5b20e3..6b9853138 100644 --- a/src/vuln_analysis/configs/config-http-openai.yml +++ b/src/vuln_analysis/configs/config-http-openai.yml @@ -46,7 +46,7 @@ functions: ignore_code_embedding: true cve_fetch_intel: _type: cve_fetch_intel - rpm_user_type: ${RPM_USER_TYPE:-internal} + rpm_user_type: ${RPM_USER_TYPE:-external} intel_plugin_config: plugin_name: vuln_analysis.data_models.plugins.intel_plugin.SimpleHttpIntelPlugin plugin_config: @@ -175,7 +175,7 @@ functions: base_git_dir: .cache/am_cache/git base_pickle_dir: .cache/am_cache/pickle base_rpm_dir: .cache/am_cache/rpms - rpm_user_type: ${RPM_USER_TYPE:-internal} + rpm_user_type: ${RPM_USER_TYPE:-external} cve_checker_segmentation: _type: cve_checker_segmentation base_checker_dir: .cache/am_cache/checker @@ -185,7 +185,7 @@ functions: llm_name: cve_agent_executor_llm base_checker_dir: .cache/am_cache/checker base_code_index_dir: .cache/am_cache/code_index - rpm_user_type: ${RPM_USER_TYPE:-internal} + rpm_user_type: ${RPM_USER_TYPE:-external} tool_names: - Source Grep - Code Keyword Search diff --git a/src/vuln_analysis/configs/config-tracing.yml b/src/vuln_analysis/configs/config-tracing.yml index c464451f2..d03cf5b5d 100644 --- a/src/vuln_analysis/configs/config-tracing.yml +++ b/src/vuln_analysis/configs/config-tracing.yml @@ -50,7 +50,7 @@ functions: ignore_code_embedding: true cve_fetch_intel: _type: cve_fetch_intel - rpm_user_type: ${RPM_USER_TYPE:-internal} + rpm_user_type: ${RPM_USER_TYPE:-external} intel_plugin_config: plugin_name: vuln_analysis.data_models.plugins.intel_plugin.SimpleHttpIntelPlugin plugin_config: diff --git a/src/vuln_analysis/configs/config.yml b/src/vuln_analysis/configs/config.yml index f93258a84..c67d346b0 100644 --- a/src/vuln_analysis/configs/config.yml +++ b/src/vuln_analysis/configs/config.yml @@ -38,7 +38,7 @@ functions: base_pickle_dir: .cache/am_cache/pickle cve_fetch_intel: _type: cve_fetch_intel - rpm_user_type: ${RPM_USER_TYPE:-internal} + rpm_user_type: ${RPM_USER_TYPE:-external} cve_process_sbom: _type: cve_process_sbom cve_verify_vuln_package: diff --git a/src/vuln_analysis/functions/base_graph_agent.py b/src/vuln_analysis/functions/base_graph_agent.py index 05ab11387..b21ab2dc1 100644 --- a/src/vuln_analysis/functions/base_graph_agent.py +++ b/src/vuln_analysis/functions/base_graph_agent.py @@ -147,7 +147,7 @@ async def _select_package( image_name = image_input.name image_repo = image_input.source_info[0].git_repo if image_input.source_info else None - matched = _find_image_matching_candidate(candidate_packages, image_name, image_repo) + matched = _find_image_matching_candidate(candidate_packages, image_name, image_repo, ecosystem) if matched: selected_package = matched logger.info("Package filter matched '%s' from image/repo (no LLM call needed, %d candidates)", diff --git a/src/vuln_analysis/functions/cve_agent.py b/src/vuln_analysis/functions/cve_agent.py index ac2a1d00b..e9c559f8c 100644 --- a/src/vuln_analysis/functions/cve_agent.py +++ b/src/vuln_analysis/functions/cve_agent.py @@ -123,7 +123,9 @@ async def _process_steps(agents: dict, routing_llm, steps, semaphore, max_iterat has_reachability = any(r.agent_type == "reachability" for r in routings) if not has_reachability and "reachability" in agents and candidate_packages: - pkg_name = candidate_packages[0].get("name", "") + ecosystem_matched = [p for p in candidate_packages if p.get("ecosystem", "").lower() == ecosystem.lower()] + pkg_source = ecosystem_matched if ecosystem_matched else candidate_packages + pkg_name = pkg_source[0].get("name", "") if pkg_name: synthetic_q = ( f"Is the code from the vulnerable package '{pkg_name}' actually " diff --git a/src/vuln_analysis/functions/react_internals.py b/src/vuln_analysis/functions/react_internals.py index 4a7c74913..4e0d0d5bd 100644 --- a/src/vuln_analysis/functions/react_internals.py +++ b/src/vuln_analysis/functions/react_internals.py @@ -756,18 +756,29 @@ def build_classification_prompt(context_block: str, question: str) -> str: ) +_LANGUAGE_ECOSYSTEMS = frozenset({"go", "python", "java", "javascript"}) + + def _find_image_matching_candidate( candidates: list[dict], image_name: str | None, image_repo: str | None, + ecosystem: str | None = None, ) -> str | None: - """Return the candidate name that appears in the image/repo string, or None.""" + """Return the candidate name that appears in the image/repo string, or None. + + For language ecosystems (Go, Python, Java, JavaScript), RHSA candidates are + skipped because they contain RPM product names rather than library paths. + """ context = " ".join( part.lower() for part in (image_name, image_repo) if part ) if not context: return None + skip_rhsa = ecosystem and ecosystem.lower() in _LANGUAGE_ECOSYSTEMS for c in candidates: + if skip_rhsa and c.get("source") == "rhsa": + continue name = c["name"].lower() if len(name) >= 3 and name in context: return c["name"] diff --git a/src/vuln_analysis/tools/tests/test_transitive_code_search.py b/src/vuln_analysis/tools/tests/test_transitive_code_search.py index 80d3dd8f9..d11f9c22a 100644 --- a/src/vuln_analysis/tools/tests/test_transitive_code_search.py +++ b/src/vuln_analysis/tools/tests/test_transitive_code_search.py @@ -489,7 +489,9 @@ async def test_transitive_search_java_4(): @pytest.mark.asyncio async def test_java_script_transitive_search_1(): - """Test that runs with a real repository""" + """Test that runs with a real repository (clones from GitHub)""" + pytest.skip("Slow integration test - temporarily disabled") + transitive_code_search_runner_coroutine = await get_transitive_code_runner_function() logging.basicConfig(level=logging.DEBUG) @@ -513,7 +515,9 @@ async def test_java_script_transitive_search_1(): @pytest.mark.asyncio async def test_java_script_transitive_search_2(): - """Test that runs with a real repository""" + """Test that runs with a real repository (clones from GitHub)""" + pytest.skip("Slow integration test - temporarily disabled") + transitive_code_search_runner_coroutine = await get_transitive_code_runner_function() logging.basicConfig(level=logging.DEBUG) diff --git a/tests/test_base_graph_agent.py b/tests/test_base_graph_agent.py index 834c84cf7..ea1fe6378 100644 --- a/tests/test_base_graph_agent.py +++ b/tests/test_base_graph_agent.py @@ -628,26 +628,6 @@ def _make_workflow_state(self, image_name="registry.redhat.io/openshift4/ose-doc ws.original_input.input.image = image return ws - @pytest.mark.asyncio - async def test_image_match_skips_llm(self): - """When a candidate name matches the image/repo, LLM is not called.""" - agent = _make_agent() - candidates = [ - {"name": "builder", "source": "rhsa"}, - {"name": "kernel", "source": "rhsa"}, - {"name": "glibc", "source": "rhsa"}, - ] - ws = self._make_workflow_state() - - with patch("vuln_analysis.utils.intel_utils.filter_context_to_package", - side_effect=lambda ctx, pkg, cands: ctx): - ctx, selected = await agent._select_package( - "go", candidates, ["CVE desc"], ws, - ) - - assert selected == "builder" - agent.package_filter_llm.ainvoke.assert_not_called() - @pytest.mark.asyncio async def test_no_match_calls_llm(self): """When no candidate matches the image, LLM is called.""" @@ -673,23 +653,6 @@ async def test_no_match_calls_llm(self): assert selected == "xstream" agent.package_filter_llm.ainvoke.assert_called_once() - @pytest.mark.asyncio - async def test_image_match_with_many_candidates(self): - """1000+ candidates with image match -> LLM skipped, no overflow.""" - agent = _make_agent() - candidates = [{"name": f"rhsa-product-{i}", "source": "rhsa"} for i in range(1200)] - candidates.append({"name": "builder", "source": "rhsa"}) - ws = self._make_workflow_state() - - with patch("vuln_analysis.utils.intel_utils.filter_context_to_package", - side_effect=lambda ctx, pkg, cands: ctx): - ctx, selected = await agent._select_package( - "go", candidates, ["CVE desc"], ws, - ) - - assert selected == "builder" - agent.package_filter_llm.ainvoke.assert_not_called() - @pytest.mark.asyncio async def test_single_candidate_no_llm(self): """Single candidate is used directly without LLM.""" diff --git a/tests/test_react_internals_rules.py b/tests/test_react_internals_rules.py index 73121eb5c..25602d3be 100644 --- a/tests/test_react_internals_rules.py +++ b/tests/test_react_internals_rules.py @@ -416,3 +416,42 @@ def test_first_match_wins(self): "https://github.com/openshift/builder", ) assert result == "openshift" + + def test_rhsa_skipped_for_language_ecosystem(self): + """RHSA candidates are skipped for language ecosystems (go, python, java, javascript).""" + candidates = [ + {"name": "openshift", "source": "rhsa"}, + {"name": "golang.org/x/crypto", "source": "ghsa"}, + ] + result = _find_image_matching_candidate( + candidates, "registry.redhat.io/openshift4/ose-docker-builder", + "https://github.com/openshift/builder", + ecosystem="go", + ) + assert result is None + + def test_rhsa_not_skipped_for_non_language_ecosystem(self): + """RHSA candidates are NOT skipped when ecosystem is not a language ecosystem.""" + candidates = [ + {"name": "openshift", "source": "rhsa"}, + {"name": "golang.org/x/crypto", "source": "ghsa"}, + ] + result = _find_image_matching_candidate( + candidates, "registry.redhat.io/openshift4/ose-docker-builder", + "https://github.com/openshift/builder", + ecosystem="rpm", + ) + assert result == "openshift" + + def test_ghsa_matched_when_rhsa_skipped(self): + """GHSA candidate is matched when RHSA is skipped and GHSA name appears in image.""" + candidates = [ + {"name": "openshift", "source": "rhsa"}, + {"name": "crypto", "source": "ghsa"}, + ] + result = _find_image_matching_candidate( + candidates, "registry.redhat.io/openshift4/crypto-service", + None, + ecosystem="go", + ) + assert result == "crypto"