feat: add AWS AMI deployment infrastructure#1170
Conversation
WalkthroughAdds an automated AMI build system: a new GitHub Actions workflow to deploy/trigger an EC2 Image Builder pipeline, monitor executions, capture AMI metadata and update releases; adds a Go CDK app and stack, CDK config/test files, docker-compose template, test harness script, docs, and module updates. Changes
Sequence Diagram(s)sequenceDiagram
participant Dev as Developer
participant GH as GitHub Actions
participant ReleaseWF as release.yaml (trigger-builds)
participant AMI_WF as ami-build.yml
participant CDK as ami-cdk.go (CDK App)
participant IB as EC2 Image Builder
participant Repo as GitHub API
Dev->>GH: push tag / create release
GH->>ReleaseWF: run build-release (tag)
ReleaseWF->>AMI_WF: workflow_dispatch (ref=tag)
ReleaseWF->>Repo: workflow_dispatch publish-node-image.yaml
AMI_WF->>CDK: check for pipeline stack / deploy if missing
CDK-->>AMI_WF: return Pipeline ARN & exports
AMI_WF->>IB: start image pipeline execution
AMI_WF-->>AMI_WF: poll execution status (<=90 min)
alt success
AMI_WF->>Repo: update release body with AMI details
AMI_WF->>GH: notify-success job
else failure/timeout
AMI_WF->>GH: notify-failure job
end
sequenceDiagram
participant CDK as ami-cdk.go
participant Stack as AmiPipelineStack
participant S3 as S3 Artifacts Bucket
participant IAM as IAM Role & InstanceProfile
participant IB as EC2 Image Builder
CDK->>Stack: instantiate AmiPipelineStack
Stack->>S3: create artifacts bucket & outputs
Stack->>IAM: create Image Builder role & instance profile
Stack->>IB: create Components, Recipe, InfraConfig, Distribution, Image Pipeline
Stack-->>CDK: export PipelineArn, Component/Recipe/Infra/Distribution ARNs, S3BucketName, InstanceProfileArn
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60–90 minutes Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✨ Finishing touches
🧪 Generate unit tests
Tip 👮 Agentic pre-merge checks are now available in preview!Pro plan users can now enable pre-merge checks in their settings to enforce checklists before merging PRs.
Please see the documentation for more information. Example: reviews:
pre_merge_checks:
custom_checks:
- name: "Undocumented Breaking Changes"
mode: "warning"
instructions: |
Pass/fail criteria: All breaking changes to public APIs, CLI flags, environment variables, configuration keys, database schemas, or HTTP/GraphQL endpoints must be documented in the "Breaking Change" section of the PR description and in CHANGELOG.md. Exclude purely internal or private changes (e.g., code not exported from package entry points or explicitly marked as internal).Please share your feedback with us on this Discord post. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Time Submission Status
|
Implement automated AMI building pipeline to reduce developer onboarding from 3+ hours to under 10 minutes. - **AWS CDK Infrastructure**: EC2 Image Builder pipeline with isolated deployment - **Docker-in-AMI**: Pre-configured containers for node, database, and MCP server - **Always-Latest Strategy**: Containers pull latest images on startup - **Simple Configuration**: 3-command setup with `tn-node-configure` script - **Multi-Region Distribution**: AMI available across AWS regions - `deployments/infra/ami-cdk.go` - Isolated AMI deployment application - `deployments/infra/stacks/ami_pipeline_stack.go` - Main CDK stack ✅ Successfully deployed in us-east-2 ✅ Docker containers and MCP server operational Additional issue will be handled separately to limit scope. Removes major adoption barrier by providing one-click deployment alternative to complex manual setup. resolves: #1131
There was a problem hiding this comment.
Actionable comments posted: 10
🧹 Nitpick comments (17)
deployments/infra/ami-cdk.go (1)
31-31: Remove unused variable.The
amiExportsvariable is assigned but never used. Either utilize the exports or remove the assignment.Apply this diff to remove the unused assignment:
- _, amiExports := stacks.AmiPipelineStack( + _, _ = stacks.AmiPipelineStack( app, config.WithStackSuffix(app, "AMI-Pipeline"), &stacks.AmiPipelineStackProps{ StackProps: awscdk.StackProps{Env: testEnv}, }, ) - _ = amiExports // Use exports if needed by other stacks.github/workflows/publish-node-image.yaml (1)
1-1: Consider YAML document start marker necessity.The
---document start marker is optional in YAML and may not be necessary here unless there are specific parsing requirements.deployments/infra/cdk.test.json (1)
40-40: Fix typo in context key.Line 40 contains a typo:
"@aws-cdk/aws-route53-patters:useCertificate"should be"@aws-cdk/aws-route53-patterns:useCertificate".Apply this diff to fix the typo:
- "@aws-cdk/aws-route53-patters:useCertificate": true, + "@aws-cdk/aws-route53-patterns:useCertificate": true,.github/workflows/ami-build.yml (2)
49-68: Shell quoting issues in workflow steps.The static analysis identified several shell quoting issues that could lead to word splitting or variable expansion problems.
Apply these shell safety improvements:
# Check if AMI pipeline stack is deployed - STACK_NAME="TrufNetwork-AMI-Pipeline-dev" + STACK_NAME="TrufNetwork-AMI-Pipeline-dev" if aws cloudformation describe-stacks \ - --stack-name "$STACK_NAME" \ + --stack-name "${STACK_NAME}" \ - --region ${{ env.AWS_REGION }} > /dev/null 2>&1; then + --region "${{ env.AWS_REGION }}" > /dev/null 2>&1; then echo "pipeline_exists=true" >> $GITHUB_OUTPUT # Get pipeline ARN from stack outputs PIPELINE_ARN=$(aws cloudformation describe-stacks \ - --stack-name "$STACK_NAME" \ + --stack-name "${STACK_NAME}" \ - --region ${{ env.AWS_REGION }} \ + --region "${{ env.AWS_REGION }}" \ --query \ 'Stacks[0].Outputs[?OutputKey==`AmiPipelineArn`].OutputValue' \ --output text) - echo "pipeline_arn=$PIPELINE_ARN" >> $GITHUB_OUTPUT + echo "pipeline_arn=${PIPELINE_ARN}" >> $GITHUB_OUTPUT
197-197: Fix command name inconsistency.Line 197 uses
truflation-configurebut should be consistent withtn-node-configureused elsewhere.Apply this diff to fix the command name:
- sudo truflation-configure --network mainnet \\ + sudo tn-node-configure --network mainnet \\scripts/test-ami.sh (4)
111-111: Chain ID inconsistency.The test uses
truflation-testnetbut the AMI components usetn-v2.1as mentioned in the stack definition.Apply this diff to align with the actual chain ID:
- - SETUP_CHAIN_ID=${CHAIN_ID:-truflation-testnet} + - SETUP_CHAIN_ID=${CHAIN_ID:-tn-v2.1}
242-244: Chain ID mapping needs correction.The chain ID mapping logic doesn't align with the actual values used in the AMI configuration.
Apply this diff to correct the chain ID mapping:
if [ "$NETWORK" = "mainnet" ]; then - CHAIN_ID="truflation" + CHAIN_ID="tn-v2.1" else - CHAIN_ID="truflation-testnet" + CHAIN_ID="tn-v2.1" fi
260-261: Update test expectations for chain ID.The grep pattern needs to match the corrected chain ID.
Apply this diff to match the corrected chain ID:
-grep -q "CHAIN_ID=truflation" /tmp/test.env && \ +grep -q "CHAIN_ID=tn-v2.1" /tmp/test.env && \
10-11: Remove unused color variables.The static analysis correctly identified that YELLOW and BLUE variables are defined but never used in the script.
Apply this diff to remove unused variables:
GREEN='\033[0;32m' RED='\033[0;31m' -YELLOW='\033[1;33m' -BLUE='\033[0;34m' NC='\033[0m'deployments/infra/stacks/ami_pipeline_stack.go (8)
97-170: Docker/Compose install is redundant; prefer one Compose path.You install both the plugin (docker compose) and the v1 binary (docker-compose). This increases drift and maintenance. Choose one (recommend the plugin) and align scripts/systemd accordingly.
Would you like a follow-up diff to drop the v1 binary and switch ExecStart/CLI calls to “docker compose”?
146-151: Apt packages duplicated.curl is installed earlier; keep once to speed builds and reduce surface.
270-287: Systemd unit: prefer docker compose and resilient path.If you standardize on the plugin, use “docker compose” and env lookup to avoid path issues.
- ExecStart=/usr/bin/docker-compose up -d - ExecStop=/usr/bin/docker-compose down + ExecStart=/usr/bin/env docker compose up -d + ExecStop=/usr/bin/env docker compose down
344-353: Single write for COMPOSE_PROFILES.You create COMPOSE_PROFILES then append another line when MCP is enabled. Compose uses the last value, but write once to avoid confusion.
- cat > .env << 'ENVEOF' - CHAIN_ID=tn-v2.1 - DB_OWNER=postgres://kwild:kwild@kwil-postgres:5432/kwild - COMPOSE_PROFILES=default - ENVEOF - - if [ "$ENABLE_MCP" = true ]; then - echo "COMPOSE_PROFILES=default,mcp" >> .env - fi + PROFILES="default" + if [ "$ENABLE_MCP" = true ]; then PROFILES="default,mcp"; fi + cat > .env << ENVEOF + CHAIN_ID=tn-v2.1 + DB_OWNER=postgres://kwild:kwild@kwil-postgres:5432/kwild + COMPOSE_PROFILES=$PROFILES + ENVEOF
371-372: Avoid external IP echo via ifconfig.me; use IMDSv2 (privacy and reliability).This leaks IP to a third-party and may fail in restricted networks. Use EC2 metadata with token and a fallback.
- echo "MCP server will be available at: http://$(curl -s ifconfig.me):8000/sse" + TOKEN=$(curl -s --connect-timeout 2 -X PUT "http://169.254.169.254/latest/api/token" -H "X-aws-ec2-metadata-token-ttl-seconds: 21600" || true) + IP=$(curl -s --connect-timeout 2 -H "X-aws-ec2-metadata-token: $TOKEN" http://169.254.169.254/latest/meta-data/public-ipv4 || hostname -I | awk '{print $1}') + echo "MCP server will be available at: http://$IP:8000/sse"
399-405: Don’t enable services during image build; let the configure script do it.Enabling at build time is redundant and can introduce side effects on the builder. Rely on tn-node-configure.
- - sudo systemctl daemon-reload - - sudo systemctl enable tn-node
467-478: CFN ExportName uniqueness.Export names must be unique per account/region. Consider prefixing with the stack name to avoid collisions across stages/apps.
- ExportName: jsii.String(nameWithPrefix("AmiPipelineArn-" + string(stage))), + ExportName: jsii.String(nameWithPrefix(*stack.StackName() + "-AmiPipelineArn-" + string(stage))),Apply similarly to AmiArtifactsBucketOutput.
480-488: Exports: include the config component or rename for clarity.You export only the Docker component ARN under ComponentArn. Either export both (DockerComponentArn, ConfigComponentArn) or rename to DockerComponentArn to avoid confusion.
- ComponentArn: *dockerComponent.AttrArn(), + ComponentArn: *dockerComponent.AttrArn(), // consider adding ConfigComponentArn too
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
deployments/infra/go.sumis excluded by!**/*.sum
📒 Files selected for processing (9)
.github/workflows/ami-build.yml(1 hunks).github/workflows/publish-node-image.yaml(1 hunks).github/workflows/release.yaml(2 hunks)deployments/infra/README.md(1 hunks)deployments/infra/ami-cdk.go(1 hunks)deployments/infra/cdk.test.json(1 hunks)deployments/infra/go.mod(2 hunks)deployments/infra/stacks/ami_pipeline_stack.go(1 hunks)scripts/test-ami.sh(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
deployments/infra/ami-cdk.go (2)
deployments/infra/stacks/ami_pipeline_stack.go (2)
AmiPipelineStack(27-491)AmiPipelineStackProps(13-15)deployments/infra/config/config.go (1)
WithStackSuffix(30-34)
deployments/infra/stacks/ami_pipeline_stack.go (2)
deployments/infra/config/is_stack_in_synthesis.go (1)
IsStackInSynthesis(8-17)deployments/infra/config/config.go (2)
GetStage(83-103)GetDevPrefix(108-129)
🪛 actionlint (1.7.7)
.github/workflows/ami-build.yml
49-49: shellcheck reported issue in this script: SC2086:info:8:34: Double quote to prevent globbing and word splitting
(shellcheck)
49-49: shellcheck reported issue in this script: SC2016:info:14:7: Expressions don't expand in single quotes, use double quotes for that
(shellcheck)
49-49: shellcheck reported issue in this script: SC2086:info:16:40: Double quote to prevent globbing and word splitting
(shellcheck)
49-49: shellcheck reported issue in this script: SC2086:info:18:35: Double quote to prevent globbing and word splitting
(shellcheck)
72-72: shellcheck reported issue in this script: SC2016:info:10:5: Expressions don't expand in single quotes, use double quotes for that
(shellcheck)
72-72: shellcheck reported issue in this script: SC2086:info:12:38: Double quote to prevent globbing and word splitting
(shellcheck)
88-88: shellcheck reported issue in this script: SC2086:info:2:6: Double quote to prevent globbing and word splitting
(shellcheck)
93-93: shellcheck reported issue in this script: SC2086:info:11:38: Double quote to prevent globbing and word splitting
(shellcheck)
108-108: shellcheck reported issue in this script: SC2086:info:26:32: Double quote to prevent globbing and word splitting
(shellcheck)
🪛 Shellcheck (0.11.0)
scripts/test-ami.sh
[warning] 10-10: YELLOW appears unused. Verify use (or export if used externally).
(SC2034)
[warning] 11-11: BLUE appears unused. Verify use (or export if used externally).
(SC2034)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: acceptance-test
🔇 Additional comments (14)
deployments/infra/go.mod (3)
15-15: LGTM! Valid golden file testing library.Goldie is a golden file test utility for Go projects that is typically used for testing responses with larger data bodies, making it well-suited for testing CDK synthesized templates or configuration files in the AMI pipeline infrastructure.
19-19: LGTM!The gopkg.in/yaml.v3 dependency addition is appropriate for YAML processing needs in the new AMI infrastructure components.
43-44: Version upgrades are up-to-date.The dependency upgrades maintain compatibility with the newly added direct dependencies while ensuring up-to-date versions.
deployments/infra/ami-cdk.go (2)
10-12: Production logging configuration is appropriate.Initializes production-level structured logging for operational visibility during CDK deployment operations.
19-22: Auto-detection approach improves deployment flexibility.Using nil values for account and region allows CDK to automatically detect from AWS credentials and CLI configuration, making deployments more portable across environments.
.github/workflows/publish-node-image.yaml (1)
4-4: YAML syntax correction is appropriate.Correcting the
on:key from an array-like format to proper quoted key format ensures consistent YAML syntax.deployments/infra/cdk.test.json (2)
17-18: Test environment configuration is appropriate.The stage and devPrefix settings are well-suited for testing the AMI pipeline infrastructure without impacting production resources.
2-2: cdk.test.json app points to an existing ami-cdk.go — confirm intent and provide required context
- deployments/infra/ami-cdk.go exists; the cdk.test.json "app" entry is valid.
- deployments/infra/cdk.json uses cdk_main.go — confirm whether the test config intentionally differs or should be aligned.
- Running go run ami-cdk.go failed with: "Mandatory context variable 'stage' is missing" — add "stage" to cdk.test.json context or update the app command to pass -c stage=<dev|prod> so the app is runnable.
.github/workflows/release.yaml (2)
3-3: YAML syntax correction is appropriate.Correcting the
on:key ensures consistent YAML syntax across workflow files.
73-100: Automated build orchestration is well-designed.The trigger-builds job effectively coordinates AMI and Docker image builds upon release publication, with proper dependency management and conditional execution.
.github/workflows/ami-build.yml (1)
53-53: Stack name mismatch with CDK application.Verification failed: CDK CLI missing in the verification environment (error: "cdk: command not found"). Cannot confirm whether the hardcoded STACK_NAME "TrufNetwork-AMI-Pipeline-dev" in .github/workflows/ami-build.yml (line 53) matches the name produced by config.WithStackSuffix() in deployments/infra/ami-cdk.go. Run in deployments/infra:
cdk --app 'go run ami-cdk.go' list --context stage=dev --context devPrefix=test
and compare the output to the hardcoded name, or update the workflow to derive the stack name from the same config.
deployments/infra/stacks/ami_pipeline_stack.go (3)
10-11: Import path likely incorrect (verify module path).Code under deployments/infra/config is imported as github.com/trufnetwork/node/infra/config. If the repo path matches the file tree, this will not resolve.
Apply this diff if needed:
- "github.com/trufnetwork/node/infra/config" + "github.com/trufnetwork/node/deployments/infra/config"
199-261: Compose file is fine; confirm “Always-Latest” policy is intentional.Images use latest tags; good if that’s the rollout strategy. Be aware this trades reproducibility for convenience.
60-76: IAM/profile posture looks correct; minor note.Passing RoleName to CfnInstanceProfile.Roles is correct. No action needed.
If the pipeline needs private GHCR pulls later, you’ll need to add creds (ECR/GHCR) and grant egress.
847c606 to
922415c
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (13)
.github/workflows/ami-build.yml (7)
47-69: Harden shell quoting and guard empty outputs (fix actionlint SC2086/SC2016 hints).Quote special files and vars; ignore empty/None pipeline ARNs.
Apply:
- STACK_NAME="AMI-Pipeline-default-Stack" + STACK_NAME="AMI-Pipeline-default-Stack" if aws cloudformation describe-stacks \ - --stack-name "$STACK_NAME" \ - --region ${{ env.AWS_REGION }} > /dev/null 2>&1; then - echo "pipeline_exists=true" >> $GITHUB_OUTPUT + --stack-name "$STACK_NAME" \ + --region "${{ env.AWS_REGION }}" > /dev/null 2>&1; then + echo "pipeline_exists=true" >> "$GITHUB_OUTPUT" # Get pipeline ARN from stack outputs - PIPELINE_ARN=$(aws cloudformation describe-stacks \ + PIPELINE_ARN="$(aws cloudformation describe-stacks \ --stack-name "$STACK_NAME" \ - --region ${{ env.AWS_REGION }} \ + --region "${{ env.AWS_REGION }}" \ --query \ 'Stacks[0].Outputs[?OutputKey==`AmiPipelineArn`].OutputValue' \ - --output text) - echo "pipeline_arn=$PIPELINE_ARN" >> $GITHUB_OUTPUT + --output text 2>/dev/null || true)" + if [ -n "$PIPELINE_ARN" ] && [ "$PIPELINE_ARN" != "None" ] && [ "$PIPELINE_ARN" != "null" ]; then + echo "pipeline_arn=$PIPELINE_ARN" >> "$GITHUB_OUTPUT" + fi else - echo "pipeline_exists=false" >> $GITHUB_OUTPUT + echo "pipeline_exists=false" >> "$GITHUB_OUTPUT" fi- cdk deploy AMI-Pipeline-default-Stack --require-approval never + cdk deploy AMI-Pipeline-default-Stack --require-approval never ... - PIPELINE_ARN=$(aws cloudformation describe-stacks \ + PIPELINE_ARN="$(aws cloudformation describe-stacks \ --stack-name "AMI-Pipeline-default-Stack" \ - --region ${{ env.AWS_REGION }} \ + --region "${{ env.AWS_REGION }}" \ --query \ 'Stacks[0].Outputs[?OutputKey==`AmiPipelineArn`].OutputValue' \ - --output text) - echo "PIPELINE_ARN=$PIPELINE_ARN" >> $GITHUB_ENV + --output text 2>/dev/null || true)" + echo "PIPELINE_ARN=$PIPELINE_ARN" >> "$GITHUB_ENV"- echo "PIPELINE_ARN=${{ steps.check-pipeline.outputs.pipeline_arn }}" \ - >> $GITHUB_ENV + echo "PIPELINE_ARN=${{ steps.check-pipeline.outputs.pipeline_arn }}" \ + >> "$GITHUB_ENV"Also applies to: 70-85, 86-91
4-13: Avoid hardcoded stack name; parameterize suffix.Expose a stack_suffix input (default “default”) and derive STACK_NAME accordingly to stay in sync with CDK context.
workflow_dispatch: inputs: force_build: description: "Force AMI build even if not a release" type: boolean default: false + stack_suffix: + description: "CDK stackSuffix (matches WithStackSuffix)" + type: string + default: "default"- STACK_NAME="AMI-Pipeline-default-Stack" + STACK_NAME="AMI-Pipeline-${{ inputs.stack_suffix || 'default' }}-Stack"- cdk deploy AMI-Pipeline-default-Stack --require-approval never + cdk deploy "AMI-Pipeline-${{ inputs.stack_suffix || 'default' }}-Stack" --require-approval never- --stack-name "AMI-Pipeline-default-Stack" \ + --stack-name "AMI-Pipeline-${{ inputs.stack_suffix || 'default' }}-Stack" \Also applies to: 52-56, 75-80
21-25: Gate job on release or manual force; otherwise input is unused.Currently force_build is unused. Gate the job to respect it.
jobs: build-ami: + if: github.event_name == 'release' || inputs.force_build == true name: Build TrufNetwork AMI
42-46: Pin CDK to a major version to avoid surprise breakages.Use a stable major (v2) instead of latest.
- npm install -g aws-cdk@latest + npm install -g aws-cdk@2
92-105: Make polling resilient to eventual consistency/read errors.Early get-image calls can 404; treat as “in progress” instead of failing the step.
- EXECUTION_ID=$(aws imagebuilder start-image-pipeline-execution \ + EXECUTION_ID="$(aws imagebuilder start-image-pipeline-execution \ --image-pipeline-arn "$PIPELINE_ARN" \ - --region ${{ env.AWS_REGION }} \ + --region "${{ env.AWS_REGION }}" \ --query 'imageBuildVersionArn' \ - --output text) + --output text)" ... - STATUS=$(aws imagebuilder get-image \ + STATUS="$(aws imagebuilder get-image \ --image-build-version-arn "$EXECUTION_ID" \ - --region ${{ env.AWS_REGION }} \ + --region "${{ env.AWS_REGION }}" \ --query 'image.state.status' \ - --output text) + --output text 2>/dev/null || true)"Also applies to: 112-151
126-135: Surface all distributed AMIs, not just the first.Image Builder can distribute across regions; include them in logs/outputs.
- # Get AMI ID - AMI_ID=$(aws imagebuilder get-image \ + # Get AMI ID (primary region) + AMI_ID="$(aws imagebuilder get-image \ --image-build-version-arn "$EXECUTION_ID" \ - --region ${{ env.AWS_REGION }} \ + --region "${{ env.AWS_REGION }}" \ --query 'image.outputResources.amis[0].image' \ - --output text) + --output text)" + # Get all distributed AMIs as JSON + AMIS_JSON="$(aws imagebuilder get-image \ + --image-build-version-arn "$EXECUTION_ID" \ + --region "${{ env.AWS_REGION }}" \ + --query 'image.outputResources.amis[].{Region:region,Image:image}' \ + --output json)" echo "AMI ID: $AMI_ID" echo "AMI_ID=$AMI_ID" >> $GITHUB_ENV + echo "AMIS_JSON=$AMIS_JSON" >> $GITHUB_ENV- echo "Pipeline ARN: $PIPELINE_ARN" + echo "Pipeline ARN: $PIPELINE_ARN" echo "Execution ID: $EXECUTION_ID" + echo "Distributed AMIs: $AMIS_JSON"Also applies to: 152-169
30-35: Set a role session name for traceability (optional).Adds clarity in CloudTrail.
with: role-to-assume: ${{ secrets.AWS_ROLE_ARN }} aws-region: ${{ env.AWS_REGION }} + role-session-name: ami-build-${{ github.run_id }}scripts/test-ami.sh (6)
1-2: Use stricter bash flags.Improve safety against unset vars and pipe failures.
-#!/bin/bash -set -e +#!/bin/bash +set -Eeuo pipefail
18-27: Support both “docker compose” and “docker-compose” and use one variable.Prevents failures on environments without the legacy plugin.
# Test counter TESTS_PASSED=0 TESTS_FAILED=0 + +# Compose shim +if command -v docker-compose >/dev/null 2>&1; then + DOCKER_COMPOSE="docker-compose" +elif docker compose version >/dev/null 2>&1; then + DOCKER_COMPOSE="docker compose" +else + echo "docker-compose / docker compose not found" + exit 1 +fi-if docker-compose -f /tmp/test-docker-compose.yml config > /dev/null 2>&1; then +if $DOCKER_COMPOSE -f /tmp/test-docker-compose.yml config > /dev/null 2>&1; then-if docker-compose -f /tmp/tn-test-compose.yml up -d kwil-postgres; then +if $DOCKER_COMPOSE -f /tmp/tn-test-compose.yml up -d kwil-postgres; then- if docker-compose -f /tmp/tn-test-compose.yml exec -T kwil-postgres pg_isready -U kwild > /dev/null 2>&1; then + if $DOCKER_COMPOSE -f /tmp/tn-test-compose.yml exec -T kwil-postgres pg_isready -U kwild > /dev/null 2>&1; then- if docker-compose -f /tmp/tn-test-compose.yml exec -T kwil-postgres psql -U kwild -d kwild -c "SELECT version();" > /dev/null 2>&1; then + if $DOCKER_COMPOSE -f /tmp/tn-test-compose.yml exec -T kwil-postgres psql -U kwild -d kwild -c "SELECT version();" > /dev/null 2>&1; then-docker-compose -f /tmp/tn-test-compose.yml down -v +$DOCKER_COMPOSE -f /tmp/tn-test-compose.yml down -vAlso applies to: 157-165, 337-369, 341-346, 345-346
83-101: Avoid binding Postgres to host port 5432 to prevent conflicts.The exec-based checks don’t need host port mapping.
- ports: - - "5432:5432" + # No host port mapping needed for internal checksAnd similarly in Test 9’s compose snippet:
- ports: - - "5432:5432" + # No host port mapping needed for internal checksAlso applies to: 317-319
176-186: Don’t run ‘go mod tidy’ in a test script.It mutates files; just build.
-echo "Running: go mod tidy && go build in deployments/infra" +echo "Running: go build in deployments/infra" cd deployments/infra -if go mod tidy && go build ./lib/... ./stacks/... > /dev/null 2>&1; then +if go build ./lib/... ./stacks/... > /dev/null 2>&1; then
18-27: Fail fast if CDK is missing.Add a pre-check for clearer error.
echo "1. Testing CDK synthesis..." cd deployments/infra -if cdk --app 'go run ami-cdk.go' synth --context stage=dev --context devPrefix=test > /dev/null 2>&1; then +if ! command -v cdk >/dev/null 2>&1; then + echo -e "${RED}❌ AWS CDK not installed${NC}" + TESTS_FAILED=$((TESTS_FAILED + 1)) +elif cdk --app 'go run ami-cdk.go' synth --context stage=dev --context devPrefix=test > /dev/null 2>&1; then echo -e "${GREEN}✅ CDK synthesis successful${NC}" TESTS_PASSED=$((TESTS_PASSED + 1)) else
409-433: Only print “All tests passed” when they actually did.Avoid confusing success message on failure.
-echo "" -echo "🎉 All tests passed!" -echo "" +echo "" +if [ "$TESTS_FAILED" -eq 0 ]; then + echo "🎉 All tests passed!" +else + echo "⚠️ Some tests failed." +fi +echo ""
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
.github/workflows/ami-build.yml(1 hunks)deployments/infra/README.md(1 hunks)deployments/infra/ami-cdk.go(1 hunks)deployments/infra/stacks/ami_pipeline_stack.go(1 hunks)scripts/test-ami.sh(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
- deployments/infra/stacks/ami_pipeline_stack.go
- deployments/infra/ami-cdk.go
- deployments/infra/README.md
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-09-22T18:35:49.455Z
Learnt from: outerlook
PR: trufnetwork/node#1170
File: .github/workflows/ami-build.yml:75-75
Timestamp: 2025-09-22T18:35:49.455Z
Learning: In the trufnetwork/node repository, the CDK stack naming convention uses `WithStackSuffix()` which generates names in the format `<stackName>-<stackSuffix>-Stack`. The AMI pipeline stack uses `AMI-Pipeline-default-Stack` because no `stackSuffix` is set in the CDK context, so it defaults to "default".
Applied to files:
.github/workflows/ami-build.yml
🪛 Shellcheck (0.11.0)
scripts/test-ami.sh
[warning] 427-427: Did you forget to close this double quoted string?
(SC1078)
[error] 431-431: Couldn't parse this double quoted string. Fix to allow more checks.
(SC1073)
[error] 433-433: Expected end of double quoted string. Fix any mentioned problems and try again.
(SC1072)
🪛 actionlint (1.7.7)
.github/workflows/ami-build.yml
49-49: shellcheck reported issue in this script: SC2086:info:8:34: Double quote to prevent globbing and word splitting
(shellcheck)
49-49: shellcheck reported issue in this script: SC2016:info:14:7: Expressions don't expand in single quotes, use double quotes for that
(shellcheck)
49-49: shellcheck reported issue in this script: SC2086:info:16:40: Double quote to prevent globbing and word splitting
(shellcheck)
49-49: shellcheck reported issue in this script: SC2086:info:18:35: Double quote to prevent globbing and word splitting
(shellcheck)
72-72: shellcheck reported issue in this script: SC2016:info:10:5: Expressions don't expand in single quotes, use double quotes for that
(shellcheck)
72-72: shellcheck reported issue in this script: SC2086:info:12:38: Double quote to prevent globbing and word splitting
(shellcheck)
88-88: shellcheck reported issue in this script: SC2086:info:2:6: Double quote to prevent globbing and word splitting
(shellcheck)
93-93: shellcheck reported issue in this script: SC2086:info:11:38: Double quote to prevent globbing and word splitting
(shellcheck)
108-108: shellcheck reported issue in this script: SC2086:info:26:32: Double quote to prevent globbing and word splitting
(shellcheck)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: acceptance-test
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (13)
deployments/infra/stacks/ami_pipeline_stack.go (6)
79-96: Enforce IMDSv2 for Image Builder instances.Harden instance metadata access.
infraConfig := awsimagebuilder.NewCfnInfrastructureConfiguration(stack, jsii.String("AmiInfrastructureConfiguration"), &awsimagebuilder.CfnInfrastructureConfigurationProps{ Name: jsii.String(nameWithPrefix("tn-ami-infra-config-" + string(stage))), InstanceProfileName: instanceProfile.InstanceProfileName(), + InstanceMetadataOptions: &awsimagebuilder.CfnInfrastructureConfiguration_InstanceMetadataOptionsProperty{ + HttpTokens: jsii.String("required"), + HttpPutResponseHopLimit: jsii.Number(2), + },
98-168: Standardize on Compose v2; drop legacy docker‑compose binary.You install both the plugin and the v1 binary. Use only
docker composeto reduce supply‑chain risk and confusion.- name: InstallDocker action: ExecuteBash inputs: commands: - sudo apt-get install -y ca-certificates curl gnupg lsb-release @@ - sudo apt-get install -y docker-ce docker-ce-cli containerd.io docker-compose-plugin @@ - - name: InstallDockerCompose - action: ExecuteBash - inputs: - commands: - - sudo curl -L "https://github.com/docker/compose/releases/latest/download/docker-compose-$(uname -s)-$(uname -m)" -o /usr/local/bin/docker-compose - - sudo chmod +x /usr/local/bin/docker-compose - - sudo ln -sf /usr/local/bin/docker-compose /usr/bin/docker-compose - @@ - - name: VerifyInstallation + - name: VerifyInstallation action: ExecuteBash inputs: commands: - docker --version - - docker-compose --version + - docker compose version - systemctl is-active docker
267-289: Use Compose v2 in systemd unit.Aligns with the plugin; avoids relying on the removed v1 binary.
[Service] Type=oneshot RemainAfterExit=true WorkingDirectory=/opt/tn - ExecStart=/usr/bin/docker-compose up -d - ExecStop=/usr/bin/docker-compose down + ExecStart=/usr/bin/docker compose up -d + ExecStop=/usr/bin/docker compose down User=tn Group=tn
378-400: Update script: use Compose v2.- sudo -u tn docker-compose pull + sudo -u tn docker compose pull @@ - sudo -u tn docker-compose up -d + sudo -u tn docker compose up -d
469-480: Output key/name mismatch with workflow query.Workflow looks for
AmiPipelineArnbut the key here isAmiPipelineArnOutput. Either update the workflow (preferred) or add a duplicate output key for compatibility.awscdk.NewCfnOutput(stack, jsii.String("AmiPipelineArnOutput"), &awscdk.CfnOutputProps{ Value: imagePipeline.AttrArn(), Description: jsii.String("ARN of the AMI building pipeline"), ExportName: jsii.String(nameWithPrefix("AmiPipelineArn-" + string(stage))), }) + // Optional compatibility output (if you cannot change the workflow right now): + awscdk.NewCfnOutput(stack, jsii.String("AmiPipelineArn"), &awscdk.CfnOutputProps{ + Value: imagePipeline.AttrArn(), + Description: jsii.String("ARN of the AMI building pipeline"), + })
482-490: Clarify exports for both components or rename.
ComponentArncurrently exposes only the Docker component while two components exist. Either export both (e.g.,DockerComponentArn,ConfigComponentArn) or rename toDockerComponentArn..github/workflows/ami-build.yml (3)
106-108: Increase wait timeout; pipeline tests allow up to 90 minutes.Step timeout (60m) can expire before Image Builder finishes.
- - name: Wait for AMI build completion - timeout-minutes: 60 + - name: Wait for AMI build completion + timeout-minutes: 120
42-46: Avoid “latest” for CDK; pin major or a tested version.Prevents surprise CI breaks.
- npm install -g aws-cdk@latest + npm install -g aws-cdk@2Optionally pin to a specific 2.x you’ve validated in CI.
47-69: Minor shell hygiene: safer bash options and quoting.Reduce footguns in the multi‑line script steps.
- run: | + run: | + set -euo pipefail cd deployments/infra @@ - STACK_NAME="AMI-Pipeline-default-Stack" + STACK_NAME="AMI-Pipeline-default-Stack" if aws cloudformation describe-stacks \ - --stack-name "$STACK_NAME" \ - --region ${{ env.AWS_REGION }} > /dev/null 2>&1; then + --stack-name "${STACK_NAME}" \ + --region "${{ env.AWS_REGION }}" > /dev/null 2>&1; then echo "pipeline_exists=true" >> $GITHUB_OUTPUTReplicate
set -euo pipefailand strict quoting in the deploy block as well.scripts/test-ami.sh (4)
409-433: “All tests passed!” prints unconditionally.Move it behind the pass condition to avoid false success messaging.
-echo "" -echo "🎉 All tests passed!" -echo "" -echo "📋 Summary of what was tested:" +echo "" +if [ $TESTS_FAILED -eq 0 ]; then + echo "🎉 All tests passed!" +else + echo "⚠️ Some tests failed." +fi +echo "" +echo "📋 Summary of what was tested:" @@ -if [ $TESTS_FAILED -gt 0 ]; then +if [ $TESTS_FAILED -gt 0 ]; then echo -e "${RED}❌ Some tests failed. Please fix the issues before deployment.${NC}" exit 1 fi
8-12: Remove unused color vars.
YELLOWandBLUEaren’t used.-GREEN='\033[0;32m' -RED='\033[0;31m' -YELLOW='\033[1;33m' -BLUE='\033[0;34m' -NC='\033[0m' +GREEN='\033[0;32m' +RED='\033[0;31m' +NC='\033[0m'
157-165: Preferdocker composeoverdocker-compose.Matches the AMI change and default on modern runners.
-if docker-compose -f /tmp/test-docker-compose.yml config > /dev/null 2>&1; then +if docker compose -f /tmp/test-docker-compose.yml config > /dev/null 2>&1; then @@ -if docker-compose -f /tmp/tn-test-compose.yml up -d kwil-postgres; then +if docker compose -f /tmp/tn-test-compose.yml up -d kwil-postgres; then @@ - if docker-compose -f /tmp/tn-test-compose.yml exec -T kwil-postgres pg_isready -U kwild > /dev/null 2>&1; then + if docker compose -f /tmp/tn-test-compose.yml exec -T kwil-postgres pg_isready -U kwild > /dev/null 2>&1; then @@ - if docker-compose -f /tmp/tn-test-compose.yml exec -T kwil-postgres psql -U kwild -d kwild -c "SELECT version();" > /dev/null 2>&1; then + if docker compose -f /tmp/tn-test-compose.yml exec -T kwil-postgres psql -U kwild -d kwild -c "SELECT version();" > /dev/null 2>&1; then @@ -docker-compose -f /tmp/tn-test-compose.yml down -v +docker compose -f /tmp/tn-test-compose.yml down -vAlso applies to: 336-369, 341-347
180-186: Build all packages; don’t assume lib/.
./lib/...may not exist../...is simpler and robust.-if go mod tidy && go build ./lib/... ./stacks/... > /dev/null 2>&1; then +if go mod tidy && go build ./... > /dev/null 2>&1; then
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
.github/workflows/ami-build.yml(1 hunks)deployments/infra/README.md(1 hunks)deployments/infra/stacks/ami_pipeline_stack.go(1 hunks)scripts/test-ami.sh(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- deployments/infra/README.md
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-09-22T18:35:49.455Z
Learnt from: outerlook
PR: trufnetwork/node#1170
File: .github/workflows/ami-build.yml:75-75
Timestamp: 2025-09-22T18:35:49.455Z
Learning: In the trufnetwork/node repository, the CDK stack naming convention uses `WithStackSuffix()` which generates names in the format `<stackName>-<stackSuffix>-Stack`. The AMI pipeline stack uses `AMI-Pipeline-default-Stack` because no `stackSuffix` is set in the CDK context, so it defaults to "default".
Applied to files:
deployments/infra/stacks/ami_pipeline_stack.go.github/workflows/ami-build.yml
🧬 Code graph analysis (1)
deployments/infra/stacks/ami_pipeline_stack.go (2)
deployments/infra/config/is_stack_in_synthesis.go (1)
IsStackInSynthesis(8-17)deployments/infra/config/config.go (2)
GetStage(83-103)GetDevPrefix(108-129)
🪛 actionlint (1.7.7)
.github/workflows/ami-build.yml
49-49: shellcheck reported issue in this script: SC2086:info:8:34: Double quote to prevent globbing and word splitting
(shellcheck)
49-49: shellcheck reported issue in this script: SC2016:info:14:7: Expressions don't expand in single quotes, use double quotes for that
(shellcheck)
49-49: shellcheck reported issue in this script: SC2086:info:16:40: Double quote to prevent globbing and word splitting
(shellcheck)
49-49: shellcheck reported issue in this script: SC2086:info:18:35: Double quote to prevent globbing and word splitting
(shellcheck)
72-72: shellcheck reported issue in this script: SC2016:info:10:5: Expressions don't expand in single quotes, use double quotes for that
(shellcheck)
72-72: shellcheck reported issue in this script: SC2086:info:12:38: Double quote to prevent globbing and word splitting
(shellcheck)
88-88: shellcheck reported issue in this script: SC2086:info:2:6: Double quote to prevent globbing and word splitting
(shellcheck)
93-93: shellcheck reported issue in this script: SC2086:info:11:38: Double quote to prevent globbing and word splitting
(shellcheck)
108-108: shellcheck reported issue in this script: SC2086:info:26:32: Double quote to prevent globbing and word splitting
(shellcheck)
🪛 Shellcheck (0.11.0)
scripts/test-ami.sh
[warning] 10-10: YELLOW appears unused. Verify use (or export if used externally).
(SC2034)
[warning] 11-11: BLUE appears unused. Verify use (or export if used externally).
(SC2034)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: acceptance-test
🔇 Additional comments (3)
deployments/infra/stacks/ami_pipeline_stack.go (2)
34-36: Remove unreliable synth guard; it can emit an empty stack.
config.IsStackInSynthesisrelies onBundlingRequired()and can be false during regularcdk synth, skipping resource creation.Apply:
- if !config.IsStackInSynthesis(stack) { - return stack, AmiPipelineStackExports{} - } + // Always define resources during synth; avoid gating on BundlingRequired().
439-453: Single‑region distribution; parametrize for multi‑region via context.Acceptance requires multi‑region. Read
-c amiRegions='["us-east-2","us-west-2"]'and build the list.- distributionConfig := awsimagebuilder.NewCfnDistributionConfiguration(stack, jsii.String("AmiDistributionConfiguration"), &awsimagebuilder.CfnDistributionConfigurationProps{ - Name: jsii.String(nameWithPrefix("tn-ami-distribution-" + string(stage))), - Description: jsii.String("Distribution configuration for TRUF.NETWORK AMI"), - Distributions: &[]*awsimagebuilder.CfnDistributionConfiguration_DistributionProperty{ - { - Region: awscdk.Aws_REGION(), // Use current region token - AmiDistributionConfiguration: &awsimagebuilder.CfnDistributionConfiguration_AmiDistributionConfigurationProperty{ - AmiTags: &map[string]*string{ - "Name": jsii.String("TRUFNETWORK-Node-{{imagebuilder:buildDate}}"), - }, - }, - }, - }, - }) + // Resolve regions from context; fallback to current region + var regions []*string + if ctx := stack.Node().TryGetContext(jsii.String("amiRegions")); ctx != nil { + if arr, ok := ctx.([]interface{}); ok { + for _, v := range arr { + if s, ok := v.(string); ok { + regions = append(regions, jsii.String(s)) + } + } + } + } + if len(regions) == 0 { + regions = []*string{awscdk.Aws_REGION()} + } + + var dists []*awsimagebuilder.CfnDistributionConfiguration_DistributionProperty + for _, r := range regions { + dists = append(dists, &awsimagebuilder.CfnDistributionConfiguration_DistributionProperty{ + Region: r, + AmiDistributionConfiguration: &awsimagebuilder.CfnDistributionConfiguration_AmiDistributionConfigurationProperty{ + Name: jsii.String(nameWithPrefix("trufnetwork-node-{{imagebuilder:buildDate}}")), + AmiTags: &map[string]*string{ + "Name": jsii.String("TRUFNETWORK-Node-{{imagebuilder:buildDate}}"), + }, + }, + }) + } + distributionConfig := awsimagebuilder.NewCfnDistributionConfiguration(stack, jsii.String("AmiDistributionConfiguration"), &awsimagebuilder.CfnDistributionConfigurationProps{ + Name: jsii.String(nameWithPrefix("tn-ami-distribution-" + string(stage))), + Description: jsii.String("Distribution configuration for TRUF.NETWORK AMI"), + Distributions: &dists, + }).github/workflows/ami-build.yml (1)
170-218: Fix release update: undefined variable; append correct content.
amiDetailsis undefined; use the constructedcomment.- // Update release body with AMI information + // Update release body with AMI information await github.rest.repos.updateRelease({ owner: context.repo.owner, repo: context.repo.repo, release_id: releaseId, - body: (currentRelease.data.body || '') + amiDetails + body: (currentRelease.data.body || '') + '\n\n' + comment });
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (5)
.github/workflows/ami-build.yml (5)
42-46: Pin CDK major version to v2 to avoid surprise breakage.
@latestcan introduce breaking changes unexpectedly on CI.Apply:
- npm install -g aws-cdk@latest + npm install -g aws-cdk@2
106-108: Increase AMI build wait timeout.Image Builder often exceeds 60 minutes (build + test + distribution).
Apply:
- timeout-minutes: 60 + timeout-minutes: 120
126-135: Publish all regional AMI IDs, not just the first.You currently extract
amis[0]. With multi‑region distribution, capture and publish the full region→AMI mapping and include it in the release notes.Apply:
- # Get AMI ID - AMI_ID=$(aws imagebuilder get-image \ - --image-build-version-arn "$EXECUTION_ID" \ - --region ${{ env.AWS_REGION }} \ - --query 'image.outputResources.amis[0].image' \ - --output text) + # Get primary AMI ID in current region and full mapping across regions + AMI_ID=$(aws imagebuilder get-image \ + --image-build-version-arn "$EXECUTION_ID" \ + --region ${{ env.AWS_REGION }} \ + --query 'image.outputResources.amis[?region==`'${{ env.AWS_REGION }}`'].image | [0]' \ + --output text) + AMI_MAP_TABLE=$(aws imagebuilder get-image \ + --image-build-version-arn "$EXECUTION_ID" \ + --region ${{ env.AWS_REGION }} \ + --query 'image.outputResources.amis[*].{Region:region,AMI:image}' \ + --output table) echo "AMI ID: $AMI_ID" echo "AMI_ID=$AMI_ID" >> $GITHUB_ENV + { + echo 'AMI_MAP_TABLE<<EOF' + echo "$AMI_MAP_TABLE" + echo 'EOF' + } >> $GITHUB_ENVif [ -n "$AMI_ID" ]; then echo "📋 AMI Build Summary:" echo "AMI ID: $AMI_ID" echo "Region: ${{ env.AWS_REGION }}" echo "Pipeline ARN: $PIPELINE_ARN" echo "Execution ID: $EXECUTION_ID" # Get AMI details aws ec2 describe-images \ --image-ids "$AMI_ID" \ --region ${{ env.AWS_REGION }} \ --query \ 'Images[0].{Name:Name,Description:Description,CreationDate:CreationDate,VirtualizationType:VirtualizationType,Architecture:Architecture,RootDeviceType:RootDeviceType}' \ --output table + echo + echo "All distributed AMIs:" + echo "$AMI_MAP_TABLE" fiwith: script: | const amiId = process.env.AMI_ID; const region = process.env.AWS_REGION; const releaseId = process.env.RELEASE_ID; + const amiMap = process.env.AMI_MAP_TABLE || ''; - const comment = `## 🚀 AMI Build Completed + const comment = `## 🚀 AMI Build Completed **AMI ID:** \`${amiId}\` **Region:** \`${region}\` **Launch URL:** \\ https://console.aws.amazon.com/ec2/home?region=${region}#LaunchInstances:ami=${amiId} + ### All Regions + \`\`\` + ${amiMap} + \`\`\` + ### Quick Start Commands: \`\`\`bashAlso applies to: 152-168, 170-218
18-20: DRY the stack name.Declare
STACK_NAMEonce in env and reference it everywhere.Apply:
env: AWS_REGION: us-east-2 + STACK_NAME: AMI-Pipeline-default-Stack- STACK_NAME="AMI-Pipeline-default-Stack" if aws cloudformation describe-stacks \ --stack-name "$STACK_NAME" \- cdk deploy AMI-Pipeline-default-Stack --require-approval never + cdk deploy "$STACK_NAME" --require-approval never- PIPELINE_ARN=$(aws cloudformation describe-stacks \ - --stack-name "AMI-Pipeline-default-Stack" \ + PIPELINE_ARN=$(aws cloudformation describe-stacks \ + --stack-name "$STACK_NAME" \- echo "Starting AMI build with pipeline: $PIPELINE_ARN" + echo "Starting AMI build with pipeline: $PIPELINE_ARN"Also applies to: 53-66, 71-85, 92-95
4-13: Wire up force_build input to gating.Currently the
force_buildinput is unused. Gate the job accordingly.Apply:
jobs: build-ami: + if: > + ${{ + github.event_name == 'release' || + (github.event_name == 'workflow_dispatch' && inputs.force_build == true) + }}Also applies to: 21-25
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
.github/workflows/ami-build.yml(1 hunks)deployments/infra/stacks/ami_pipeline_stack.go(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- deployments/infra/stacks/ami_pipeline_stack.go
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-09-22T18:35:49.455Z
Learnt from: outerlook
PR: trufnetwork/node#1170
File: .github/workflows/ami-build.yml:75-75
Timestamp: 2025-09-22T18:35:49.455Z
Learning: In the trufnetwork/node repository, the CDK stack naming convention uses `WithStackSuffix()` which generates names in the format `<stackName>-<stackSuffix>-Stack`. The AMI pipeline stack uses `AMI-Pipeline-default-Stack` because no `stackSuffix` is set in the CDK context, so it defaults to "default".
Applied to files:
.github/workflows/ami-build.yml
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: acceptance-test
🔇 Additional comments (2)
.github/workflows/ami-build.yml (2)
19-20: Region alignment looks good.
AWS_REGION: us-east-2now matches the documented target region.If multi-region distribution changes the “primary” region later, consider promoting it to a workflow input.
62-65: Correct CFN Output key usage.Using
AmiPipelineArnOutputmatches the stack’s output. Good catch.Also applies to: 82-84
|
@coderabbitai full review |
✅ Actions performedFull review triggered. |
There was a problem hiding this comment.
Actionable comments posted: 5
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
.github/workflows/publish-node-image.yaml (1)
32-34: Enable QEMU for multi‑arch builds.Without QEMU, arm64 builds may fail on x86 runners.
- name: Set up Docker Buildx uses: docker/setup-buildx-action@v3 + - name: Set up QEMU + uses: docker/setup-qemu-action@v3deployments/infra/go.mod (1)
3-3: Invalid go directive (patch version not allowed).Use major.minor and optionally add a toolchain line.
-go 1.24.1 +go 1.24 +toolchain go1.24.1.github/workflows/release.yaml (1)
8-14: Missing permission to dispatch workflows.Grant actions: write so createWorkflowDispatch succeeds.
permissions: contents: write packages: read # This is required for creating and modifying releases id-token: write + actions: write
🧹 Nitpick comments (25)
deployments/infra/README.md (3)
247-248: “Always‑Latest” pull is risky (drift/supply‑chain).Pin images by tag (or digest) and use update-node to upgrade. At minimum, document the trade‑offs and how to pin.
295-301: Make AMI ID retrieval deterministic.Add a CLI snippet to fetch the AMI ID from the pipeline outputs or tags instead of a placeholder:
STACK=AMI-Pipeline-default-Stack PIPELINE_ARN=$(aws cloudformation describe-stacks --stack-name "$STACK" --region us-east-2 \ --query 'Stacks[0].Outputs[?OutputKey==`AmiPipelineArnOutput`].OutputValue' --output text) aws imagebuilder list-images --region us-east-2 \ --filters name=imagePipelineArn,values="$PIPELINE_ARN" \ --query 'reverse(sort_by(imageVersionList,&dateCreated))[0].arn' --output text | xargs -I{} aws imagebuilder get-image --image-build-version-arn {} --region us-east-2 \ --query 'image.outputResources.amis[0].image' --output text
315-318: Use the real MCP endpoint in the example.curling “/” returns 404; show the SSE path:
curl -I http://localhost:8000/sse.github/workflows/publish-node-image.yaml (2)
22-26: Add concurrency to prevent overlapping publishes.Prevents racing “latest” tags on rapid triggers.
name: Publish Node Image +'concurrency': + group: publish-node-image-${{ github.ref_name }} + cancel-in-progress: true
53-61: Speed up builds and add provenance.Use GHA cache and attestations.
- name: Build and push image (multi-arch) uses: docker/build-push-action@v5 with: context: . file: deployments/Dockerfile push: true platforms: linux/amd64,linux/arm64 tags: ${{ steps.meta.outputs.tags }} labels: ${{ steps.meta.outputs.labels }} + cache-from: type=gha + cache-to: type=gha,mode=max + provenance: truedeployments/infra/cdk.test.json (1)
40-41: Typo in CDK feature flag key (ignored at runtime).Fix or remove to avoid confusion.
- "@aws-cdk/aws-route53-patters:useCertificate": true, + // Removed invalid key: "@aws-cdk/aws-route53-patterns:useCertificate".github/workflows/ami-build.yml (5)
47-66: Quote $GITHUB_OUTPUT and guard shell expansions.Avoid word-splitting/globbing (shellcheck SC2086/SC2016).
- echo "pipeline_exists=true" >> $GITHUB_OUTPUT + echo "pipeline_exists=true" >> "$GITHUB_OUTPUT" ... - echo "pipeline_arn=$PIPELINE_ARN" >> $GITHUB_OUTPUT + echo "pipeline_arn=$PIPELINE_ARN" >> "$GITHUB_OUTPUT"
79-92: Quote $GITHUB_ENV and handle empty ARN consistently.- echo "PIPELINE_ARN=$PIPELINE_ARN" >> $GITHUB_ENV + echo "PIPELINE_ARN=$PIPELINE_ARN" >> "$GITHUB_ENV" ... - echo "PIPELINE_ARN=${{ steps.check-pipeline.outputs.pipeline_arn }}" \ - >> $GITHUB_ENV + echo "PIPELINE_ARN=${{ steps.check-pipeline.outputs.pipeline_arn }}" \ + >> "$GITHUB_ENV"
21-24: Gate the job on release or forced runs.Use the input you defined to avoid accidental builds.
jobs: - build-ami: + build-ami: + if: github.event_name == 'release' || (github.event_name == 'workflow_dispatch' && inputs.force_build == true)
174-176: Safer condition for AMI_ID presence.GitHub ifs don’t treat non‑empty strings as booleans reliably; compare explicitly.
- - name: Update GitHub release with AMI details - if: github.event_name == 'release' && env.AMI_ID + - name: Update GitHub release with AMI details + if: github.event_name == 'release' && env.AMI_ID != ''
1-3: Add concurrency to avoid overlapping AMI builds.name: Build AMI +'concurrency': + group: ami-build-${{ github.ref_name }} + cancel-in-progress: truedeployments/infra/ami-cdk.go (1)
23-31: Avoid hardcoding stack name; derive from context suffix.Prevents drift with workflows and config.
-import ( +import ( + "fmt" "github.com/aws/aws-cdk-go/awscdk/v2" "github.com/trufnetwork/node/infra/stacks" "go.uber.org/zap" ) @@ - // Use explicit stack name to match GitHub workflow expectations - stackName := "AMI-Pipeline-default-Stack" - _, amiExports := stacks.AmiPipelineStack( + // Build name: AMI-Pipeline-<stackSuffix>-Stack (default suffix "default") + sfx := "default" + if v := app.Node().TryGetContext(jsii.String("stackSuffix")); v != nil { + if s, ok := v.(string); ok && s != "" { + sfx = s + } + } + stackName := fmt.Sprintf("AMI-Pipeline-%s-Stack", sfx) + stacks.AmiPipelineStack( app, stackName, &stacks.AmiPipelineStackProps{ StackProps: awscdk.StackProps{Env: testEnv}, }, ) - _ = amiExports // Use exports if needed by other stacksdeployments/infra/stacks/ami_pipeline_stack.go (6)
137-144: Prefer Docker Compose v2 (plugin) over legacy v1 binary.Drop the v1 download and use “docker compose …” everywhere.
- - name: InstallDockerCompose - action: ExecuteBash - inputs: - commands: - - sudo curl -L "https://github.com/docker/compose/releases/latest/download/docker-compose-$(uname -s)-$(uname -m)" -o /usr/local/bin/docker-compose - - sudo chmod +x /usr/local/bin/docker-compose - - sudo ln -sf /usr/local/bin/docker-compose /usr/bin/docker-compose + # Compose v2 is provided via the Docker plugin ('docker compose')- ExecStart=/usr/bin/docker-compose up -d - ExecStop=/usr/bin/docker-compose down + ExecStart=/usr/bin/docker compose up -d + ExecStop=/usr/bin/docker compose down- sudo -u tn docker-compose pull + sudo -u tn docker compose pull @@ - sudo -u tn docker-compose up -d + sudo -u tn docker compose up -dAlso applies to: 286-289, 397-402
160-167: Update verification command for Compose v2.- - docker-compose --version + - docker compose version
220-243: Avoid “latest” for tn-node image.Pin by tag/digest; let update-node perform upgrades. Example:
- tn-node: - image: ghcr.io/trufnetwork/node:latest + tn-node: + image: ghcr.io/trufnetwork/node:${TN_NODE_TAG:-vX.Y.Z} # or @sha256:...
436-441: Guarantee EBS encryption on root volume.Relying on account defaults can be unsafe.
Ebs: &awsimagebuilder.CfnImageRecipe_EbsInstanceBlockDeviceSpecificationProperty{ VolumeSize: jsii.Number(20), // 20GB root volume VolumeType: jsii.String("gp3"), + Encrypted: jsii.Bool(true), DeleteOnTermination: jsii.Bool(true), },
446-460: Implement real multi‑region distribution (context‑driven).Read a JSON array from context (e.g., -c amiRegions='["us-east-2","us-west-2"]') and map over it; fall back to current region if absent.
377-381: Use EC2 metadata for public IP (faster, no external deps).- PUBLIC_IP=$(curl -s --connect-timeout 5 ifconfig.co 2>/dev/null || curl -s --connect-timeout 5 ifconfig.me 2>/dev/null || hostname -I | awk '{print $1}' || echo "localhost") + PUBLIC_IP=$(curl -s --max-time 2 http://169.254.169.254/latest/meta-data/public-ipv4 || hostname -I | awk '{print $1}' || echo "localhost")scripts/test-ami.sh (7)
247-255: Write COMPOSE_PROFILES only once; avoid duplicate keys in .envCurrently writes COMPOSE_PROFILES twice, which is ambiguous for readers/parsers.
Apply this diff:
-cat > /tmp/test.env << EOL -CHAIN_ID=$CHAIN_ID -DB_OWNER=postgres://kwild:kwild@kwil-postgres:5432/kwild -COMPOSE_PROFILES=node -EOL - -if [ "$ENABLE_MCP" = true ]; then - echo "COMPOSE_PROFILES=node,mcp" >> /tmp/test.env -fi +if [ "$ENABLE_MCP" = true ]; then + COMPOSE_PROFILES="node,mcp" +else + COMPOSE_PROFILES="node" +fi + +cat > /tmp/test.env << EOL +CHAIN_ID=$CHAIN_ID +DB_OWNER=postgres://kwild:kwild@kwil-postgres:5432/kwild +COMPOSE_PROFILES=$COMPOSE_PROFILES +EOL
59-81: Python fallback depends on PyYAML; pre-check and skip with warning if missingWithout PyYAML, this always fails. Emit a clear warning or install guidance; don’t produce a false failure.
Apply this diff:
-else - echo "Using Python YAML parser (yamllint not available)" - if python3 -c " -import yaml -import sys -try: - with open('.github/workflows/ami-build.yml') as f: - yaml.safe_load(f) - with open('.github/workflows/release.yaml') as f: - yaml.safe_load(f) - with open('.github/workflows/publish-node-image.yaml') as f: - yaml.safe_load(f) - sys.exit(0) -except Exception as e: - print(f'YAML Error: {e}') - sys.exit(1) -" 2>/dev/null; then - echo -e "${GREEN}✅ GitHub Actions workflow syntax valid${NC}" - TESTS_PASSED=$((TESTS_PASSED + 1)) - else - echo -e "${RED}❌ GitHub Actions workflow syntax invalid${NC}" - TESTS_FAILED=$((TESTS_FAILED + 1)) - fi +else + echo "Using Python YAML parser (yamllint not available)" + if ! python3 -c "import yaml" >/dev/null 2>&1; then + echo -e "${YELLOW}⚠️ PyYAML not installed; skipping YAML validation (install with: pip3 install pyyaml).${NC}" + elif python3 - <<'PY' 2>/dev/null; then +import yaml, sys +for p in ('.github/workflows/ami-build.yml', + '.github/workflows/release.yaml', + '.github/workflows/publish-node-image.yaml'): + with open(p, 'r') as f: + yaml.safe_load(f) +sys.exit(0) +PY + echo -e "${GREEN}✅ GitHub Actions workflow syntax valid${NC}" + TESTS_PASSED=$((TESTS_PASSED + 1)) + else + echo -e "${RED}❌ GitHub Actions workflow syntax invalid${NC}" + TESTS_FAILED=$((TESTS_FAILED + 1)) + fi fi
21-27: Pre-check CDK CLI and emit actionable errorFail early with a clear instruction if cdk isn’t installed.
Apply this diff:
-if cdk --app 'go run ami-cdk.go' synth --context stage=dev --context devPrefix=test > /dev/null 2>&1; then +if ! command -v cdk >/dev/null 2>&1; then + echo -e "${YELLOW}⚠️ AWS CDK not installed; install with: npm i -g aws-cdk${NC}" + TESTS_FAILED=$((TESTS_FAILED + 1)) +elif cdk --app 'go run ami-cdk.go' synth --context stage=dev --context devPrefix=test > /dev/null 2>&1; then echo -e "${GREEN}✅ CDK synthesis successful${NC}" TESTS_PASSED=$((TESTS_PASSED + 1)) else echo -e "${RED}❌ CDK synthesis failed${NC}" TESTS_FAILED=$((TESTS_FAILED + 1)) fi
2-2: Harden shell optionsPrefer failing fast on unset vars and pipeline errors.
Apply this diff:
-set -e +set -Eeuo pipefail
168-168: Parse the running script, not a path that assumes repo rootMakes Test 4 robust regardless of current working directory.
Apply this diff:
-if bash -n scripts/test-ami.sh; then +if bash -n "$0"; then
8-12: Remove unused color var (SC2034) or use itBLUE is never used; drop it to silence ShellCheck.
Apply this diff:
GREEN='\033[0;32m' RED='\033[0;31m' YELLOW='\033[1;33m' -BLUE='\033[0;34m' NC='\033[0m'
303-366: Optional: add a best-effort cleanup trapPrevents leaked containers/files if the script aborts mid-run.
Apply this diff (adjust as needed):
echo "9. Testing PostgreSQL service startup..." echo "Starting PostgreSQL container to test database connectivity..." +cleanup() { + $COMPOSE -f /tmp/tn-test-compose.yml down -v >/dev/null 2>&1 || true + rm -f /tmp/tn-test-compose.yml /tmp/test-docker-compose.yml /tmp/test.env /tmp/test-env-gen.sh /tmp/test-config.sh /tmp/tn-update-test.sh >/dev/null 2>&1 || true +} +trap cleanup EXIT + cat > /tmp/tn-test-compose.yml << 'EOF'
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
deployments/infra/go.sumis excluded by!**/*.sum
📒 Files selected for processing (9)
.github/workflows/ami-build.yml(1 hunks).github/workflows/publish-node-image.yaml(1 hunks).github/workflows/release.yaml(2 hunks)deployments/infra/README.md(1 hunks)deployments/infra/ami-cdk.go(1 hunks)deployments/infra/cdk.test.json(1 hunks)deployments/infra/go.mod(2 hunks)deployments/infra/stacks/ami_pipeline_stack.go(1 hunks)scripts/test-ami.sh(1 hunks)
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-09-22T18:35:49.455Z
Learnt from: outerlook
PR: trufnetwork/node#1170
File: .github/workflows/ami-build.yml:75-75
Timestamp: 2025-09-22T18:35:49.455Z
Learning: In the trufnetwork/node repository, the CDK stack naming convention uses `WithStackSuffix()` which generates names in the format `<stackName>-<stackSuffix>-Stack`. The AMI pipeline stack uses `AMI-Pipeline-default-Stack` because no `stackSuffix` is set in the CDK context, so it defaults to "default".
Applied to files:
.github/workflows/ami-build.ymldeployments/infra/stacks/ami_pipeline_stack.godeployments/infra/ami-cdk.go
📚 Learning: 2025-09-23T02:13:19.965Z
Learnt from: MicBun
PR: trufnetwork/node#1170
File: deployments/infra/stacks/ami_pipeline_stack.go:297-384
Timestamp: 2025-09-23T02:13:19.965Z
Learning: In the trufnetwork/node repository, the chain ID should always be "tn-v2.1" regardless of network configuration (mainnet/testnet). There are no other chains currently supported.
Applied to files:
deployments/infra/stacks/ami_pipeline_stack.go
🪛 actionlint (1.7.7)
.github/workflows/ami-build.yml
49-49: shellcheck reported issue in this script: SC2086:info:8:34: Double quote to prevent globbing and word splitting
(shellcheck)
49-49: shellcheck reported issue in this script: SC2016:info:14:7: Expressions don't expand in single quotes, use double quotes for that
(shellcheck)
49-49: shellcheck reported issue in this script: SC2086:info:16:40: Double quote to prevent globbing and word splitting
(shellcheck)
49-49: shellcheck reported issue in this script: SC2086:info:18:35: Double quote to prevent globbing and word splitting
(shellcheck)
72-72: shellcheck reported issue in this script: SC2016:info:11:5: Expressions don't expand in single quotes, use double quotes for that
(shellcheck)
72-72: shellcheck reported issue in this script: SC2086:info:13:38: Double quote to prevent globbing and word splitting
(shellcheck)
89-89: shellcheck reported issue in this script: SC2086:info:2:6: Double quote to prevent globbing and word splitting
(shellcheck)
94-94: shellcheck reported issue in this script: SC2086:info:15:38: Double quote to prevent globbing and word splitting
(shellcheck)
112-112: shellcheck reported issue in this script: SC2086:info:26:32: Double quote to prevent globbing and word splitting
(shellcheck)
🪛 Shellcheck (0.11.0)
scripts/test-ami.sh
[warning] 10-10: YELLOW appears unused. Verify use (or export if used externally).
(SC2034)
[warning] 11-11: BLUE appears unused. Verify use (or export if used externally).
(SC2034)
🔇 Additional comments (2)
deployments/infra/stacks/ami_pipeline_stack.go (1)
34-36: Guard prevents resource synthesis under legit runs.Remove IsStackInSynthesis gate; it can cause empty stacks.
- if !config.IsStackInSynthesis(stack) { - return stack, AmiPipelineStackExports{} - } + // Proceed unconditionally; avoid gating on bundling heuristics.scripts/test-ami.sh (1)
83-156: Compose file: env defaults look fine; config check is a good guardNice use of docker-compose config to validate YAML. No action.
There was a problem hiding this comment.
Actionable comments posted: 4
🧹 Nitpick comments (8)
deployments/infra/stacks/ami_pipeline_stack.go (5)
49-58: Make bucket retention stage-aware; avoid DESTROY/auto-delete in prod.Artifacts/logs should be retained in prod. Recommend prod: RETAIN + no AutoDeleteObjects; dev: DESTROY + AutoDeleteObjects.
Apply this diff (minimal inline approach):
- artifactsBucket := awss3.NewBucket(stack, jsii.String("AmiArtifactsBucket"), &awss3.BucketProps{ - RemovalPolicy: awscdk.RemovalPolicy_DESTROY, - AutoDeleteObjects: jsii.Bool(true), + autoDelete := jsii.Bool(string(stage) != "prod") + removal := awscdk.RemovalPolicy_DESTROY + if string(stage) == "prod" { removal = awscdk.RemovalPolicy_RETAIN } + artifactsBucket := awss3.NewBucket(stack, jsii.String("AmiArtifactsBucket"), &awss3.BucketProps{ + RemovalPolicy: removal, + AutoDeleteObjects: autoDelete, Versioned: jsii.Bool(false), PublicReadAccess: jsii.Bool(false), BlockPublicAccess: awss3.BlockPublicAccess_BLOCK_ALL(), Encryption: awss3.BucketEncryption_S3_MANAGED, EnforceSSL: jsii.Bool(true), })
78-95: Enforce IMDSv2 on builder instances.Require HttpTokens=required for instance metadata; defense-in-depth.
Apply this diff:
infraConfig := awsimagebuilder.NewCfnInfrastructureConfiguration(stack, jsii.String("AmiInfrastructureConfiguration"), &awsimagebuilder.CfnInfrastructureConfigurationProps{ Name: jsii.String(nameWithPrefix("tn-ami-infra-config-" + string(stage))), InstanceProfileName: instanceProfile.InstanceProfileName(), InstanceTypes: &[]*string{ jsii.String("t3.medium"), // Cost-effective for AMI building }, + InstanceMetadataOptions: &awsimagebuilder.CfnInfrastructureConfiguration_InstanceMetadataOptionsProperty{ + HttpTokens: jsii.String("required"), + HttpPutResponseHopLimit: jsii.Number(2), + },
137-144: Drop legacy docker-compose and use the v2 plugin everywhere.You install both the plugin and the legacy binary; keep one to reduce drift. Prefer v2 (“docker compose”); update systemd accordingly.
Apply these diffs:
- - name: InstallDockerCompose - action: ExecuteBash - inputs: - commands: - - sudo curl -L "https://github.com/docker/compose/releases/latest/download/docker-compose-$(uname -s)-$(uname -m)" -o /usr/local/bin/docker-compose - - sudo chmod +x /usr/local/bin/docker-compose - - sudo ln -sf /usr/local/bin/docker-compose /usr/bin/docker-compose- ExecStart=/usr/bin/docker-compose up -d - ExecStop=/usr/bin/docker-compose down + ExecStart=/usr/bin/docker compose up -d + ExecStop=/usr/bin/docker compose downAlso applies to: 288-289
434-443: Encrypt root EBS volume explicitly.Be explicit even if the account default is “encrypt by default”.
Apply this diff:
Ebs: &awsimagebuilder.CfnImageRecipe_EbsInstanceBlockDeviceSpecificationProperty{ VolumeSize: jsii.Number(20), // 20GB root volume VolumeType: jsii.String("gp3"), + Encrypted: jsii.Bool(true), DeleteOnTermination: jsii.Bool(true), },
446-460: Parametrize distribution regions.You now target only the current region. Consider a context key amiRegions='["us-east-2","us-west-2"]' and map to Distributions to meet the multi‑region objective.
If helpful, I can provide a small helper to parse the JSON context and build the slice.
scripts/test-ami.sh (2)
1-2: Harden shell options.Use strict mode to catch subtle failures earlier.
Apply this diff:
-#!/bin/bash -set -e +#!/bin/bash +set -Eeuo pipefail +IFS=$'\n\t'
316-383: Ensure cleanup on error for DB test.Add a trap to always tear down the test stack to avoid orphaned containers.
+# Add near the top after COMPOSE detection: +cleanup() { + if [ -f /tmp/tn-test-compose.yml ]; then + $COMPOSE -f /tmp/tn-test-compose.yml down -v >/dev/null 2>&1 || true + fi +} +trap cleanup EXIT.github/workflows/release.yaml (1)
75-102: Add concurrency to avoid duplicate downstream dispatches.Prevents overlapping runs for the same ref if the workflow is re‑triggered.
permissions: contents: write packages: read # This is required for creating and modifying releases id-token: write # Required for workflow dispatch actions: write + +concurrency: + group: release-${{ github.ref }} + cancel-in-progress: falseIf you prefer cancellation on re-runs for the same tag, set cancel-in-progress: true.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (8)
.github/workflows/ami-build.yml(1 hunks).github/workflows/release.yaml(3 hunks)deployments/infra/README.md(1 hunks)deployments/infra/ami-cdk.go(1 hunks)deployments/infra/cdk.test.json(1 hunks)deployments/infra/go.mod(3 hunks)deployments/infra/stacks/ami_pipeline_stack.go(1 hunks)scripts/test-ami.sh(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (5)
- deployments/infra/cdk.test.json
- deployments/infra/ami-cdk.go
- deployments/infra/go.mod
- .github/workflows/ami-build.yml
- deployments/infra/README.md
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-09-22T18:35:49.455Z
Learnt from: outerlook
PR: trufnetwork/node#1170
File: .github/workflows/ami-build.yml:75-75
Timestamp: 2025-09-22T18:35:49.455Z
Learning: In the trufnetwork/node repository, the CDK stack naming convention uses `WithStackSuffix()` which generates names in the format `<stackName>-<stackSuffix>-Stack`. The AMI pipeline stack uses `AMI-Pipeline-default-Stack` because no `stackSuffix` is set in the CDK context, so it defaults to "default".
Applied to files:
deployments/infra/stacks/ami_pipeline_stack.go
📚 Learning: 2025-09-23T02:13:19.965Z
Learnt from: MicBun
PR: trufnetwork/node#1170
File: deployments/infra/stacks/ami_pipeline_stack.go:297-384
Timestamp: 2025-09-23T02:13:19.965Z
Learning: In the trufnetwork/node repository, the chain ID should always be "tn-v2.1" regardless of network configuration (mainnet/testnet). There are no other chains currently supported.
Applied to files:
deployments/infra/stacks/ami_pipeline_stack.go
🧬 Code graph analysis (1)
deployments/infra/stacks/ami_pipeline_stack.go (2)
deployments/infra/config/is_stack_in_synthesis.go (1)
IsStackInSynthesis(8-17)deployments/infra/config/config.go (2)
GetStage(83-103)GetDevPrefix(108-129)
🪛 Shellcheck (0.11.0)
scripts/test-ami.sh
[warning] 10-10: YELLOW appears unused. Verify use (or export if used externally).
(SC2034)
[warning] 11-11: BLUE appears unused. Verify use (or export if used externally).
(SC2034)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: acceptance-test
🔇 Additional comments (4)
scripts/test-ami.sh (1)
430-459: Gated summary looks good.Now prints success only when zero failures, with a concise summary. Nice.
.github/workflows/release.yaml (1)
3-7: Quoting ‘on’ is correct.Prevents YAML keyword pitfalls; trigger set is clear.
deployments/infra/stacks/ami_pipeline_stack.go (2)
34-36: Remove unreliable IsStackInSynthesis guard (can emit an empty stack).This predicate is not a reliable synth indicator; it can skip resource creation and produce an empty template. Proceed unconditionally or gate behind an explicit context flag.
Apply this diff:
- if !config.IsStackInSynthesis(stack) { - return stack, AmiPipelineStackExports{} - } + // Proceed with resource definition; avoid gating on BundlingRequired().
377-381: Avoid external IP egress; use EC2 IMDS for public IP.Leaking instance IP via ifconfig.* is unnecessary and flaky; prefer the metadata service with a local fallback.
Apply this diff:
- if [ "$ENABLE_MCP" = true ]; then - # Get public IP with fallback to local IP - PUBLIC_IP=$(curl -s --connect-timeout 5 ifconfig.co 2>/dev/null || curl -s --connect-timeout 5 ifconfig.me 2>/dev/null || hostname -I | awk '{print $1}' || echo "localhost") - echo "MCP server will be available at: http://$PUBLIC_IP:8000/sse" - fi + if [ "$ENABLE_MCP" = true ]; then + IPV4=$(curl -s --max-time 2 http://169.254.169.254/latest/meta-data/public-ipv4 || hostname -I | awk '{print $1}' || echo "localhost") + echo "MCP server will be available at: http://${IPV4}:8000/sse" + fi
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (9)
scripts/test-ami.sh (5)
287-305: Avoid network-heavy pulls; check manifests instead.Use
docker manifest inspect(ordocker image inspectif locally cached) to validate image availability without downloading layers.Apply:
-echo "Checking kwildb/postgres:16.8-1..." -if docker pull kwildb/postgres:16.8-1; then +echo "Checking kwildb/postgres:16.8-1..." +if docker manifest inspect kwildb/postgres:16.8-1 >/dev/null 2>&1; then @@ -echo "Checking crystaldba/postgres-mcp:latest..." -if docker pull crystaldba/postgres-mcp:latest; then +echo "Checking crystaldba/postgres-mcp:latest..." +if docker manifest inspect crystaldba/postgres-mcp:latest >/dev/null 2>&1; then
344-377: Ensure cleanup on failure; add trap.If an earlier step exits (set -e), containers/files may persist. Add a trap to always teardown.
Example:
+cleanup() { + $COMPOSE -f /tmp/tn-test-compose.yml down -v >/dev/null 2>&1 || true + rm -f /tmp/tn-test-compose.yml /tmp/test-docker-compose.yml /tmp/test.env /tmp/test-config.sh /tmp/test-env-gen.sh /tmp/tn-update-test.sh 2>/dev/null || true +} +trap cleanup EXITPlace near the top after COMPOSE is resolved.
1-3: Harden shell options.Add
-uand-o pipefailto catch unset vars and pipeline errors early.-set -e +set -euo pipefail
12-21: Prefer command detection for legacy binary.
docker-compose versioninvokes the tool; checking existence is cheaper and avoids noisy errors.-elif docker-compose version >/dev/null 2>&1; then +elif command -v docker-compose >/dev/null 2>&1; then
425-454: Quote numeric tests.Avoid word-splitting/empty var edge cases.
-if [ $TESTS_FAILED -eq 0 ]; then +if [ "${TESTS_FAILED}" -eq 0 ]; then @@ -else +else @@ - echo "📊 Test Results: $TESTS_PASSED/$TOTAL_TESTS tests passed, $TESTS_FAILED failed" + echo "📊 Test Results: ${TESTS_PASSED}/${TOTAL_TESTS} tests passed, ${TESTS_FAILED} failed".github/workflows/ami-build.yml (4)
70-76: Bootstrap CDK before first deploy.Prevents failures in fresh accounts/regions.
cd deployments/infra echo "Deploying AMI pipeline infrastructure..." + cdk bootstrap --require-approval never cdk deploy AMI-Pipeline-default-Stack --require-approval never
92-105: Guard against empty PIPELINE_ARN.Fail fast with a clear error.
- name: Trigger AMI build run: | - echo "Starting AMI build with pipeline: $PIPELINE_ARN" + if [ -z "${PIPELINE_ARN:-}" ]; then + echo "PIPELINE_ARN is empty; aborting." >&2 + exit 1 + fi + echo "Starting AMI build with pipeline: $PIPELINE_ARN"
126-135: Expose all distributed AMIs, not just the first.Image Builder can output multiple region AMIs; capture and print them.
- AMI_ID=$(aws imagebuilder get-image \ + AMI_ID=$(aws imagebuilder get-image \ --image-build-version-arn "$EXECUTION_ID" \ --region "${{ env.AWS_REGION }}" \ --query 'image.outputResources.amis[0].image' \ --output text) + AMI_TABLE=$(aws imagebuilder get-image \ + --image-build-version-arn "$EXECUTION_ID" \ + --region "${{ env.AWS_REGION }}" \ + --query 'image.outputResources.amis[].{Region:region,AMI:image}' \ + --output table) echo "AMI ID: $AMI_ID" echo "AMI_ID=$AMI_ID" >> "$GITHUB_ENV" + echo "AMI_TABLE<<EOF" >> "$GITHUB_ENV" + echo "$AMI_TABLE" >> "$GITHUB_ENV" + echo "EOF" >> "$GITHUB_ENV"Optionally append
${AMI_TABLE}to the release note for full visibility.
42-45: Pin CDK major to reduce surprise breakages.Use a stable major (v2) instead of
latest.- npm install -g aws-cdk@latest + npm install -g aws-cdk@2
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (8)
.github/workflows/ami-build.yml(1 hunks).github/workflows/release.yaml(3 hunks)deployments/infra/README.md(1 hunks)deployments/infra/ami-cdk.go(1 hunks)deployments/infra/cdk.test.json(1 hunks)deployments/infra/go.mod(3 hunks)deployments/infra/stacks/ami_pipeline_stack.go(1 hunks)scripts/test-ami.sh(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (5)
- deployments/infra/cdk.test.json
- deployments/infra/ami-cdk.go
- deployments/infra/stacks/ami_pipeline_stack.go
- deployments/infra/go.mod
- deployments/infra/README.md
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-09-22T18:35:49.455Z
Learnt from: outerlook
PR: trufnetwork/node#1170
File: .github/workflows/ami-build.yml:75-75
Timestamp: 2025-09-22T18:35:49.455Z
Learning: In the trufnetwork/node repository, the CDK stack naming convention uses `WithStackSuffix()` which generates names in the format `<stackName>-<stackSuffix>-Stack`. The AMI pipeline stack uses `AMI-Pipeline-default-Stack` because no `stackSuffix` is set in the CDK context, so it defaults to "default".
Applied to files:
.github/workflows/ami-build.yml
📚 Learning: 2025-09-23T02:13:19.965Z
Learnt from: MicBun
PR: trufnetwork/node#1170
File: deployments/infra/stacks/ami_pipeline_stack.go:297-384
Timestamp: 2025-09-23T02:13:19.965Z
Learning: In the trufnetwork/node repository, the chain ID should always be "tn-v2.1" regardless of network configuration (mainnet/testnet). There are no other chains currently supported.
Applied to files:
scripts/test-ami.sh
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: acceptance-test
- GitHub Check: lint
🔇 Additional comments (3)
scripts/test-ami.sh (1)
119-157: Profiles drift risk vs AMI defaults.Repo guidance sets COMPOSE_PROFILES only when MCP is enabled (no “node” default). Ensure this compose template stays aligned with the AMI compose to prevent test/config drift.
.github/workflows/release.yaml (1)
75-102: Verified — target workflows expose workflow_dispatch.
Both .github/workflows/ami-build.yml and .github/workflows/publish-node-image.yaml include workflow_dispatch, so the createWorkflowDispatch calls will not 404..github/workflows/ami-build.yml (1)
212-218: Bug: undefined variable ‘amiDetails’ breaks release update.Replace with the constructed
comment.- body: (currentRelease.data.body || '') + amiDetails + body: (currentRelease.data.body || '') + comment
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (4)
scripts/test-ami.sh (4)
13-24: Avoid eval for Compose; use a function or array to prevent quoting pitfallseval with a space-containing command string is brittle. Prefer a wrapper that preserves argv boundaries.
Apply this diff to introduce a safe wrapper and use it for subsequent calls (replace other eval "$COMPOSE ..." similarly):
if docker compose version >/dev/null 2>&1; then COMPOSE="docker compose" elif command -v docker-compose >/dev/null 2>&1; then COMPOSE="docker-compose" @@ -echo "Using Docker Compose: $COMPOSE" +echo "Using Docker Compose: $COMPOSE" + +# Wrapper to call Compose safely without eval +run_compose() { + # shellcheck disable=SC2086 # intentional word splitting for $COMPOSE + $COMPOSE "$@" +}Example replacements:
-if eval "$COMPOSE -f /tmp/test-docker-compose.yml config" > /dev/null 2>&1; then +if run_compose -f /tmp/test-docker-compose.yml config > /dev/null 2>&1; then @@ -if eval "$COMPOSE -f /tmp/tn-test-compose.yml up -d kwil-postgres"; then +if run_compose -f /tmp/tn-test-compose.yml up -d kwil-postgres; then @@ - if eval "$COMPOSE -f /tmp/tn-test-compose.yml exec -T kwil-postgres pg_isready -U kwild" > /dev/null 2>&1; then + if run_compose -f /tmp/tn-test-compose.yml exec -T kwil-postgres pg_isready -U kwild > /dev/null 2>&1; then @@ - if eval "$COMPOSE -f /tmp/tn-test-compose.yml exec -T kwil-postgres psql -U kwild -d kwild -c \"SELECT version();\"" > /dev/null 2>&1; then + if run_compose -f /tmp/tn-test-compose.yml exec -T kwil-postgres psql -U kwild -d kwild -c "SELECT version();" > /dev/null 2>&1; then @@ -eval "$COMPOSE -f /tmp/tn-test-compose.yml down -v" +run_compose -f /tmp/tn-test-compose.yml down -v
79-101: Don’t fail YAML test when yamllint is absent and PyYAML isn’t installedCurrent fallback will mark workflows invalid if PyYAML isn’t present. Treat missing PyYAML as a skipped check to reduce flakiness.
Apply this diff to gracefully skip (and count as pass) when PyYAML is missing:
else echo "Using Python YAML parser (yamllint not available)" - if python3 -c " -import yaml -import sys -try: - with open('.github/workflows/ami-build.yml') as f: - yaml.safe_load(f) - with open('.github/workflows/release.yaml') as f: - yaml.safe_load(f) - with open('.github/workflows/publish-node-image.yaml') as f: - yaml.safe_load(f) - sys.exit(0) -except Exception as e: - print(f'YAML Error: {e}') - sys.exit(1) -" 2>/dev/null; then + if python3 -c "import yaml" >/dev/null 2>&1; then + if python3 - <<'PY' 2>/dev/null; then +import yaml, sys +for p in ['.github/workflows/ami-build.yml', + '.github/workflows/release.yaml', + '.github/workflows/publish-node-image.yaml']: + with open(p) as f: + yaml.safe_load(f) +PY + echo -e "${GREEN}✅ GitHub Actions workflow syntax valid${NC}" + TESTS_PASSED=$((TESTS_PASSED + 1)) + else + echo -e "${RED}❌ GitHub Actions workflow syntax invalid${NC}" + TESTS_FAILED=$((TESTS_FAILED + 1)) + fi else - echo -e "${GREEN}✅ GitHub Actions workflow syntax valid${NC}" - TESTS_PASSED=$((TESTS_PASSED + 1)) - else - echo -e "${RED}❌ GitHub Actions workflow syntax invalid${NC}" - TESTS_FAILED=$((TESTS_FAILED + 1)) + echo "PyYAML not installed; skipping YAML validation" + TESTS_PASSED=$((TESTS_PASSED + 1)) fi fi
202-208: Avoid go mod tidy in a test; build all packages insteadgo mod tidy mutates go.mod/go.sum and can cause noisy diffs. Build ./... to cover all packages without modifying modules.
Apply this diff:
-if go mod tidy && go build ./lib/... ./stacks/... > /dev/null 2>&1; then +if go build ./... > /dev/null 2>&1; then
326-355: Remove hardcoded container_name to avoid collisions with existing containersHardcoded names can fail if a container with the same name exists. Let Compose auto-name containers.
Apply this diff in the Postgres-only test Compose:
kwil-postgres: image: kwildb/postgres:16.8-1 - container_name: tn-postgresAlso consider removing container_name from the earlier template (lines 111, 131, 154) to prevent conflicts even when only validating config.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
.github/workflows/ami-build.yml(1 hunks).github/workflows/release.yaml(3 hunks)deployments/infra/cdk.context.json(1 hunks)deployments/infra/stacks/ami_pipeline_stack.go(1 hunks)scripts/test-ami.sh(1 hunks)
✅ Files skipped from review due to trivial changes (1)
- deployments/infra/cdk.context.json
🚧 Files skipped from review as they are similar to previous changes (2)
- .github/workflows/ami-build.yml
- deployments/infra/stacks/ami_pipeline_stack.go
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-09-23T02:13:19.965Z
Learnt from: MicBun
PR: trufnetwork/node#1170
File: deployments/infra/stacks/ami_pipeline_stack.go:297-384
Timestamp: 2025-09-23T02:13:19.965Z
Learning: In the trufnetwork/node repository, the chain ID should always be "tn-v2.1" regardless of network configuration (mainnet/testnet). There are no other chains currently supported.
Applied to files:
scripts/test-ami.sh
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: lint
- GitHub Check: acceptance-test
🔇 Additional comments (4)
scripts/test-ami.sh (1)
129-137: CHAIN_ID default aligned with repo standard (tn-v2.1)Good alignment with infra and team guidance.
.github/workflows/release.yaml (3)
17-19: Concurrency group per-ref looks goodPrevents overlapping runs per tag while allowing different tags to proceed independently.
68-78: Keep token input — softprops/action-gh-release@v2 acceptstoken(defaults to${{ github.token }})
The action supports awith.tokeninput; the explicittoken: ${{ secrets.GITHUB_TOKEN }}is valid but optional.Likely an incorrect or invalid review comment.
79-106: No action required — target workflows already declare workflow_dispatchBoth .github/workflows/ami-build.yml and .github/workflows/publish-node-image.yaml include workflow_dispatch, so the createWorkflowDispatch calls are valid.
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (5)
deployments/infra/stacks/docker-compose.template.yml (3)
11-12: Restrict Postgres port to loopbackPrevent exposing the DB externally by binding to localhost.
Apply:
- - "5432:5432" + - "127.0.0.1:5432:5432"
22-45: Add pull policy and healthcheck for tn-nodeAlign with “always-latest” strategy and enable dependable startup ordering.
Apply:
tn-node: image: ghcr.io/trufnetwork/node:latest + pull_policy: always container_name: tn-node @@ restart: unless-stopped + healthcheck: + test: ["CMD-SHELL", "curl -fsS http://localhost:8080/health || exit 1"] + interval: 10s + timeout: 5s + retries: 12Note: ensure curl is available in the image or switch to a binary your image provides.
46-62: Add pull policy and make depends_on health-aware for MCPEnsure latest image is pulled and wait on healthy deps (requires tn-node healthcheck above).
Apply:
postgres-mcp: image: crystaldba/postgres-mcp:latest + pull_policy: always container_name: tn-mcp @@ - depends_on: - - kwil-postgres - - tn-node + depends_on: + kwil-postgres: + condition: service_healthy + tn-node: + condition: service_healthyscripts/test-ami.sh (2)
13-22: Avoid eval; use an array for Compose commandUsing eval is unnecessary and risk‑prone. Store the Compose CLI as an array and invoke it safely.
Apply:
-# Detect Docker Compose command -if docker compose version >/dev/null 2>&1; then - COMPOSE="docker compose" -elif command -v docker-compose >/dev/null 2>&1; then - COMPOSE="docker-compose" +# Detect Docker Compose command +if docker compose version >/dev/null 2>&1; then + COMPOSE=(docker compose) +elif command -v docker-compose >/dev/null 2>&1; then + COMPOSE=(docker-compose) @@ -echo "Using Docker Compose: $COMPOSE" +echo "Using Docker Compose: ${COMPOSE[*]}" @@ - eval "$COMPOSE -f /tmp/tn-test-compose.yml down -v" >/dev/null 2>&1 || true + "${COMPOSE[@]}" -f /tmp/tn-test-compose.yml down -v >/dev/null 2>&1 || true @@ -if eval "$COMPOSE -f /tmp/test-docker-compose.yml config" > /dev/null 2>&1; then +if "${COMPOSE[@]}" -f /tmp/test-docker-compose.yml config > /dev/null 2>&1; then @@ -if eval "$COMPOSE -f /tmp/tn-test-compose.yml up -d kwil-postgres"; then +if "${COMPOSE[@]}" -f /tmp/tn-test-compose.yml up -d kwil-postgres; then @@ - if eval "$COMPOSE -f /tmp/tn-test-compose.yml exec -T kwil-postgres pg_isready -U kwild" > /dev/null 2>&1; then + if "${COMPOSE[@]}" -f /tmp/tn-test-compose.yml exec -T kwil-postgres pg_isready -U kwild > /dev/null 2>&1; then @@ - if eval "$COMPOSE -f /tmp/tn-test-compose.yml exec -T kwil-postgres psql -U kwild -d kwild -c \"SELECT version();\"" > /dev/null 2>&1; then + if "${COMPOSE[@]}" -f /tmp/tn-test-compose.yml exec -T kwil-postgres psql -U kwild -d kwild -c "SELECT version();" > /dev/null 2>&1; then @@ -eval "$COMPOSE -f /tmp/tn-test-compose.yml down -v" +"${COMPOSE[@]}" -f /tmp/tn-test-compose.yml down -vAlso applies to: 24-25, 29-30, 110-116, 261-269, 269-269, 292-292
79-101: YAML fallback requires PyYAML; handle missing module explicitlyIf PyYAML isn’t installed, this test fails regardless of YAML validity. Detect and error with a clear message, or mark as skipped.
Apply:
- echo "Using Python YAML parser (yamllint not available)" - if python3 -c " + echo "Using Python YAML parser (yamllint not available)" + if python3 - <<'PY' +try: + import yaml # type: ignore +except ModuleNotFoundError: + print('YAML Error: PyYAML not installed; install yamllint or PyYAML to enable this test.') + raise +import sys try: - with open('.github/workflows/ami-build.yml') as f: - yaml.safe_load(f) - with open('.github/workflows/release.yaml') as f: - yaml.safe_load(f) - with open('.github/workflows/publish-node-image.yaml') as f: - yaml.safe_load(f) - sys.exit(0) + for p in ('.github/workflows/ami-build.yml', + '.github/workflows/release.yaml', + '.github/workflows/publish-node-image.yaml'): + with open(p, 'r', encoding='utf-8') as f: + yaml.safe_load(f) + sys.exit(0) except Exception as e: - print(f'YAML Error: {e}') + print(f'YAML Error: {e}') sys.exit(1) -" 2>/dev/null; then +PY + then
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
deployments/infra/stacks/ami_pipeline_stack.go(1 hunks)deployments/infra/stacks/docker-compose.template.yml(1 hunks)scripts/test-ami.sh(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- deployments/infra/stacks/ami_pipeline_stack.go
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-09-23T02:13:19.979Z
Learnt from: MicBun
PR: trufnetwork/node#1170
File: deployments/infra/stacks/ami_pipeline_stack.go:297-384
Timestamp: 2025-09-23T02:13:19.979Z
Learning: In the trufnetwork/node repository, the chain ID should always be "tn-v2.1" regardless of network configuration (mainnet/testnet). There are no other chains currently supported.
Applied to files:
scripts/test-ami.sh
🪛 Checkov (3.2.334)
deployments/infra/stacks/docker-compose.template.yml
[medium] 27-28: Basic Auth Credentials
(CKV_SECRET_4)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: acceptance-test
🔇 Additional comments (2)
deployments/infra/stacks/docker-compose.template.yml (1)
1-70: Overall compose topology looks solidProfiles, volumes, networks, and Postgres healthcheck are sensible defaults.
Please confirm the tn-node exposes 8080/health (or provide the correct path) so the healthcheck can be wired accurately.
scripts/test-ami.sh (1)
147-172: Good coverage and flow; counters, gating, and cleanups look solidTest steps are cohesive, with proper pass/fail accounting and teardown.
Consider switching hard-coded /tmp paths to mktemp-created temp dirs to avoid collisions in parallel runs. Want a patch?
Also applies to: 189-212, 227-251, 253-294, 299-336, 338-370
Implement automated AMI building pipeline to reduce developer onboarding from 3+ hours to under 10 minutes.
AWS CDK Infrastructure: EC2 Image Builder pipeline with isolated deployment
Docker-in-AMI: Pre-configured containers for node, database, and MCP server
Always-Latest Strategy: Containers pull latest images on startup
Simple Configuration: 3-command setup with
tn-node-configurescriptMulti-Region Distribution: AMI available across AWS regions
deployments/infra/ami-cdk.go- Isolated AMI deployment applicationdeployments/infra/stacks/ami_pipeline_stack.go- Main CDK stack✅ Successfully deployed in us-east-2
Additional issue will be handled separately to limit scope.
Removes major adoption barrier by providing one-click deployment alternative to complex manual setup.
resolves: #1131
resolves: #1134
Summary by CodeRabbit
New Features
Documentation
Tests
Chores