Skip to content

chore: simplify AMI deployment and fix overwriting config#1173

Merged
MicBun merged 2 commits into
mainfrom
simplifyAmi
Sep 25, 2025
Merged

chore: simplify AMI deployment and fix overwriting config#1173
MicBun merged 2 commits into
mainfrom
simplifyAmi

Conversation

@MicBun
Copy link
Copy Markdown
Contributor

@MicBun MicBun commented Sep 24, 2025

resolves: #1132

Summary by CodeRabbit

  • New Features

    • Node now preserves configuration and identity across restarts; startup messaging updated to reflect network branding.
  • Refactor

    • Database service renamed for consistency across the stack.
    • Standardized runtime paths and host references used by services.
  • Bug Fixes

    • Identity generation is now conditional to reduce unnecessary reinitialization.
    • Improved database connectivity by updating service references.
  • Chores

    • Updated environment templates (removed legacy DB_OWNER entry), service dependencies, and orchestration health checks.

@MicBun MicBun requested a review from outerlook September 24, 2025 07:15
@MicBun MicBun self-assigned this Sep 24, 2025
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Sep 24, 2025

Walkthrough

Removes the DB_OWNER line from the AMI .env template and renames the Postgres service from kwil-postgres to tn-postgres in Docker Compose. Updates TN node startup to prefer persistent /root/.kwild config, generate identity only if missing, and adjust DB and config paths accordingly.

Changes

Cohort / File(s) Summary
AMI env configuration
deployments/infra/stacks/ami_pipeline_stack.go
Removed the DB_OWNER line from the generated .env content used for TN node setup (no other control-flow or error-handling changes).
Docker Compose and node startup
deployments/infra/stacks/docker-compose.template.yml
Renamed service/container kwil-postgrestn-postgres, updated depends_on health conditions and DATABASE_URI, changed tn-node volume to persist at ./tn-data:/root/.kwild, and reworked startup script to check /root/.kwild/config.toml, conditionally generate/restore node identity, detect external IP, copy configs/genesis, convert provided private key to nodekey.json, and start the TN node with updated messages and paths.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor User
  participant Compose as docker-compose
  participant PG as tn-postgres
  participant Node as tn-node
  participant MCP as tn-mcp

  User->>Compose: docker-compose up
  Compose->>PG: start tn-postgres
  PG-->>Compose: healthy
  Compose->>Node: start tn-node (depends_on: tn-postgres healthy)
  Node->>Node: check /root/.kwild/config.toml
  alt config exists
    Node->>Node: load existing identity/config
  else no config
    Node->>Node: ensure curl/wget, fetch public IP, generate identity
    Node->>Node: copy genesis/configs, convert private key to nodekey.json
    Node->>PG: connect/init using host=tn-postgres
  end
  Node->>Node: start TRUF.NETWORK node
  Compose->>MCP: start tn-mcp (depends_on: tn-postgres healthy)
  MCP->>PG: connect via DATABASE_URI (host=tn-postgres)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested reviewers

  • outerlook

Poem

I thump my paws—one line went shy,
The postgres name hops to tn on high.
My burrow keeps configs safe and sound,
If none exist, I craft them from the ground—
TRUF nodes hum as I bounce by. 🐇✨

Pre-merge checks and finishing touches

❌ Failed checks (2 warnings)
Check name Status Explanation Resolution
Linked Issues Check ⚠️ Warning The pull request changes do not implement any of the acceptance criteria for issue #1132, as there is no configuration wrapper script, no pre-configuration of system services, no designed AMI file structure, nor any reduction of user setup commands provided in the diffs. Update the PR to include the wrapper script, pre-configured systemd and PostgreSQL services, a structured AMI layout with dependencies pre-installed, and a streamlined user setup flow to meet the linked issue objectives.
Out of Scope Changes Check ⚠️ Warning The pull request introduces extensive Docker Compose template and startup script modifications that are unrelated to the scope of simplifying the AWS AMI deployment pipeline defined by issue #1132, which primarily focuses on AMI configuration rather than container orchestration scripts. Remove or isolate the Docker Compose and startup script changes into a separate PR and focus this one solely on the AMI pipeline improvements.
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The pull request title concisely highlights the two main modifications to the codebase by indicating the simplification of the AMI deployment process and the fix to prevent configuration overwrites, which aligns with the actual changes made to the AMI pipeline code and the Docker Compose startup script.
Docstring Coverage ✅ Passed No functions found in the changes. Docstring coverage check skipped.
✨ Finishing touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch simplifyAmi

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7019605 and 10741b9.

📒 Files selected for processing (2)
  • deployments/infra/stacks/ami_pipeline_stack.go (0 hunks)
  • deployments/infra/stacks/docker-compose.template.yml (3 hunks)
💤 Files with no reviewable changes (1)
  • deployments/infra/stacks/ami_pipeline_stack.go
🧰 Additional context used
🧠 Learnings (3)
📓 Common learnings
Learnt from: outerlook
PR: trufnetwork/node#1168
File: tests/extensions/tn_digest/test_tn_digest.sh:15-15
Timestamp: 2025-09-19T18:59:51.942Z
Learning: In Docker Compose files, service names (like "tn-db:") can remain unchanged even when the underlying image is updated (e.g., from tn-db:local to ghcr.io/trufnetwork/node:local). Service names are internal network identifiers and don't need to match the actual image names.
📚 Learning: 2025-09-19T18:59:51.942Z
Learnt from: outerlook
PR: trufnetwork/node#1168
File: tests/extensions/tn_digest/test_tn_digest.sh:15-15
Timestamp: 2025-09-19T18:59:51.942Z
Learning: In Docker Compose files, service names (like "tn-db:") can remain unchanged even when the underlying image is updated (e.g., from tn-db:local to ghcr.io/trufnetwork/node:local). Service names are internal network identifiers and don't need to match the actual image names.

Applied to files:

  • deployments/infra/stacks/docker-compose.template.yml
📚 Learning: 2025-09-23T18:50:13.274Z
Learnt from: MicBun
PR: trufnetwork/node#1172
File: deployments/infra/stacks/docker-compose.template.yml:3-7
Timestamp: 2025-09-23T18:50:13.274Z
Learning: The Kwil system expects POSTGRES_HOST_AUTH_METHOD=trust for PostgreSQL database configuration in Docker deployments. This is an intentional configuration requirement for Kwil nodes, not a security oversight.

Applied to files:

  • deployments/infra/stacks/docker-compose.template.yml
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: lint
  • GitHub Check: acceptance-test
🔇 Additional comments (1)
deployments/infra/stacks/docker-compose.template.yml (1)

58-60: Fix copy to preserve config directories and dotfiles.

cp /root/.kwild-new/* /root/.kwild/ skips dotfiles and fails on directories (e.g., config/), so the generated node config is left incomplete and the service starts without required files. Switch to an archive copy with the trailing dot so every file/dir (including hidden ones) is moved before deleting the temp directory.

-          cp /root/.kwild-new/* /root/.kwild/
+          cp -a /root/.kwild-new/. /root/.kwild/

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@holdex
Copy link
Copy Markdown

holdex Bot commented Sep 24, 2025

Time Submission Status

Member Status Time Action Last Update
MicBun ✅ Submitted 4h Update time Sep 25, 2025, 8:10 PM
@outerlook ❌ Missing - ⚠️ Submit time -

@MicBun MicBun marked this pull request as draft September 24, 2025 07:36
@MicBun
Copy link
Copy Markdown
Contributor Author

MicBun commented Sep 24, 2025

after the snapshot download, the block height stuck. I am debugging this

@MicBun MicBun marked this pull request as ready for review September 24, 2025 09:25
@MicBun MicBun changed the title chore: simplify AMI deployment and overwriting compose chore: simplify AMI deployment and fix overwriting config Sep 24, 2025
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
deployments/infra/stacks/docker-compose.template.yml (1)

64-74: Harden nodekey.json permissions.

Write key with restricted perms to avoid world/group-readable secrets.

Apply this diff:

-          echo '{\"key\":\"'$$CLEAN_KEY'\",\"type\":\"secp256k1\"}' > /root/.kwild/nodekey.json
-          echo 'Nodekey created successfully'
+          echo '{\"key\":\"'$$CLEAN_KEY'\",\"type\":\"secp256k1\"}' > /root/.kwild/nodekey.json
+          chmod 600 /root/.kwild/nodekey.json || true
+          echo 'Nodekey created successfully'
🧹 Nitpick comments (3)
deployments/infra/stacks/docker-compose.template.yml (3)

76-78: Use exec for proper signal handling and PID 1 reaping.

Replace the shell with the process so Compose stops cleanly.

Apply this diff:

-        /app/kwild start
+        exec /app/kwild start

3-3: Pin Postgres image to a version for reproducibility.

Avoid floating latest to prevent surprise upgrades.

Example:

image: kwildb/postgres:16

22-22: Verify tn-node tag strategy.

sha-eb8d9f0 looks like a moving tag; consider immutable digests or semver tags for AMI stability.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0779299 and 7019605.

📒 Files selected for processing (2)
  • deployments/infra/stacks/ami_pipeline_stack.go (1 hunks)
  • deployments/infra/stacks/docker-compose.template.yml (3 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • deployments/infra/stacks/ami_pipeline_stack.go
🧰 Additional context used
🧠 Learnings (4)
📓 Common learnings
Learnt from: outerlook
PR: trufnetwork/node#1168
File: .github/workflows/publish-node-image.yaml:19-19
Timestamp: 2025-09-19T18:14:01.428Z
Learning: The ghcr.io/trufnetwork/tn-db image was never released, so no backwards compatibility or transitional measures are needed when renaming to ghcr.io/trufnetwork/node.
📚 Learning: 2025-09-19T18:59:51.942Z
Learnt from: outerlook
PR: trufnetwork/node#1168
File: tests/extensions/tn_digest/test_tn_digest.sh:15-15
Timestamp: 2025-09-19T18:59:51.942Z
Learning: In Docker Compose files, service names (like "tn-db:") can remain unchanged even when the underlying image is updated (e.g., from tn-db:local to ghcr.io/trufnetwork/node:local). Service names are internal network identifiers and don't need to match the actual image names.

Applied to files:

  • deployments/infra/stacks/docker-compose.template.yml
📚 Learning: 2025-09-19T18:14:01.428Z
Learnt from: outerlook
PR: trufnetwork/node#1168
File: .github/workflows/publish-node-image.yaml:19-19
Timestamp: 2025-09-19T18:14:01.428Z
Learning: The ghcr.io/trufnetwork/tn-db image was never released, so no backwards compatibility or transitional measures are needed when renaming to ghcr.io/trufnetwork/node.

Applied to files:

  • deployments/infra/stacks/docker-compose.template.yml
📚 Learning: 2025-09-23T18:50:13.233Z
Learnt from: MicBun
PR: trufnetwork/node#1172
File: deployments/infra/stacks/docker-compose.template.yml:3-7
Timestamp: 2025-09-23T18:50:13.233Z
Learning: The Kwil system expects POSTGRES_HOST_AUTH_METHOD=trust for PostgreSQL database configuration in Docker deployments. This is an intentional configuration requirement for Kwil nodes, not a security oversight.

Applied to files:

  • deployments/infra/stacks/docker-compose.template.yml
🔇 Additional comments (4)
deployments/infra/stacks/docker-compose.template.yml (4)

2-4: Service rename to tn-postgres is consistent and aligns references.

Matches updates in depends_on and DB URIs; internal DNS will resolve correctly.


34-35: Good use of health-gated dependency for Postgres.

Ensures tn-node waits for a ready DB.


84-84: DB URI host change looks correct.

Matches tn-postgres and trust auth model per Kwil requirement.


90-93: Confirm Compose supports depends_on.condition: service_started in your environment.

Some older Compose versions don’t honor conditions unless using the v2+ spec. If unsupported, tn-mcp may start too early.

If needed, switch to “service_healthy” with a tn-node healthcheck. Example (add under tn-node; adjust to what the image provides):

healthcheck:
  test: ["CMD", "/app/kwild", "status"]
  interval: 15s
  timeout: 5s
  retries: 10

Then:

depends_on:
  tn-node:
    condition: service_healthy

Comment thread deployments/infra/stacks/docker-compose.template.yml Outdated
Comment thread deployments/infra/stacks/docker-compose.template.yml
Comment thread deployments/infra/stacks/ami_pipeline_stack.go Outdated
@MicBun MicBun marked this pull request as draft September 24, 2025 19:29
@MicBun MicBun requested a review from outerlook September 25, 2025 20:00
@MicBun MicBun marked this pull request as ready for review September 25, 2025 20:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Problem(AWS AMI): Complex node setup prevents developer adoption

2 participants