Skip to content

docs: Z2LS Windows deploy spec for Chef cookbook (v0.32.2)#73

Open
priceflex wants to merge 3 commits into
mainfrom
docs/z2ls-deploy-spec
Open

docs: Z2LS Windows deploy spec for Chef cookbook (v0.32.2)#73
priceflex wants to merge 3 commits into
mainfrom
docs/z2ls-deploy-spec

Conversation

@priceflex
Copy link
Copy Markdown
Owner

@priceflex priceflex commented May 27, 2026

What

Adds docs/Z2LS-DEPLOY-SPEC.md — full deploy specification for Z2LS Windows gateways as of v0.32.2.

Why

The existing Z2LS-E2E-RUNBOOK.md is a debug runbook pinned to v0.30 and still says "DO NOT swap the relay to v0.30.13" — that note is obsolete now that v0.32.2 ships both relay and gateway with compatible wire format.

Need an authoritative input doc for writing the Chef cookbook that provisions Z2LS Windows boxes. Captures filesystem layout, NSSM service config, firewall rule, upgrade flow with version-pin verification, and v0.30→v0.32.2 architecture deltas (so we don't carry forward dead workarounds).

Details

Twelve sections covering:

  1. Architecture — full topology diagram with v0.32.2 multi-candidate flow
  2. v0.30 → v0.32.2 delta table — what changed and what didn't
  3. Filesystem layoutC:\\TRS_Tools\\ztlp\\ (code, operator-managed) vs C:\\ProgramData\\ZTLP\\ (runtime state, system-managed)
  4. config.toml + policy.toml — exact contents + Chef-attribute mapping
  5. NSSM service definition — every AppParameters flag with rationale
  6. Windows Firewall rule — and why all-profiles, not just private
  7. Dependency ordering — OpenSSH first, then NSSM, then identity-gen, etc.
  8. Upgrade flow — staging slot + atomic rename + version verification + auto-rollback on mismatch
  9. Healthcheck procedure — six verification steps for a post-converge test
  10. Future work — RBAC in policy, NS-name resolution, TPM-sealed identity, auto-enrollment via Bootstrap API
  11. Attribute summary — paste-ready attributes/default.rb block
  12. Reference snapshot — golden-file state for Kitchen tests

Validation

Every concrete value in the doc was captured live from DESKTOP-LRC8DKH (10.170.3.111) on 2026-05-27 via:

  • nssm get ztlp_listener <key> for service config
  • sc.exe query ztlp_listener for service state
  • Get-Item C:\\TRS_Tools\\ztlp\\ztlp.exe for binary size
  • Get-ChildItem C:\\ProgramData\\ZTLP for file inventory
  • type config.toml / policy.toml for content
  • netsh advfirewall firewall show rule for firewall rule

PowerShell snippets respect the documented traps:

  • Use sc.exe not sc (which is PowerShell-aliased to Set-Content)
  • Use cmd /c for chained & operations (PowerShell reserves &)
  • Note the NSSM UTF-16 round-trip pitfall

Follow-up

Cookbook implementation is a separate PR. This doc is the input spec.

When the cookbook lands, this doc gets a 'Cookbook reference: ' line in the header.

🤖 Generated with Hermes Agent

Summary by CodeRabbit

  • Documentation
    • Added comprehensive deployment guide for Z2LS Windows gateway nodes, detailing system architecture, connection flows, configuration file setup and management, step-by-step deployment procedures, version upgrade and rollback workflows with backup retention policies, required toolchain dependencies and their precise ordering, and comprehensive post-deployment health checks and operational verification procedures.

Review Change Stack

## What

Adds docs/Z2LS-DEPLOY-SPEC.md — full deploy specification for the Z2LS
Windows gateway as of v0.32.2. Captures every artifact a Chef cookbook
author needs to provision a fresh Z2LS machine from zero, plus the
upgrade flow from earlier versions.

## Why

The existing Z2LS-E2E-RUNBOOK.md is a debug runbook pinned at v0.30
and still says 'DO NOT swap the relay to v0.30.13' — outdated since
v0.32.2 unified the relay and gateway wire-format compatibility.

We're about to write the Chef cookbook for Z2LS provisioning. Need
one authoritative document that:

  - Captures the live filesystem layout (verified on DESKTOP-LRC8DKH)
  - Documents every NSSM service parameter and what it does
  - Explains config.toml + policy.toml schemas
  - Lists the firewall rule
  - Specifies the upgrade flow with version-pin verification
  - Calls out pitfalls (NSSM UTF-16, version-string drift,
    identity-file ACLs, policy.toml deny-by-default contract,
    service-name derivation)
  - Documents the v0.30 → v0.32.2 architecture changes so cookbook
    v1 doesn't carry forward dead workarounds (e.g. the relay-pin
    note in the v0.30 runbook is now obsolete)

## Details

Twelve sections:
  1. Architecture diagram + topology
  2. v0.30 → v0.32.2 delta table (so authors know what NOT to copy
     from the old runbook)
  3. Filesystem layout — C:\TRS_Tools\ztlp\ vs C:\ProgramData\ZTLP\
  4. config.toml + policy.toml — exact contents + attribute mapping
  5. NSSM service definition — every key, every flag, with rationale
  6. Windows Firewall rule
  7. Dependency order (OpenSSH first, NSSM, identity gen, etc.)
  8. Upgrade flow — staging slot + atomic rename + version verification
  9. Post-converge healthcheck procedure
 10. Future work (RBAC, NS-name resolution, TPM, auto-enrollment)
 11. Ready-to-paste attributes/default.rb block
 12. Reference snapshot of live state for golden-file Kitchen tests

Cross-references DEPLOYMENT.md, Z2LS-E2E-RUNBOOK.md (with a note that
this doc supersedes it for v0.32.2+), WINDOWS-RELAY-SSH.md, and the
ztlp-bootstrap-enrollment skill for the future auto-enrollment work.

## Tests

N/A — pure documentation. Verified that every concrete value in the
doc matches what's live on DESKTOP-LRC8DKH (10.170.3.111) as of
2026-05-27 via SSH + nssm get + sc.exe query + Get-Item.

## Validation

Manually walked every code/command block. Powershell snippets use
sc.exe (not PowerShell-aliased sc), respect the UTF-16 NSSM trap,
and use cmd /c chains where & has special meaning in PowerShell.

## Follow-up

Cookbook implementation: separate PR. This doc is the input spec.
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 27, 2026

Warning

Review limit reached

@priceflex, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 13 minutes and 13 seconds. Learn how PR review limits work.

Your organization has run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 43d537ad-d269-4224-9992-627a99c914cd

📥 Commits

Reviewing files that changed from the base of the PR and between a439e75 and 4fbf157.

📒 Files selected for processing (1)
  • docs/Z2LS-DEPLOY-SPEC.md
📝 Walkthrough

Walkthrough

This PR adds docs/Z2LS-DEPLOY-SPEC.md, an authoritative deployment specification for Z2LS Windows gateway nodes on ZTLP v0.32.2+. The document covers system architecture, filesystem layout, configuration contracts, NSSM service setup, Windows Firewall rules, deployment procedures, upgrade flows, and health verification steps.

Changes

Z2LS Windows Gateway Deployment Specification

Layer / File(s) Summary
Overview and Architecture Context
docs/Z2LS-DEPLOY-SPEC.md
Document scope and audience, Z2LS system architecture with relay/NS components and multi-candidate discovery workflow, and version differences between v0.30 and v0.32.2 affecting configuration.
Deployment Layout and Configuration Contracts
docs/Z2LS-DEPLOY-SPEC.md
Filesystem layout specification for operator-managed tools/logs and system-managed runtime state, identity generation rules and permissions, and Chef-rendered config.toml and policy.toml contracts with attribute mapping.
Service and Firewall Configuration
docs/Z2LS-DEPLOY-SPEC.md
NSSM ztlp_listener service parameters, detailed flag-to-attribute mapping including --service-name computation and operational pitfalls, and Windows Firewall inbound UDP rule specification.
Deployment and Operational Procedures
docs/Z2LS-DEPLOY-SPEC.md
Cookbook dependency ordering (OpenSSH, NSSM, identity generation, templates, firewall, service start), upgrade/rollback flow with SHA verification and backup strategy, and post-convergence verification checklist (service state, identity ACLs, UDP binding, NS candidate resolution, end-to-end SSH).
Reference Materials and Live State
docs/Z2LS-DEPLOY-SPEC.md
Future work roadmap (RBAC, NS-name support, multi-relay failover, TPM-backed identity), attribute defaults snippet, verified live host state snapshot, and document history.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • priceflex/ztlp#61: Documents Z2LS Windows gateway deployment details including NSSM ztlp listen --service-name setup and Windows Firewall UDP rule configuration, which are now comprehensively specified in the authoritative deploy spec.

Poem

🐰 A gateway's path, now crystal clear,
From Windows service to firewall near,
With checksums, rules, and state to keep,
The spec goes deep—no more to leap! 📋✨

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and concisely summarizes the main change: adding a Z2LS Windows deployment specification document for version v0.32.2 tied to Chef cookbook implementation.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch docs/z2ls-deploy-spec

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@docs/Z2LS-DEPLOY-SPEC.md`:
- Around line 21-71: The unlabeled fenced code blocks in Z2LS-DEPLOY-SPEC.md
trigger markdownlint MD040; update each triple-backtick block that contains
ASCII diagrams or command output (the large network diagram block and the other
fenced blocks around lines indicated in the review: examples at the big diagram
and the blocks at 106-115, 125-131, 225-236, 257-259, 312-318, 367-380) by
adding an explicit language identifier (e.g., ```text or ```powershell for
command snippets) immediately after the opening backticks so all fenced blocks
are labeled consistently for linting and rendering.
- Around line 539-541: The snapshot in Section 12 currently claims the live
binary outputs `ztlp 0.31.0` which contradicts the spec’s authoritative version
`ztlp 0.32.2`; either update the snapshot so the `C:\TRS_Tools\ztlp\ztlp.exe`
entry and the `ztlp --version` line show `ztlp 0.32.2` and the expected byte
size, or mark Section 12 as a pre-upgrade capture by adding an explicit
timestamp and a clear “non-authoritative / pre-upgrade snapshot” note next to
the `ztlp 0.31.0` text so readers know it does not reflect the authoritative
v0.32.2 state.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 7ca88fec-9e9e-4b48-86ff-5fbdb50d1771

📥 Commits

Reviewing files that changed from the base of the PR and between d3128fc and a439e75.

📒 Files selected for processing (1)
  • docs/Z2LS-DEPLOY-SPEC.md

Comment thread docs/Z2LS-DEPLOY-SPEC.md
Comment thread docs/Z2LS-DEPLOY-SPEC.md
Comment on lines +539 to +541
- `C:\TRS_Tools\ztlp\ztlp.exe` — size 10,505,216 bytes, `ztlp --version` reports
`ztlp 0.31.0` (older binary still on box — Steve's pending v0.32.2 deploy here).
**When the cookbook ships, that should become 10,xxx,xxx bytes reporting `ztlp 0.32.2`.**
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Snapshot section conflicts with the spec’s stated authoritative version

Line 540 says the live binary reports ztlp 0.31.0, which contradicts the document’s “authoritative v0.32.2” framing and can invalidate cookbook golden assertions. Either update this snapshot to true v0.32.2 state or explicitly mark Section 12 as a pre-upgrade capture with an exact timestamp and non-authoritative status.

Suggested doc correction
-- `C:\TRS_Tools\ztlp\ztlp.exe` — size 10,505,216 bytes, `ztlp --version` reports
-  `ztlp 0.31.0` (older binary still on box — Steve's pending v0.32.2 deploy here).
-  **When the cookbook ships, that should become 10,xxx,xxx bytes reporting `ztlp 0.32.2`.**
+- `C:\TRS_Tools\ztlp\ztlp.exe` — `ztlp --version` reports `ztlp 0.32.2` (captured 2026-05-27).
+  Size may vary by build metadata; assert version/SHA256 rather than byte count in Kitchen checks.
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- `C:\TRS_Tools\ztlp\ztlp.exe` — size 10,505,216 bytes, `ztlp --version` reports
`ztlp 0.31.0` (older binary still on box — Steve's pending v0.32.2 deploy here).
**When the cookbook ships, that should become 10,xxx,xxx bytes reporting `ztlp 0.32.2`.**
- `C:\TRS_Tools\ztlp\ztlp.exe``ztlp --version` reports `ztlp 0.32.2` (captured 2026-05-27).
Size may vary by build metadata; assert version/SHA256 rather than byte count in Kitchen checks.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/Z2LS-DEPLOY-SPEC.md` around lines 539 - 541, The snapshot in Section 12
currently claims the live binary outputs `ztlp 0.31.0` which contradicts the
spec’s authoritative version `ztlp 0.32.2`; either update the snapshot so the
`C:\TRS_Tools\ztlp\ztlp.exe` entry and the `ztlp --version` line show `ztlp
0.32.2` and the expected byte size, or mark Section 12 as a pre-upgrade capture
by adding an explicit timestamp and a clear “non-authoritative / pre-upgrade
snapshot” note next to the `ztlp 0.31.0` text so readers know it does not
reflect the authoritative v0.32.2 state.

priceflex added 2 commits May 27, 2026 23:19
Two different flags on opposite ends of a ZTLP connection — cookbook
authors only care about --service-name (listener-side). --service is
client-side and not a cookbook concern.

Adds a comparison table right after the NSSM parameters section, and
tightens the rationale on the --service-name row to mention the relay-
slug collision problem (default "ztlp-gateway" is unsuitable for any
multi-tenant relay).
## What

Adds a 'What --service values can operators use?' subsection to
docs/Z2LS-DEPLOY-SPEC.md, right after the --service-name vs --service
distinction.

## Why

The previous text only said --service was 'not a cookbook concern' but
didn't tell operators what they could actually type. Two consequences:

1. Operators reading the doc had to guess what's exposed beyond ssh.
2. Cookbook authors didn't see the connection between
   node['ztlp']['forwards'] (the source of truth) and what operators
   can use against the resulting box.

## Details

New subsection has five parts:

- **The rule** — --service NAME must match a --forward NAME:HOST:PORT
  on the gateway. Default Z2LS deploy only has ssh.

- **Adding more services** — table of common Windows services and the
  matching --forward entries (rdp/winrm/winrm-tls/smb/vnc/http/https/ipp).
  Two worked operator commands (RDP + WinRM).

- **Name validation rules** — from proto/src/tunnel.rs#parse_forward_arg:
  ASCII alphanumeric + dash + underscore only, max 253 bytes parse limit
  but 63-byte wire cap (CLIENT_ROUTE_MAX_SVC_LEN). winrm-tls is fine;
  winrm.tls and 'Remote Desktop' (space) are not.

- **Three quirks** —
  (1) the magic _default service when --forward has no NAME: prefix;
  (2) the gateway picks the forward by name, client controls which;
  (3) policy.toml gates --service in addition to --forward — both need
      to be kept in sync per cookbook attribute.

- **Security note** — adding RDP/SMB means every enrolled identity gets
  access until per-service RBAC ships (NS group_index in v0.32.2 isn't
  wired through policy yet). Concrete recommendation: ssh + maybe rdp
  for Z2LS today.

## Validation

All facts cross-referenced from the actual source:
- MAX_SERVICE_NAME_LEN = 253 → proto/src/tunnel.rs:92
- DEFAULT_SERVICE = '_default' → proto/src/tunnel.rs:95
- CLIENT_ROUTE_MAX_SVC_LEN = 63 → proto/src/bin/ztlp-cli.rs:3788
- parse_forward_arg character class → proto/src/tunnel.rs:3187

No code changes — pure documentation.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant