Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
108 changes: 106 additions & 2 deletions roles/bm_sno/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -63,6 +63,10 @@ provision IP via `/etc/hosts` entries managed by the role.
| `cifmw_bm_agent_live_debug` | bool | `false` | Patch the agent ISO with password, autologin, and systemd debug shell on `tty6` for discovery-phase console access (requires `cifmw_bm_agent_core_password`) |
| `cifmw_bm_agent_disabled_ifaces` | list | `[]` | Extra NIC names to disable IPv4/IPv6 on during agent-based install. Prevents overlapping-subnet validation failures when multiple NICs share a native VLAN (e.g. `[eno2]`). The interfaces stay link-up but get no IP address; post-install NNCP configures them. |
| `cifmw_bm_agent_lvms_partition` | dict | `{}` | When set, creates an Ignition partition at install time to cap CoreOS rootfs growth and leave unallocated space for the LVMS StorageClass. Keys: `device` (required, e.g. `/dev/nvme0n1`), `rootfs_mib` (default `150000`), `size_mib` (default `0` = rest of disk), `label` (default `lvmstorage`). See [LVMS partition](#lvms-partition). |
| `cifmw_bm_agent_reuse_vmedia` | bool | `false` | Skip ISO generation, HTTP server start/stop, and VirtualMedia eject/insert when the agent ISO is already mounted in the iDRAC (e.g. via the iDRAC web UI using a local file). When `true` the role goes straight to setting the one-time boot override and waiting for install. The `openshift-install` binary and working directory from the previous run must still be present on disk. |
| `cifmw_bm_agent_iso_server_ip` | str | `""` | IP address the iDRAC uses to fetch the agent ISO. When empty, the role auto-detects the controller's IP from nodepool metadata or `ansible_default_ipv4.address`. Set this when the auto-detected IP is not reachable by the iDRAC — for example, when running over VPN where the VPN interface IP must be used instead of the default-route IP. |
| `cifmw_bm_agent_node_vlan` | int | `0` | 802.1Q VLAN ID for the machine network. When non-zero, the generated `agent-config.yaml` creates a VLAN sub-interface (`<iface>.<vlan>`) on top of `cifmw_bm_agent_node_iface` and assigns the node IP there instead of the bare physical NIC. Set to `0` (default) when the machine-network VLAN arrives untagged (native) at the NIC. |
| `cifmw_bm_agent_additional_ntp_sources` | list | `[]` | NTP server hostnames or IPs added to `additionalNTPSources` in `agent-config.yaml`. These are baked into the agent ISO so `chronyd` can synchronize on first boot even in restricted networks. Without this, the Assisted Installer validation may reject the host with *"Host couldn't synchronize with any NTP server"* (see [KCS 7020898](https://access.redhat.com/solutions/7020898)). Example: `["clock.redhat.com"]`. |

## Secrets management

Expand Down Expand Up @@ -100,6 +104,81 @@ The agent-based deployment is composed of reusable task files under
| `bm_patch_agent_iso.yml` | Patches the agent ISO ignition with core password, autologin, and debug shell on tty6 (used when `cifmw_bm_agent_live_debug` is true) |
| `bm_core_password_machineconfig.yml` | Generates a MachineConfig manifest to set the core user password hash post-install |

## Pre-mounted ISO (reuse VirtualMedia mode)

Use this when the agent ISO cannot be served over HTTP from the Ansible
controller to the iDRAC (for example: the iDRAC is on a network segment
unreachable from the controller, or VirtualMedia HTTP insertion fails
persistently). In this case mount the ISO manually in the iDRAC web UI via
*Virtual Media → Connect Virtual Media → Local File*, then set
`cifmw_bm_agent_reuse_vmedia: true` in your `vars.yaml` (or pass it as an
extra-var) and re-run the playbook.

### Two-playbook workflow

**Run 1 — generate the agent ISO** (`cifmw_bm_agent_reuse_vmedia: false`,
the default). Let the playbook run until the ISO is written to disk — you
do not need the VirtualMedia insert to succeed. Abort after the ISO
generation step if needed:

```yaml
# vars.yaml
cifmw_bm_agent_reuse_vmedia: false # default — explicit for clarity
```

After Run 1, the following artifacts exist in
`<cifmw_reproducer_basedir>/artifacts/agent-install/`:

- `openshift-install` — binary used for `wait-for` in Run 2
- `agent.x86_64.iso` — copy this to your local machine and upload via
the iDRAC web UI (`Virtual Media → Connect Virtual Media → Local File`)
- `agent_ssh_key` — cluster SSH key used by the installer

Confirm the iDRAC shows the drive as *Connected* before proceeding.

**Run 2 — boot from the pre-mounted ISO**:

```yaml
# vars.yaml (or -e on the ansible-playbook command line)
cifmw_bm_agent_reuse_vmedia: true
```

```bash
ansible-playbook -i inventory.yaml playbook.yaml \
-e cifmw_bm_agent_reuse_vmedia=true
```

This run skips ISO generation, the podman HTTP server, and all VirtualMedia
eject/insert steps. It powers the host off, sets the UEFI one-time boot
override to the Virtual Optical Drive, powers the host back on, and waits
for `openshift-install agent wait-for install-complete`.

### What is skipped with `cifmw_bm_agent_reuse_vmedia: true`

- Removing stale agent state from the previous run
- ISO generation (`openshift-install agent create image`)
- ISO patching for live debug
- HTTP server start and stop (podman)
- VirtualMedia eject before insert
- VirtualMedia ISO insert
- VirtualMedia eject after install

### What still runs

- USB boot BIOS check / enable
- Power-off (so the host boots cleanly from the mounted ISO)
- SSH key generation (idempotent, reuses existing key)
- `openshift-install` binary acquisition (skipped when binary already present)
- Config template generation (idempotent)
- LVMS MachineConfig generation (idempotent)
- UEFI VirtualMedia target discovery and one-time boot override
- Power-on and install wait
- kubeconfig copy

**Prerequisite**: the `openshift-install` binary and the working directory
(`<cifmw_reproducer_basedir>/artifacts/agent-install/`) from Run 1 must
still be present on disk.

## openshift-install acquisition

The `openshift-install` binary is obtained automatically via one of two
Expand Down Expand Up @@ -165,7 +244,7 @@ Test coverage:

Minimal vars.yaml for a bare metal SNO deployment:

```YAML
```yaml
cifmw_bm_sno: true
cifmw_bm_agent_cluster_name: ocp
cifmw_bm_agent_base_domain: example.com
Expand All @@ -181,6 +260,26 @@ cifmw_bm_nodes:
root_device: /dev/sda
```

With a tagged machine-network VLAN and NTP sources (restricted network):

```yaml
cifmw_bm_sno: true
cifmw_bm_agent_cluster_name: sno
cifmw_bm_agent_base_domain: lab.example.local
cifmw_bm_agent_machine_network: "x.x.x.0/24"
cifmw_bm_agent_node_ip: "x.x.x.101"
cifmw_bm_agent_node_iface: eno17395np0 # physical NIC; VLAN sub-iface created automatically
cifmw_bm_agent_node_vlan: 1073 # machine network arrives 802.1Q-tagged
cifmw_bm_agent_additional_ntp_sources:
- clock.redhat.com
cifmw_bm_agent_bmc_host: x.x.x.151
cifmw_bm_agent_openshift_version: "4.18.3"

cifmw_bm_nodes:
- mac: "D4:04:E6:F8:41:50"
root_device: /dev/nvme1n1
```

## Local debugging on an autoheld Zuul node

When a Zuul job is held (`autohold`), you can SSH into the Zuul controller
Expand Down Expand Up @@ -255,10 +354,15 @@ oc get nodes

For ssh access into SNO host:
```bash
ssh -i ~/ci-framework-data/artifacts/agent-install/agent_ssh_key \
ssh -o IdentitiesOnly=yes \
-i ~/ci-framework-data/artifacts/agent-install/agent_ssh_key \
core@<cluster>.<cifmw_bm_agent_base_domain>
```

`-o IdentitiesOnly=yes` is required when the local ssh-agent holds many keys —
the server's `MaxAuthTries` limit (default 6) is hit before the explicit key is
tried, resulting in *"Too many authentication failures"*.

Replace `<cluster>` with the value of `cifmw_bm_agent_cluster_name` (e.g.
`sno`).

Expand Down
9 changes: 9 additions & 0 deletions roles/bm_sno/defaults/main.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
---
cifmw_bm_agent_iso_http_port: 80
cifmw_bm_agent_iso_server_ip: ""
cifmw_bm_agent_installer_timeout: 7200
cifmw_bm_agent_openshift_version: "4.18.3"
cifmw_bm_agent_core_password: "redhat"
Expand All @@ -18,3 +19,11 @@ cifmw_bm_agent_disabled_ifaces: []
# size_mib: 0 # 0 = rest of disk
# label: lvmstorage
cifmw_bm_agent_lvms_partition: {}

# Skip ISO generation, HTTP server, and VirtualMedia eject/insert when the
# agent ISO is already mounted in the iDRAC (e.g. via the iDRAC web UI using
# a local file). The playbook will go straight to setting the one-time boot
# override and waiting for the install to complete.
# The openshift-install binary and work directory from the previous run must
# still be present (they are not regenerated in this mode).
cifmw_bm_agent_reuse_vmedia: false
112 changes: 112 additions & 0 deletions roles/bm_sno/tasks/bm_discover_vmedia_member.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,112 @@
---
# Discover the correct VirtualMedia member URI for CD/DVD on this iDRAC.
# The member name varies across firmware versions (CD, RemovableDisk, 1, 2, …).
# On iDRAC 10+ the Managers VirtualMedia path requires NumericDynamicSegmentsEnable;
# the Systems path works without it and is tried as a fallback.
# You can manually set it with racadm set Redfish.1.NumericDynamicSegmentsEnable Enabled.
# Sets _vmedia_member_uri, _vmedia_insert_action, _vmedia_eject_action.
# Requires: _bmc_host, _bmc_creds, _redfish_headers

- name: Fetch VirtualMedia collection (Managers path)
no_log: false
ansible.builtin.uri:
url: "https://{{ _bmc_host }}/redfish/v1/Managers/iDRAC.Embedded.1/VirtualMedia"
method: GET
headers: "{{ _redfish_headers }}"
user: "{{ _bmc_creds.username }}"
password: "{{ _bmc_creds.password }}"
validate_certs: false
force_basic_auth: true
return_content: true
status_code: [200, 404]
register: _vmedia_collection_mgr

- name: Fetch VirtualMedia collection (Systems path fallback)
when: _vmedia_collection_mgr.status == 404
no_log: false
ansible.builtin.uri:
url: "https://{{ _bmc_host }}/redfish/v1/Systems/System.Embedded.1/VirtualMedia"
method: GET
headers: "{{ _redfish_headers }}"
user: "{{ _bmc_creds.username }}"
password: "{{ _bmc_creds.password }}"
validate_certs: false
force_basic_auth: true
return_content: true
status_code: [200]
register: _vmedia_collection_sys

- name: Set active VirtualMedia collection result
ansible.builtin.set_fact:
_vmedia_collection: >-
{{ _vmedia_collection_sys
if (_vmedia_collection_mgr.status == 404)
else _vmedia_collection_mgr }}

- name: Show VirtualMedia collection source
ansible.builtin.debug:
msg: >-
VirtualMedia collection from
{{ '/redfish/v1/Systems/System.Embedded.1/VirtualMedia'
if (_vmedia_collection_mgr.status == 404)
else '/redfish/v1/Managers/iDRAC.Embedded.1/VirtualMedia' }}
({{ _vmedia_collection.json.Members | length }} members)

- name: Fetch each VirtualMedia member detail
no_log: false
ansible.builtin.uri:
url: "https://{{ _bmc_host }}{{ item['@odata.id'] }}"
method: GET
headers: "{{ _redfish_headers }}"
user: "{{ _bmc_creds.username }}"
password: "{{ _bmc_creds.password }}"
validate_certs: false
force_basic_auth: true
return_content: true
status_code: [200]
register: _vmedia_members
loop: "{{ _vmedia_collection.json.Members }}"
loop_control:
label: "{{ item['@odata.id'] | basename }}"

- name: Pick the first member that supports CD or DVD media types
ansible.builtin.set_fact:
_vmedia_member_uri: "{{ item.json['@odata.id'] }}"
_vmedia_insert_action: >-
{{ item.json.Actions['#VirtualMedia.InsertMedia'].target }}
_vmedia_eject_action: >-
{{ item.json.Actions['#VirtualMedia.EjectMedia'].target }}
when:
- _vmedia_member_uri is not defined
- item.json.MediaTypes is defined
- item.json.MediaTypes | intersect(['CD', 'DVD']) | length > 0
loop: "{{ _vmedia_members.results }}"
loop_control:
label: "{{ item.json['@odata.id'] | basename }}"

- name: Fall back to first member with an InsertMedia action
ansible.builtin.set_fact:
_vmedia_member_uri: "{{ item.json['@odata.id'] }}"
_vmedia_insert_action: >-
{{ item.json.Actions['#VirtualMedia.InsertMedia'].target }}
_vmedia_eject_action: >-
{{ item.json.Actions['#VirtualMedia.EjectMedia'].target }}
when:
- _vmedia_member_uri is not defined
- item.json.Actions is defined
- "'#VirtualMedia.InsertMedia' in item.json.Actions"
loop: "{{ _vmedia_members.results }}"
loop_control:
label: "{{ item.json['@odata.id'] | basename }}"

- name: Fail if no usable VirtualMedia member found
when: _vmedia_member_uri is not defined
ansible.builtin.fail:
msg: >-
No VirtualMedia member with InsertMedia support found.
Members: {{ _vmedia_members.results |
map(attribute='json') | map(attribute='@odata.id') | list | join(', ') }}

- name: Show discovered VirtualMedia member
ansible.builtin.debug:
msg: "VirtualMedia member: {{ _vmedia_member_uri }} — insert: {{ _vmedia_insert_action }}"
53 changes: 37 additions & 16 deletions roles/bm_sno/tasks/bm_discover_vmedia_target.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,11 @@
# Discover or validate the UEFI device path for the iDRAC Virtual Optical Drive,
# clear any pending iDRAC config jobs, and set a one-time boot override.
# Requires: _bmc_host, _bmc_creds, _redfish_headers
#
# Boot override strategy (selected by cifmw_bm_agent_bios_onetimeboot_fqdd):
# unset / empty — standard Redfish PATCH /Systems/System.Embedded.1 (iDRAC ≤ 9)
# set to FQDD — BIOS pending-settings PATCH /Bios/Settings (iDRAC 10+)
# see bm_discover_vmedia_target_idrac10.yml
- name: Fetch UEFI boot option IDs
no_log: true
ansible.builtin.uri:
Expand Down Expand Up @@ -45,8 +50,12 @@
map(attribute='DisplayName', default='?') | zip(_known_uefi_paths) |
map('join', ' -> ') | list }}

# Skip UefiDevicePath validation when using the BIOS pending-settings approach
# (cifmw_bm_agent_bios_onetimeboot_fqdd set) — FQDD is validated by iDRAC itself.
- name: Validate user-provided VirtualMedia UEFI path
when: cifmw_bm_agent_vmedia_uefi_path | length > 0
when:
- cifmw_bm_agent_vmedia_uefi_path | length > 0
- cifmw_bm_agent_bios_onetimeboot_fqdd | default('') | length == 0
ansible.builtin.assert:
that:
- cifmw_bm_agent_vmedia_uefi_path in _known_uefi_paths
Expand Down Expand Up @@ -123,7 +132,9 @@
ansible.builtin.pause:
seconds: 10

- name: Set one-time boot from Virtual Optical Drive
# ── Standard one-time boot (iDRAC ≤ 9): PATCH Systems Boot property ──────────
- name: Set one-time boot — standard Redfish PATCH (iDRAC ≤ 9)
when: cifmw_bm_agent_bios_onetimeboot_fqdd | default('') | length == 0
no_log: true
ansible.builtin.uri:
url: "https://{{ _bmc_host }}/redfish/v1/Systems/System.Embedded.1"
Expand All @@ -141,7 +152,9 @@
force_basic_auth: true
status_code: [200, 204]

- name: Verify boot override was applied
- name: Verify boot override — standard (iDRAC ≤ 9)
when: cifmw_bm_agent_bios_onetimeboot_fqdd | default('') | length == 0
no_log: true
ansible.builtin.uri:
url: "https://{{ _bmc_host }}/redfish/v1/Systems/System.Embedded.1"
method: GET
Expand All @@ -152,30 +165,37 @@
force_basic_auth: true
return_content: true
status_code: [200]
register: _boot_verify
register: _boot_verify_standard

- name: Assert boot override is set
- name: Assert boot override is set — standard (iDRAC ≤ 9)
when: cifmw_bm_agent_bios_onetimeboot_fqdd | default('') | length == 0
ansible.builtin.assert:
that:
- _boot_verify.json.Boot.BootSourceOverrideTarget == 'UefiTarget'
- _boot_verify.json.Boot.BootSourceOverrideEnabled == 'Once'
- _boot_verify.json.Boot.UefiTargetBootSourceOverride | default('') | length > 0
- _boot_verify_standard.json.Boot.BootSourceOverrideTarget == 'UefiTarget'
- _boot_verify_standard.json.Boot.BootSourceOverrideEnabled == 'Once'
- _boot_verify_standard.json.Boot.UefiTargetBootSourceOverride | default('') | length > 0
fail_msg: >-
Boot override not applied.
Target: {{ _boot_verify.json.Boot.BootSourceOverrideTarget }},
Enabled: {{ _boot_verify.json.Boot.BootSourceOverrideEnabled }},
UefiPath: {{ _boot_verify.json.Boot.UefiTargetBootSourceOverride | default('empty') }}
Target: {{ _boot_verify_standard.json.Boot.BootSourceOverrideTarget }},
Enabled: {{ _boot_verify_standard.json.Boot.BootSourceOverrideEnabled }},
UefiPath: {{ _boot_verify_standard.json.Boot.UefiTargetBootSourceOverride | default('empty') }}

- name: Show resolved boot path
- name: Show resolved boot path — standard (iDRAC ≤ 9)
when: cifmw_bm_agent_bios_onetimeboot_fqdd | default('') | length == 0
ansible.builtin.debug:
msg: >-
Resolved boot path:
{{ _boot_verify.json.Boot.UefiTargetBootSourceOverride }}
msg: "Resolved boot path: {{ _boot_verify_standard.json.Boot.UefiTargetBootSourceOverride }}"

# ── iDRAC 10+ one-time boot: BIOS pending-settings ───────────────────────────
- name: Set one-time boot via BIOS pending settings (iDRAC 10+)
when: cifmw_bm_agent_bios_onetimeboot_fqdd | default('') | length > 0
ansible.builtin.include_tasks:
file: bm_discover_vmedia_target_idrac10.yml

- name: Verify VirtualMedia is still inserted
when: not cifmw_bm_agent_reuse_vmedia | bool
no_log: true
ansible.builtin.uri:
url: "https://{{ _bmc_host }}/redfish/v1/Managers/iDRAC.Embedded.1/VirtualMedia/CD"
url: "https://{{ _bmc_host }}{{ _vmedia_member_uri }}"
method: GET
headers: "{{ _redfish_headers }}"
user: "{{ _bmc_creds.username }}"
Expand All @@ -187,6 +207,7 @@
register: _vmedia_check

- name: Assert VirtualMedia ISO is mounted
when: not cifmw_bm_agent_reuse_vmedia | bool
ansible.builtin.assert:
that:
- _vmedia_check.json.Inserted | bool
Expand Down
Loading
Loading