Artemis guest request not cancelled on SIGTERM, cleanup crashes with MetadataError

## Summary

When tmt receives SIGTERM during Artemis provisioning (while polling `get_new_state`), the Artemis guest request is never cancelled via the API, resulting in an orphaned cloud resource. Additionally, the cleanup step crashes with a `MetadataError` instead of handling the situation gracefully.

Observed in: https://artifacts.osci.redhat.com/testing-farm/9821a7f6-f206-462a-a512-ab0efea3f12c/work-whqlfskl_d68/log.txt
tmt version: 1.71.0

## What happened

1. Artemis guest `c7bbbc9f-0b8a-4d3b-ba11-234d7bcda709` was requested at 12:32:52
2. `get_new_state` polling started (86400s timeout, 3s tick) — guest never reached `ready` state
3. SIGTERM received at 14:45:10 (~2h12m into polling)
4. tmt caught the interrupt, suspended steps, ran report, then attempted cleanup
5. Cleanup crashed with: `No guests queued for phase "default-0". A typo in "where" key?`
6. Artemis guest request was **never cancelled** — resource leak

## Root cause

There are two issues:

### 1. `GuestArtemis._create()` does not cancel the request on interrupt

In [`tmt/steps/provision/artemis.py`](https://github.com/teemtee/tmt/blob/main/tmt/steps/provision/artemis.py), the `_create()` method only calls `self.remove()` when `WaitingTimedOutError` is raised (line 636-639). When `Interrupted` is raised (due to SIGTERM), the exception propagates without cancelling the Artemis request:

```python
try:
    guest_info = Waiting(
        Deadline.from_seconds(self.provision_timeout), tick=self.provision_tick
    ).wait(get_new_state, self._logger)

except tmt.utils.wait.WaitingTimedOutError as error:
    self.remove()  # ← only on timeout
    raise ArtemisProvisionError(...) from error

# No handler for Interrupted → guest request leaks
```

### 2. Cleanup step crashes when guests exist but none are ready

In [`tmt/steps/cleanup/__init__.py`](https://github.com/teemtee/tmt/blob/main/tmt/steps/cleanup/__init__.py), `Cleanup.go()` has an inconsistency:

- Line 135: checks `self.plan.provision.guests` (includes all guests, even not-ready ones) — this is **non-empty** (the `GuestArtemis` object exists because `self._guest` is set in `ProvisionArtemis.go()` before `start()` is called)
- Line 150: uses `self._steppified_guests` which filters through `ready_guests` → `guest.is_ready` → checks `self.primary_address is not None` — this is **empty** because provisioning never completed
- Line 147: `enqueue_plugin(guests=[])` raises `MetadataError`

This means `CleanupInternal.go()` — which calls `guest.stop()` and `guest.remove()` (the `DELETE /guests/{guestname}` API call) — is **never executed**.

## Expected behavior

1. When interrupted during `_create()` polling, tmt should cancel the Artemis guest request via `DELETE /guests/{guestname}` before propagating the exception
2. The cleanup step should handle the case where guests exist but are not ready, either by skipping the enqueue or by including not-ready guests for cleanup purposes (they still need `remove()` called)

## Impact

Orphaned Artemis guest requests leaking cloud resources (e.g., AWS bare-metal instances) when tmt is interrupted during provisioning.

---
Assisted-by: Claude Code


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Artemis guest request not cancelled on SIGTERM, cleanup crashes with MetadataError #4834

Summary

What happened

Root cause

1. `GuestArtemis._create()` does not cancel the request on interrupt

2. Cleanup step crashes when guests exist but none are ready

Expected behavior

Impact

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Artemis guest request not cancelled on SIGTERM, cleanup crashes with MetadataError #4834

Description

Summary

What happened

Root cause

1. GuestArtemis._create() does not cancel the request on interrupt

2. Cleanup step crashes when guests exist but none are ready

Expected behavior

Impact

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

1. `GuestArtemis._create()` does not cancel the request on interrupt