Skip to content

fix: improve error handling for user-/meta-/vendor-data handlers#93

Merged
alexlovelltroy merged 5 commits into
mainfrom
synackd/fix/nonexistent-node-impersonation
Oct 2, 2025
Merged

fix: improve error handling for user-/meta-/vendor-data handlers#93
alexlovelltroy merged 5 commits into
mainfrom
synackd/fix/nonexistent-node-impersonation

Conversation

@synackd
Copy link
Copy Markdown
Contributor

@synackd synackd commented Oct 2, 2025

Closes #67

Changes

Refactor the error handling in the data handlers. Specifically:

SMDClient:

  • move error types to separate errors.go
  • add ErrEmptyID error type to distinguish errors that occur because the passed ID is empty
  • add ErrSMDResponse error type (returned by getSMD()) to capture SMD HTTP return codes
    • allows requester to act based on response from SMD

Handlers:

  • clarify error messages and add some debug messages
  • use new ErrSMDResponse error to determine cloud-init's response to different SMD response codes

Testing

Node Booting

When booting a node with debugging enabled, you should see messages akin to:

DBG no id specified in request, attempting to identify based on requesting IP
DBG requesting IP is: 172.16.0.2
DBG xname x3000c0s0b1n0 with ip 172.16.0.2 found

Invalid xnames

vendor-data:

Try to get vendor-data for invalid xname:

$ ochami -l debug -L basic cloud-init node get vendor-data f
...
DEBUG | client.go:245 > GET: https://demo.openchami.cluster:8443/cloud-init/admin/impersonation/f/vendor-data
DEBUG | client.go:266 > Request headers:
DEBUG | client.go:268 >   User-Agent: [ochami/0.5.4]
DEBUG | client.go:268 >   Authorization: [Bearer ey...]
DEBUG | client.go:277 > No body in request
DEBUG | client.go:288 > Response status: 200 OK
DEBUG | client.go:290 > Response headers:
DEBUG | client.go:292 >   Date: [Thu, 02 Oct 2025 17:37:05 GMT]
DEBUG | client.go:292 >   Content-Length: [9]
DEBUG | client.go:292 >   Content-Type: [text/plain; charset=utf-8]
DEBUG | client.go:305 > Response body:
DEBUG | client.go:306 > #include

INFO  | http.go:124 > Response status: HTTP/2.0 200 OK
#include

cloud-init logs (200 returned, notice 400 returned by SMD):

5:37PM ERR SMD GET request went through, but returned unsuccessful HTTP response (HTTP 400)
5:37PM WRN node f is an invalid xname in SMD, include list will be empty
5:37PM INF Request bytes_in=0 bytes_out=9 duration=1.947641 method=GET remote_addr=172.16.0.254 request_id=cloud-init/8uzzUL64Jy-000014 request_uri=/admin/impersonation/f/vendor-data status=OK status_code=200 user_agent=ochami/0.5.4
2025/10/02 17:37:05 [cloud-init/8uzzUL64Jy-000014] "GET http://demo.openchami.cluster:8443/admin/impersonation/f/vendor-data HTTP/1.1" from 172.16.0.254 - 200 9B in 1.994452ms

SMD logs (400 returned by SMD):

2025/10/02 16:45:58 [smd/5BDDCpVN3U-000008] "GET http://smd:27779/hsm/v2/memberships/f HTTP/1.1" from 10.89.3.156:51768 - 400 83B in 191.543µs
10.89.3.156 - - [02/Oct/2025:16:46:02 +0000] "GET /hsm/v2/Inventory/EthernetInterfaces/ HTTP/1.1" 200 2741 "" "Go-http-client/1.1"
4:46PM INF Request bytes_in=0 bytes_out=2741 duration=0.551389 method=GET remote_addr=10.89.3.156:51776 request_id=smd/5BDDCpVN3U-000009 request_uri=/hsm/v2/Inventory/EthernetInterfaces/ status=OK status_code=200 user_agent=Go-http-client/1.1

meta-data:

Try to get meta-data for invalid xname:

$ ochami -l debug -L basic cloud-init node get meta-data f
...
DEBUG | client.go:245 > GET: https://demo.openchami.cluster:8443/cloud-init/admin/impersonation/f/meta-data
DEBUG | client.go:266 > Request headers:
DEBUG | client.go:268 >   User-Agent: [ochami/0.5.4]
DEBUG | client.go:268 >   Authorization: [Bearer ey...]
DEBUG | client.go:277 > No body in request
DEBUG | client.go:288 > Response status: 404 Not Found
DEBUG | client.go:290 > Response headers:
DEBUG | client.go:292 >   Content-Type: [text/plain; charset=utf-8]
DEBUG | client.go:292 >   X-Content-Type-Options: [nosniff]
DEBUG | client.go:292 >   Date: [Thu, 02 Oct 2025 17:49:22 GMT]
DEBUG | client.go:292 >   Content-Length: [24]
DEBUG | client.go:305 > Response body:
DEBUG | client.go:306 > node f not found in SMD

DEBUG | ci.go:202 > failed to get node data error="unsuccessful HTTP status: HTTP/2.0 404 Not Found: node f not found in SMD\n"
ERROR | cloud_init-node-get.go:153 > cloud-init node meta-data request yielded unsuccessful HTTP response error="GetNodeData(): failed to GET node data from cloud-init: unsuccessful HTTP status: HTTP/2.0 404 Not Found: node f not found in SMD\n"

cloud-init logs (404 returned, notice 404 returned by SMD):

5:49PM DBG Getting metadata for id: f
5:49PM ERR SMD GET request went through, but returned unsuccessful HTTP response (HTTP 404)
2025/10/02 17:49:22 [cloud-init/8uzzUL64Jy-000015] "GET http://demo.openchami.cluster:8443/admin/impersonation/f/meta-data HTTP/1.1" from 172.16.0.254 - 404 24B in 1.939801ms
5:49PM INF Request bytes_in=0 bytes_out=24 duration=1.89482 method=GET remote_addr=172.16.0.254 request_id=cloud-init/8uzzUL64Jy-000015 request_uri=/admin/impersonation/f/meta-data status="Not Found" status_code=404 user_agent=ochami/0.5.4

SMD for some reason doesn't check the xname format, but simply searches for the passed ID. Thus, it returns 404 when it can't find it:

10.89.3.165 - - [02/Oct/2025:17:49:22 +0000] "GET /hsm/v2/State/Components/f HTTP/1.1" 404 82 "" "Go-http-client/1.1"

user-data:

Try to get user-data for invalid xname:

$ ochami -l debug -L basic cloud-init node get user-data f
...
DEBUG | client.go:245 > GET: https://demo.openchami.cluster:8443/cloud-init/admin/impersonation/f/user-data
DEBUG | client.go:266 > Request headers:
DEBUG | client.go:268 >   User-Agent: [ochami/0.5.4]
DEBUG | client.go:268 >   Authorization: [Bearer ey...]
DEBUG | client.go:277 > No body in request
DEBUG | client.go:288 > Response status: 200 OK
DEBUG | client.go:290 > Response headers:
DEBUG | client.go:292 >   Date: [Thu, 02 Oct 2025 17:55:45 GMT]
DEBUG | client.go:292 >   Content-Length: [13]
DEBUG | client.go:292 >   Content-Type: [text/plain; charset=utf-8]
DEBUG | client.go:305 > Response body:
DEBUG | client.go:306 > #cloud-config
INFO  | http.go:124 > Response status: HTTP/2.0 200 OK
#cloud-config

cloud-init unconditionally returns #cloud-init (200) for user-data since it is not used:

2025/10/02 17:55:45 [cloud-init/8uzzUL64Jy-000016] "GET http://demo.openchami.cluster:8443/admin/impersonation/f/user-data HTTP/1.1" from 172.16.0.254 - 200 13B in 145.513µs

Unknown (but valid) xname

vendor-data:

$ ochami -l debug -L basic cloud-init node get vendor-data x1000c0s0b0n0
...
DEBUG | client.go:245 > GET: https://demo.openchami.cluster:8443/cloud-init/admin/impersonation/x1000c0s0b0n0/vendor-data
DEBUG | client.go:266 > Request headers:
DEBUG | client.go:268 >   User-Agent: [ochami/0.5.4]
DEBUG | client.go:268 >   Authorization: [Bearer ey...]
DEBUG | client.go:277 > No body in request
DEBUG | client.go:288 > Response status: 200 OK
DEBUG | client.go:290 > Response headers:
DEBUG | client.go:292 >   Content-Length: [9]
DEBUG | client.go:292 >   Content-Type: [text/plain; charset=utf-8]
DEBUG | client.go:292 >   Date: [Thu, 02 Oct 2025 18:06:20 GMT]
DEBUG | client.go:305 > Response body:
DEBUG | client.go:306 > #include

INFO  | http.go:124 > Response status: HTTP/2.0 200 OK
#include

cloud-init should return 200 (include list is empty) and SMD should return 404.

cloud-init logs:

6:06PM ERR SMD GET request went through, but returned unsuccessful HTTP response (HTTP 404)
6:06PM WRN node x1000c0s0b0n0 not found in SMD, include list will be empty
2025/10/02 18:06:20 [cloud-init/8uzzUL64Jy-000017] "GET http://demo.openchami.cluster:8443/admin/impersonation/x1000c0s0b0n0/vendor-data HTTP/1.1" from 172.16.0.254 - 200 9B in 3.362563ms

meta-data:

$ ochami -l debug -L basic cloud-init node get meta-data x1000c0s0b0n0
...
DEBUG | client.go:245 > GET: https://demo.openchami.cluster:8443/cloud-init/admin/impersonation/x1000c0s0b0n0/meta-data
DEBUG | client.go:266 > Request headers:
DEBUG | client.go:268 >   User-Agent: [ochami/0.5.4]
DEBUG | client.go:268 >   Authorization: [Bearer ey...]
DEBUG | client.go:277 > No body in request
DEBUG | client.go:288 > Response status: 404 Not Found
DEBUG | client.go:290 > Response headers:
DEBUG | client.go:292 >   Content-Type: [text/plain; charset=utf-8]
DEBUG | client.go:292 >   X-Content-Type-Options: [nosniff]
DEBUG | client.go:292 >   Date: [Thu, 02 Oct 2025 18:09:08 GMT]
DEBUG | client.go:292 >   Content-Length: [36]
DEBUG | client.go:305 > Response body:
DEBUG | client.go:306 > node x1000c0s0b0n0 not found in SMD

DEBUG | ci.go:202 > failed to get node data error="unsuccessful HTTP status: HTTP/2.0 404 Not Found: node x1000c0s0b0n0 not found in SMD\n"
ERROR | cloud_init-node-get.go:153 > cloud-init node meta-data request yielded unsuccessful HTTP response error="GetNodeData(): failed to GET node data from cloud-init: unsuccessful HTTP status: HTTP/2.0 404 Not Found: node x1000c0s0b0n0 not found in SMD\n"

Both SMD and cloud-init return 404.

cloud-init logs:

6:09PM DBG Getting metadata for id: x1000c0s0b0n0
6:09PM ERR SMD GET request went through, but returned unsuccessful HTTP response (HTTP 404)
2025/10/02 18:09:08 [cloud-init/8uzzUL64Jy-000018] "GET http://demo.openchami.cluster:8443/admin/impersonation/x1000c0s0b0n0/meta-data HTTP/1.1" from 172.16.0.254 - 404 36B in 1.531195ms

user-data:

$ ochami -l debug -L basic cloud-init node get user-data x1000c0s0b0n0
...
DEBUG | client.go:245 > GET: https://demo.openchami.cluster:8443/cloud-init/admin/impersonation/x1000c0s0b0n0/user-data
DEBUG | client.go:266 > Request headers:
DEBUG | client.go:268 >   Authorization: [Bearer ey...]
DEBUG | client.go:268 >   User-Agent: [ochami/0.5.4]
DEBUG | client.go:277 > No body in request
DEBUG | client.go:288 > Response status: 200 OK
DEBUG | client.go:290 > Response headers:
DEBUG | client.go:292 >   Date: [Thu, 02 Oct 2025 18:11:44 GMT]
DEBUG | client.go:292 >   Content-Length: [13]
DEBUG | client.go:292 >   Content-Type: [text/plain; charset=utf-8]
DEBUG | client.go:305 > Response body:
DEBUG | client.go:306 > #cloud-config
INFO  | http.go:124 > Response status: HTTP/2.0 200 OK
#cloud-config

user-data always returns 200.

Groups

Try to get data for any of:

  • existing node but non-existing group
  • nonexisting node but existing group
  • nonexisting node and group
  • node is not a member of group

And the message should be the same:

$ ochami -l debug -L basic cloud-init node get group x1000c0s0b0n0 nonexistent
...
DEBUG | client.go:245 > GET: https://demo.openchami.cluster:8443/cloud-init/admin/impersonation/x1000c0s0b0n0/nonexistent.yaml
DEBUG | client.go:266 > Request headers:
DEBUG | client.go:268 >   User-Agent: [ochami/0.5.4]
DEBUG | client.go:268 >   Authorization: [Bearer ey...]
DEBUG | client.go:277 > No body in request
DEBUG | client.go:288 > Response status: 404 Not Found
DEBUG | client.go:290 > Response headers:
DEBUG | client.go:292 >   Content-Type: [text/plain; charset=utf-8]
DEBUG | client.go:292 >   X-Content-Type-Options: [nosniff]
DEBUG | client.go:292 >   Date: [Thu, 02 Oct 2025 18:18:05 GMT]
DEBUG | client.go:292 >   Content-Length: [97]
DEBUG | client.go:305 > Response body:
DEBUG | client.go:306 > node x1000c0s0b0n0 is not a member of group nonexistent (node and/or group may not exist in SMD)

DEBUG | ci.go:248 > failed to get node group data error="unsuccessful HTTP status: HTTP/2.0 404 Not Found: node x1000c0s0b0n0 is not a member of group nonexistent (node and/or group may not exist in SMD)\n"
ERROR | cloud_init-node-get.go:74 > cloud-init node group request yielded unsuccessful HTTP response error="GetNodeGroupData(): failed to GET node group data from cloud-init: unsuccessful HTTP status: HTTP/2.0 404 Not Found: node x1000c0s0b0n0 is not a member of group nonexistent (node and/or group may not exist in SMD)\n"

Signed-off-by: Devon Bautista <17506592+synackd@users.noreply.github.com>
ErrEmptyID is returned if an ID (xname) passed is empty. Functions can
check for this error to warn the caller not to pass an empty ID.

ErrSMDResponse is an error that wraps an *http.Response, specifically
one that is returned from calls to SMD (e.g. via getSMD()). This is so
callers of getSMD() can distinguish between control flow errors and HTTP
errors (response >= 400) from SMD.

This commit also separates error definitions for the smdclient package
into their own errors.go file.

Signed-off-by: Devon Bautista <17506592+synackd@users.noreply.github.com>
@synackd
Copy link
Copy Markdown
Contributor Author

synackd commented Oct 2, 2025

Ugh, forgot to lint before pushing... Will push changes shortly.

In MetaDataHandler(), issue HTTP response based on if the errors
returned by smd.ComponentInformation() and smd.GroupMembership() are
control flow errors or unsuccessful HTTP return codes (>= 400). With the
addition of ErrSMDResponse, the error can be checked if it is an HTTP
error from an API call to SMD or a control flow error and perform the
proper actions based on this.

Signed-off-by: Devon Bautista <17506592+synackd@users.noreply.github.com>
When asking for the user data of either:

- a non-existent group
- a non-existent node
- a node that is not in a group

the previous error message simply said "Group not found" which can be
misleading. This commit clarifies that the error could be any of the
above possibilities.

Signed-off-by: Devon Bautista <17506592+synackd@users.noreply.github.com>
Signed-off-by: Devon Bautista <17506592+synackd@users.noreply.github.com>
@synackd synackd force-pushed the synackd/fix/nonexistent-node-impersonation branch from f765488 to 1ca0633 Compare October 2, 2025 18:39
@alexlovelltroy alexlovelltroy merged commit d25ac41 into main Oct 2, 2025
5 checks passed
@synackd synackd deleted the synackd/fix/nonexistent-node-impersonation branch October 2, 2025 18:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] Impersonating non-existent node does not err

2 participants