Partial TRS haircut#70
Open
rainest wants to merge 1 commit into
Open
Conversation
Replace the github.com/Cray-HPE/hms-trs-app-api/ package with a local copy that lacks the Kafka implementation. Removes serialization code that's not necessary for locally-issued HTTP requests. Removes workarounds for bugs in older Go versions. Removes unused logger cruft. Signed-off-by: Travis Raines <571832+rainest@users.noreply.github.com>
Contributor
Author
|
IDK what the heck golangci-lint's doing--it's complaining about but that's specifically a line I removed because of, yknow, the thing it's complaining about: https://github.com/OpenCHAMI/power-control/pull/70/changes#diff-dc71194ec358f605d87e6c182853969eacb9af7dc0a044247868f27c011d1bf9L174
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
tl;dr you probably don't need Kafka just to send a bunch of HTTP requests. If we hit scaling problems in the future, that should better inform further refactoring.
This moves TRS into an internal package and removes some stuff that looks like cruft, but doesn't try for exhaustive simplification just yet.
This will allow for further simplifying #69 into "just set
CGO_ENABLED=0, but I think having #69 around as an example if we encounter something else that happens to need CGO.This "adds" code, but it's actually removing a bunch of code that used to be external in an HPE repo: https://gist.github.com/rainest/a04de0575926844227fd3096b5800ad5
The task runner service tries to answer the question of whether you need Kafka to handle the issuance and tracking of HTTP requests, and its README posits a possible answer and some ballpark scale figures:
Bringing Kafka in requires CGO, because the original Confluent client needs it. Should you need Kafka, there are alternatives that don't need CGO (https://github.com/twmb/franz-go and the new official IBM/sarama#1179).
#69 does show how to handle CGO on ARM, but AFAIK nothing else in our service set needs CGO, and ideally nothing will.
That said, you probably don't need Kafka! As stated, the local variant's usually fine, since you probably simply don't have that many nodes, or could use alternative strategies for dealing with HTTP requests that don't involve an entire distributed event store! Refactoring to some combination of standard Go concurrency primitives, the existing usage of a simple retry library, multiple replicas, or any of the other myriad ways to handle distribution of HTTP requests in a way particular to the needs of an individual service is likely fine. OpenCHAMI doesn't use the one other service that (maybe) needed TRS, so the need to keep the library around is tenuous.
I don't have a good way to disentangle the architecture history that drove the design choices in TRS. Some (serializing HTTP requests into JSON that could fit into Kafka events) is pretty clearly unnecessary if you don't need to farm your HTTP requests out to Kafka, some appears to be trying to deal with connection management behavior in ancient versions of Golang (problem solved: we do not use ancient versions of Golang), and some (in test design) seems to be trying to chaos engineering writ small when BMCs or the cluster network misbehaves. There's maybe some value in the latter, but I'm iffy because I lack an actual fleet of IRL BMCs that I can casually try and make misbehave by chaotically sending mass power commands or status queries to a misbehaving cluster or a general purpose mock environment with problem scenarios informed by real-world experience.
Ultimately, there's further cruft here that could be simplified further, but my appetite for yak shaving is not infinite.