So an issue I'm running into is that we don't really have a thing like a heartbeat metric for SQUAC that is a simple metric of SQUAC uptime, or script success. For example, calculation of SQUAC daily aggregates was broken Sep 3-5 and Sep 15-17, 2023 (not sure about the exact datetime ranges) and a simple heartbeat metric measurement for these dates not being present could have pointed to this.
A hacky way to do this is to create a dummy channel (or network of channels?) where a single measurement gets written with a value of 1 within every script. This is very end-to-end and maybe not the smartest way to do this. Some wrinkles to think about are what happens when a bulk upload barfs and this dummy channel was already written... should there be a dummy channel for the start and one for the end of each script?
So an issue I'm running into is that we don't really have a thing like a heartbeat metric for SQUAC that is a simple metric of SQUAC uptime, or script success. For example, calculation of SQUAC daily aggregates was broken Sep 3-5 and Sep 15-17, 2023 (not sure about the exact datetime ranges) and a simple heartbeat metric measurement for these dates not being present could have pointed to this.
A hacky way to do this is to create a dummy channel (or network of channels?) where a single measurement gets written with a value of 1 within every script. This is very end-to-end and maybe not the smartest way to do this. Some wrinkles to think about are what happens when a bulk upload barfs and this dummy channel was already written... should there be a dummy channel for the start and one for the end of each script?