We are using prom-client with NestJS (together with @willsoto/nestjs-prometheus). We have several custom metrics - in particular histograms. As a result, the overall metrics size grows up to 8MB. This results in the NodeJS event loop lag getting to be as large as 1 second!
What we see:
- Reducing the amount of data collected, e.g. by removing some metrics, improves the situation (however we would like to keep the metrics we have)
- Explicitly invoking
register.resetMetrics() causes the lag to go down almost to 0. It then starts increasing again
- CPU usage is constant at around 15%
- CPU profile shows long time spent in Node internal network functions, such as
getPeerCertificate and destroySSL
Has anyone encountered this sort of behavior? Any suggestions?
We are using
prom-clientwith NestJS (together with@willsoto/nestjs-prometheus). We have several custom metrics - in particular histograms. As a result, the overall metrics size grows up to 8MB. This results in the NodeJS event loop lag getting to be as large as 1 second!What we see:
register.resetMetrics()causes the lag to go down almost to 0. It then starts increasing againgetPeerCertificateanddestroySSLHas anyone encountered this sort of behavior? Any suggestions?