Skip to content

Add metrics for RPC backend#100

Open
Chengxuan wants to merge 4 commits into
hyperledger-firefly:mainfrom
kaleido-io:rpc-metrics
Open

Add metrics for RPC backend#100
Chengxuan wants to merge 4 commits into
hyperledger-firefly:mainfrom
kaleido-io:rpc-metrics

Conversation

@Chengxuan

Copy link
Copy Markdown
Contributor

Adding basic metrics for RPC backend.

Example dashboard:

image (3)

Signed-off-by: Chengxuan Xing <chengxuan.xing@kaleido.io>
Signed-off-by: Chengxuan Xing <chengxuan.xing@kaleido.io>
@Chengxuan Chengxuan requested a review from a team June 19, 2026 10:55

@EnriqueL8 EnriqueL8 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, one important question on batch vs single op per request metrics

Comment thread pkg/rpcbackend/backend.go Outdated
var batchRes []*RPCResponse
log.L(ctx).Debugf("RPC batch[%d] -->", len(ops))
rpcStartTime := time.Now()
httpStart := time.Now()

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rpcStartTime? not http no?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also keeping things consistent

Comment thread pkg/rpcbackend/backend.go Outdated
SetBody(batchReqs).
SetResult(&batchRes).
Post("")
httpDuration := time.Since(httpStart)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

Suggested change
httpDuration := time.Since(httpStart)
requestDuration := time.Since(httpStart)

Comment thread pkg/rpcbackend/backend.go Outdated
errMsg := i18n.NewError(ctx, signermsgs.MsgRPCRequestFailed, err).Error()
log.L(ctx).Errorf("RPC batch[%d] <-- ERROR: %s", len(ops), errMsg)
for _, op := range ops {
recordRPCRequest(ctx, op.Method, statusTransportError, httpDuration)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This modifies the metrics no? you cannot compare a batch metric op to a single op right? That will give you odd metrics ?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@EnriqueL8 Yes, and it's deliberate that the batch timing and single request timing are tracked under the same metric. The metric are grouped by json_rpc method and status.

if users decide to mix usage of batch / non-batch for the same query, the timing of a JSON_RPC method will be hard to track.

However, I don't believe not tracking batch requests in the proposed metrics is correct either. So I added a batch label just in case there is a need to separate them, and also added an extra metric for batch size.

Signed-off-by: Chengxuan Xing <chengxuan.xing@kaleido.io>
…-metrics

Signed-off-by: Chengxuan Xing <chengxuan.xing@kaleido.io>
@Chengxuan Chengxuan requested a review from EnriqueL8 June 25, 2026 12:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants