Ignore invalid utf8 characters while decoding by devbugging · Pull Request #141 · onflow/flow-archive

devbugging · 2023-08-09T13:32:53Z

GCP streamer failed to decode block data that contained invalid utf8 characters. The explanation as to why that data was included is here (copy from Slack):

I’ve looked into this and it seems the issue is caused by the transaction with ID f6c8e65646a3b140902aa7559ae2e740bbe92fbef65f414a441c141340a5756f more specifically last argument of the transaction, if looked at closely you can see the whitespace before the address is not actual whitespace but invalid utf8 character, under further investigation I’ve found it’s BOM character https://en.wikipedia.org/wiki/Byte_order_mark
The problem is then in CBOR encoding/decoding assuming utf8 validity which in this case breaks. Because this is the first time (to my limited knowledge) the tx args are CBOR encoded/decoded and since tx args are provided by user input we can make the RN node fail with current setting. Making sure we set CBOR decoding flag to enable non valid utf-8 chars would fix this and at the same time understanding this issue I feel it would be an ok fix. So it’s not any bugs in the uploader producing malformed data but it’s an invalid input from the user.

This allows such characters and thus avoid failing.

Misc

PR title will be clear as part of the changelog
PR is against the correct branch
PR is labelled appropriately
PR is linked to an issue

Ignore invalid utf8 characters while decoding

90de112

devbugging added the Improvement label Aug 9, 2023

devbugging self-assigned this Aug 9, 2023

koko1123 approved these changes Aug 9, 2023

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ignore invalid utf8 characters while decoding#141

Ignore invalid utf8 characters while decoding#141
devbugging wants to merge 1 commit into
masterfrom
gregor/utf8-decode

devbugging commented Aug 9, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

devbugging commented Aug 9, 2023

Misc

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants