Be robust to invalid utf-8 characters in task db#115
Be robust to invalid utf-8 characters in task db#115bergercookie wants to merge 1 commit intoGothenburgBitFactory:developfrom
Conversation
Sometimes, there may exist non-printable characters in the taskwarrior `pending.data` file, for example due to an emoji added to the task description but is not yet properly parsed by python. In these cases, we'd want `tasklib` not to crash but rather ignore it and keep parsing the results of the command.
|
Not entirely sure that ignoring the error is the way to go, because it means the records obtained from taskwarrior will be inconsistent with the actual stored data. In fact it shouldn't be getting invalid text at all that cannot be decoded probably... it doesn't support arbitrary binary data. Here is a small reproducer which tries to store the character '🦀': # add the rustlang crab emoji as task annotation from shell prompt:
task 1 annotate -- $'\U0001f980\'After this, tasklib will crash when reading the task database. I am not sure if this is rather a bug in taskwarrior, as it doesn't seem to get encoded correctly; after the above annotation, @bergercookie did you get the character into the database by specifically using |
|
Note that the latest development version of taskwarrior seems to store this fine, see GothenburgBitFactory/taskwarrior#3286 |
This PR is like 2 years back so I honestly have no idea how I had reproduced this 😅 |
|
I think the issue should be closed, because the problem doesn't exist anymore. |
Sometimes, there may exist non-printable characters in the taskwarrior
pending.datafile, for example due to an emoji added to the task description but is not yet properly parsed by python. In these cases, we'd wanttasklibnot to crash but rather ignore it and keep parsing the results of the command.