Conversation
Summary of ChangesHello @kevoreilly, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request initiates a significant migration of the logging data serialization mechanism from BSON to Nanopb (Protocol Buffers). The change aims to leverage Nanopb's efficiency for embedded systems, potentially leading to smaller log sizes and improved performance. It involves integrating the Nanopb library, defining a new Protobuf schema for logging events, and refactoring the core logging logic in Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request aims to replace "BSON" with "nanopb" (Protocol Buffers) for logging serialization. However, it introduces critical buffer over-read vulnerabilities in "log.c" within "log_string" and "log_wstring", where "strlen" can read past buffer boundaries due to ignored length parameters, potentially causing crashes or information disclosure. Beyond this, the implementation is incomplete, with significant logging functionality commented out or replaced, thread-safety mechanisms removed, and risks of uninitialized memory usage. The "protobuf_wrapper" also presents concerns regarding its fixed-size buffer and non-scalable field mapping.
| protobuf_context_t ctx; | ||
| //protobuf_init_message(&ctx, "debug"); | ||
| protobuf_append_string(&ctx, "msg", msg); | ||
| protobuf_finish(&ctx); | ||
| log_raw_direct(protobuf_data(&ctx), protobuf_size(&ctx)); | ||
| protobuf_destroy(&ctx); |
There was a problem hiding this comment.
The protobuf_context_t ctx variable is used without being initialized. This will lead to undefined behavior when protobuf_append_string and other functions attempt to write to it. You should initialize it using protobuf_init(&ctx); at the beginning of the function.
void debug_message(const char *msg) {
protobuf_context_t ctx;
protobuf_init(&ctx);
//protobuf_init_message(&ctx, "debug");
protobuf_append_string(&ctx, "msg", msg);
protobuf_finish(&ctx);
log_raw_direct(protobuf_data(&ctx), protobuf_size(&ctx));
protobuf_destroy(&ctx);
log_flush();
}| // if (!TryEnterCriticalSection(&g_mutex)) | ||
| // goto exit; |
There was a problem hiding this comment.
Commenting out the TryEnterCriticalSection call (and the corresponding LeaveCriticalSection at the end of the function) removes thread safety from the logging mechanism. This is a critical issue that can lead to race conditions and memory corruption if loq is called from multiple threads concurrently. The mutex protection must be restored.
| //bson_init( g_bson ); | ||
| //bson_append_int( g_bson, "I", index ); | ||
| hookinfo = hook_info(); | ||
| bson_append_ptr(g_bson, "C", hookinfo->return_address); | ||
| //bson_append_ptr(g_bson, "C", hookinfo->return_address); | ||
| // return location of malware callsite | ||
| bson_append_ptr(g_bson, "R", hookinfo->main_caller_retaddr); | ||
| //bson_append_ptr(g_bson, "R", hookinfo->main_caller_retaddr); | ||
| // return parent location of malware callsite | ||
| bson_append_ptr(g_bson, "P", hookinfo->parent_caller_retaddr); | ||
| bson_append_int(g_bson, "T", GetCurrentThreadId()); | ||
| bson_append_int(g_bson, "t", raw_gettickcount() - g_starttick ); | ||
| //bson_append_ptr(g_bson, "P", hookinfo->parent_caller_retaddr); | ||
| //bson_append_int(g_bson, "T", GetCurrentThreadId()); | ||
| //bson_append_int(g_bson, "t", raw_gettickcount() - g_starttick ); | ||
| // number of times this log was repeated -- we'll modify this | ||
| bson_append_int(g_bson, "r", 0); | ||
| //bson_append_int(g_bson, "r", 0); | ||
|
|
||
| compare_offset = (unsigned int)(g_bson->cur - bson_data(g_bson)); | ||
| //compare_offset = (unsigned int)(g_bson->cur - bson_data(g_bson)); | ||
| // the repeated value is encoded immediately before the stream we want to compare | ||
| repeat_offset = compare_offset - 4; | ||
|
|
||
| bson_append_start_array(g_bson, "args"); | ||
| bson_append_int( g_bson, "0", is_success ); | ||
| bson_append_ptr( g_bson, "1", return_value ); | ||
| //bson_append_start_array(g_bson, "args"); | ||
| //bson_append_int( g_bson, "0", is_success ); | ||
| //bson_append_ptr( g_bson, "1", return_value ); |
| static void log_string(const char *str, int length) | ||
| { | ||
| int ret; | ||
| char *utf8s; | ||
| int utf8len; | ||
|
|
||
| if (str == NULL) { | ||
| bson_append_string_n( g_bson, g_istr, "", 0 ); | ||
| protobuf_append_string(g_protobuf, g_istr, ""); | ||
| return; | ||
| } | ||
| utf8s = utf8_string(str, length); | ||
| utf8len = * (int *) utf8s; | ||
| ret = bson_append_binary( g_bson, g_istr, BSON_BIN_BINARY, utf8s+4, utf8len ); | ||
| if (ret == BSON_ERROR) { | ||
| bson_append_string_n(g_bson, g_istr, "", 0); | ||
| } | ||
| free(utf8s); | ||
| protobuf_append_string(g_protobuf, g_istr, str); | ||
| } |
There was a problem hiding this comment.
This refactoring introduces a critical buffer over-read vulnerability in "log_string". The "length" parameter is ignored, and "protobuf_append_string" relies on "strlen", which will read past buffer boundaries if a non-null-terminated string is passed (e.g., via "loq"'s 'S' format specifier). This can lead to crashes or information disclosure. The same issue applies to "log_wstring". The previous "BSON" implementation correctly handled string lengths.
| static void log_wstring(const wchar_t *str, int length) | ||
| { | ||
| int ret; | ||
| char *utf8s; | ||
| int utf8len; | ||
|
|
||
| if (str == NULL) { | ||
| bson_append_string_n( g_bson, g_istr, "", 0 ); | ||
| protobuf_append_string(g_protobuf, g_istr, ""); | ||
| return; | ||
| } | ||
| utf8s = utf8_wstring(str, length); | ||
| utf8len = * (int *) utf8s; | ||
| ret = bson_append_binary( g_bson, g_istr, BSON_BIN_BINARY, utf8s+4, utf8len ); | ||
| if (ret == BSON_ERROR) { | ||
| bson_append_string_n(g_bson, g_istr, "", 0); | ||
| // Convert wide string to UTF-8 or handle appropriately | ||
| char *utf8s = utf8_wstring(str, length); | ||
| if (utf8s) { | ||
| int utf8len = *(int*)utf8s; | ||
| // This needs adaptation for binary data in protobuf | ||
| protobuf_append_string(g_protobuf, g_istr, utf8s + 4); | ||
| free(utf8s); | ||
| } | ||
| free(utf8s); | ||
| } |
There was a problem hiding this comment.
Similar to the vulnerability in log_string, this function is also vulnerable to a buffer over-read. The result of utf8_wstring is not guaranteed to be null-terminated. Passing this buffer (utf8s + 4) to protobuf_append_string will cause strlen to be called on a non-null-terminated string, leading to a buffer over-read. The old implementation correctly used the explicit length of the converted string.
| static void log_wargv(int argc, const wchar_t ** argv) { | ||
| int i; | ||
|
|
||
| bson_append_start_array( g_bson, g_istr ); | ||
|
|
||
| for (i = 0; i < argc; i++) { | ||
| num_to_string(g_istr, 4, i); | ||
| log_wstring(argv[i], -1); | ||
| } | ||
|
|
||
| bson_append_finish_array( g_bson ); | ||
| } |
There was a problem hiding this comment.
This implementation of log_wargv is incorrect. Since bson_append_start_array and bson_append_finish_array have been removed, the loop will repeatedly call log_wstring, which in turn calls protobuf_append_string. This will overwrite the same field in the protobuf message on each iteration. To correctly log an array of strings, the schema needs to be updated to use a repeated field, and the logging logic must be adapted to add multiple entries to it.
| // You'll need to add protobuf_append_binary to your wrapper | ||
| // For now, we'll skip binary data or log as string | ||
| if (trunclength > 0) { | ||
| // This is a placeholder - you'll need proper binary handling | ||
| protobuf_append_string(g_protobuf, g_istr, "[binary data]"); | ||
| } |
There was a problem hiding this comment.
Logging a placeholder string "[binary data]" instead of the actual buffer content is a critical loss of functionality. The comment indicates that a binary append function is missing from the wrapper. This needs to be implemented in protobuf_wrapper.c and used here to restore the ability to log binary data.
| int protobuf_append_string(protobuf_context_t* ctx, const char* name, const char* value) { | ||
| if (!value) return 0; | ||
|
|
||
| if (ctx->event.which_message_type == HookEvent_str_tag) { | ||
| StrMessage* str_msg = &ctx->event.message_type.str; | ||
|
|
||
| if (strcmp(name, "name") == 0) { | ||
| str_msg->name.funcs.encode = encode_string_callback; | ||
| str_msg->name.arg = (void*)value; | ||
| } else if (strcmp(name, "type") == 0) { | ||
| str_msg->type.funcs.encode = encode_string_callback; | ||
| str_msg->type.arg = (void*)value; | ||
| } else if (strcmp(name, "category") == 0) { | ||
| str_msg->category.funcs.encode = encode_string_callback; | ||
| str_msg->category.arg = (void*)value; | ||
| } else if (strcmp(name, "api_name") == 0) { | ||
| str_msg->api_name.funcs.encode = encode_string_callback; | ||
| str_msg->api_name.arg = (void*)value; | ||
| } | ||
| } | ||
|
|
||
| return 1; | ||
| } | ||
|
|
||
| int protobuf_append_int(protobuf_context_t* ctx, const char* name, int32_t value) { | ||
| if (ctx->event.which_message_type == HookEvent_regular_call_tag) { | ||
| RegularCall* call = &ctx->event.message_type.regular_call; | ||
|
|
||
| if (strcmp(name, "I") == 0 || strcmp(name, "i") == 0) { | ||
| call->i = value; | ||
| } else if (strcmp(name, "T") == 0 || strcmp(name, "t") == 0) { | ||
| call->t = value; | ||
| } | ||
| } else if (ctx->event.which_message_type == HookEvent_str_tag) { | ||
| StrMessage* str_msg = &ctx->event.message_type.str; | ||
|
|
||
| if (strcmp(name, "I") == 0 || strcmp(name, "i") == 0) { | ||
| str_msg->i = value; | ||
| } | ||
| } | ||
|
|
||
| return 1; | ||
| } |
There was a problem hiding this comment.
Using strcmp to map string names to fields is inefficient and brittle. It makes the code hard to maintain, as any change in field names in the .proto file requires manual updates here. A more robust approach would be to pass field descriptors or use a more direct mapping mechanism to set the values, avoiding string comparisons at runtime.
| typedef struct { | ||
| HookEvent event; | ||
| uint8_t buffer[4096]; | ||
| size_t encoded_size; | ||
| } protobuf_context_t; |
There was a problem hiding this comment.
Using a fixed-size buffer of 4096 bytes for encoding protobuf messages introduces a risk of buffer overflows if a message exceeds this size. The previous BSON implementation appeared to handle dynamic resizing. Consider either increasing the buffer size significantly or, preferably, implementing dynamic resizing or at least adding checks in protobuf_finish to prevent writing past the end of the buffer.
| void protobuf_init(protobuf_context_t* ctx) { | ||
| // Zero the entire structure first | ||
| memset(ctx, 0, sizeof(protobuf_context_t)); | ||
|
|
||
| // Initialize nanopb structs manually for MSVC compatibility | ||
| ctx->event.which_message_type = HookEvent_regular_call_tag; | ||
|
|
||
| // Manually initialize the regular_call message | ||
| ctx->event.message_type.regular_call.i = 0; | ||
| ctx->event.message_type.regular_call.t = 0; | ||
| ctx->event.message_type.regular_call.r = 0; | ||
| ctx->event.message_type.regular_call.p = 0; | ||
|
|
||
| // Initialize string callbacks to NULL | ||
| ctx->event.message_type.regular_call.c.funcs.encode = NULL; | ||
| ctx->event.message_type.regular_call.c.arg = NULL; | ||
| ctx->event.message_type.regular_call.args.funcs.encode = NULL; | ||
| ctx->event.message_type.regular_call.args.arg = NULL; | ||
| ctx->event.message_type.regular_call.index.funcs.encode = NULL; | ||
| ctx->event.message_type.regular_call.index.arg = NULL; | ||
| ctx->event.message_type.regular_call.aux.funcs.encode = NULL; | ||
| ctx->event.message_type.regular_call.aux.arg = NULL; | ||
| ctx->event.message_type.regular_call.data.funcs.encode = NULL; | ||
| ctx->event.message_type.regular_call.data.arg = NULL; | ||
|
|
||
| ctx->encoded_size = 0; | ||
| } |
There was a problem hiding this comment.
Instead of manually zeroing out fields, it's safer and more maintainable to use the initializer macros generated by nanopb. This ensures all fields, including any future additions, are correctly initialized to their default state.
For example, you can use HookEvent_init_zero.
void protobuf_init(protobuf_context_t* ctx) {
// Use the nanopb-generated default initializer.
ctx->event = HookEvent_init_zero;
ctx->encoded_size = 0;
}
No description provided.