nanopb integration by kevoreilly · Pull Request #118 · kevoreilly/capemon

kevoreilly · 2025-12-23T10:16:30Z

No description provided.

gemini-code-assist · 2025-12-23T10:17:41Z

Summary of Changes

Hello @kevoreilly, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request initiates a significant migration of the logging data serialization mechanism from BSON to Nanopb (Protocol Buffers). The change aims to leverage Nanopb's efficiency for embedded systems, potentially leading to smaller log sizes and improved performance. It involves integrating the Nanopb library, defining a new Protobuf schema for logging events, and refactoring the core logging logic in log.c to utilize a new protobuf_wrapper layer. While the core serialization switch is implemented, some aspects like complex array handling and binary data logging are noted as requiring further development.

Highlights

Serialization Migration: Replaced the existing BSON data serialization library with Nanopb, a Protocol Buffers implementation optimized for constrained systems.
New Protobuf Schema: Introduced schema.proto to define the new data structures for logging events, including StrMessage and RegularCall types within a HookEvent oneof.
Protobuf Wrapper: Added protobuf_wrapper.c and protobuf_wrapper.h to provide a simplified API for interacting with Nanopb, mirroring some functionalities of the previous BSON library.
Logging Mechanism Refactor: Extensively updated log.c to switch from BSON-based data appending to the new Protobuf wrapper functions for various data types like integers, strings, and pointers.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request aims to replace "BSON" with "nanopb" (Protocol Buffers) for logging serialization. However, it introduces critical buffer over-read vulnerabilities in "log.c" within "log_string" and "log_wstring", where "strlen" can read past buffer boundaries due to ignored length parameters, potentially causing crashes or information disclosure. Beyond this, the implementation is incomplete, with significant logging functionality commented out or replaced, thread-safety mechanisms removed, and risks of uninitialized memory usage. The "protobuf_wrapper" also presents concerns regarding its fixed-size buffer and non-scalable field mapping.

gemini-code-assist · 2025-12-23T10:21:43Z

log.c

+	protobuf_context_t ctx;
+	//protobuf_init_message(&ctx, "debug");
+	protobuf_append_string(&ctx, "msg", msg);
+	protobuf_finish(&ctx);
+	log_raw_direct(protobuf_data(&ctx), protobuf_size(&ctx));
+	protobuf_destroy(&ctx);


The protobuf_context_t ctx variable is used without being initialized. This will lead to undefined behavior when protobuf_append_string and other functions attempt to write to it. You should initialize it using protobuf_init(&ctx); at the beginning of the function.

void debug_message(const char *msg) { protobuf_context_t ctx; protobuf_init(&ctx); //protobuf_init_message(&ctx, "debug"); protobuf_append_string(&ctx, "msg", msg); protobuf_finish(&ctx); log_raw_direct(protobuf_data(&ctx), protobuf_size(&ctx)); protobuf_destroy(&ctx); log_flush(); }

gemini-code-assist · 2025-12-23T10:21:43Z

log.c

+//	if (!TryEnterCriticalSection(&g_mutex))
+//		goto exit;


Commenting out the TryEnterCriticalSection call (and the corresponding LeaveCriticalSection at the end of the function) removes thread safety from the logging mechanism. This is a critical issue that can lead to race conditions and memory corruption if loq is called from multiple threads concurrently. The mutex protection must be restored.

gemini-code-assist · 2025-12-23T10:21:43Z

log.c

+	//bson_init( g_bson );
+	//bson_append_int( g_bson, "I", index );
 	hookinfo = hook_info();
-	bson_append_ptr(g_bson, "C", hookinfo->return_address);
+	//bson_append_ptr(g_bson, "C", hookinfo->return_address);
 	// return location of malware callsite
-	bson_append_ptr(g_bson, "R", hookinfo->main_caller_retaddr);
+	//bson_append_ptr(g_bson, "R", hookinfo->main_caller_retaddr);
 	// return parent location of malware callsite
-	bson_append_ptr(g_bson, "P", hookinfo->parent_caller_retaddr);
-	bson_append_int(g_bson, "T", GetCurrentThreadId());
-	bson_append_int(g_bson, "t", raw_gettickcount() - g_starttick );
+	//bson_append_ptr(g_bson, "P", hookinfo->parent_caller_retaddr);
+	//bson_append_int(g_bson, "T", GetCurrentThreadId());
+	//bson_append_int(g_bson, "t", raw_gettickcount() - g_starttick );
 	// number of times this log was repeated -- we'll modify this
-	bson_append_int(g_bson, "r", 0);
+	//bson_append_int(g_bson, "r", 0);

-	compare_offset = (unsigned int)(g_bson->cur - bson_data(g_bson));
+	//compare_offset = (unsigned int)(g_bson->cur - bson_data(g_bson));
 	// the repeated value is encoded immediately before the stream we want to compare
 	repeat_offset = compare_offset - 4;

-	bson_append_start_array(g_bson, "args");
-	bson_append_int( g_bson, "0", is_success );
-	bson_append_ptr( g_bson, "1", return_value );
+	//bson_append_start_array(g_bson, "args");
+	//bson_append_int( g_bson, "0", is_success );
+	//bson_append_ptr( g_bson, "1", return_value );


The core logic for logging API call details has been commented out. This effectively disables the primary functionality of the loq function. This appears to be a work-in-progress, but in its current state, no API call information will be recorded.

gemini-code-assist · 2025-12-23T10:21:43Z

log.c

 static void log_string(const char *str, int length)
 {
-	int ret;
-	char *utf8s;
-	int utf8len;
-
 	if (str == NULL) {
-		bson_append_string_n( g_bson, g_istr, "", 0 );
+		protobuf_append_string(g_protobuf, g_istr, "");
 		return;
 	}
-	utf8s = utf8_string(str, length);
-	utf8len = * (int *) utf8s;
-	ret = bson_append_binary( g_bson, g_istr, BSON_BIN_BINARY, utf8s+4, utf8len );
-	if (ret == BSON_ERROR) {
-		bson_append_string_n(g_bson, g_istr, "", 0);
-	}
-	free(utf8s);
+	protobuf_append_string(g_protobuf, g_istr, str);
 }


This refactoring introduces a critical buffer over-read vulnerability in "log_string". The "length" parameter is ignored, and "protobuf_append_string" relies on "strlen", which will read past buffer boundaries if a non-null-terminated string is passed (e.g., via "loq"'s 'S' format specifier). This can lead to crashes or information disclosure. The same issue applies to "log_wstring". The previous "BSON" implementation correctly handled string lengths.

gemini-code-assist · 2025-12-23T10:21:43Z

log.c

 static void log_wstring(const wchar_t *str, int length)
 {
-	int ret;
-	char *utf8s;
-	int utf8len;
-
 	if (str == NULL) {
-		bson_append_string_n( g_bson, g_istr, "", 0 );
+		protobuf_append_string(g_protobuf, g_istr, "");
 		return;
 	}
-	utf8s = utf8_wstring(str, length);
-	utf8len = * (int *) utf8s;
-	ret = bson_append_binary( g_bson, g_istr, BSON_BIN_BINARY, utf8s+4, utf8len );
-	if (ret == BSON_ERROR) {
-		bson_append_string_n(g_bson, g_istr, "", 0);
+	// Convert wide string to UTF-8 or handle appropriately
+	char *utf8s = utf8_wstring(str, length);
+	if (utf8s) {
+		int utf8len = *(int*)utf8s;
+		// This needs adaptation for binary data in protobuf
+		protobuf_append_string(g_protobuf, g_istr, utf8s + 4);
+		free(utf8s);
 	}
-	free(utf8s);
 }


Similar to the vulnerability in log_string, this function is also vulnerable to a buffer over-read. The result of utf8_wstring is not guaranteed to be null-terminated. Passing this buffer (utf8s + 4) to protobuf_append_string will cause strlen to be called on a non-null-terminated string, leading to a buffer over-read. The old implementation correctly used the explicit length of the converted string.

gemini-code-assist · 2025-12-23T10:21:43Z

log.c

 static void log_wargv(int argc, const wchar_t ** argv) {
 	int i;

-	bson_append_start_array( g_bson, g_istr );
-
 	for (i = 0; i < argc; i++) {
 		num_to_string(g_istr, 4, i);
 		log_wstring(argv[i], -1);
 	}
-
-	bson_append_finish_array( g_bson );
 }


This implementation of log_wargv is incorrect. Since bson_append_start_array and bson_append_finish_array have been removed, the loop will repeatedly call log_wstring, which in turn calls protobuf_append_string. This will overwrite the same field in the protobuf message on each iteration. To correctly log an array of strings, the schema needs to be updated to use a repeated field, and the logging logic must be adapted to add multiple entries to it.

gemini-code-assist · 2025-12-23T10:21:43Z

log.c

+	// You'll need to add protobuf_append_binary to your wrapper
+	// For now, we'll skip binary data or log as string
+	if (trunclength > 0) {
+		// This is a placeholder - you'll need proper binary handling
+		protobuf_append_string(g_protobuf, g_istr, "[binary data]");
+	}


Logging a placeholder string "[binary data]" instead of the actual buffer content is a critical loss of functionality. The comment indicates that a binary append function is missing from the wrapper. This needs to be implemented in protobuf_wrapper.c and used here to restore the ability to log binary data.

gemini-code-assist · 2025-12-23T10:21:43Z

protobuf_wrapper.c

+int protobuf_append_string(protobuf_context_t* ctx, const char* name, const char* value) {
+    if (!value) return 0;
+
+    if (ctx->event.which_message_type == HookEvent_str_tag) {
+        StrMessage* str_msg = &ctx->event.message_type.str;
+
+        if (strcmp(name, "name") == 0) {
+            str_msg->name.funcs.encode = encode_string_callback;
+            str_msg->name.arg = (void*)value;
+        } else if (strcmp(name, "type") == 0) {
+            str_msg->type.funcs.encode = encode_string_callback;
+            str_msg->type.arg = (void*)value;
+        } else if (strcmp(name, "category") == 0) {
+            str_msg->category.funcs.encode = encode_string_callback;
+            str_msg->category.arg = (void*)value;
+        } else if (strcmp(name, "api_name") == 0) {
+            str_msg->api_name.funcs.encode = encode_string_callback;
+            str_msg->api_name.arg = (void*)value;
+        }
+    }
+
+    return 1;
+}
+
+int protobuf_append_int(protobuf_context_t* ctx, const char* name, int32_t value) {
+    if (ctx->event.which_message_type == HookEvent_regular_call_tag) {
+        RegularCall* call = &ctx->event.message_type.regular_call;
+
+        if (strcmp(name, "I") == 0 || strcmp(name, "i") == 0) {
+            call->i = value;
+        } else if (strcmp(name, "T") == 0 || strcmp(name, "t") == 0) {
+            call->t = value;
+        }
+    } else if (ctx->event.which_message_type == HookEvent_str_tag) {
+        StrMessage* str_msg = &ctx->event.message_type.str;
+
+        if (strcmp(name, "I") == 0 || strcmp(name, "i") == 0) {
+            str_msg->i = value;
+        }
+    }
+
+    return 1;
+}


Using strcmp to map string names to fields is inefficient and brittle. It makes the code hard to maintain, as any change in field names in the .proto file requires manual updates here. A more robust approach would be to pass field descriptors or use a more direct mapping mechanism to set the values, avoiding string comparisons at runtime.

gemini-code-assist · 2025-12-23T10:21:43Z

protobuf_wrapper.h

+typedef struct {
+    HookEvent event;
+    uint8_t buffer[4096];
+    size_t encoded_size;
+} protobuf_context_t;


Using a fixed-size buffer of 4096 bytes for encoding protobuf messages introduces a risk of buffer overflows if a message exceeds this size. The previous BSON implementation appeared to handle dynamic resizing. Consider either increasing the buffer size significantly or, preferably, implementing dynamic resizing or at least adding checks in protobuf_finish to prevent writing past the end of the buffer.

gemini-code-assist · 2025-12-23T10:21:43Z

protobuf_wrapper.c

+void protobuf_init(protobuf_context_t* ctx) {
+    // Zero the entire structure first
+    memset(ctx, 0, sizeof(protobuf_context_t));
+
+    // Initialize nanopb structs manually for MSVC compatibility
+    ctx->event.which_message_type = HookEvent_regular_call_tag;
+
+    // Manually initialize the regular_call message
+    ctx->event.message_type.regular_call.i = 0;
+    ctx->event.message_type.regular_call.t = 0;
+    ctx->event.message_type.regular_call.r = 0;
+    ctx->event.message_type.regular_call.p = 0;
+
+    // Initialize string callbacks to NULL
+    ctx->event.message_type.regular_call.c.funcs.encode = NULL;
+    ctx->event.message_type.regular_call.c.arg = NULL;
+    ctx->event.message_type.regular_call.args.funcs.encode = NULL;
+    ctx->event.message_type.regular_call.args.arg = NULL;
+    ctx->event.message_type.regular_call.index.funcs.encode = NULL;
+    ctx->event.message_type.regular_call.index.arg = NULL;
+    ctx->event.message_type.regular_call.aux.funcs.encode = NULL;
+    ctx->event.message_type.regular_call.aux.arg = NULL;
+    ctx->event.message_type.regular_call.data.funcs.encode = NULL;
+    ctx->event.message_type.regular_call.data.arg = NULL;
+
+    ctx->encoded_size = 0;
+}


Instead of manually zeroing out fields, it's safer and more maintainable to use the initializer macros generated by nanopb. This ensures all fields, including any future additions, are correctly initialized to their default state.

For example, you can use HookEvent_init_zero.

void protobuf_init(protobuf_context_t* ctx) { // Use the nanopb-generated default initializer. ctx->event = HookEvent_init_zero; ctx->encoded_size = 0; }

nanopb integration: initial commit

417cb5a

nanopb integration: remove bson

858b329

gemini-code-assist bot reviewed Dec 23, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

nanopb integration#118

nanopb integration#118
kevoreilly wants to merge 2 commits intocapemonfrom
nanopb

kevoreilly commented Dec 23, 2025

Uh oh!

gemini-code-assist bot commented Dec 23, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Dec 23, 2025

Uh oh!

gemini-code-assist bot Dec 23, 2025

Uh oh!

gemini-code-assist bot Dec 23, 2025

Uh oh!

gemini-code-assist bot Dec 23, 2025

Uh oh!

gemini-code-assist bot Dec 23, 2025

Uh oh!

gemini-code-assist bot Dec 23, 2025

Uh oh!

gemini-code-assist bot Dec 23, 2025

Uh oh!

gemini-code-assist bot Dec 23, 2025

Uh oh!

gemini-code-assist bot Dec 23, 2025

Uh oh!

gemini-code-assist bot Dec 23, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

kevoreilly commented Dec 23, 2025

Uh oh!

gemini-code-assist bot commented Dec 23, 2025

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Dec 23, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Dec 23, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Dec 23, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Dec 23, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Dec 23, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Dec 23, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Dec 23, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Dec 23, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Dec 23, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Dec 23, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant