Skip to content

Database Performance Optimizations#7

Open
everettbu wants to merge 1 commit into
db-cleanup-baselinefrom
db-cleanup-optimized
Open

Database Performance Optimizations#7
everettbu wants to merge 1 commit into
db-cleanup-baselinefrom
db-cleanup-optimized

Conversation

@everettbu

@everettbu everettbu commented Jul 26, 2025

Copy link
Copy Markdown

Test 7

Summary by CodeRabbit

  • Improvements
    • Enhanced annotation cleanup to run more efficiently and frequently, now triggering every minute instead of every ten minutes.
    • Improved reliability and performance of annotation and tag deletion, especially for large batches and on MySQL databases.
    • Strengthened integration test coverage for annotation cleanup, ensuring robust handling of large data sets and realistic scenarios.

… deadlocks on MySQL (#80329)

* Split subquery when cleaning annotations

* update comment

* Raise batch size, now that we pay attention to it

* Iterate in batches

* Separate cancellable batch implementation to allow for multi-statement callbacks, add overload for single-statement use

* Use split-out utility in outer batching loop so it respects context cancellation

* guard against empty queries

* Use SQL parameters

* Use same approach for tags

* drop unused function

* Work around parameter limit on sqlite for large batches

* Bulk insert test data in DB

* Refactor test to customise test data creation

* Add test for catching SQLITE_MAX_VARIABLE_NUMBER limit

* Turn annotation cleanup test to integration tests

* lint

---------

Co-authored-by: Sofia Papagiannaki <1632407+papagian@users.noreply.github.com>
@coderabbitai

coderabbitai Bot commented Jul 26, 2025

Copy link
Copy Markdown

Walkthrough

The annotation cleanup logic was refactored to avoid MySQL deadlocks by switching from single-statement batched deletes to a two-step process: fetching IDs in batches and deleting them iteratively. Integration tests for cleanup were expanded for efficiency and coverage, including large batch scenarios. The cleanup service now runs every minute instead of every ten minutes.

Changes

File(s) Change Summary
pkg/services/annotations/annotationsimpl/cleanup_test.go Integration test suite renamed and improved: added skip for short mode, expanded batch size tests, bulk/batched data setup, and more robust cleanup assertions. Helper refactored for batch inserts.
pkg/services/annotations/annotationsimpl/xorm_store.go Cleanup logic refactored to fetch and delete IDs in batches, avoiding MySQL deadlocks. Added helper methods for ID fetch/delete, replaced repeated raw SQL execution, improved logging, and removed old batch execution method.
pkg/services/cleanup/cleanup.go Changed cleanup service interval from 10 minutes to 1 minute. No other logic was modified.

Sequence Diagram(s)

sequenceDiagram
    participant Scheduler as CleanupService
    participant Repo as xormRepositoryImpl
    participant DB as Database

    Scheduler->>Repo: Run cleanup (every 1 min)
    loop Until no more records or cancelled
        Repo->>DB: Fetch batch of IDs to delete
        DB-->>Repo: Return IDs
        Repo->>DB: Delete records by IDs
        DB-->>Repo: Confirm deletion
    end
    Repo-->>Scheduler: Report total deleted
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐇
In fields of code where cleanup grows,
Batches march where once one goes.
IDs fetched and swiftly swept,
Deadlocks dodged, the service kept
To run each minute, neat and fleet—
A tidy burrow, code complete!

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 golangci-lint (2.2.2)

Error: can't load config: unsupported version of the configuration: "" See https://golangci-lint.run/product/migration-guide for migration instructions
The command is terminated due to an error: can't load config: unsupported version of the configuration: "" See https://golangci-lint.run/product/migration-guide for migration instructions

✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch db-cleanup-optimized

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai generate unit tests to generate unit tests for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@everettbu

Copy link
Copy Markdown
Author

@coderabbitai review

@coderabbitai

coderabbitai Bot commented Jul 27, 2025

Copy link
Copy Markdown
✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (4)
pkg/services/annotations/annotationsimpl/xorm_store.go (3)

13-13: Import organization: Consider grouping imports

The new migrator import is correctly added but consider grouping it with other grafana imports for better organization.


584-595: Good safety check, consider adding documentation

The empty condition check prevents accidentally fetching all records. Since this method builds SQL dynamically, consider adding a comment documenting that the condition parameter must be sanitized and should never contain user input directly.

+// fetchIDs retrieves IDs from the specified table matching the given condition.
+// The condition parameter must be pre-sanitized and should never contain direct user input.
+// An empty condition will return an error to prevent fetching the entire table.
 func (r *xormRepositoryImpl) fetchIDs(ctx context.Context, table, condition string) ([]int64, error) {

608-613: Consider SQL query size limits for large batches

While the SQLite parameter limit workaround is necessary, constructing SQL strings with thousands of IDs could hit query size limits or cause memory issues. Consider chunking large batches even for SQLite.

 	if r.db.GetDBType() == migrator.SQLite && r.cfg.AnnotationCleanupJobBatchSize > sqliteParameterLimit {
+		// Further chunk if batch size is extremely large to avoid query size limits
+		const maxIDsPerQuery = 10000
+		if len(ids) > maxIDsPerQuery {
+			var totalAffected int64
+			for i := 0; i < len(ids); i += maxIDsPerQuery {
+				end := i + maxIDsPerQuery
+				if end > len(ids) {
+					end = len(ids)
+				}
+				affected, err := r.deleteByIDs(ctx, table, ids[i:end])
+				totalAffected += affected
+				if err != nil {
+					return totalAffected, err
+				}
+			}
+			return totalAffected, nil
+		}
 		values := fmt.Sprint(ids[0])
pkg/services/annotations/annotationsimpl/cleanup_test.go (1)

273-291: Efficient bulk insert implementation

The batched insert approach with a batch size of 500 is much more efficient than individual inserts. Good use of min function to handle the last batch.

Consider extracting the batch size as a constant to avoid duplication:

+	const insertBatchSize = 500
 	err := store.WithDbSession(context.Background(), func(sess *db.Session) error {
-		batchsize := 500
-		for i := 0; i < len(newAnnotations); i += batchsize {
-			_, err := sess.InsertMulti(newAnnotations[i:min(i+batchsize, len(newAnnotations))])
+		for i := 0; i < len(newAnnotations); i += insertBatchSize {
+			_, err := sess.InsertMulti(newAnnotations[i:min(i+insertBatchSize, len(newAnnotations))])
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f84c8f6 and 81c45bf.

📒 Files selected for processing (3)
  • pkg/services/annotations/annotationsimpl/cleanup_test.go (9 hunks)
  • pkg/services/annotations/annotationsimpl/xorm_store.go (2 hunks)
  • pkg/services/cleanup/cleanup.go (1 hunks)
🔇 Additional comments (5)
pkg/services/annotations/annotationsimpl/xorm_store.go (1)

640-661: Well-designed batch processing helper

The untilDoneOrCancelled function provides a clean abstraction for the repeated batch processing pattern. Good error handling and context cancellation support.

pkg/services/cleanup/cleanup.go (1)

77-77: Significant increase in cleanup frequency - verify performance impact

Changing the cleanup interval from 10 minutes to 1 minute is a 10x increase in frequency. While the refactored cleanup logic is more efficient, this could still impact database performance, especially on busy systems.

Consider:

  1. Has this been load tested with realistic data volumes?
  2. Should this be configurable rather than hard-coded?
  3. Are there metrics in place to monitor the cleanup job duration and database impact?
pkg/services/annotations/annotationsimpl/cleanup_test.go (3)

18-22: Good practice: Integration test naming and skip condition

Renaming to include "Integration" prefix and adding skip for short test runs follows Go testing best practices.


97-111: Excellent test coverage for SQLite edge case

This test case effectively verifies that the implementation can handle batch sizes exceeding SQLite's parameter limit (999). The choice of 32767 for batch size and 40003 total annotations ensures thorough testing of the batching logic.


121-127: Good error handling with errors.Join

Using errors.Join to combine both deletion errors ensures no error is lost during cleanup. This is a best practice for handling multiple potential errors.

Comment on lines +534 to 535
r.log.Error("Annotations to clean by time", "count", len(ids), "ids", ids, "cond", cond, "err", err)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Use appropriate log levels for debugging information

The code uses Error log level for what appears to be debugging information. This will pollute error logs in production. Consider using Debug or Info level instead.

-			r.log.Error("Annotations to clean by time", "count", len(ids), "ids", ids, "cond", cond, "err", err)
+			r.log.Debug("Annotations to clean by time", "count", len(ids), "ids", ids, "cond", cond)

-			r.log.Error("cleaned annotations by time", "count", len(ids), "affected", x, "err", y)
+			if y != nil {
+				r.log.Error("Failed to clean annotations by time", "count", len(ids), "affected", x, "err", y)
+			} else {
+				r.log.Debug("Cleaned annotations by time", "count", len(ids), "affected", x)
+			}

Also applies to: 537-538

🤖 Prompt for AI Agents
In pkg/services/annotations/annotationsimpl/xorm_store.go around lines 534 to
535 and also lines 537 to 538, the log statements currently use the Error log
level for what seems to be debugging or informational messages. Change these log
calls from Error to Debug or Info level to prevent cluttering error logs in
production environments. Ensure the log level accurately reflects the message
severity.

Comment on lines +554 to +555
r.log.Error("Annotations to clean by count", "count", len(ids), "ids", ids, "cond", cond, "err", err)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Use appropriate log levels for debugging information (MaxCount section)

Same issue with log levels in the MaxCount cleanup section.

-			r.log.Error("Annotations to clean by count", "count", len(ids), "ids", ids, "cond", cond, "err", err)
+			r.log.Debug("Annotations to clean by count", "count", len(ids), "ids", ids, "cond", cond)

-			r.log.Error("cleaned annotations by count", "count", len(ids), "affected", x, "err", y)
+			if y != nil {
+				r.log.Error("Failed to clean annotations by count", "count", len(ids), "affected", x, "err", y)
+			} else {
+				r.log.Debug("Cleaned annotations by count", "count", len(ids), "affected", x)
+			}

Also applies to: 557-558

🤖 Prompt for AI Agents
In pkg/services/annotations/annotationsimpl/xorm_store.go around lines 554 to
558, the log statements in the MaxCount cleanup section are using the Error log
level for what appears to be debugging or informational messages. Change these
log statements to use a more appropriate log level such as Debug or Info instead
of Error to better reflect the nature of the messages and avoid misleading error
logs.

if err != nil {
return 0, err
}
r.log.Error("Tags to clean", "count", len(ids), "ids", ids, "cond", cond, "err", err)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Use appropriate log levels in CleanOrphanedAnnotationTags

Same logging level issue in the orphaned tags cleanup.

-		r.log.Error("Tags to clean", "count", len(ids), "ids", ids, "cond", cond, "err", err)
+		r.log.Debug("Tags to clean", "count", len(ids), "ids", ids, "cond", cond)

-		r.log.Error("cleaned tags", "count", len(ids), "affected", x, "err", y)
+		if y != nil {
+			r.log.Error("Failed to clean tags", "count", len(ids), "affected", x, "err", y)
+		} else {
+			r.log.Debug("Cleaned tags", "count", len(ids), "affected", x)
+		}

Also applies to: 579-579

🤖 Prompt for AI Agents
In pkg/services/annotations/annotationsimpl/xorm_store.go at lines 576 and 579,
the log statements for cleaning orphaned annotation tags use the Error log level
inappropriately. Change these log calls to use a more suitable log level such as
Info or Debug, depending on the importance of the message, to better reflect the
severity and avoid misleading error reporting.

@github-actions

Copy link
Copy Markdown
Contributor

This pull request has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in 2 weeks if no further activity occurs. Please feel free to give a status update or ping for review. Thank you for your contributions!

@github-actions github-actions Bot added the stale label Aug 27, 2025
@github-actions

Copy link
Copy Markdown
Contributor

This pull request has been automatically closed because it has not had any further activity in the last 2 weeks. Thank you for your contributions!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants