Skip to content

Conversation

@Jackcuii
Copy link
Collaborator

This is a Draft PR

Description

This PR adds CMU 15-445 Lab 0 (Count-min Sketch) to the Benchmark Suite. The task requires implementing a thread-safe Count-min sketch data structure, a probabilistic data structure used for frequency estimation in streaming data. This lab focuses on C++ programming, concurrency, algorithms, and database systems concepts.

Changes

  • Added new task directory data/cmu_15-445/task_cpp/ with complete lab setup

Testing

E2E Tested with Claude Haiku

TODOs

  • P1
  • P2
  • P3
  • P4

Copy link
Collaborator

@tareknaser tareknaser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the great work. This looks almost ready to merge. I made a few small updates including adding a course entry and a reference solution (based on Claude’s trajectory) and rebasing on top of main. I’ll add a couple more minor updates in separate comments for you to review.

If everything looks good, we can go ahead and merge

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think we can simplify this file to be

#!/bin/bash
set -e

echo "=== Setting up CMU 15-445 CountMinSketch Lab ==="

cd /workspace

echo "Installing git"
apt-get update > /dev/null 2>&1
apt-get install -y git > /dev/null 2>&1

echo "Cloning bustub repository"
git clone https://github.com/cmu-db/bustub.git /tmp/bustub > /dev/null 2>&1
git -C /tmp/bustub checkout bd3912741c45370d5f9c7bef638452b10b140138 > /dev/null 2>&1

echo "Moving source to workspace"
mv /tmp/bustub/* ./
mv /tmp/bustub/.clang-format ./ 2>/dev/null || true
mv /tmp/bustub/.clang-tidy ./ 2>/dev/null || true
rm -rf /tmp/bustub .git

echo "Installing build dependencies"
build_support/packages.sh -y > /dev/null 2>&1

echo "Creating checksums for protected files"
mkdir -p /tmp/checksums
sha256sum test/primer/count_min_sketch_test.cpp > /tmp/checksums/test.sha256

echo "Building project"
mkdir -p build && cd build
cmake -DCMAKE_BUILD_TYPE=Debug .. > /dev/null 2>&1
make -j$(nproc) > /dev/null 2>&1

echo "Setup complete"
echo "Agent should implement:"
echo "  - src/include/primer/count_min_sketch.h"
echo "  - src/primer/count_min_sketch.cpp"

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And the evaluation script to be

#!/bin/bash
set -e

cd /workspace

# Verify test file wasn't modified
echo "Verifying protected files were not modified"
if ! sha256sum -c /tmp/checksums/test.sha256 > /dev/null 2>&1; then
    echo "FAIL: test/primer/count_min_sketch_test.cpp was modified"
    exit 1
fi
echo "Protected files unchanged"

# Build
echo ""
echo "=== Building ==="
rm -rf build
mkdir build && cd build
cmake -DCMAKE_BUILD_TYPE=Debug .. > /dev/null 2>&1
if ! make -j$(nproc); then
    echo "FAIL: Build failed"
    exit 1
fi

# Run tests
echo ""
echo "=== Running Tests ==="
make -j$(nproc) count_min_sketch_test > /dev/null 2>&1
if ! ./test/count_min_sketch_test; then
    echo "FAIL: Tests failed"
    exit 1
fi

# Format check
echo ""
echo "=== Format Check ==="
make format > /dev/null 2>&1
if ! make check-clang-tidy-p0; then
    echo "FAIL: clang-tidy check failed"
    exit 1
fi

echo ""
echo "PASS: All checks passed"
exit 0

There is no need to have scoring scheme since we just report pass/fail. What do you think?

@Jackcuii
Copy link
Collaborator Author

Thanks for the great work. This looks almost ready to merge. I made a few small updates including adding a course entry and a reference solution (based on Claude’s trajectory) and rebasing on top of main. I’ll add a couple more minor updates in separate comments for you to review.

If everything looks good, we can go ahead and merge

Thank you Tarek! I will add more tests to this PR to scale it up~

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants