Skip to content

feat: Query Builder API implemented with Count#584

Open
sriramk03 wants to merge 2 commits intoOpenMined:mainfrom
sriramk03:feat_query_builder
Open

feat: Query Builder API implemented with Count#584
sriramk03 wants to merge 2 commits intoOpenMined:mainfrom
sriramk03:feat_query_builder

Conversation

@sriramk03
Copy link
Contributor

Description

The design of this new API is based on the principles of the PlumeKotlin public API, which uses a guided, multi-builder pattern to ensure queries are constructed in a valid and logical sequence. This enforcement of the FROM -> GROUP BY -> AGGREGATE order at the API level makes the query construction process more robust and intuitive.

This initial pull request establishes the foundation of the new API, implementing the core builder structure with support for count aggregations.

Implementation Details
The new Advanced Query Builder is implemented through a sequence of specialized builder classes:

  1. QueryBuilder: The public entry point for starting a query with the from_() method. It returns a GroupByBuilder.
  2. GroupByBuilder: An internal class responsible for the group_by() operation. It returns an AggregationBuilder.
  3. AggregationBuilder: An internal class for defining one or more aggregations.
    • For this initial implementation, it supports the count() method.
    • It contains the build() method, which constructs the final Query object.

How has this been tested?

  • A new test suite has been added in tests/advanced_query_builder_test.py.
  • The tests validate the new multi-builder pattern and verify the functionality of the count aggregation.

Checklist

Copy link
Collaborator

@dvadym dvadym left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Some comments below

@@ -0,0 +1,521 @@
# Copyright 2023 OpenMined.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2025

# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""Advanced query builder API."""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's drop advanced, just query_builder.py

from pipeline_dp import pipeline_backend


class GroupsBalance(enum.Enum):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's drop GroupsBalance (it's in supported in Python)

public_groups: Optional[Sequence[Any]]
groups_balance: Optional[GroupsBalance]

class Builder:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need in builder for dataclass, it's more Java style to have builders (because in Java on named arguments, no default values for argument), in Python constructors are used



@dataclasses.dataclass
class GaussianCountSpec(CountSpec):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need separately Gaussian/Laplace, just CountSpec. See Kotlin file for the texample

# max_contributions_per_partition is implicitly 1


class Budget(abc.ABC):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's have only 1 dataclass (which contains epsilon and delta, with delta = 0 as a default values)


def run(self, test_mode: bool = False):
"""Runs the DP query."""
backend = pipeline_backend.LocalBackend()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replace the implementation of run to raise NotImplemented(), the purpose of this PR is skeleton, not the implementation. "1 purpose for 1 PR" ))

contribution_bounding_level: Optional[ContributionBoundingLevel]
budget: EpsDeltaBudget

class Builder:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

drop builder



@dataclasses.dataclass
class OptimalGroupSelectionGroupBySpec(GroupBySpec):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't need this class, from GroupSpec

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants