MetricComparisons into InformationImbalance + NeighborhoodOverlap; move k*+ID functions into KStar class#181
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Refactor
MetricComparisonsclass intoInformationImbalanceandNeighborhoodOverlap, plus relocation of two k* ID-estimation methods toKStarclass. Fully backward compatible.Why
The metric comparisons class is more than 1000 lines long. It contains two conceptually different sets of functions: those used to compute information imbalance and neighborhood overlap, including their helper methods.
What changed
New classes
InformationImbalance(indadapy/information_imbalance.py) collects all the 17 imbalance + causality methods (return_inf_imb_*,greedy_feature_selection_*, the causality block).NeighborhoodOverlap(indadapy/neighborhood_overlap.py) containes the neighborhood overlap functions:return_label_overlap,return_data_overlap, and_label_imbalance_helper.Base. Shared helper_get_nn_indiceslifted todadapy/_utils/metric_comparisons.pyas a module-level function.Datanow inherits fromInformationImbalance, NeighborhoodOverlapdirectly (no throughMetricComparisons).Backward-compatibility
MetricComparisonsshrunk from ~1050 lines to a ~35-line, left for backward compatibility:Existing code (
from dadapy import MetricComparisons, instantiation, method calls) works unchanged.Symmetric constructor API for the comparison classes
InformationImbalance(X1, X2).return_information_imbalance()NeighborhoodOverlap(X1, X2).return_data_overlap(k=30)NeighborhoodOverlap(X, labels=y).return_label_overlap(k=5)Relocation of two k-star ID-estimation methods to
KStarclassreturn_ids_kstar_grideandreturn_ids_kstar_binomialmoved fromDatatoKStar, where they conceptually belong; I personally don't understand why those two functions were in Data.Datashrank from 243 → 76 lines and is now a pure container.