added code for model merging and sparse training methods#167
Open
SaminYeasar wants to merge 21 commits intomicrosoft:mainfrom
Open
added code for model merging and sparse training methods#167SaminYeasar wants to merge 21 commits intomicrosoft:mainfrom
SaminYeasar wants to merge 21 commits intomicrosoft:mainfrom
Conversation
SaminYeasar
commented
Apr 28, 2025
- Added merging methods:
- SLERP
- LERP
- TiesMerge: made correction
- TaskArithmatic
- ModelBreadcrumbs
- UniformMerge
- UniformSparseMerge
- Added different scoring functions
- grow and drop
- layer drop + sparse
- model wise sparse
- gradient-magnitude based sparse
- weight-magnitude based sparse
- added backwardhook: will mask gradient during backdrop
- iterative and one-shot sparse training
- efficient sparse expert saving
added merging methods and scoring functions
moved merging model to separate folder, updated essential function and arguments for sparse-adapter training
updated model merging code
merging the latest update
fixed right sparse merge method
fverac
reviewed
Jun 18, 2025
| def __init__(self, config: SLERPMergeConfig = None): | ||
| super().__init__(config or SLERPMergeConfig()) | ||
|
|
||
| def load_mask(self, expert): |
Contributor
There was a problem hiding this comment.
Is it possible to transition all logic for saving and loading of the mask to be done via the state_dict, as is currently implemented in the main branch?
In other words, can we remove the need for ever loading a .npz file?
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.