Skip to content

Conversation

@daniel-monroe
Copy link
Member

No description provided.

Arcturai and others added 30 commits February 21, 2022 15:22
dense function combines dydense/dyrelu/linearscaling/gating into one function
Add logit gating, dense_layer, stop file, make dyrelu slopes/intercepts trainable
Weight gen (simple_gen) generates attention weights from each square by compressing then doing a batched dense to 64; buckets divides the training data based on material left.
Dytalking heads at this stage dynamically generates the projection matrices for the attention weights (same for all square pairs). Fixed set_visible_devices error by initializing tensorflow first in TFProcess and making DyDense temperature an instance attribute.
DyDense layers had issue saving sublayers so the design approach of the squeeze-excite layers is used, i.e., sublayers are moved outside into a function.
Fullgen compresses the tokens and combines them to extract global information into attention weights.
Dytalking heads adds residual to matrix specifying linear transformation
Removed old modules which were not useful including yaml references, also removed legacy resnet code. Added arc's encoding with option in yaml and also added example.yaml
update Readme to describe talking heads, fullgen, and dynamic kernel methods. Removed leelalogs and configs, except for example.yaml.
Fixed typo in tfprocess and Readme, made some config stuff oprtional, fixed arc encoding, removed fullgen bias
Added search_loss, which is one over the prediction for the best move, and confident_accuracy, which is the accuracy for positions where there is a clear best move. Removed simple gating. Also updated README to include info on auxiliary losses.
Smolgen is more efficient version of fullgen, also added square relu which adds 0.5% pol acc
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants