Skip to content

Is the AlphaZero implementation reliable? #1309

Description

@Egiob

Hey @sotetsuk, I wanted to run a series of experiments with AlphaZero and wanted to start from the implementation provided here.

However, I'm a little bit worried about the note in the README:

> [!NOTE]
> This implementation of AlphaZero demonstrates sufficient learning performance in environments including 9x9 Go, but it has some slight differences in learning details compared to the original AlphaZero and Gumbel AlphaZero. An implementation that addresses these differences and focuses on enhanced efficiency is currently under development and is expected to be released shortly.

Is this still true or has the "corrected" version already been implemented? What are the potential discrepancies that you have in mind? I would like to make the baseline as reliable and strong as possible. Do you have recommendations for that?

Thanks a lot 😺

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions