Skip to content

Porting certainty propagation to new codebase. Notes and Outline. #13

@Videodr0me

Description

@Videodr0me

Certainty propagation

Certainty propagation is known in literature as MCTS-Solver (Winands et. al). If we reach terminal (which are "certain") nodes, we can propagate the "certain" results from these nodes efficiently up the search-tree. For example if one side can play a move thats leads to a terminal win node, we can make the parent node a "certain" node scored as a loss from the opponents view. Further, nodes whose children are all certain, become certain themselves with the -max q among the certain childs. With this even MCTS can solve some shallow mates, and steer exploring the tree more efficiently and avoid branches that are "proven" to no longer need exploration. See also https://github.com/Videodr0me/leela-chess-experimental/wiki/MCTS-Solver---Certainty-Propagation-and-Autoextending

Features of full Implementation (as in my experimental build):

  • One ply look-ahead. Checking for terminals is relatively cheap compared to NN evals. Checking when creating the skeleton-childs (pre edge-node separation) detects terminals (and make the skeleton-child a full terminal node) one ply earlier, making leela see some mating combos earlier. Though not impossible with the new codebase, this can be skipped initially.
  • Two-Fold draw scoring. I folded this into certainty propagation, but this could easily be a separate option. I reused the certain state of a node to indicate a two-fold, and as soon as the node becomes root (tree-reuse), I uncertain it and recalculate its stats (more about this later). Not strictly needed, can be skipped initially.
  • Root-selection rule. In the lastest version of this, I went as far as to not search certain childs of root at all (so if a root-child became certain it would stop getting visits). This requires some changes, to (1) PUCT to choose non-certain childs at root, to (2) GetBestChild, so that the most visited is chosen unless there is a certain move with a higher q and to (3) the best move update along the same lines as GetBestChild. This also ins not necessary as a first step, even though I would propose to at least (independent of visits) choose a winning certain child of root and avoid certain loosing childs at root in GetBestChild and best move update. PUCT can remain unchanged (and if changed should NOT avoid loosing moves except at root). For choosing between a terminal threefold and a two-fold, I used NN eval of parent. But again, this is maybe not for an inital version.
  • Mate Display. Purely Cosmetic can be omitted. I do it in a hacky way, in that if best move is a certain win (or certain loss), I just take the length of x = PV/2 as the length of mate (this is approximate, as we only know its winning, but do not know what is the shortest path). Uci Output in this case is modified from cp score to mate x (or -x). Not really needed, and also if the certain state of a node is overloaded with TB results, it might not be a mate but just the path to a TB win.
  • Autoextending single legal moves. Plays well with certainty propagation (especially for forced mates). But for such a little change, was awkward to implement. I opted for further extending if picknodeextend returned a move with only one single move, and when backuping that score copying it from child to parent (also for chains of single legal moves, even though extremely rare in chess). Even though it helps with forced mate combos, it can be skipped initially.

Note on the basic implementation (what are the essentials):

-IsCertain(). Nodes get a new attribute "certain". All terminal nodes are certain, but not all certain nodes are terminal. MakeCertain(v) makes a node certain. MakeTerminal(result) makes a node terminal and certain. Most checks in the main loop are changed from IsTerminal() to IsCertain(), This changes nothing in regard to terminals (all terminals are also certain). The terminal state can be "overloaded" with this, but i opted for a separate bool for clarity and so i can turn the feature of more easily via uci options (for self-play testing). Later this became useful as for example two-fold draw-scoring needs to differentiate between terminal draws and two-fold draws. I could imagine something similar with TB and a "swindle" mode when playing against non TB opponents.

-The main propagation is done in the (new codebase) DoBackupUpdate(). I store the certain scores in both Q and V, and opted to fetch them via V. But fetching them via Q would work as well. I use hacky float == x.x compares, which works because i set them exactly as such, but there are nicer ways (for example use game:result or something). For efficiency I only do the certainty check if the update originated from a certain node (which can also be a terminal) -> v_is_certain = true. Root is never made certain (actually it is made uncertain before search, if it was certain due to tree-reuse). I "overloaded" the code to at root also aid in the best move determination (thats what the n==root_node_ condition is for) and for tie breaking (two-folds, certain wins vs. terminal wins.). This is a simplified version without the overloading, its actually really simple:

  // MCTS Solver or Proof-Number-Search
  // kCertaintyProp > 0
     if (kCertaintyProp && v_is_certain && !n->IsCertain())) {
	bool children_all_certain = true;
        float best_certain_v = -3.0f;
        Node *best_certain_node = nullptr;
	for (Node* iter : n->Children()) {
	  if (!iter->IsCertain()) { children_all_certain = false; }
	    else if ((iter->GetV()) >= best_certain_v) {
	      best_certain_node = iter;
	      best_certain_v = iter->GetV();
	      }
	 }
	 if (((children_all_certain) || (best_certain_v == 1.0f)) && (n != root_node_)) {
	     v = -best_certain_v;
	     n->MakeCertain(v);
	    ++madecertain_;
         }

This is basically it, if all childs are certain we make the node certain with the -max of the certain childs. Or if we find at least one certain winning child, then the node is a certain loss for the opponent. This gets backpropagated in the tree.

The full version of this with the following best move determination and the tie break rules are here https://github.com/Videodr0me/leela-chess-experimental/blob/master/src/mcts/search.cc#L339 I currently do best move determination in two stages (mainly because of the not searching certain childs of root). First the normal update (which is preliminary) and then at root (one ply higher) I resuse the info about the best certain move from the linked full version. This again can be done simpler initially.

The changes to GetBestMove are here https://github.com/Videodr0me/leela-chess-experimental/blob/d2088837fff01779dbf8fc606af6e92c98c1a449/src/mcts/search.cc#L505 These compute normal best move and best certain move, and then decide between them. This can greatly be simplified if we initially continue searching certain childs at root (basically prioritizing everything over certain losses and always take certain wins.). Also the tiebreak rules can safely be ignored (not really sure they yield any meaningful elo anyhow for example in prefering a certain loss over a terminal loss, in hope the opponent did not see the mate).

The changes only affect node.h, node.cc, search.h and search.cc. Only conditionals with if (kCertaintyProb) are relevant. And of those only those that are general "kCertaintyProb" or "kCertaintyProp==1". The rest is irrelevant. I hope this helps for a start. The changes in ExtendNode (old codebase) are only relevant for the one-ply lookahead and two-fold draw scoring (and the tree balancing).

Metadata

Metadata

Assignees

No one assigned

    Labels

    documentationImprovements or additions to documentation

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions