OneZoom · lentinj · Jun 8, 2026 · Jun 5, 2026 · Jun 2, 2026 · Jun 2, 2026
diff --git a/README.markdown b/README.markdown
@@ -85,4 +85,44 @@ Note that download_and_filter_wikidata and download_and_filter_pageviews take se
 
 4. Commit `dvc.lock` to git.
 
+
+## Uploading tree to server
+
+1. If you are running the tree building scripts on a different computer to the one running the web server, you will need to push the `completetree_XXXXXX.js`, `completetree_XXXXXX.js.gz`, `cut_position_map_XXXXXX.js`, `cut_position_map_XXXXXX.js.gz`, `dates_XXXXXX.js`, `dates_XXXXXX.js.gz` files onto your server, e.g. by pushing to your local Github repo then pulling the latest github changes to the server.
+
+2. (15 mins) load the CSV tables into the DB. Use the script generated in `data/output_files/import_XXXXXX.sql` to truncate and repopulate ordered_leaves/nodes/etc.
+
+   ```
+   echo "SET GLOBAL local_infile=ON;" | mysql -p OneZoom_dev
+   mysql --local-infile --host localhost --user onezoom --password --database OneZoom_dev < data/output_files/import_XXXXXX.sql
+   ```
+
+3. Check for dups, and if any sponsors are no longer on the tree, using something like the following SQL command:
+
+   ```
+   select * from reservations left outer join ordered_leaves on reservations.OTT_ID = ordered_leaves.ott where ordered_leaves.ott is null and reservations.verified_name IS NOT NULL;
+   select group_concat(id), group_concat(parent), group_concat(name), count(ott) from ordered_leaves group by ott having(count(ott) > 1)
+   ```
+
+### Fill in additional server fields
+
+ 11. (15 mins) create example pictures for each node by percolating up. This requires the most recent `images_by_ott` table, so either do this on the main server, or (if you are doing it locally) update your `images_by_ott` to the most recent server version.
+
+    ```
+    ${OZ_DIR}/OZprivate/ServerScripts/Utilities/picProcess.py -v
+    ```
+
+1. (5 mins) percolate the IUCN data up using
+
+   ```
+   ${OZ_DIR}/OZprivate/ServerScripts/Utilities/IUCNquery.py -v
+   ```
+
+   (note that this both updates the IUCN data in the DB and percolates up interior node info)
+
+1. (10 mins) If this is a site with sponsorship (only the main OZ site), set the pricing structure using SET_PRICES.html (accessible from the management pages).
+1. (5 mins - this does seem to be necessary for ordered nodes & ordered leaves). Make sure indexes are reset. Look at `OZprivate/ServerScripts/SQL/create_db_indexes.sql` for the SQL to do this - this may involve logging in to the SQL server (e.g. via Sequel Pro on Mac) and pasting all the drop index and create index commands.
+
+
+
 For detailed step-by-step documentation, see [oz_tree_build/README.markdown](oz_tree_build/README.markdown).
diff --git a/data/OZTreeBuild/AllLife/BespokeTree/include_noAutoOTT/Deepfin2.phy b/data/OZTreeBuild/AllLife/BespokeTree/include_noAutoOTT/Deepfin2.phy
diff --git a/dvc.lock b/dvc.lock
@@ -94,8 +94,8 @@ stages:
     deps:
     - path: data/OZTreeBuild/AllLife/BespokeTree/include_noAutoOTT/
       hash: md5
-      md5: 8cb57266b725e9893505618bf366af54.dir
-      size: 1231351
+      md5: c3c1ebf2453c636e3ffdfcef58722d9c.dir
+      size: 1231291
       nfiles: 56
     params:
       params.yaml:
@@ -104,8 +104,8 @@ stages:
     outs:
     - path: data/OZTreeBuild/AllLife/BespokeTree/include_OT_v16.1/
       hash: md5
-      md5: cfe57e6fbd3572028ac2d83203a96fe4.dir
-      size: 1534894
+      md5: c12514e92740949250ecdb4375d6c360.dir
+      size: 1534814
       nfiles: 55
   get_open_trees_from_one_zoom:
     cmd:
@@ -115,8 +115,8 @@ stages:
     deps:
     - path: data/OZTreeBuild/AllLife/BespokeTree/include_OT_v16.1/
       hash: md5
-      md5: cfe57e6fbd3572028ac2d83203a96fe4.dir
-      size: 1534894
+      md5: c12514e92740949250ecdb4375d6c360.dir
+      size: 1534814
       nfiles: 55
     - path: data/OZTreeBuild/AllLife/OpenTreeParts/OT_required/
       hash: md5
@@ -180,8 +180,8 @@ stages:
     deps:
     - path: data/OZTreeBuild/AllLife/BespokeTree/include_OT_v16.1/
       hash: md5
-      md5: cfe57e6fbd3572028ac2d83203a96fe4.dir
-      size: 1534894
+      md5: c12514e92740949250ecdb4375d6c360.dir
+      size: 1534814
       nfiles: 55
     - path: data/OZTreeBuild/AllLife/OpenTreeParts/OpenTree_all/
       hash: md5
@@ -254,8 +254,9 @@ stages:
       nfiles: 7
   make_js_treefiles:
     cmd:
-    - mkdir -p data/js_output
-    - make_js_treefiles --outdir data/js_output
+    - rm -r data/js_output ; mkdir -p data/js_output
+    - make_js_treefiles --outdir data/js_output 
+      data/output_files/ordered_tree_*.poly
     deps:
     - path: data/output_files/
       hash: md5

diff --git a/dvc.yaml b/dvc.yaml
@@ -193,8 +193,11 @@ stages:
 
   make_js_treefiles:
     cmd:
-      - mkdir -p data/js_output
-      - make_js_treefiles --outdir data/js_output
+      - rm -r data/js_output ; mkdir -p data/js_output
+      - >-
+        make_js_treefiles
+        --outdir data/js_output
+        data/output_files/ordered_tree_*.poly
     deps:
       - data/output_files/
     always_changed: true
diff --git a/oz_tree_build/README.markdown b/oz_tree_build/README.markdown
@@ -45,41 +45,6 @@ Then see the section titled "Upload data to the server and check it" below.
 
 Edit `params.yaml` to change the OpenTree version, taxonomy version, build version, etc. DVC will detect the parameter changes and re-run only the affected stages.
 
-### Upload data to the server and check it
-
-8. If you are running the tree building scripts on a different computer to the one running the web server, you will need to push the `completetree_XXXXXX.js`, `completetree_XXXXXX.js.gz`, `cut_position_map_XXXXXX.js`, `cut_position_map_XXXXXX.js.gz`, `dates_XXXXXX.js`, `dates_XXXXXX.js.gz` files onto your server, e.g. by pushing to your local Github repo then pulling the latest github changes to the server.
-1. (15 mins) load the CSV tables into the DB, using the SQL commands printed in step 6 (at the end of the `data/output_files/ordered_output.log` file: the lines that start something like `TRUNCATE TABLE ordered_leaves; LOAD DATA LOCAL INFILE ...;` `TRUNCATE TABLE ordered_nodes; LOAD DATA LOCAL INFILE ...;`). Either do so via a GUI utility, or copy the `.csv.mySQL` files to a local directory on the machine running your SQL server (e.g. using `scp -C` for compression) and run your `LOAD DATA LOCAL INFILE` commands on the mysql command line (this may require you to start the command line utility using `mysql --local-infile`, e.g.:
-
-   ```
-   mysql --local-infile --host db.MYSERVER.net --user onezoom --password --database onezoom_dev
-   ```
-
-1. Check for dups, and if any sponsors are no longer on the tree, using something like the following SQL command:
-
-   ```
-   select * from reservations left outer join ordered_leaves on reservations.OTT_ID = ordered_leaves.ott where ordered_leaves.ott is null and reservations.verified_name IS NOT NULL;
-   select group_concat(id), group_concat(parent), group_concat(name), count(ott) from ordered_leaves group by ott having(count(ott) > 1)
-   ```
-
-### Fill in additional server fields
-
-11. (15 mins) create example pictures for each node by percolating up. This requires the most recent `images_by_ott` table, so either do this on the main server, or (if you are doing it locally) update your `images_by_ott` to the most recent server version.
-
-    ```
-    ${OZ_DIR}/OZprivate/ServerScripts/Utilities/picProcess.py -v
-    ```
-
-1. (5 mins) percolate the IUCN data up using
-
-   ```
-   ${OZ_DIR}/OZprivate/ServerScripts/Utilities/IUCNquery.py -v
-   ```
-
-   (note that this both updates the IUCN data in the DB and percolates up interior node info)
-
-1. (10 mins) If this is a site with sponsorship (only the main OZ site), set the pricing structure using SET_PRICES.html (accessible from the management pages).
-1. (5 mins - this does seem to be necessary for ordered nodes & ordered leaves). Make sure indexes are reset. Look at `OZprivate/ServerScripts/SQL/create_db_indexes.sql` for the SQL to do this - this may involve logging in to the SQL server (e.g. via Sequel Pro on Mac) and pasting all the drop index and create index commands.
-
 ### At last
 
 15. Have a well deserved cup of tea
diff --git a/oz_tree_build/taxon_mapping_and_popularity/CSV_base_table_creator.py b/oz_tree_build/taxon_mapping_and_popularity/CSV_base_table_creator.py
@@ -86,6 +86,7 @@
 from dendropy import Node, Tree
 
 from ..images_and_vernaculars.get_wiki_images import get_qid_from_taxa_data
+from ..utilities.debug_util import parse_args_and_add_logging_switch
 from ..utilities.file_utils import open_file_based_on_extension
 from . import OTT_popularity_mapping
 
@@ -597,7 +598,6 @@ def output_simplified_tree(tree, taxonomy_file, outdir, version, seed, save_sql=
         set_node_ages,
         set_real_parent_nodes,
         write_brief_newick,
-        write_preorder_ages,
         write_preorder_to_csv,
     )
 
@@ -606,7 +606,6 @@ def output_simplified_tree(tree, taxonomy_file, outdir, version, seed, save_sql=
     Tree.prune_non_species = prune_non_species
     Tree.set_node_ages = set_node_ages
     Tree.set_real_parent_nodes = set_real_parent_nodes
-    Tree.write_preorder_ages = write_preorder_ages
     Tree.remove_unifurcations_keeping_higher_taxa = remove_unifurcations_keeping_higher_taxa
     Tree.write_preorder_to_csv = write_preorder_to_csv
     Tree.group_genera_in_polytomies = group_genera_in_polytomies
@@ -664,8 +663,6 @@ def output_simplified_tree(tree, taxonomy_file, outdir, version, seed, save_sql=
         tree.seed_node.write_brief_newick(condensed_newick)
     with open(os.path.join(outdir, f"ordered_tree_{version}.poly"), "w+") as condensed_poly:
         tree.seed_node.write_brief_newick(condensed_poly, "{}")
-    with open(os.path.join(outdir, f"ordered_dates_{version}.js"), "w+") as json_dates:
-        tree.write_preorder_ages(json_dates, format="json")
 
     # these are the extra columns output to the leaf csv file
     leaf_extras = OrderedDict()
@@ -720,19 +717,22 @@ def output_simplified_tree(tree, taxonomy_file, outdir, version, seed, save_sql=
         from shutil import copyfile
         from subprocess import call
 
-        # make CSV files that can be imported into mySQL (subs \\N for null values)
-        logging.info(" > saving extra file copies in mySQL format: import them using:")
-        for tab in ["_leaves", "_nodes"]:
-            fn = os.path.join(outdir, "ordered" + tab + f"_{version}" + ".csv")
-            sqlfile = fn + ".mySQL"
-            copyfile(fn, sqlfile)
-            call(["perl", "-pi", "-e", r"s/,(?=(,|\n))/,\\N/g", sqlfile])
-            logging.info(
-                f"sql> TRUNCATE TABLE ordered{tab}; "
-                f"LOAD DATA LOCAL INFILE '{sqlfile}' REPLACE INTO TABLE `ordered{tab}` "
-                f"FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '\"' "
-                f"IGNORE 1 LINES ({open(fn).readline().rstrip()}) SET id = NULL;"
-            )
+        with open(os.path.join(outdir, f"import_{version}.sql"), "w", encoding="utf-8") as sql_f:
+            # make CSV files that can be imported into mySQL (subs \\N for null values)
+            logging.info(" > saving extra file copies in mySQL format: import them using:")
+            for tab in ["_leaves", "_nodes"]:
+                fn = os.path.join(outdir, "ordered" + tab + f"_{version}" + ".csv")
+                sqlfile = fn + ".mySQL"
+                copyfile(fn, sqlfile)
+                call(["perl", "-pi", "-e", r"s/,(?=(,|\n))/,\\N/g", sqlfile])
+                sql_f.writelines(
+                    [
+                        f"TRUNCATE TABLE ordered{tab};\n"
+                        f"LOAD DATA LOCAL INFILE '{sqlfile}' REPLACE INTO TABLE `ordered{tab}` \n"
+                        f"    FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '\"' \n"
+                        f"    IGNORE 1 LINES ({open(fn).readline().rstrip()}) SET id = NULL;\n"
+                    ]
+                )
 
 
 def display_WD_ott_stats(OTT_ptrs):
@@ -940,12 +940,6 @@ def switch_otts_to_qids(taxa_data_file, tree):
 def process_all(args):
     random_seed_addition = 42
     start = time.time()
-    if args.verbosity == 0:
-        logging.basicConfig(stream=sys.stderr, level=logging.WARNING)
-    elif args.verbosity == 1:
-        logging.basicConfig(stream=sys.stderr, level=logging.INFO, format="%(message)s")
-    elif args.verbosity >= 2:
-        logging.basicConfig(stream=sys.stderr, level=logging.DEBUG)
     logging.info(f"OneZoom data generation started on {time.asctime(time.localtime(time.time()))}")
     skip_popularity = (
         args.popularity_file is None
@@ -1138,15 +1132,8 @@ def main():
         type=str,
         help="JSON file with persisted data about taxa, typically used for the extinct tree",
     )
-    parser.add_argument(
-        "--verbosity",
-        "-v",
-        action="count",
-        default=0,
-        help="verbosity: output extra non-essential info",
-    )
 
-    args = parser.parse_args()
+    args = parse_args_and_add_logging_switch(parser)
     process_all(args)
 
 

diff --git a/oz_tree_build/taxon_mapping_and_popularity/dendropy_extras.py b/oz_tree_build/taxon_mapping_and_popularity/dendropy_extras.py
@@ -240,70 +240,6 @@ def remove_unifurcations_keeping_higher_taxa(self):
     return n_deleted
 
 
-def write_preorder_ages(self, node_dates_fh, leaf_dates_fh=None, format="tsv"):  # noqa A002
-    """
-    Write the dates to one or two files. If no second file is given, only write leaves if
-    the format is 'json'. The main file is for nodes: any absent dates should be treated
-    as unknown. The leaves file should be tiny: most leaves should not have a date, and
-    be treated as extant (0 Ma), unless they have an extinction_date set.
-
-    Format can equal 'json', 'csv', or 'tsv'
-    """
-    if format == "json":
-        start = "{"
-        end = ["}"]
-        sep = '":'
-        join = ['"', '"']
-
-    if format == "tsv":
-        sep = "\t"
-        end = [""]
-        start = ""
-        join = ["", ""]
-
-    if format == "csv":
-        sep = ","
-        end = [""]
-        start = ""
-        join = ["", ""]
-
-    leaf_num = 0
-    node_num = 0
-    if leaf_dates_fh or format == "json":
-        if leaf_dates_fh is None:
-            leaf_dates_fh = node_dates_fh
-            leaf_dates_fh.write('var tree_date = {"leaves":')
-            join = ['"', '"']
-            end = ['},"nodes":', "}}"]
-
-        leaf_dates_fh.write(start)
-        for leaf in self.leaf_node_iter():
-            # for compactness, we should probably write this in binary, as a series of
-            # (4-byte int, float); for the moment write it as text format, to be gzipped
-            leaf_num += 1
-            if (getattr(leaf, "age", None) is not None) and (leaf.age > 0):
-                leaf_dates_fh.write(join[0] + str(leaf_num) + sep + str(leaf.age))
-                if format == "json":
-                    # after first value, start putting initial commas (avoids trailing comma)
-                    join[0] = ',"'
-                else:
-                    join[0] = "\n"
-        leaf_dates_fh.write(end[0])
-        leaf_dates_fh.flush()
-
-    node_dates_fh.write(start)
-    for node in self.preorder_internal_node_iter():
-        node_num += 1
-        if getattr(node, "age", None) is not None:
-            node_dates_fh.write(join[-1] + str(node_num) + sep + str(node.age))
-            if format == "json":
-                join[-1] = ',"'
-            else:
-                join[-1] = "\n"
-    node_dates_fh.write(end[-1])
-    node_dates_fh.flush()
-
-
 def write_preorder_to_csv(
     self,
     leaf_file,

diff --git a/oz_tree_build/tree_build/ott_mapping/add_ott_numbers_to_trees.py b/oz_tree_build/tree_build/ott_mapping/add_ott_numbers_to_trees.py
@@ -22,7 +22,9 @@
 """  # noqa E501
 
 import argparse
+import collections
 import json
+import logging
 import os
 import re
 import sys
@@ -31,6 +33,8 @@
 
 from dendropy import Tree
 
+logger = logging.getLogger(__name__)
+
 unambiguous = 0
 synonyms = 0
 unidentified = 0
@@ -318,6 +322,9 @@ def lookup_OTT(name_node_dict, context):
 
             if len(remainder):
                 names = [(n.label).replace("_", " ") for n in remainder]
+                duplicates = [item for item, count in collections.Counter(names).items() if count > 1]
+                if len(duplicates) > 0:
+                    logging.error(f"File {f} has multiple nodes labelled: {duplicates}")
                 lookup_OTT(dict(zip(names, remainder)), context_name)
 
             if args.savein: