diff --git a/.gitignore b/.gitignore index 503e3e8135fb..5b42c14c1ba2 100644 --- a/.gitignore +++ b/.gitignore @@ -52,7 +52,7 @@ ARM/ ARM64/ Debug/ Generated[!!-~]Files/ -Release/ +Release/** __pycache__/ _site/ arm/ @@ -63,7 +63,7 @@ doc/ lib/ !tools/cldr/lib out/ -release/ +release/** target/ !docs/processes/release/ tmp/ @@ -81,6 +81,7 @@ pkgdataMakefile rules.mk .DS_Store .flattened-pom.xml +dependency-reduced-pom.xml !icu4c/source/samples/csdet/Makefile diff --git a/.mvn/maven.config b/.mvn/maven.config index 76581d2d64d7..0c7730aea05e 100644 --- a/.mvn/maven.config +++ b/.mvn/maven.config @@ -8,3 +8,4 @@ # Do not display transfer progress when downloading or uploading --no-transfer-progress +-Dfile.encoding=UTF-8 diff --git a/docs/processes/cldr-icu.md b/docs/processes/cldr-icu.md index b3f3b58d9089..56a152ca5abc 100644 --- a/docs/processes/cldr-icu.md +++ b/docs/processes/cldr-icu.md @@ -47,12 +47,14 @@ for a given version is downloading the zipped sources for the common (`core.zip` and tools (`tools.zip`) directory subtrees from the Data column in [CLDR Releases/Downloads](https://cldr.unicode.org/index/downloads) -Besides a standard JDK 11+, the process also requires [ant](https://ant.apache.org) and -[maven](https://maven.apache.org) plus the xml-apis.jar from the -[Apache xalan package](https://xalan.apache.org/xalan-j/downloads.html) _(Is this -latter requirement still true?)_. +Besides a standard JDK 11+, the process also requires [Ant](https://ant.apache.org), +[Maven](https://maven.apache.org), and Python (https://www.python.org). -If you do CLDR development you can configure maven as documented at +WARNING: the Ant scripts will soon be REMOVED. +PLEASE execute all the steps using the new Python workflow. +REPORT any problems you encounter, and switch back to the Ant if you don't have another choice. + +If you do CLDR development you can configure Maven as documented at [CLDR Maven setup](http://cldr.unicode.org/development/maven) (non-Eclipse version). But for the CLDR to ICU data conversion, or for regular ICU development this is not needed. @@ -106,12 +108,12 @@ ticket and a separate PR: There are several environment variables that need to be defined. -1. Java-, ant-, and maven-related variables +1. Java-, Ant- (TO REMOVE), Maven-, and Python-related variables * `JAVA_HOME`: Path to JDK (a directory, containing e.g. `bin/java`, `bin/javac`, etc.); on many systems this can be set using the output of `/usr/libexec/java_home`. - * `ANT_OPTS`: You may want to set `-Xmx8192m` to give Java more memory; otherwise + * `ANT_OPTS`: (TO REMOVE) You may want to set `-Xmx8192m` to give Java more memory; otherwise it may run out of heap. * `MAVEN_ARGS`: You may want to set `--no-transfer-progress` to reduce the noise @@ -145,9 +147,10 @@ There are several environment variables that need to be defined. ## 1 Environment variables -1a. Java, ant, and maven variables, adjust for your system +1a. Java, Ant (TO REMOVE), Maven, and Python variables, adjust for your system ```sh export JAVA_HOME=/usr/libexec/java_home +# TO REMOVE export ANT_OPTS="-Xmx8192m" export MAVEN_ARGS="--no-transfer-progress" ``` @@ -172,13 +175,19 @@ export ICU4J_ROOT=$ICU_DIR/icu4j export TOOLS_ROOT=$ICU_DIR/tools ``` -1d. Directory for logs/notes (create if does not exist) +1d. Python variables +```sh +export PYTHONPATH=$ICU_DIR/tools/py +export PYTHONDONTWRITEBYTECODE=1 +``` + +1e. Directory for logs/notes (create if does not exist) ```sh export NOTES=...(some directory)... mkdir -p $NOTES ``` -1e. The name of the icu data directory for Java (for example `icudt74b`) +1f. The name of the icu data directory for Java (for example `icudt74b`) ```sh export ICU_DATA_VER=icudt(version)b ``` @@ -248,6 +257,22 @@ mvn clean install -pl :cldr-all,:cldr-code -DskipTests -DskipITs 5a. Generate the CLDR production data. +**// NEW PROCESS, Python. Please use this!** + +This process uses Python with ICU4C's `data/build.py` + +* Running `python build.py --cleanprod` is necessary to clean out the production data directory + (usually `$CLDR_TMP_DIR/production`), required if any CLDR data has changed. + +```sh +cd $ICU4C_DIR/source/data +python build.py --proddata +``` + +**// NEW PROCESS - END** + +**// TO REMOVE - Don't execute if the above step works.** + This process uses ant with ICU4C's `data/build.xml` * Running `ant cleanprod` is necessary to clean out the production data directory @@ -261,6 +286,7 @@ ant cleanprod ant setup ant proddata 2>&1 | tee $NOTES/cldr-newData-proddataLog.txt ``` +**// TO REMOVE - END** > Note, for CLDR development, at this point tests are sometimes run on the production data, see @@ -299,8 +325,13 @@ java -jar target/cldr-to-icu-1.0-SNAPSHOT-jar-with-dependencies.jar --cldrDataDi 5c. Update the CLDR testData files needed by ICU4C/J tests, ensuring they are representative of the newest CLDR data. + ```sh cd $ICU_DIR/tools/cldr +# NEW PROCESS, Python. Please use this! +python build.py --copy-cldr-testdata + +# TO REMOVE. Don't execute if the above step works. ant copy-cldr-testdata ``` @@ -453,7 +484,7 @@ cd $ICU4J_ROOT ## 13 Rebuild ICU4J with new data, run tests -13a. Run the tests using the maven build +13a. Run the tests using the Maven build ```sh cd $ICU4J_ROOT mvn clean @@ -488,7 +519,7 @@ Running a specific test is the same as above: mvn install --pl :core -DICU.exhaustive=10 -Dtest=ExhaustiveNumberTest ``` -## 14 Investigate and fix maven check test failures +## 14 Investigate and fix Maven check test failures Fix test cases and repeat from step 13, or fix CLDR data and repeat from step 4, as appropriate, until there are no more failures in ICU4C or ICU4J. diff --git a/icu4c/source/data/build.py b/icu4c/source/data/build.py new file mode 100755 index 000000000000..3303119924d3 --- /dev/null +++ b/icu4c/source/data/build.py @@ -0,0 +1,155 @@ +#!/usr/bin/env python3 -B +# +# Copyright (C) 2026 and later: Unicode, Inc. and others. +# License & terms of use: http://www.unicode.org/copyright.html +"""Generates data in cldr-staging/production from the cldr main repo""" + +import argparse +import os +import sys +import datetime +import subprocess + +try: + from libs import icudirs + from libs import icufs + from libs import iculog + from libs import icuproc +except (ModuleNotFoundError, ImportError) as e: + print("Make sure you define PYTHONPATH pointing to the ICU modules:") + print(" export PYTHONPATH=/tools/py") + print("On Windows:") + print(" set PYTHONPATH=\\tools\\py") + sys.exit(1) + + +basedir = "." +cldr_tmp_dir = None +cldr_prod_dir = None +cldrtools_jar = None +cldr_tmp_dir = None +notes_dir: str = "./notes" + + +def _init(): + """Initialization. Check folders existence, cldr-code.jar exists, etc.""" + iculog.subtitle("init()") + iculog.info(str(datetime.datetime.now())) + + cldr_dir = icudirs.cldr_dir() + + cldrtools_dir = os.path.join(cldr_dir, "tools") + iculog.info(f"cldr_dir:{cldr_dir}") + iculog.info(f"cldrtools_dir:{cldrtools_dir}") + if not os.path.isdir(cldrtools_dir): + iculog.failure( + "Please make sure that the CLDR tools directory" + " is checked out into CLDR_DIR" + ) + + dir_to_check = f"{cldrtools_dir}/cldr-code/target/classes" + if not os.path.isdir(dir_to_check): + iculog.failure(f"Can't find {dir_to_check}. Please build cldr-code.jar.") + + global cldrtools_jar + cldrtools_jar = f"{cldrtools_dir}/cldr-code/target/cldr-code.jar" + if not os.path.isfile(cldrtools_jar): + iculog.failure( + f"CLDR classes not found in {cldrtools_dir}/cldr-code/target/classes." + " Please build cldr-code.jar." + ) + + global cldr_tmp_dir + cldr_tmp_dir = icudirs.cldr_prod_dir() + global cldr_prod_dir + cldr_prod_dir = f"{cldr_tmp_dir}/production/" + + global notes_dir + notes_dir = os.environ.get("NOTES", "./notes") + + subprocess.run("mvn -version", encoding="utf-8", shell=True, check=True) + iculog.info(f"cldr tools dir: {cldrtools_dir}") + iculog.info(f"cldr tools jar: {cldrtools_jar}") + iculog.info(f"CLDR_TMP_DIR: {cldr_tmp_dir} ") + iculog.info(f"cldr.prod_dir (production data): {cldr_prod_dir}") + iculog.info(f"notes_dir: {notes_dir}") + + +def cleanprod(): + """Remove the data in cldr-staging/production""" + iculog.title("cleanprod()") + icufs.rmdir(f"{cldr_prod_dir}/common") + icufs.rmdir(f"{cldr_prod_dir}/keyboards") + + +def restoreprod(): + """Restore the git version of data in cldr-staging/production""" + iculog.title("restoreprod()") + if not cldr_prod_dir: + iculog.failure("cldr_prod_dir not configured") + return + old_dir = icufs.pushd(cldr_prod_dir) + icufs.rmdir("common") + icuproc.run_with_logging( + "git checkout -- common", + logfile=os.path.join(notes_dir, "cldr-newData-restorecommonLog.txt"), + ) + icufs.rmdir("keyboards") + icuproc.run_with_logging( + "git checkout -- keyboards", + logfile=os.path.join(notes_dir, "cldr-newData-restorekeyboardsLog.txt"), + ) + icufs.popd(old_dir) + + +def proddata(): + """Generates data in cldr-staging/production""" + cleanprod() + iculog.title("proddata()") + iculog.info(f"Rebuilding {cldr_prod_dir} - takes a while!") + # setup prod data + icuproc.run_with_logging( + "java" + f" -cp {cldrtools_jar}" + " org.unicode.cldr.tool.GenerateProductionData" + " -v", + logfile=os.path.join(notes_dir, "cldr-newData-proddataLog.txt"), + ) + + +def main(): + parser = argparse.ArgumentParser() + parser.add_argument( + "-c", "--cleanprod", help="remove all build targets", action="store_true" + ) + parser.add_argument( + "-p", + "--proddata", + help="Rebuilds files in cldr-staging/production", + action="store_true", + ) + parser.add_argument( + "-r", + "--restore", + help="Restore (from git) the filed removed by cleanprod", + action="store_true", + ) + cmd = parser.parse_args() + + if cmd.cleanprod: + _init() + cleanprod() + elif cmd.proddata: + _init() + proddata() + elif cmd.restore: + _init() + restoreprod() + else: + parser.print_help() + + return 0 + + +if __name__ == "__main__": + sys.exit(main()) diff --git a/tools/cldr/README.md b/tools/cldr/README.md index bc579c7656a4..047844b1a85d 100644 --- a/tools/cldr/README.md +++ b/tools/cldr/README.md @@ -5,10 +5,21 @@ ## CLDR test data +The Python [build.py](build.py) file takes care of copying some CLDR +test data directories to both the ICU4C and ICU4J source trees. To add +more directories to the list, modify the `cldr_test_data` fileset. + +ANT-TO-REMOVE-START + +WARNING: Ant support WILL BE REMOVED. +Only use this (and report) if the step above fails. + The ant [build.xml](build.xml) file takes care of copying some CLDR test data directories to both the ICU4C and ICU4J source trees. To add more directories to the list, modify the `cldrTestData` fileset. +ANT-TO-REMOVE-END + ## cldr-to-icu The cldr-to-icu directory contains tools to convert from CLDR's XML diff --git a/tools/cldr/build.py b/tools/cldr/build.py new file mode 100755 index 000000000000..abef8efba2b3 --- /dev/null +++ b/tools/cldr/build.py @@ -0,0 +1,158 @@ +#!/usr/bin/env python3 -B +# +# Copyright (C) 2026 and later: Unicode, Inc. and others. +# License & terms of use: http://www.unicode.org/copyright.html + +"""This build file is intended to become the single mechanism for working with + CLDR code and data when building ICU data. + +Eventually it will encompass: +* Building ICU data form CLDR data via cldr-to-icu. +* Building the CLDR libraries needed to support ICU data conversion. +* Copying CLDR test data for ICU regression tests. + +It's not complete yet, so for now follow the instructions in: + /docs/processes/cldr-icu.md +which is best viewed as + https://unicode-org.github.io/icu/processes/cldr-icu.html +""" + +import argparse +import os +import sys + +try: + from libs import icufs + from libs import iculog + from libs import icuproc +except (ModuleNotFoundError, ImportError) as e: + print("Make sure you define PYTHONPATH pointing to the ICU modules:") + print(" export PYTHONPATH=/tools/py") + print("On Windows:") + print(" set PYTHONPATH=\\tools\\py") + sys.exit(1) + +cldr_dir = str(os.getenv("CLDR_DIR")) +icu_dir = str(os.getenv("ICU_DIR")) +test_data_dir_4c = "" +test_data_dir_4j = "" + + +def _init_args(): + """Initialize any properties not already set on the command line.""" + # Inherit properties from environment variable unless specified. As usual + # with Ant, this is messier than it should be. All we are saying here is: + # "Use the property if explicitly set, otherwise use the environment variable" + # We cannot just set the property to the environment variable, since expansion + # fails for non existent properties, and you are left with a literal value of + # "${env.CLDR_DIR}". + global test_data_dir_4c + global test_data_dir_4j + if not icu_dir: + iculog.failure( + "Set the ICU_DIR environment variable to the top level" + " ICU source directory (containing 'icu4c' and 'icu4j')." + ) + if not cldr_dir: + iculog.failure( + "Set the CLDR_DIR environment variable to the top level" + " CLDR source directory (containing 'common')." + ) + test_data_dir_4c = os.path.join(icu_dir, "icu4c/source/test/testdata/cldr") + test_data_dir_4j = os.path.join( + icu_dir, "icu4j/main/core/src/test/resources/com/ibm/icu/dev/data/cldr" + ) + + +def _create_catalog(test_data_dir: str, contents: list[str]): + catalog_file_name = os.path.join(test_data_dir, "personNameTest/catalog.txt") + icufs.copyfile( + os.path.join(test_data_dir, "personNameTest/_header.txt"), + catalog_file_name, + ) + with open(catalog_file_name, "a", encoding="utf-8") as f: + for line in contents: + f.write(line) + f.write("\n") + + +def copy_cldr_testdata(): + """Copies CLDR test data directories, after deleting previous + contents to prevent inconsistent state.""" + _init_args() + clean_cldr_testdata() + src_dir_base = os.path.join(cldr_dir, "common/testData") + # CLDR test data directories to be copied into ICU. + # Add directories here to control which test data is installed. + cldr_test_data = [ + "localeIdentifiers", + "personNameTest", # Used in ExhaustivePersonNameTest + "units", # Used in UnitsTest tests + ] + for test_dir in cldr_test_data: + src_dir = os.path.join(src_dir_base, test_dir) + iculog.subtitle(f"Copying CLDR test data to {src_dir}") + icufs.copycleandir(src_dir, os.path.join(test_data_dir_4c, test_dir)) + icufs.copycleandir(src_dir, os.path.join(test_data_dir_4j, test_dir)) + + iculog.subtitle("Creating catalog.txt file") + # collect the file names in the cldr/personNameTest directory + contents = os.listdir(os.path.join(src_dir_base, "personNameTest")) + contents.sort() + contents = list(filter(lambda x: not x.startswith("_"), contents)) + _create_catalog(test_data_dir_4c, contents) + _create_catalog(test_data_dir_4j, contents) + + +def clean_cldr_testdata(): + """Deletes CLDR test data""" + _init_args() + iculog.title("Removing test dirs") + icufs.rmdir(test_data_dir_4c) + icufs.rmdir(test_data_dir_4j) + + +def reset_cldr_testdata(): + """Restores CLDR test data""" + _init_args() + iculog.title("Git-restore test dirs") + icuproc.run_with_logging(f"git checkout -- {test_data_dir_4c}") + icuproc.run_with_logging(f"git checkout -- {test_data_dir_4j}") + + +def main() -> int: + parser = argparse.ArgumentParser() + parser.add_argument( + "-cp", + "--copy-cldr-testdata", + help="Copies CLDR test data directories, after deleting" + " previous contents to prevent inconsistent state.", + action="store_true", + ) + parser.add_argument( + "-rm", + "--remove-cldr-testdata", + help="Deletes CLDR test data", + action="store_true", + ) + parser.add_argument( + "-reset", + "--reset-cldr-testdata", + help="Restores the CLDR test data from git", + action="store_true", + ) + cmd = parser.parse_args() + + if cmd.copy_cldr_testdata: + copy_cldr_testdata() + elif cmd.remove_cldr_testdata: + clean_cldr_testdata() + elif cmd.reset_cldr_testdata: + reset_cldr_testdata() + else: + parser.print_help() + return 0 + + +if __name__ == "__main__": + sys.exit(main()) diff --git a/tools/currency/build.py b/tools/currency/build.py new file mode 100755 index 000000000000..03d939d26dc8 --- /dev/null +++ b/tools/currency/build.py @@ -0,0 +1,203 @@ +#!/usr/bin/env python3 -B +# +# Copyright (C) 2026 and later: Unicode, Inc. and others. +# License & terms of use: http://www.unicode.org/copyright.html +"""A tool used for maintaining ICU's ISO 4217 currency code mapping data. + +ICU uses a resource generated by this tool for mapping +ISO 4217 currency alpha codes to numeric codes. +""" +import argparse +import os +import sys +import urllib +import urllib.request + +try: + from libs import icufs + from libs import iculog + from libs import icuproc +except (ModuleNotFoundError, ImportError) as e: + print("Make sure you define PYTHONPATH pointing to the ICU modules:") + print(" export PYTHONPATH=/tools/py") + print("On Windows:") + print(" set PYTHONPATH=\\tools\\py") + sys.exit(1) + + +basedir = "." +out_dir = f"{basedir}/out" +src_dir = f"{basedir}/src" +classes_dir = f"{out_dir}/bin" +res_dir = f"{out_dir}/res" +xml_dir = f"{out_dir}/xml" + +base_url = "https://www.six-group.com/dam/download/financial-information/data-center/iso-currrency/lists" +current_xml = "list-one.xml" +historic_xml = "list-three.xml" + + +def build(): + """Verify ICU"s local data and generate ISO 4217 alpha-numeric code + mapping data resource""" + iculog.subtitle("build()") + check() + resource() + + +def classes(): + """Build the Java tool""" + iculog.subtitle("classes()") + icufs.makecleandir(classes_dir) + icuproc.run_with_logging( + "javac" + f" -d {classes_dir}" + " --release 11" + " -encoding UTF-8" + f" {src_dir}/com/ibm/icu/dev/tool/currency/*.java", + logfile="-", + ) + + +def _check_local_xml() -> bool: + iculog.info("_check_local_xml()") + return os.path.exists(f"{basedir}/{current_xml}") and os.path.exists( + f"{basedir}/{historic_xml}" + ) + + +def _local_xml() -> bool: + iculog.info("_local_xml()") + if _check_local_xml(): + iculog.subtitle("Using local ISO 4217 XML data files") + icufs.copyfile(current_xml, xml_dir) + icufs.copyfile(historic_xml, xml_dir) + return True + return False + + +def _download_xml() -> None: + """Downloads the xml files""" + iculog.info("_download_xml()") + if _check_local_xml(): + return + iculog.info("Downloading ISO 4217 XML data files") + icufs.mkdir(xml_dir) + + iculog.info( + "urllib.request.urlretrieve(" + f' "{base_url}/{current_xml}",' + f' "{xml_dir}/{current_xml}")' + ) + + opener = urllib.request.build_opener() + opener.addheaders = [("Accept", "application/xml")] + urllib.request.install_opener(opener) + urllib.request.urlretrieve( + f"{base_url}/{current_xml}", f"{xml_dir}/{current_xml}" + ) + urllib.request.urlretrieve( + f"{base_url}/{historic_xml}", f"{xml_dir}/{historic_xml}" + ) + + +def xml_data(): + """Prepare necessary ISO 4217 XML data files""" + iculog.subtitle("xml_data()") + if not _local_xml(): + _download_xml() + + +def check(): + """Verify if ICU"s local mapping data is synchronized with the XML data""" + iculog.subtitle("check()") + classes() + xml_data() + icuproc.run_with_logging( + "java" + f" -cp {classes_dir}" + " com.ibm.icu.dev.tool.currency.Main" + " check" + f" {xml_dir}/{current_xml}" + f" {xml_dir}/{historic_xml}", + logfile="-", + ) + + +def resource(): + """Build ISO 4217 alpha-numeric code mapping data resource""" + iculog.subtitle("resources()") + classes() + icufs.mkdir(res_dir) + icuproc.run_with_logging( + "java" + f" -cp {classes_dir}" + " com.ibm.icu.dev.tool.currency.Main" + f" build {res_dir}", + logfile="-", + ) + iculog.info( + "ISO 4217 numeric code mapping data was successfully created" + f" in {res_dir}" + ) + + +def clean(): + """Delete build outputs""" + iculog.subtitle("clean()") + icufs.rmdir(out_dir) + icufs.rmdir("target") + + +def main(): + parser = argparse.ArgumentParser() + parser.add_argument( + "--build", + help="Verify ICUs local data and generate ISO 4217 alpha-numeric code" + " mapping data resource", + action="store_true", + ) + parser.add_argument( + "--check", + help="Verify if ICU's local mapping data is synchronized with" + " the XML data", + action="store_true", + ) + parser.add_argument( + "--classes", help="Build the Java tool", action="store_true" + ) + parser.add_argument( + "--clean", help="Delete build outputs", action="store_true" + ) + parser.add_argument( + "--resource", + help="Build ISO 4217 alpha-numeric code mapping data resource", + action="store_true", + ) + parser.add_argument( + "--xmlData", + help="Prepare necessary ISO 4217 XML data files", + action="store_true", + ) + cmd = parser.parse_args() + + if cmd.build: + build() + elif cmd.check: + check() + elif cmd.classes: + classes() + elif cmd.clean: + clean() + elif cmd.resource: + resource() + elif cmd.xmlData: + xml_data() + else: + parser.print_help() + + return 0 + + +if __name__ == "__main__": + sys.exit(main()) diff --git a/tools/currency/readme.txt b/tools/currency/readme.txt index cebdb64ae0e1..8b245cf86cfa 100644 --- a/tools/currency/readme.txt +++ b/tools/currency/readme.txt @@ -58,44 +58,42 @@ in ICU resource bundle source format - currencyNumericCodes.txt. For each ICU release, we should check if the mapping data is up to date. -Prerequisites: Java 6+, ant +Prerequisites: Java 11+, Ant (TO REMOVE), Python -First, run the ant target "check". This ant target download XML files from +First, run the "check" target. This target will download XML files from the SIX Interbank Clearing site and invoke the tool command "check". +```sh +./build.py --check +``` + When the target successfully finished, you should see the log like below: ---------------------------------------------------------------------------- -C:\devtools\trunk\currency>ant -Buildfile: C:\devtools\trunk\currency\build.xml - -classes: - -_checkLocalXml: - -_localXml: - -_downloadXml: - [echo] Downloading ISO 4217 XML data files - [get] Getting: http://www.currency-iso.org/dam/downloads/lists/list_one.xm -l - [get] To: C:\devtools\trunk\currency\out\xml\list_one.xml - [get] Getting: http://www.currency-iso.org/dam/downloads/lists/list_three. -xml - [get] To: C:\devtools\trunk\currency\out\xml\list_three.xml -xmlData: +``` +[INFO] ──────────────────────────────[ check() ]─────────────────────────────── +[INFO] ─────────────────────────────[ classes() ]────────────────────────────── +[INFO] [rmdir] ./out/bin +[INFO] [mkdir] ./out/bin +[INFO] [execute] javac -d ./out/bin --release 11 -encoding UTF-8 ./src/com/ibm/icu/dev/tool/currency/*.java + logfile: - +[INFO] [mkdir] ./target/pylogs +[INFO] [rmfile] ./target/pylogs/- +[INFO] ─────────────────────────────[ xml_data() ]───────────────────────────── +[INFO] _local_xml() +[INFO] _check_local_xml() +[INFO] _download_xml() +[INFO] _check_local_xml() +[INFO] Downloading ISO 4217 XML data files +[INFO] [mkdir] ./out/xml +[INFO] urllib.request.urlretrieve( "https://www.six-group.com/dam/download/financial-information/data-center/iso-currrency/lists/list-one.xml", "./out/xml/list-one.xml") +[INFO] [execute] java -cp ./out/bin com.ibm.icu.dev.tool.currency.Main check ./out/xml/list-one.xml ./out/xml/list-three.xml + logfile: - +[OK] ICU data is synchronized with the reference data +[INFO] [mkdir] ./target/pylogs +[INFO] [rmfile] ./target/pylogs/- +``` -check: - [java] [OK] ICU data is synchronized with the reference data - -resource: - [echo] ISO 4217 numeric code mapping data was successfully created in C:\de -vtools\trunk\currency/out/res - -build: - -BUILD SUCCESSFUL -Total time: 1 second ---------------------------------------------------------------------------- In this case, our data is synchronized with the latest XML data and you're done. @@ -103,8 +101,8 @@ In this case, our data is synchronized with the latest XML data and you're done. If the data is out of sync, you should see message like below: ---------------------------------------------------------------------------- check: - [java] Missing alpha code in ICU map [ZWR] - [java] Codes not found in the reference data: ZZZ + Missing alpha code in ICU map [ZWR] + Codes not found in the reference data: ZZZ BUILD FAILED C:\devtools\trunk\currency\build.xml:54: Java returned: 1 @@ -113,11 +111,18 @@ In this case, you have to update the hardcoded data in NumericCodeData. You can either edit the table in NumericCodeData manually, or run the tool command "print" and copy the output and paste it to the table. -Once you make sure "ant check" returns no errors, run "ant resource". This +Once you make sure "check" returns no errors, run "resource" task. This target generate out/res/currencyNumericCodes.txt. The file should go to /source/data/misc directory. -Note: The default ant target does both operation. Although it creates the +```sh +./build.py --resource +``` + +Note: The default "build" target does both operation. Although it creates the ICU resource file, you do not need to replace the one in ICU4C package with the newly generated one if "check" successfully finished. +```sh +./build.py --build +``` diff --git a/tools/py/libs/__init__.py b/tools/py/libs/__init__.py index 35b97bb231ad..a6090f7d1430 100644 --- a/tools/py/libs/__init__.py +++ b/tools/py/libs/__init__.py @@ -6,7 +6,7 @@ from . import iculog HOME_DIR: str = os.path.expanduser('~') -ICU_DIR: str = os.getenv('ICU_ROOT') +ICU_DIR: str = os.getenv('ICU_ROOT', '.') # Initialize logging iculog.init_logging() diff --git a/tools/py/libs/icudirs.py b/tools/py/libs/icudirs.py new file mode 100644 index 000000000000..3d89a3ac8abc --- /dev/null +++ b/tools/py/libs/icudirs.py @@ -0,0 +1,142 @@ +# Copyright (C) 2026 and later: Unicode, Inc. and others. +# License & terms of use: http://www.unicode.org/copyright.html + +"""Get the paths for various directories, checking for some files / folders.""" + +import os +import sys + +from libs import iculog + + +def _get_save_dir( + key: str, + defalt_value: str | None = None, + dirs_to_check: tuple[str, ...] = (), + files_to_check: tuple[str, ...] = (), +) -> str | None: + """Get the path for a directory, checking for some files / folders.""" + result: str | None = os.environ.get(key) + # Variable not set, or set to empty string. + if not result: + if not defalt_value: + iculog.error(f'Environment variable {key} is not set.') + return None + iculog.info( + f'Environment variable {key} is not set.\n' + + f' Using the default value: "{defalt_value}".' + ) + result = defalt_value + # Options: either complain if it is not an absolute path, + # or make it absolute and set the environment. + # result = os.path.abspath(result) + # os.environ[key] = result + result = os.path.normpath(result) + # Does not point to a directory. + if not os.path.isdir(result): + iculog.error( + f'Environment variable {key} is set to {result}.\n' + + ' But it does not point to a directory.' + ) + return None + # Does not contain the expected subdirectories. + for chk_dir in dirs_to_check: + if not os.path.isdir(os.path.join(result, chk_dir)): + iculog.error( + f'Environment variable {key} is set to {result}.\n' + + ' But it does not seem to point to a valid directory.\n' + + f' Missing subdirectory: {chk_dir}/' + ) + return None + # Does not contain the expected files. + for chk_file in files_to_check: + if not os.path.isfile(os.path.join(result, chk_file)): + iculog.error( + f'Environment variable {key} is set to {result}.\n' + + ' But it does not seem to point to a valid directory.\n' + + f' Missing file: {chk_file}' + ) + return None + return result + + +def icu_dir(defalt_value: str | None = None) -> str: + """Get the path for the 'icu' repository directory.""" + if not defalt_value: + # If a default value is not provided, determine it based on the location of + # this file, assuming this file is located at /tools/py/libs/ + defalt_value = os.path.abspath(os.path.dirname(__file__)).rsplit(os.sep, 3)[ + 0 + ] + iculog.info( + 'Determined defalt_value based on the location of the python module.\n' + + f' defaultvalue = "{defalt_value}".' + ) + result = _get_save_dir( + 'ICU_DIR', + defalt_value, + ('icu4c', 'icu4j', 'tools'), + ('pom.xml', 'README.md', 'CONTRIBUTING.md'), + ) + if not result: + iculog.failure( + 'Please set the ICU_DIR environment variable to the top level ' + "'icu' source dir (containing 'icu4c' and 'icu4j')." + ) + sys.exit(1) + return result + + +def cldr_dir(defalt_value: str | None = None) -> str: + """Get the path for the 'cldr' repository directory.""" + result = _get_save_dir( + 'CLDR_DIR', + defalt_value, + ('common', 'keyboards', 'specs', 'tools'), + ('pom.xml', 'README.md', 'CONTRIBUTING.md'), + ) + if not result: + iculog.failure( + 'Please set the CLDR_DIR environment variable to the top level' + " 'cldr' source dir (containing 'common')." + ) + sys.exit(1) + return result + + +def cldr_prod_dir(defalt_value: str | None = None) -> str: + """Get the path for the 'cldr-staging' repository directory.""" + result = _get_save_dir( + 'CLDR_TMP_DIR', + defalt_value, + ('births', 'docs/charts', 'production'), + ('pom.xml', 'README-common.md', 'README-keyboards.md'), + ) + if not result: + iculog.failure( + 'Please set the CLDR_TMP_DIR environment variable to the top level' + " 'cldr-staging' source dir (containing 'production')." + ) + sys.exit(1) + return result + + +def report_dirs(): + """List the values of the known directories with the current environment.""" + iculog.debug(f'__file__(): {__file__}') + try: + iculog.info(f'icu_dir(): {icu_dir()}') + except SystemExit as _: + pass + try: + iculog.info(f'cldr_dir(): {cldr_dir()}') + except SystemExit as _: + pass + try: + iculog.info(f'cldr_prod_dir(): {cldr_prod_dir()}') + except SystemExit as _: + pass + + +if __name__ == '__main__': + report_dirs() diff --git a/tools/py/libs/icuproc.py b/tools/py/libs/icuproc.py index 4eddaa964ced..ac8915e4d193 100644 --- a/tools/py/libs/icuproc.py +++ b/tools/py/libs/icuproc.py @@ -13,7 +13,7 @@ def run_with_logging( command: str, logfile: str = 'last_run.log', - root_dir: str = os.getenv('ICU_ROOT'), + root_dir: str = os.getenv('ICU_ROOT', '.'), ok_result: list[int]|None = None, ) -> subprocess.CompletedProcess[str]: """The method that executes the step proper. @@ -27,8 +27,6 @@ def run_with_logging( Returns: the result of the subprocess execution. """ - if root_dir is None: - root_dir = '.' if ok_result is None: ok_result = [0] iculog.info(f'[execute] {command}\n logfile: {logfile}') @@ -38,12 +36,19 @@ def run_with_logging( should_check = True else: should_check = False + + if logfile == '-': + stdout_redirect = None # regular output + stderr_redirect = None # regular output + else: + stdout_redirect = subprocess.PIPE # Capture stdout + stderr_redirect = subprocess.STDOUT # Merge stderr into stdout try: result: subprocess.CompletedProcess[str] = subprocess.run( command, encoding='utf-8', - stdout=subprocess.PIPE, # Capture stdout - stderr=subprocess.STDOUT, # Merge stderr into stdout + stdout=stdout_redirect, + stderr=stderr_redirect, shell=True, check=should_check ) @@ -53,9 +58,12 @@ def run_with_logging( exit(ex.returncode) if logfile and root_dir: - abs_logdir = os.path.join(root_dir, 'target', 'pylogs') - icufs.mkdir(abs_logdir) - abs_logfile = os.path.join(abs_logdir, logfile) + if os.path.isabs(logfile): + abs_logfile = logfile + else: + abs_logdir = os.path.join(root_dir, 'target', 'pylogs') + icufs.mkdir(abs_logdir) + abs_logfile = os.path.join(abs_logdir, logfile) icufs.rmfile(abs_logfile) with open(abs_logfile, 'w', encoding='utf-8') as f: f.write('==================\n') diff --git a/tools/py/libs/py.typed b/tools/py/libs/py.typed new file mode 100644 index 000000000000..22486ac477b4 --- /dev/null +++ b/tools/py/libs/py.typed @@ -0,0 +1,2 @@ +# Copyright (C) 2026 and later: Unicode, Inc. and others. +# License & terms of use: http://www.unicode.org/copyright.html diff --git a/tools/py/libs/test_icudirs.py b/tools/py/libs/test_icudirs.py new file mode 100644 index 000000000000..9dcb2da3078a --- /dev/null +++ b/tools/py/libs/test_icudirs.py @@ -0,0 +1,116 @@ +# Copyright (C) 2026 and later: Unicode, Inc. and others. +# License & terms of use: http://www.unicode.org/copyright.html + +"""Tests for the directories system module.""" + +import os +import tempfile +import unittest + +from libs import icudirs + + +class TestIcuDirs(unittest.TestCase): + """Test class for the directories system module.""" + + root_dir: tempfile.TemporaryDirectory[str] + + def test_icu_dir_default(self): + """Test icu directory, basic""" + icu_dir = icudirs.icu_dir() + self.assertIsNotNone(icu_dir) + self.assertNotEqual(icu_dir, '') + self.assertTrue(os.path.isdir(os.path.join(icu_dir, 'icu4j'))) + + expected = os.path.join(self.root_dir.name, 'icu') + os.environ['ICU_DIR'] = expected + icu_dir = icudirs.icu_dir() + self.assertEqual(icu_dir, expected) + + def test_cldr_dir(self): + """Test cldr directory, including many expected failures""" + # Env variable undefined + os.environ['CLDR_DIR'] = '' + with self.assertRaises(SystemExit) as _: + icudirs.cldr_dir() + + # Point to something that does not exist + os.environ['CLDR_DIR'] = 'something_that_does_not_exist' + with self.assertRaises(SystemExit) as _: + icudirs.cldr_dir() + + # Point CLDR_DIR to a file, should fail + fake_dir = os.path.join(self.root_dir.name, 'cldr/pom.xml') + os.environ['CLDR_DIR'] = fake_dir + with self.assertRaises(SystemExit) as _: + icudirs.cldr_dir() + + # Point CLDR_DIR to the icu_dir, should fail as environment variable + # is defined, but the folder structure does not match the expected one. + fake_dir = os.path.join(self.root_dir.name, 'icu') + os.environ['CLDR_DIR'] = fake_dir + with self.assertRaises(SystemExit) as _: + icudirs.cldr_dir() + + # Point to a correct path, should succeed + fake_dir = os.path.join(self.root_dir.name, 'cldr') + os.environ['CLDR_DIR'] = fake_dir + cldr_dir = icudirs.cldr_dir() + self.assertEqual(cldr_dir, fake_dir) + + def test_cldr_prod_dir(self): + """Test cldr-staging directory, for completeness""" + fake_dir = os.path.join(self.root_dir.name, 'cldr-staging') + os.environ['CLDR_TMP_DIR'] = fake_dir + cldr_prod_dir = icudirs.cldr_prod_dir() + self.assertEqual(cldr_prod_dir, fake_dir) + + @classmethod + def _create_fake_folder_struct( + cls, + folder_name: str, + dirs_to_check: list[str] | None = None, + files_to_check: list[str] | None = None, + ): + """Prepare a mock directory structure for icu, cldr, and cldr-staging""" + fake_dir = os.path.join(cls.root_dir.name, folder_name) + if dirs_to_check: + for fake_dir_name in dirs_to_check: + os.makedirs(os.path.join(fake_dir, fake_dir_name), exist_ok=True) + if files_to_check: + for fake_file_name in files_to_check: + fake_file = os.path.join(fake_dir, fake_file_name) + with open(fake_file, 'w', encoding='utf-8') as f: + f.write('Fake, for testing') # Expected file + + @classmethod + def setUpClass(cls): + super().setUpClass() + cls.root_dir = tempfile.TemporaryDirectory(prefix='icudir_test_') + # Create a fake CLDR folder + cls._create_fake_folder_struct( + 'icu', + ['icu4c', 'icu4j', 'tools'], + ['pom.xml', 'README.md', 'CONTRIBUTING.md'], + ) + # Create a fake CLDR folder + cls._create_fake_folder_struct( + 'cldr', + ['common', 'keyboards', 'specs', 'tools'], + ['pom.xml', 'README.md', 'CONTRIBUTING.md'], + ) + # Create a fake CLDR-staging folder + cls._create_fake_folder_struct( + 'cldr-staging', + ['births', 'docs/charts', 'production/common/main'], + ['pom.xml', 'README-common.md', 'README-keyboards.md'], + ) + + @classmethod + def tearDownClass(cls): + super().tearDownClass() + cls.root_dir.cleanup() + + +if __name__ == '__main__': + unittest.main() diff --git a/tools/release/java/Makefile b/tools/release/java/Makefile index 1ad3f8f9a519..575b66e33541 100644 --- a/tools/release/java/Makefile +++ b/tools/release/java/Makefile @@ -13,8 +13,8 @@ # you can put the OLD_ICU=xx and NEW_ICU=yy in separate lines in Makefile.local # -ANT=ant -ANT_TARGET=apireport +GEN_REPORT=./build.py +GEN_REPORT_TARGET=apireport DOXYGEN=doxygen -include Makefile.local @@ -84,17 +84,17 @@ clean-docs: | check-vars $(TARGET): check-vars $(OLD_ICU_BUILD)/$(XML) $(NEW_ICU_BUILD)/$(XML) echo "Remember to run the non-ascii file detector if you get errors." - $(ANT) -Dolddir="$(OLD_ICU_BUILD)/$(XML)" -Dnewdir="$(NEW_ICU_BUILD)/$(XML)" $(ANT_TARGET) + $(GEN_REPORT) --olddir="$(OLD_ICU_BUILD)/$(XML)" --newdir="$(NEW_ICU_BUILD)/$(XML)" --$(GEN_REPORT_TARGET) echo "If you get no-changes, see the readme- may need to add xalan/xerces jars." # check-vars $(OLD_ICU_BUILD)/$(XML) $(NEW_ICU_BUILD)/$(XML) APIChangeReport.xml: $(OLD_ICU_BUILD)/$(XML) $(NEW_ICU_BUILD)/$(XML) echo "Remember to run the non-ascii file detector if you get errors." - $(ANT) -Dolddir="$(OLD_ICU_BUILD)/$(XML)" -Dnewdir="$(NEW_ICU_BUILD)/$(XML)" $(ANT_TARGET)_xml + $(GEN_REPORT) --olddir="$(OLD_ICU_BUILD)/$(XML)" --newdir="$(NEW_ICU_BUILD)/$(XML)" --$(GEN_REPORT_TARGET)_xml APIChangeReport.md: $(OLD_ICU_BUILD)/$(XML) $(NEW_ICU_BUILD)/$(XML) echo "Remember to run the non-ascii file detector if you get errors." - $(ANT) -Dolddir="$(OLD_ICU_BUILD)/$(XML)" -Dnewdir="$(NEW_ICU_BUILD)/$(XML)" $(ANT_TARGET)_md + $(GEN_REPORT) --olddir="$(OLD_ICU_BUILD)/$(XML)" --newdir="$(NEW_ICU_BUILD)/$(XML)" --$(GEN_REPORT_TARGET)_md %/doc/xml: %/Doxyfile # don't care what GENERATE_XML is set to previously - set it to yes. diff --git a/tools/release/java/build.py b/tools/release/java/build.py new file mode 100755 index 000000000000..2560533f88b6 --- /dev/null +++ b/tools/release/java/build.py @@ -0,0 +1,159 @@ +#!/usr/bin/env python3 -B +# +# Copyright (C) 2026 and later: Unicode, Inc. and others. +# License & terms of use: http://www.unicode.org/copyright.html + +"""This is the build file for ICU tools.""" + +import argparse +import os +import sys +import datetime +import subprocess + +try: + from libs import iculog + from libs import icuproc +except (ModuleNotFoundError, ImportError) as e: + print("Make sure you define PYTHONPATH pointing to the ICU modules:") + print(" export PYTHONPATH=/tools/py") + print("On Windows:") + print(" set PYTHONPATH=\\tools\\py") + sys.exit(1) + + +basedir = "." +rsrc_dir = os.path.join(basedir, "src/main/resources") +apireport_jar = "target/icu4c-apireport.jar" +newdir = "" +olddir = "" + + +def _init() -> bool: + """Checks that .jar to run exists""" + iculog.info(str(datetime.datetime.now())) + subprocess.run("mvn -version", encoding="utf-8", shell=True, check=True) + iculog.info(f"tools jar={apireport_jar}") + iculog.info(f"basedir={basedir}") + return os.path.isfile(apireport_jar) + + +def tools(): + """compile release tools""" + if not _init(): + iculog.failure(f"The {apireport_jar} was not built.") + icuproc.run_with_logging("mvn package", logfile="-") + + +def clean(): + """remove all build targets""" + _init() + icuproc.run_with_logging("mvn clean", logfile="-") + + +def apireport(): + tools() + xslt_dir = f"{rsrc_dir}/com/ibm/icu/dev/tools/docs" + icuproc.run_with_logging( + "java" + f" -jar {apireport_jar}" + f" --olddir {olddir}" + f" --newdir {newdir}" + f" --cppxslt {xslt_dir}/dumpAllCppFunc.xslt" + f" --cxslt {xslt_dir}/dumpAllCFunc.xslt" + f" --reportxslt {xslt_dir}/genReport.xslt" + f" --resultfile {basedir}/APIChangeReport.html", + logfile="-", + ) + + +def apireport_md(): + tools() + xslt_dir = f"{rsrc_dir}/com/ibm/icu/dev/tools/docs" + icuproc.run_with_logging( + "java" + f" -jar {apireport_jar}" + f" --olddir {olddir}" + f" --newdir {newdir}" + f" --cppxslt {xslt_dir}/dumpAllCppFunc.xslt" + f" --cxslt {xslt_dir}/dumpAllCFunc.xslt" + f" --reportxslt {xslt_dir}/genReport_md.xslt" + f" --resultfile {basedir}/APIChangeReport.md", + logfile="-", + ) + + +def apireport_xml(): + tools() + xslt_dir = f"{rsrc_dir}/com/ibm/icu/dev/tools/docs" + icuproc.run_with_logging( + "java" + f" -jar {apireport_jar}" + f" --olddir {olddir}" + f" --newdir {newdir}" + f" --cppxslt {xslt_dir}/dumpAllCppFunc_xml.xslt" + f" --cxslt {xslt_dir}/dumpAllCFunc_xml.xslt" + f" --reportxslt {xslt_dir}/genreport_xml.xslt" + f" --resultfile {basedir}/APIChangeReport.xml", + logfile="-", + ) + + +def main(): + parser = argparse.ArgumentParser() + parser.add_argument( + "--clean", help="Remove all build targets", action="store_true" + ) + parser.add_argument( + "--tools", help="Compile the release tools", action="store_true" + ) + parser.add_argument( + "--olddir", help="Directory that contains xml docs of old version" + ) + parser.add_argument( + "--newdir", help="Directory that contains xml docs of new version" + ) + parser.add_argument( + "--apireport", + help="Generate the apireport in HTML format", + action="store_true", + ) + parser.add_argument( + "--apireport_md", + help="Generate the apireport in Markdown format", + action="store_true", + ) + parser.add_argument( + "--apireport_xml", + help="Generate the apireport in XML format", + action="store_true", + ) + cmd = parser.parse_args() + + if cmd.olddir: + global olddir + olddir = cmd.olddir + if cmd.newdir: + global newdir + newdir = cmd.newdir + + if cmd.clean: + clean() + elif cmd.tools: + tools() + elif cmd.apireport: + apireport() + elif cmd.clean: + clean() + elif cmd.apireport_md: + apireport_md() + elif cmd.apireport_xml: + apireport_xml() + else: + parser.print_help() + + return 0 + + +if __name__ == "__main__": + sys.exit(main()) diff --git a/tools/release/java/readme.txt b/tools/release/java/readme.txt index ad91d4b7d459..e733720c998e 100644 --- a/tools/release/java/readme.txt +++ b/tools/release/java/readme.txt @@ -19,14 +19,15 @@ Requirements: To use the utility: 1. Put both old and new ICU source trees on your system -2. Run "configure" in both old and new (you can use any mixture of in-source and out-of-source builds). Doxygen must be found during the configure phase, but you do not need to build the standard API docs. +2. Run "configure" in both old and new (you can use any mixture of in-source and out-of-source builds). + Doxygen must be found during the configure phase, but you do not need to build the standard API docs. ** Then in each directory, run `make doc` to create the doc/ directory. 3. Create a Makefile.local in this readme's directory (tools/release/java/) with just these two lines, for example: - OLD_ICU=/xsrl/E/icu-6.7/icu4c/sources - NEW_ICU=/xsrl/E/icu-6.8/icu4c/sources + OLD_ICU=/xsrl/E/icu-6.7/icu4c/source + NEW_ICU=/xsrl/E/icu-6.8/icu4c/source WARNING: the paths must be absolute paths, and should not use ~ or environment variables. @@ -37,8 +38,8 @@ To use the utility: If your ICU is an out-of-source-build, add these two lines indicating the build location: - OLD_ICU_BUILD=/xsrl/E/icu-build-m48/icu4c/sources - NEW_ICU_BUILD=/xsrl/E/icu-build/icu4c/sources + OLD_ICU_BUILD=/xsrl/E/icu-build-m48/icu4c/source + NEW_ICU_BUILD=/xsrl/E/icu-build/icu4c/source 4. From this directory, (tools/release/java/) run Make to build docs: (the tool will be built automatically) make APIChangeReport.html