Thank you for your interest in contributing to UNIC! There are many ways to contribute, and we appreciate all of them.
If you have questions, please ask on the Gitter room or join the IRC channel #unic.
To request for new components or new features for the existing components, or to report a bug, please file an issue on GitHub.
You find these directories in the project root:
-
/data/: source data; used to generate data tables for consumption in the code; or, used at runtime by integration tests (like conformance tests) and benches. -
/docs/: project high-level documentations, such as tutorials and guidelines. -
/etc/: scripts for project maintenance and deployment. -
/tools/: tooling for generating data tables from source data. -
/unic/: source code for theunicsuper-crate package. Also, under this directory goes all major UNIC components, each under their own directory. The components are separate by multiple aspects: source of data/algorithm (like Unicode, CLDR, IETF), abstraction level (like character-level functionalities vs. string-level functionalities), and practicality for users of the library.
Please consider running through these steps before submitting a PR for UNIC:
-
If you're adding a new component, add it to the
COMPONENTSlist in/etc/common.sh, somewhere after all its dependencies. (This list is used for packaging and publishing the components, ascargo packagedoes not support workspaces, yet. -
Tests that depend on large data, whether directly reading from
/data/directory, or using data tables generated from source data, should be added as integration tests, intests/directory under the relevant component. If it's a benchmarking test, it should go underbenches/. -
If you're adding new integration tests or benches that read directly from the
/data/directory, it should be excluded from packaging, as it won't be able to access the data when unpacked. Add each such test file to theexcludelist in the Cargo manifest file (Cargo.toml). -
For each component, referencing to paths outside of the component directory should be limited to these specific cases:
-
In
[dependencies]section of the Cargo manifest, settingpathfor other UNIC components. -
In integration tests and benches, reading from source data.
-
-
For each component, we keep the source code organized as this:
-
The library file (
src/lib.rs) contains version information of the package (likeCARGO_PKG_VERSION) and the data (likeUNICODE_VERSION), and API ispub used from other crates and local modules. -
New types go into their own modules, and kept separate based on abstraction levels in the specifications and implementation. For example, the
BidiClasstype, representing the Unicode Bidi_Class character property, goes into its ownbidi_classmodule, which also contains other types directly related to the definition, likeBidiClassCategory. -
Traits for non-UNIC types, like the Rust core
charandstrtypes to into separate modules, usually namedtrait.rs.
-
-
Format the code (Rust and Python) using the
/etc/format.sh. This allows everyone to use auto-formatting and not worry about manual code formatting. Please make use you are using the latest version ofrustfmt-nightlybefore running the script.If you have suggestions to change the Rust formatting style, please submitting a separate PR, updating
/.rustfmt.tomlin one commit, and applying the changes in another diff. -
For pre-
1.0.0development, we try to have a new release after each new Rust release. Therefore, there's no need to make new releases after each major or minor component, and as a result, no reason to bump version in PRs.
Use the FIXME comment tag to mark something that's broken. The only data after
a FIXME tag would be the detail of what's broken.
Use the TODO comment tag to describe what needs to be done in the future. You
can also mark different kinds of TODO items, as shown below.
If no metadata, leave the tag empty.
// TODO: Add more tests for this and that case.Mark GitHub issues/pulls with the GH- prefix, which is more portable/readable
that the previous #nnn format.
// TODO(GH-205): Write more on FIXME/TODO tags.For work depending or waiting on updates to Rust, mark them with:
MIN_RUST_VERSION, if the feature is already stable;FUTURE_RUST, if the feature is yet unstable (nightly-only).
// TODO(MIN_RUST_VERSION): Drop this after Rust 1.23.0.
...
// TODO(FUTURE_RUST): Replace with char::MAX_UTF*_LEN when available.
...If no other context/metadata, but a specific person is expected to follow-up, add their username.
// TODO(behnam): Write more on FIXME/TODO tags.