Skip to content

Latest commit

 

History

History
214 lines (159 loc) · 8.42 KB

File metadata and controls

214 lines (159 loc) · 8.42 KB

Contributing to Tesseract Core

Tesseract Core is an open-source project and, as such, we welcome contributions from developers, engineers, scientists, and end-users in general. Contributions are what make the open source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.

Code of Conduct

Ensure your contributions adhere to the Code of Conduct.

Feedback

Constructive feedback is very welcome. We are interested in hearing from you!

In the case things aren't working as expected, or the documentation is lacking, please file a bug report.

In the case you want to suggest a new feature, please file a new feature request. In particular, we recommend you open an issue before contributing code in a pull request. This allows all parties to talk things over before jumping into action, and increase the likelihood of pull requests getting merged.

In case you have general questions or feedback, need support from the community, or have a cool demo to share, start a thread in our Discourse Forum. We use GitHub Issues for bug reports and feature requests only.

Documentation

Tesseract documentation is kept under the docs/ directory of the repository, written in Markdown and using Sphinx to generate the final HTMLs. Fixes and enhancements to the documentation should be submitted as pull requests, we treat the same as code contributions.

To build the documentation locally, install the documentation dependencies in addition to the project itself, then run make html:

$ . venv/bin/activate
$ pip install -e .[dev]
$ cd docs
$ make html

The resulting HTMLs are in docs/build/html/.

Contributions in the form of tutorials, examples, demos, blog posts (including those posted elsewhere already) are best highlighted and celebrated in the Discourse Forum.

Code

Tesseract is developed under the Apache 2.0 license. By contributing to the Tesseract project you agree that your code contributions are governed by this license. We require you to sign our Contributor License Agreement to state so.

Local development setup

Make sure you have Docker installed on your machine and you can run docker commands via your user. After that, clone the repository, install the dependencies, and setup pre-commit hooks:

$ git clone git@github.com:pasteurlabs/tesseract-core.git
$ cd tesseract-core
$ python -m venv venv
$ . venv/bin/activate
$ pip install -e .[dev]
$ pre-commit install

Tests

This project uses the pytest framework for all tests. New code should be covered by new or existing tests.

To run the tests simply run pytest in the root of the project. This will run the entire test suite, including the end-to-end tests that take quite a while to finish. Instead, you can run the tests separately:

$ pytest --skip-endtoend
$ pytest --always-run-endtoend tests/endtoend_tests

Testing philosophy

We follow these principles when writing tests:

  • Prefer end-to-end tests over unit tests — Tests that exercise real Tesseract builds and invocations catch more bugs than isolated unit tests. When in doubt, write an end-to-end test.

  • Avoid mocks where feasible — Mocks can hide real integration issues. If a test requires complex mocking to work, consider whether an end-to-end test would be more valuable.

  • Don't test implementation details — Tests should verify behavior, not internal structure. If refactoring breaks your test but not the actual functionality, the test was too tightly coupled.

  • Be mindful of slow tests — End-to-end tests that build Tesseracts are slow. Before adding a new one, check if an existing test can be extended, or if a faster unit test would suffice for your specific case.

Linting and code quality

We use pre-commit to run linters and formatters automatically before each commit. The hooks are configured in .pre-commit-config.yaml and include:

  • Ruff — Fast Python linter and formatter (replaces flake8, isort, black)
  • Various file checks — Trailing whitespace, YAML validation, etc.

To run all pre-commit hooks manually on all files:

$ pre-commit run --all-files

To run a specific hook:

$ pre-commit run ruff --all-files

Pre-commit also handles automatic dependency updates via Dependabot-style version bumps. These are configured to run periodically in CI.

GitHub workflow

This project uses Git for version control and follows a GitHub workflow. To contribute follow these steps:

  1. Fork the project via the GitHub UI.
  2. Clone your fork to your machine.
  3. Add an upstream remote: git remote add upstream git@github.com:pasteurlabs/tesseract-core.git.
  4. Create a new branch for your code contribution: git switch --create my_branch.
  5. Implement your changes.
  6. Commit and push to your fork: git push --set-upstream origin my_branch.
  7. Open a Pull Request with your changes.

It is a good practice to rebase often on top of main to keep your code up to date with latest development and minimize merge conflicts:

$ git fetch upstream
$ git switch main
$ git merge upstream/main
$ git switch my_branch
$ git rebase main
$ git push --force

Commit and pull request messages guidelines

We follow the Conventional Commits specification for all commits that reach the main branch. Each commit is crafted from a pull request that is squash-merged. The commit title and message comes from the pull request title and message, respectively. As such, they should be structured following the specfication.

The title consists of a type, and optional scope, and a short description: type[(scope)]: description. The types we use are:

  • chore: for changes that affect the build system, external dependencies, or general housekeeping.
  • ci: for changes in the CI.
  • doc: for documentation only changes.
  • feat: for a new feature.
  • fix: for fixing a bug.
  • perf: for a code change that improves performance.
  • refactor: for a code change that neither adds a feature nor fixes a bug.
  • security: for a change that fixes a security issue.
  • test: for adding new tests or fixing existing ones.

The scopes we use are:

  • cli: for changes that affect tesseract CLI.
  • engine: for changes that affect the CLI engine.
  • sdk: for changes that affect the Tesseract Python API.
  • example: for changes in the examples.
  • runtime: for changes in the Tesseract Runtime.
  • deps: for changes in the dependencies.

In case there are breaking changes in your code, this should be indicated in the message either by appending an exclamation mark (!) after the type/scope or by adding a BREAKING CHANGE: trailer to the message.

Versioning

The Tesseract project follows semantic versioning.

Release process

(code owners only)

Releases are done via GitHub Actions, which automatically build the release artifacts and publish them to the GitHub Releases page. To create a new release, follow these steps:

  1. Make sure the code is in a good state, all tests pass, and the documentation is up to date.
  2. Trigger a new release action through the GitHub UI. This opens a new pull request with the release notes and the version number.
  3. Add any additional release notes to the pull request message. They will automatically be included at the top of the release notes.
  4. In the meantime, you can add more commits to main (and update the release branch) which will trigger re-generation of the changelog and release notes.
  5. Once the pull request is ready, merge it into main.
  6. GitHub Actions will then automatically release the new version. Verify that the release artifacts are correctly built and published.
  7. Make an announcement in the Discourse Forum and on social media, if applicable.