feat: Add multi-stage multi-arch Containerfile#46
Open
FreddyFunk wants to merge 1 commit intoNVIDIA:mainfrom
Open
feat: Add multi-stage multi-arch Containerfile#46FreddyFunk wants to merge 1 commit intoNVIDIA:mainfrom
FreddyFunk wants to merge 1 commit intoNVIDIA:mainfrom
Conversation
Add a Containerfile with three build stages (python-export, cpp-build, runtime) supporting x86_64 and aarch64, along with a .containerignore and container usage documentation. Relates to NVIDIA#32 Signed-off-by: Frederic Laing <dev@fredfunk.tech>
Collaborator
|
Thanks a lot for your MR! TensorRT Edge-LLM were committing to provide some Docker recommendations, but not with a fixed Dockerfile now as Dockerfiles need careful maintenance and infra costs. The team will re-evaluate the need for this and evaluate whether this MR should be merged. |
Collaborator
|
Thanks for your PR! Our team will evaluate it later on whether this MR's approach is good, or to create DockerFiles for the environment. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What does this PR do?
Adds a multi-stage, multi-arch
Containerfilethat produces two image targets: apython-exportimage for model quantization and ONNX export and a leanerruntimeimage with only the C++ binaries for engine building and inference. No GPU is required at build time, enabling CI and headless server builds. Also adds a documentation page covering build instructions, usage examples and available export tools.Relates to #32. The file is named
Containerfilerather thanDockerfileto reflect that it is an OCI-compliant container image, compatible with both Podman and Docker.Type of change: ?
new feature, documentation
Overview: ?
Usage
Testing
All three image builds verified on x86_64 host with a RTX 3090 and a x84_64 host without a NVIDIA GPU :
podman build -t tensorrt-edgellm .--> builds successfully (10.4 GB)podman build --target python-export -t tensorrt-edgellm-export .--> builds successfully (19 GB)podman build --platform linux/arm64 -t tensorrt-edgellm-arm64 .--> cross-builds successfully via qemu-user-static (10.4 GB, arm64)Before your PR is "Ready for review"
Additional Information
I verified arm64 container builds via cross compiling but I do not have any arm64 hardware capable of connecting to a NVIDIA GPU available. I expect everything to work just fine, but I suggest that someone with access to arm64 hardware test this before merging this pull request.