When creating the CMake build, make sure to add the -DCMAKE_EXPORT_COMPILE_COMMANDS=1 flag. See its documentation here.
A template project for using aligator with CMake and C++ can be found in the aligator-cmake-example-project repository.
When aligator is installed, the CMake configuration file (aligatorConfig.cmake) provides a CMake function to help users easily create a Python extension module.
Users can write an extension module in C++ for performance reasons when providing e.g. custom constraints, cost functions, dynamics, and so on.
The CMake function is called as follows:
aligator_create_python_extension(<name> [WITH_SOABI] <sources...>)
This will create a CMake MODULE target named <name> on which the user can set properties and add an install directive.
An usage example can be found in this repo.
This project builds some C++ examples and tests. Debugging them is fairly straightforward using GDB:
gdb path/to/executablewith the appropriate command line arguments. Examples will appear in the binaries of build/examples. Make sure to look at GDB's documentation.
If you want to catch std::exception instances thrown, enter the following command once in GDB:
(gdb) catch throw std::exceptionIn order for debug symbols to be loaded and important variables not being optimized out, you will want to compile in DEBUG mode.
Then, you can run the module under gdb using
gdb --args python example/file.pyIf you want to look at Eigen types such as vectors and matrices, you should look into the eigengdb plugin for GDB.
[TODO]
The SolverProxDDP solver is able to leverage multicore CPU architectures.
Before calling the solver make sure to enable parallelization as follows:
In Python:
solver.rollout_type = aligator.ROLLOUT_LINEAR
solver.linear_solver_choice = aligator.LQ_SOLVER_PARALLEL
solver.setNumThreads(num_threads)And in C++:
std::size_t num_threads = 4ul; // for example
solver.rollout_type = aligator::RolloutType::LINEAR;
solver.linear_solver_choice = aligator::LQSolverChoice::PARALLEL;
solver.setNumThreads(num_threads);Aligator uses OpenMP for parallelization which is setup using environment variables in your shell. The settings are local to your shell.
Printing OpenMP parameters at launch:
export OMP_DISPLAY_ENV=VERBOSEPrint when a thread is launched and with which affinity (CPU thread(s) on where it will try to run):
export OMP_DISPLAY_AFFINITY=TRUEOpenMP operates with places which define a CPU thread or core reserved for a thread. Places can be a CPU thread or an entire CPU core (which can have one thread, or multiple with hyperthreading).
export OMP_PLACES ="threads(n)" # Threads will run on the first nth CPU threads, with one thread per CPU thread.or
export OMP_PLACES="{0},{1},{2}" # Threads will run on CPU threads 0, 1 ,2Threads will run on the first nth CPU cores, with one thread per core, even if the core has multiple threads
export OMP_PLACES="cores(n)"For more info on places see here.
Some modern CPUs have a mix of performance (P) and efficiency (E) cores. The E-cores are often slower, hence we should have OpenMP schedule threads on P-cores only.
Get your CPU model with
lscpu | grep -i "Model Name"Get CPU core info with:
lscpu -e
# with an i7-13800H
CPU NODE SOCKET CORE L1d:L1i:L2:L3 ONLINE MAXMHZ MINMHZ MHZ
0 0 0 0 0:0:0:0 yes 5000.0000 400.0000 400.000
1 0 0 0 0:0:0:0 yes 5000.0000 400.0000 400.000
2 0 0 1 4:4:1:0 yes 5000.0000 400.0000 400.000
3 0 0 1 4:4:1:0 yes 5000.0000 400.0000 400.000
4 0 0 2 8:8:2:0 yes 5200.0000 400.0000 400.000
5 0 0 2 8:8:2:0 yes 5200.0000 400.0000 5176.303
6 0 0 3 12:12:3:0 yes 5200.0000 400.0000 1482.743
7 0 0 3 12:12:3:0 yes 5200.0000 400.0000 400.000
8 0 0 4 16:16:4:0 yes 5000.0000 400.0000 3485.561
9 0 0 4 16:16:4:0 yes 5000.0000 400.0000 721.684
10 0 0 5 20:20:5:0 yes 5000.0000 400.0000 1641.311
11 0 0 5 20:20:5:0 yes 5000.0000 400.0000 400.000
12 0 0 6 24:24:6:0 yes 4000.0000 400.0000 400.000
13 0 0 7 25:25:6:0 yes 4000.0000 400.0000 2949.734
14 0 0 8 26:26:6:0 yes 4000.0000 400.0000 2554.695
15 0 0 9 27:27:6:0 yes 4000.0000 400.0000 3588.623
16 0 0 10 28:28:7:0 yes 4000.0000 400.0000 400.000
17 0 0 11 29:29:7:0 yes 4000.0000 400.0000 400.000
18 0 0 12 30:30:7:0 yes 4000.0000 400.0000 400.000
19 0 0 13 31:31:7:0 yes 4000.0000 400.0000 3610.068A little digging on the internet tells us that this CPU has 6 performance cores and 8 efficiency cores for a total of 20 threads. We see higher frequencies in core 0 to 5: these are the performance cores. To use only performance cores on this CPU you would set:
export OMP_PLACES="cores(6)"
# or
export OMP_PLACES="threads(12)"Important
Put your PC in performance mode (usually found in the power settings).
We use google benchmark to define C++ benchmarks which are able to aggregate data from runs, and Flame Graphs to produce a breakdown of the various function calls and their importance as a proportion of the call stack.
If you have the Rust toolchain and cargo installed, we suggest you install cargo-flamegraph. Then, you can create a flame graph with the following command:
flamegraph -o my_flamegraph.svg -- ./build/examples/example-croc-talos-armHere's Crocoddyl's flame graph:
Here's for
aligator::SolverFDDP: