Skip to content

Add section on preventing too much parallism/high memory use to the docs #437

@bouweandela

Description

@bouweandela

The default Dask settings may result in too much parallism and/or high memory use when running multiple executions in parallel on the same machine.

ESMValTool

ESMValCore works best with the Dask Distributed scheduler. I would also recommend setting max_parallel_tasks (an ESMValCore setting), to a low number, e.g. 1, 2 or 3, because only one ESMValCore preprocessing task will submit jobs to the Distributed scheduler at a time to avoid overloading the workers. Therefore, I would recommend running with the following settings for ESMValCore:
~/.config/esmvaltool/config.yaml:

max_parallel_tasks: 2
dask:
  use: local_distributed
  profiles:
    local_distributed:
      cluster:
        type: distributed.LocalCluster
        n_workers: 2
        threads_per_worker: 2
        memory_limit: 4GiB

where the total memory use will be n_workers * memory_limit (= 8GB in the example above), so you can reduce n_workers if you need it to use less memory. I would recommend using at least 4 GB or RAM per Dask Distributed worker. Some jobs might be able to run with 2GB, but probably not all of them. You can tune the total memory / CPU use by specifying the number of workers. The number of CPU cores used will be n_workers * theads_per_worker.

For more information on configuring ESMValCore, see https://docs.esmvaltool.org/projects/ESMValCore/en/latest/quickstart/configure.html

PMP and ilamb

Both PMP and ilamb also use Dask through Xarray, but we've seen crashes because of that, e.g. #394 and #385 (comment). For PMP and ilamb I would recommend using the following global Dask configuration, to avoid the Dask crashes:
~/.config/dask/config.yaml

scheduler: synchronous

or, if you want faster runs, at least control the number of workers that gets started (by default the is the number of CPU cores, which is way too much if you are running other executions on the same machine because each job will use this many):
~/.config/dask/config.yaml

scheduler: threads
num_workers: 4

Note that the ilamb diagnostics are currently restricted to the synchronous scheduler by

# Run ILAMB in a single-threaded mode to avoid issues with multithreading (#394)
with dask.config.set(scheduler="synchronous"):
, so it will not respect the global Dask settings.

For more information on configuring Dask, see:

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions