Skip to content

(simple) I/O optimisation for parallel file systems (most notably Lustre) #665

@kostrzewa

Description

@kostrzewa

One of the issues that we have at present with tmLQCD is that the online measurements correspond to many small text files, especially when many samples per config are used (one file per sample). This is poison for something like Lustre.

At the same time, especially when running on a Lustre file system, it makes sense from a performance point of view to set something like

lfs setstripe -C 8 -S 32M .

for the working directory in order to make the reading and writing of the gauge configurations via lemon fast.

The problem is that our small files (online measurements) and large files (gauge configurations) end up in the same directory with the same striping settings.

It makes no sense to have 8 stripes of 32 MB each for files which are a few KB at most.

It would therefore be beneficial to separate the output for online measurements and gauge configurations into subdirectories:

  • workdir
    • confs
    • omeas

or something like that. This would allow for the working directory to use Lustre defaults, the confs subdir to use a striping good for MPI-I/O and the omeas subdir to use either the defaults or a striping setting good for small files.

Further down the line we would ideally have only a single file per gauge configuration for the measurements even if using multiple samples and/or measuring many operators. HDF5 would be a natural format for this.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions