Performance Considerations

For optimal performance in writing .zarrmd trajectories, we reccomend using Zarrtraj with the following settings:

Set n_frames to the number of frames being written

By default, Zarrtraj will allocate ~12MB chunks at a time for the output file and deallocate the unused memory by resizing the underlying Zarr dataset when the writer writes the last frame, but providing the n_frames kwarg allows Zarrtraj to allocate only the memory that is needed. This will boost writing speed and reduce memory overhead when writing small trajectories.

Set precision=3

Under the hood, this kwarg is creating a numcodecs.quantize.Quantize filter to reduce the precision of floating point data in the .zarrmd file to the number of digits specified. 3 decimal places is the default precision for XTC and should be sufficient in the majority of cases.

Use compressor=numcodecs.Blosc(cname="zstd", clevel=9)

From early prototyping, this compressor was found to provide the best compression ratio for zarrmd trajectory data. While further benchmarking and experimentation is needed, this setting in addition to precision=3 provides the closest compression to XTC achieved thus far.

Example

import numcodecs
import zarrtraj
import MDAnalysis as mda
from MDAnalysisTests.datafiles import PSF, DCD

u = mda.Universe(PSF, DCD)

with mda.Writer(
    "test.zarrmd",
    n_atoms=u.trajectory.n_atoms,
    n_frames=u.trajectory.n_frames,
    precision=3,
    compressor=numcodecs.Blosc(cname="zstd", clevel=9),
) as W:
    for ts in u.trajectory:
        W.write(u)