Releases: hpc/mpifileutils
v0.11.1
New features:
- Release tarball to package mpiFileUtils with appropriate versions of LWGRP, DTCMP, and libcircle to simplify builds. See https://mpifileutils.readthedocs.io/en/latest/build.html#cmake
- dcp, dsync: added
--xattrs={none, all, non-lustre, libattr}
option for better control when copying extended attributes. Lustre extended attributes are no longer copied by default. Copying of extended attributes is now independent of the--preserve
option to copy owner, group, permissions, and timestamps. #503
New DAOS features:
- New
daos-serialize
anddaos-deserialize
tools for moving containers to/from HDF5 format. - Support for paths formatted as
daos://<pool>/<cont>/[path]
Bug Fixes:
- libmfu: incorrect item count in
mfu_flist_copy
progress message #497 - libmfu: segfault in strmap #501
Recommended:
- liblwgrp v1.0.4
- libdtcmp v1.1.4
- libcircle v0.3
- libarchive v3.5.1
v0.11
New features:
- libmfu: mfu_flist_archive function to create and extract tar (tape archive) files
- libmfu: updated default I/O buffer and chunk sizes from 1MB to 4MB
- libmfu: define major, minor, patch versioning on libmfu.so
- dcmp, dcp, dsync: renamed
--synchronous
to--direct
as the option to enableO_DIRECT
- dcmp, dsync: added
--chunksize
and--blocksize
options to size work units when slicing files - dcp: log more errors and
return 1
frommain
on error - dcp, dsync, dwalk: added
--dereference
option for dereferencing symbolic links - dtar: promoted from experimental to production tool
New DAOS features:
- dcmp, dcp, dsync: added DAOS support for DAOS POSIX containers
- dcp: added DAOS support for generic DAOS containers
- see DAOS-Support.md for documentation on DAOS usage
Bug Fixes:
- libmfu: improved
O_DIRECT
buffer and file offset alignments when copying files - libmfu: compile bug to define
SYS_getdents
onaarch64
systems
Recommended:
- liblwgrp v1.0.3
- libdtcmp v1.1.1
- libarchive v3.5.1
v0.10.1
This is primarily a bug fix release of v0.10. Among other improvements, it includes fixes to support newer Lustre, Open MPI, and GCC versions.
New features:
- dchmod: check for CAP_FOWNER and CAP_CHOWN to reduce need for
--force
option
Bug fixes:
- libmfu: drop dead Lustre-specific code that prevented compilation with Lustre v2.13 and newer
- libmfu: convert deprecated MPI functions to use newer
MPI_Comm_create_keyval
,MPI_Comm_set_attr
, andMPI_Comm_get_attr
- libmfu: support empty lists in
mfu_flist_sort
- libmfu: switch from "external32" to "native" MPI I/O data representation to better support Open MPI v4.0.3, and add error checking around MPI I/O calls
- libmfu: pad short path names with null when writing binary cache files
- libmfu: avoid multiple definitions of global variables to allow easier builds with the GCC v10.1 compiler
- libmfu: update file offset alignment, buffer alignment, and transfer size for improved O_DIRECT support in
mfu_flist_copy
- dfilemaker: correct usage message
- dfind: support file names containing spaces in
--exec {}
substitutions and avoid seg fault when missing terminating;
v0.10
New features:
- libmfu: reduced verbosity of debug messages when copying GPFS ACLs
- libmfu: tweaked lite walk progress to count directories after reading them, rather than discovering them
- libmfu: include item rate in walk progress messages
- libmfu: added mfu_progress_start/update/complete functions to periodically print progress information
- progress messages enabled in dchmod, dcmp, dcp, dreln, drm, dstripe, dsync, dwalk, see new
--progress
option for more details - dchmod: enable
--user
and--group
options to accept numeric user id and group id values in addition to names - dchmod: added algorithm to avoid stat on walk when stat info is not needed
- dchmod: skip chown/chmod calls on items that do not need to be changed
- dchmod: added
--force
option to always call chown/chmod - dchmod: added
--silent
option to suppress EPERM error messages - dcp, dcp1: added
--chunksize
and--blocksize
options to control slicing of large files during copy - drm: added
--stat
,--text
, and--output
options to record list of files drm attempts to remove - dsync: added
--link-dest
option to create hardlinks to conserve storage space and inodes during incremental backups - dsync: added
--sparse
option to write sparse files
Bug Fixes:
- libmfu: fix call to segmented scan leading to false positives and false negatives when detecting file content differences in
dcmp
anddsync --contents
- dfind: mask file mode with S_IFMT instead of file type to avoid collisions between different file types, previously
-type f
would also return symlinks
Requires:
- libcircle v0.3 or higher
v0.9.1
This is primarily a bug fix release for v0.9, but we also promote dbz2 to be a released tool.
New features:
- libmfu: added functions for parallel compression / decompression of dbz2 files
- dbz2: compress and decompress a large file in parallel. Thanks to Ahana Roy Choudhury for contributing the original dbz2 implementation using libcircle for parallel compression and decompression and for proposing the original dbz2 file format that facilitates parallel decompression.
- dwalk:
--file_histogram
option prints a default size histogram - dwalk:
--text
option is now documented. Use this option with--output
to create a text file. - per common user request, tools now verbose by default, use new
--quiet
option to silence - simplified informational messages to be more concise
- many tools have had their usage and output text updated
Bug fixes:
- cmake: rpaths now supported in both build and install directories
- cmake: tests added to enable Lustre APIs
- cmake: add path to find FindGPFS.cmake when using
-DENABLE_GPFS=ON
- libmfu: fixed bug in cache file format that was introduced in v0.9
- dbcast: fixed to no longer create target directory if it already exists
- dfind: output file was mistakenly hardcoded to write to text format
- dwalk:
--distribution
option led to a hang when using incorrect syntax
v0.9
We've officially converted to CMake!
Instructions for building are here: https://mpifileutils.readthedocs.io/en/v0.9/build.html
New Features:
- dcmp: include nanoseconds when comparing timestamps
- dcmp: new
--lite
option to compare files based on file type, file size, and modification time rather than file content - drm: new
--aggressive
option to delete files while walking - dsync: default behavior no longer deletes files at the destination, deleting now requires new
--delete
option - dsync: optionally copies in batches with
--batch-files
option as form of self-checkpointing long running dsync jobs - drm: fix segfault when deleting a large number of files
- libmfu: avoid problematic MPI I/O external32 for more consistent file format
- libmfu: support for GPFS ACLs
New Tools:
- dfind: filters file list based on different criteria
- dreln: update symlinks whose targets use absolute paths, useful after dsync
v0.8.1
v0.8
New features:
- dchmod: added --owner option to change user on files
- dchmod: fixed bug when using 'a' option in symbolic notation
- dcmp: added expressions to compare permissions and ACL on files
- dcp: fix bug that prevented some subdirectories from being created
- dcp1: original dcp from LANL, may be faster than dcp for directory trees with lots of small files, a similar algorithm to be merged into dcp in future release
- dfilemaker: updated to create multiple directory levels, files of different sizes with random content, and symlinks
- dfilemaker1: dfilemaker from v0.7, to be merged into dfilemaker in future release
- dstripe: added support for Lustre 2.5
- dsync: new tool to synchronize one directory tree with another (good for backups or completing partial copies)
- libmfu: added API calls to define new list elements and set their properties
- libmfu: added function to write file list to text file
New experimental tools (work in progress):
- dbz2: compress a single file with bz2
- dfind: filter file lists with find-like tests
- dgrep: parallel grep
- dtar: parallel tar (incomplete)
Known bugs:
- dtar uses libarchive that assumes a single process is writing the tar archive. It likely will generate corrupt tar archives if run with more than one process. This will be fixed in a future release.
- The binary format for reading and writing filelists to files uses MPI I/O external32 data representation. This has proved to be buggy across MPI implementations. In a future release, mpiFileUtils will be changed to read/write these files with POSIX I/O. If a work around is needed, one may change "external32" to "native" in src/common/mfu_flist_io.c.
- dsync --contents has a bug when computing its offset during lseek for overwriting an existing file (fixed in v0.8.1)
Updated dependency:
- Requires update of DTCMP from v1.0.3 to v1.1.0 for new DTCMP_Segmented_scanv/exscanv calls.
md5sum: 1082600e7ac4e6b2c13d91bbec40cffb
v0.7
- dbcast: now creates destination directory before broadcasting file
- dbcast: added --size option to set file segment size
- dchmod: process umask if user provides no ugoa letter in symbolic notation
- dcmp: added --sync option to synchronize source and target directory trees
- dsh: added support for wildcard filters, limit ls output to 100 items by default
- dsh: added --output and --file options to save modified flist on exit
- dstripe: now recursively processes files under a directory
- dstripe: added --report to print file striping info
- dstripe: added --count and --size options to set stripe parameters
- dstripe: added --minsize option to only process files above a certain size
- libmfu: moved copy logic from dcp to new mfu_flist_copy routine
- libmfu: write flist cache files using multiple stripes, one per MPI rank
- configure: fixed --enable-experimental builds
Known bugs:
- using an input list with dcp is broken in v0.7, but this is patched on the main branch
md5sum: c081f7f72c4521dddccdcf9e087c5a2b
v0.6
This is an intermediate release along the roadmap for v1.0.
- dchmod: added dchmod to set access permissions and group
- added node2 test suite and integrated with travis.ci
- dcmp: parallelized compare for large data files (now in addition to large directories)
- dcp: promote dcp2 to dcp, renamed dcp to dcp1
- renamed common library from libbayer to libmfu (mpiFileUtils)
- moved significant code to libmfu
- numerous bugs fixes and other enhancements