-
Notifications
You must be signed in to change notification settings - Fork 68
Design
Adam Moody edited this page Oct 14, 2016
·
9 revisions
- long running jobs: report progress to user, continue where left off after interruption (checkpoint/restart) and provide common method to halt job
- invoke standard linux tools where possible, e.g., grep
- parallel techniques: master/worker, distributed queue, distributed task graph
- define common file formats for input / output between tools
- posix i/o wrappers to retry on non-fatal errors (e.g., EINTR)
- component to manipulate paths (e.g., basename, dirname, transform /a/b/../c// into /a/c)
- abstraction for file meta data (stat data) to access fields and transfer between procs
- API to read / write file meta data structures to files
- API to filter and sort file meta data structures
- parallel directory walk
- parallel pipe from one tool to another
- list
- find
- copy
- rsync
- remove
- tar/zip
- grep
- compare
- Lustre
- Panasas
- GPFS
- NFS
- PLFS
- SCR
- ADIOS
Reading through the tar code today to see how it handles xattrs and came across this as an answer to the sub-second timestamps... tar uses functions like get_stat_atime() defined in stat-time.h to fetch the timestamp from a stat structure: http://www.gnu.org/software/gnulib/coverage/gllib/stat-time.h.gcov.frameset.html Then it uses utimensat() to set the timestamps.