bigfile-diff-compare

A simple text file comparison utility for extremely large text files that fail under diff

I was moving large numbers of files from drive to drive and due to boring problems not worth repeating here, a trivial rsync was not cutting it, so at some point I had to bite the bullet and get a file listing of the from and to directories, and compare the listings with 'diff'. Turned out that linux 'diff' was failing silently due to memory issues, and none of the alternatives that I read about at stackexchange etc were working that well either. So... I wrote my own file comparison program. The input text files are memory mapped, and the internal data is a pointer per line which is taken off the heap in two large calloc calls. It works and it's fast enough. The comparison logic isn't super clever but it's good enough for all the real-world problems I've used it for so far. I expect this to be used when there are relatively small differences in extremely large files.

Perhaps its biggest infelicity is that it will synchronise on a blank line. If that just isn't acceptable to you I think the fix would be around the call to "Match". I added a new program, 'smallcompare', later, which uses a different algorithm to produce better differences, but it is too slow for using on really large files. I just include it here in case it's useful but honestly you might as well use 'diff' for anything that smallcompare would be suitable for.

The code is a bit crude and it's sort of based on my recollection of Hamish Dewar's far more elegant "compare" program from Edinburgh University back in the 60's.

Graham Toal 202308290145

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
LICENSE		LICENSE
README.md		README.md
bigcompare.c		bigcompare.c
smallcompare.c		smallcompare.c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

bigfile-diff-compare

About

Releases

Packages

Languages

License

gtoal/bigfile-diff-compare

Folders and files

Latest commit

History

Repository files navigation

bigfile-diff-compare

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages