-
Notifications
You must be signed in to change notification settings - Fork 145
Home
better-files
is a dependency-free pragmatic thin Scala wrapper around Java NIO.
Consult the changelog if you are upgrading your library.
Imagine you have to write the following method:
- List all
.csv
files in a directory by increasing order of file size - Drop the first line of each file and concat the rest into a single output file
- Split the above output file into
n
smaller files without breaking up the lines in the input files -
gzip
each of the smaller output files
Note: Your program should work when files are much bigger than memory in your JVM and must close all open resources correctly
The above task is not that easy to write in Java or shell or Python without a certain amount of Googling. Using better-files, the above problem can be solved in a fairly straightforward way:
import better.files._
def run(inputDir: File, outputDir: File, n: Int) = {
val count = new AtomicInteger()
val outputs = Vector.tabulate(n)(i => outputDir / s"part-$i.csv.gz")
for {
writers <- outputs.map(_.newGzipOutputStream().printWriter()).autoClosed
inputFile <- inputDir.list(_.extension == Some(".csv")).toSeq.sorted(File.Order.bySize)
line <- inputFile.lineIterator.drop(1)
} writers(count.incrementAndGet() % n).println(line)
}
Ask in our gitter channel or file an issue with the question tag
YourKit supports better-files with its full-featured Java Profiler. YourKit, LLC is the creator of YourKit Java Profiler and YourKit .NET Profiler, innovative and intelligent tools for profiling Java and .NET applications.