Skip to content

8. Class Reference

Maksymilian edited this page Dec 9, 2024 · 22 revisions

Work in progress

Processor Class:

The Processor class is the core component responsible for identifying and grouping duplicate files based on the multi step processing workflow. It utilizes various algorithms and grouping strategies to efficiently process and classify files into sets of similar files.


Constructors:


Processor(Grouper grouper, Collection<Algorithm<?>> algorithms)

  • Parameters:
    • grouper - A Grouper instance to perform the initial division of files based on a distinction predicate (e.g., CRC32 checksum).
    • algorithms - A collection of Algorithm objects applied to the files during the "Algorithm Application" step. The order of the algorithms matter for the processing.
  • Throws:
    • NullPointerException - If either grouper, or algorithms are null, or the algorithm collection is empty or contains null elements.
  • Purpose: Initializes the Processor with the provided grouping strategy and set of algorithms for processing the files.

Methods:


Map<File, Set<File>> process(@NotNull Collection<@NotNull File> files) throws IOException

  • Parameters:

    • files - A collection of File objects to be processed. Typically, these files are of the same type (e.g., images) and are grouped based on similarity.
  • Returns:

    • A Map where the key is a file considered the "original" in a group of similar files, and the value is a set of files considered duplicates or similar files.
  • Throws:

    • NullPointerException - If the input collection contains null or is null.
    • IOException - If any I/O error occurs during processing.
  • Purpose: This method processes the input collection of files through the following steps:

    1. Initial Division: Files are divided into subsets based on a distinction predicate.
    2. Algorithm Application: A series of algorithms is applied to the subsets to refine the grouping further.
    3. Original File Identification: The first file in each group is identified as the "original", and the groups are reorganized accordingly.

private Set<Set<File>> algorithmsApplication(@NotNull Set<Set<File>> groupedFiles) throws IOException

  • Parameters:

    • groupedFiles - A set of sets of files, where each set represents a group of similar files.
  • Returns:

    • A new set of sets of files after applying all algorithms and consolidating the groups.
  • Throws:

    • IOException - If any error occurs during the algorithm application.
  • Purpose: This method applies each algorithm in the algorithms collection to the grouped files and consolidates the results by merging groups with identical keys and removing groups with only one file.


private <T> Map<T, Set<File>> applyAlgorithm(@NotNull Algorithm<T> algorithm, @NotNull Set<Set<File>> groupedFiles)

  • Parameters:

    • algorithm - The Algorithm to apply the grouped files.
    • groupedFiles - A set of sets of files to process with the algorithm.
  • Returns:

    • A Map where the key is the characteristic (e.g., perceptual hash or CRC32 checksum) and the value is a set of files sharing that characteristic.
  • Purpose: This method applies a single algorithm to the grouped files and returns a map of results.


private Set<Set<File>> postAlgorithmConsolidation(@NotNull Map<?, Set<File>> algorithmOutput)

  • Parameters:

    • algorithmOutput - A map containing the results of the algorithm application, where the key is a shared characteristic and the value is a set of files that share that characteristic.
  • Returns:

    • A set of sets of files after consolidating the results by removing groups with only one file and merging groups with identical keys.
  • Purpose: This method consolidates the results of an algorithm by eliminating groups that contain only one file and merging groups with identical keys.


private Map<File, Set<File>> originalDistinction(@NotNull Set<Set<File>> groupedFiles)

  • Parameters:

    • groupedFiles - A set of sets of files representing groups of similar files.
  • Returns:

    • A new Map where:
      • The key is the "original" file (the first file in each group).
      • The value is a Set of files considered duplicates or similar files.
  • Throws:

    • NullPointerException - If groupedFiles contains null.
  • Purpose: This method identifies the "original" file in each group and reorganizes the groups into a map, where each key is the original file and each value is a set of similar files (including the original file itself).


private Set<File> consolidate(@NotNull Set<File> s1, @NotNull Set<File> s2)

  • Parameters:

    • s1 - The first set to merge.
    • s2 - The second set to merge.
  • Returns:

    • A new set containing all elements from both s1 and s2.
  • Purpose: This method merges two sets into one, ensuring that all elements from both sets are included.


Logger:


  • The Processor class uses a Logger instance (logger) from the SLF4J API to log messages during the various stages of file processing. For example, it logs the start of processing, division of files, application of algorithms, and the identification of original files.

Usage Example:


Grouper grouper = new Crc32Grouper();
List<Algorithm<?>> algorithms = List.of(new PerceptualHash(), new PixelByPixel());
Processor processor = new Processor(grouper, algorithms);

Collection<File> files = List.of(new File("image1.jpg"), new File("image2.jpg"));
Map<File, Set<File>> result = processor.process(files);
result.forEach((original, duplicates) -> {
    System.out.println("Original: " + original);
    duplicates.forEach(duplicate -> System.out.println("  Duplicate: " + duplicate));
});

Algorithms Package:

Algorithm Interface:

The Algorithm interface represents a functional abstraction for an algorithm that operates on a set of files, dividing them into smaller subsets based on some shared characteristic, resulting in a map where each key corresponds to a group of files that share that characteristic.


Methods:


Map<K, Set<File>> apply(Set<File> group)

  • Parameters:

    • group - A Set of File objects to be processed by the algorithm. These files are typically of the same type (e.g., images), and the goal is to group them based on some shared characteristic.
  • Returns:

    • A Map where each key (K) corresponds to a set of files that share the same characteristic (e.g., checksum, hash, metadata). The key is computed from the shared property of the files.
  • Purpose:

    • This method applies the algorithm to the given set of files, partitioning them into smaller groups. Each group corresponds to a characteristic shared by all the files in the group. For example, the characteristic could be a checksum, perceptual hash, or file size.
    • The key used to map each set of files should be deterministic. This means that the same group of files will always produce the same output map when the algorithm is applied, ensuring consistency in grouping.

Functional Interface:


  • The Algorithm interface is marked with the @FunctionalInterface annotation, indicating that it is designed to be used with lambda expressions or method references.
  • Purpose of Functional Interface:
    • It can be easily implemented using a lambda expression or a method reference, allowing flexibility in defining various algorithms for grouping files. This allows for the easy application of different strategies, such as comparing files based on checksum, perceptual hash, or other metrics.

Usage Example:


Algorithm<String> checksumAlgorithm = (group) -> {
    // Example algorithm logic to group files by checksum (dummy implementation)
    Map<String, Set<File>> result = new HashMap<>();
    for (File file : group) {
        String checksum = getChecksum(file); // Example method to calculate checksum
        result.computeIfAbsent(checksum, k -> new HashSet<>()).add(file);
    }
    return result;
};

Set<File> files = new HashSet<>(List.of(new File("file1.txt"), new File("file2.txt")));
Map<String, Set<File>> groupedFiles = checksumAlgorithm.apply(files);

PerceptualHash Class

The PerceptualHash class implements the Algorithm interface and is used to compute perceptual hashes for images. This class groups similar images by generating and comparing perceptual hashes, which are unique identifiers derived from an image's content. The process of generating these hashes allows for comparing images based on their visual similarities rather than their exact content, making it useful for image de-duplication or similarity detection.


Methods:


Map<String, Set<File>> apply(@NotNull Set<File> group)

  • Parameters:

    • group - A Set of File objects representing the images to be processed. Each image will be grouped based on its perceptual hash.
  • Returns:

    • A Map where each key is a perceptual hash (a String), and each value is a Set of files that share the same hash, representing images that are visually similar.
  • Purpose:

    • The method processes each image in the input set by resizing it, extracting its pixel values, applying the Discrete Cosine Transform (DCT), and generating a perceptual hash. The images are then grouped based on these hashes, and the result is returned as a map. Images with the same perceptual hash are considered similar and grouped together.

@NotNull private BufferedImage resize(@NotNull File file)

  • Parameters:

    • file - The image File to be resized.
  • Returns:

    • A BufferedImage that is resized to 8x8 pixels and converted to grayscale.
  • Purpose:

    • Resizes the image to a fixed 8x8 pixel size to standardize it for hash generation. The image is also converted to grayscale to simplify the process and reduce the detail that could interfere with the hash calculation.

private double[][] extractSample(BufferedImage image)

  • Parameters:

    • image - A BufferedImage that has already been resized.
  • Returns:

    • A 2D double array representing the pixel values of the image.
  • Purpose:

    • Extracts the pixel values from the resized image and stores them in a matrix (2D array), which will be used in further steps for hash generation.

private String buildHash(double[][] matrix)

  • Parameters:

    • matrix - A 2D double array representing the pixel values of the image.
  • Returns:

    • A String representing the perceptual hash of the image, generated by comparing each pixel with the average value of the matrix.
  • Purpose:

    • Constructs a binary string (the perceptual hash) by comparing each pixel's value with the average value of all pixels in the matrix. If a pixel's value is greater than the average, it is marked as '1'; otherwise, it is marked as '0'.

private double getAvg(double[][] matrix)

  • Parameters:

    • matrix - A 2D double array representing the pixel values of the image.
  • Returns:

    • The average pixel value of the matrix, computed across all pixels.
  • Purpose:

    • Calculates the average value of the pixel values in the matrix, which is used in the buildHash method to compare each pixel against the average for hash generation.

Description:


The PerceptualHash class generates perceptual hashes for images, allowing for the grouping of similar images. The algorithm works by performing several steps:

  1. Resize: Each image is resized to 8x8 pixels.
  2. Extract Sample: The pixel values of the resized image are extracted into a matrix.
  3. Discrete Cosine Transform (DCT): The DCT is applied (via the DCT::apply method) to reduce high-frequency components and focus on the low-frequency ones.
  4. Generate Hash: A hash is created by comparing the pixel values with the average value of the matrix, where pixels greater than the average are marked as 1, and those below the average are marked as 0.
  5. Group by Hash: Images are then grouped by their perceptual hashes. Images with the same hash are considered visually similar.

The final result is a map of perceptual hashes, where the key is the hash and the value is a set of images that share that hash.


Usage Example:


Set<File> images = new HashSet<>(List.of(new File("image1.jpg"), new File("image2.jpg")));
PerceptualHash perceptualHashAlgorithm = new PerceptualHash();
Map<String, Set<File>> groupedImages = perceptualHashAlgorithm.apply(images);

groupedImages.forEach((hash, files) -> {
    System.out.println("Hash: " + hash);
    files.forEach(file -> System.out.println("  " + file.getName()));
});

This example shows how to apply the PerceptualHash algorithm to a set of image files. The result is a map where images with the same perceptual hash are grouped together, indicating that they are visually similar.


PixelByPixel Class

The PixelByPixel class implements the Algorithm interface and is used for image matching based on pixel-by-pixel comparison. The goal of this algorithm is to group identical images from a set by comparing them at the pixel level. It efficiently handles large datasets using parallel processing and caching to optimize performance.


Methods:


Map<File, Set<File>> apply(Set<File> group)

  • Parameters:

    • group - A Set of File objects representing the images to be processed.
  • Returns:

    • A Map where each key is a file, and the corresponding value is a set of files that are identical to the key file. Images that are pixel-identical are grouped together. And value set contain the key value.
  • Purpose:

    • This method processes a group of image files, comparing them pixel-by-pixel to identify identical images. The images are grouped by their pixel-level equivalence and stored in a map for the result. It utilizes a queue to manage image files and processes them in parallel.

private void process(@NotNull Map<File, Set<File>> result, @NotNull Queue<File> groupQueue)

  • Parameters:

    • result - A mutable Map that stores the groups of identical images.
    • groupQueue - A mutable Queue containing the files to be processed.
  • Purpose:

    • This method iterates through the queue, selecting a "key" image and comparing it to the other images in the queue. Identical images are removed from the queue and grouped together. The grouping is done in parallel to speed up processing.

private BufferedImage getCachedImage(@NotNull File file)

  • Parameters:

    • file - The image File to retrieve from the cache.
  • Returns:

    • A BufferedImage corresponding to the given file.
  • Purpose:

    • This method retrieves an image from the AdaptiveCache. If the image is not found in the cache, it will be loaded from the disk and added to the cache for future use. This avoids repeatedly reading the same image from disk, optimizing performance.

private boolean compareImages(@NotNull BufferedImage img1, @NotNull BufferedImage img2)

  • Parameters:

    • img1 - The first image to compare.
    • img2 - The second image to compare.
  • Returns:

    • true if the images are identical pixel-by-pixel, otherwise false.
  • Purpose:

    • This method compares two images by first checking if their dimensions match. If the dimensions are the same, it then compares the raw pixel data by examining the byte data of the image's raster. If the byte data matches, the images are considered identical.

Description:


The PixelByPixel class is an image comparison algorithm that uses an exact, pixel-by-pixel method to identify identical images. It operates as follows:

  1. Load Images: It loads images from disk using a cache (via the AdaptiveCache class). If an image is not found in the cache, it is loaded from the disk and added to the cache.
  2. Pixel Comparison: The images are compared pixel by pixel, checking if they are exactly identical.
  3. Group Identical Images: Images that are identical (based on pixel comparison) are grouped together in a Map. The key of the map is the original image, and the value is a set of identical images.
  4. Parallel Processing: The comparison is parallelized to speed up processing, especially when handling large datasets of images.
  5. Cache Usage: The AdaptiveCache is used to optimize the image loading process, reducing the need to reload images repeatedly.

The algorithm assumes that all images in the group have the same resolution and format.


Usage Example:


Set<File> imageFiles = new HashSet<>(Arrays.asList(file1, file2, file3));
PixelByPixel algorithm = new PixelByPixel();
Map<File, Set<File>> result = algorithm.apply(imageFiles);

result.forEach((key, identicalImages) -> {
    System.out.println("Original Image: " + key.getName());
    identicalImages.forEach(file -> System.out.println("  Identical Image: " + file.getName()));
});

In this example, the PixelByPixel algorithm is applied to a set of image files. The result is a map where each key is an image file, and the value is a set of images that are identical to the key image. The images are grouped based on pixel-by-pixel comparison.


Math Package:

DCT Class

The DCT class is responsible for applying the Discrete Cosine Transform (DCT) and quantization to a given matrix of image coefficients. It serves as a pipeline that combines both transformations sequentially to prepare image data for perceptual hashing, compression, or other applications.


Dependencies:


  • pl.magzik.algorithms.math.dct.Transformer - Handles the Discrete Cosine Transform operation.
  • pl.magzik.algorithms.math.dct.Quantifier - Handles quantization of the DCT coefficients.

Constructor:


private DCT(Quantifier quantifier, Transformer transformer)

  • Parameters:
    • quantifier - The Quantifier instance, that handles quantization of the DCT coefficients.
    • transformer - The Transformer instance, that handles the Discrete Cosine transform operation.

Methods:


static double[][] apply(double[][] matrix)

  • Parameters:

    • matrix - A 2D array of doubles representing the input matrix (e.g., grayscale pixel values from an image).
  • Returns:

    • A 2D array of doubles representing the quantized DCT coefficients.
  • Purpose:

    • This method performs both DCT and quantization in sequence.
    • It creates new instances of Quantifier and Transformer, initializes the DCT pipeline, and processes the input matrix.
  • How It Works:

    • Step 1: The matrix is passed to the transform method of Transformer, which applies the Discrete Cosine Transform.
    • Step 2: The resulting DCT coefficients are passed to the quantize method of Quantifier, which reduces their precision.
    • Step 3: The final quantized matrix is returned as output.

private double[][] applyInternal(double[][] matrix)

  • Parameters:

    • matrix - A 2D array of doubles representing the input matrix.
  • Returns:

    • A 2D array of double representing the quantized DCT coefficients.
  • Purpose:

    • This method is used internally to apply the transformation pipeline.
    • It first calls the transform method of the Transformer instance to compute the DCT.
    • Then, it calls the quantize method of the Quantifier instance to apply quantization.

DCT package:


Quantifier Class:

The Quantifier class is responsible for performing quantization on a matrix of DCT (Discrete Cosine Transform) coefficients.


Constructors:

Quantifier(int[][] quantizationMatrix)

  • Parameters:
    • quantizationMatrix - A 2D integer array representing the quantization matrix.

Quantifier()

  • Default Matrix: The default matrix follows the JPEG standard for 8x8 blocks:
{ {16, 11, 10, 16, 24, 40, 51, 61},
  {12, 12, 14, 19, 26, 58, 60, 55},
  {14, 13, 16, 24, 40, 57, 69, 56},
  {14, 17, 22, 29, 51, 87, 80, 62},
  {18, 22, 37, 56, 68, 109, 103, 77},
  {24, 35, 55, 64, 81, 104, 113, 92},
  {49, 64, 78, 87, 103, 121, 120, 101},
  {72, 92, 95, 98, 112, 100, 103, 99} };

Methods:

double[][] quantize(double[][] coeffs)

  • Parameters:

    • coeffs - A 2D double array representing the matrix of DCT coefficients.
  • Returns:

    • A 2D double array of quantized coefficients.
  • Throws:

    • IllegalArgumentException - If the dimensions of the input coeffs matrix don't match the quantization matrix dimensions.
  • Purpose:

    • Applies quantization to the given matrix of DCT coefficients using the quantization matrix.
    • Each DCT coefficient is divided by its corresponding quantization value and then rounded.
  • How It Works:

    • Loops through each value in the coefficient matrix.
    • Divides each coefficient by the corresponding value in the quantization matrix.
    • Rounds the result and stores it in a new matrix.

Transformer Class:

The Transformer class provides methods to perform the Discrete Cosine Transform (DCT) on both 1D vectors and 2D matrices. It leverages the efficient JTransforms library to compute DCT operations.


Methods:

double[] transform(double[] vector)

  • Parameters:

    • vector - A 1D array of double values to be transformed.
  • Returns:

    • A new double[] array containing the transformed values.
  • Purpose:

    • Computes the 1D DCT for the given vector using the DoubleDCT_1D class from the JTransforms library.
  • How It Works:

    • Clones the input vector to avoid mutating the original data.
    • Initializes a DoubleDCT_1D object with the vector`s length.
    • Calls forward() with scaling = true to normalize the result.

double[][] transform(double[][] matrix)

  • Parameters:

    • matrix - A 2D array of double values representing the input data.
  • Returns:

    • A new 2D double[][] array containing the transformed values.
  • Purpose:

    • Performs a 2D DCT on the input matrix. This is achieved by:
      1. Applying a 1D DCT to each row of the matrix.
      2. Applying a 1D DCT to each column of the intermediate result.
  • How It Works:

    1. Copies the input matrix into new transformed array.
    2. Transforms each row using the transform(double[] vector) method.
    3. Extracts each column, transforms it, and writes back the result into the matrix.

Cache Package:

AdaptiveCache Class:

The AdaptiveCache class provides an adaptive memory-based caching solution using the Caffeine caching library. It dynamically adjusts memory usage based on the available JVM heap memory, ensuring efficient memory management and performance for image caching.


Constants:


private static final double MAXIMUM_MEMORY_PERCENTAGE = 0.6

  • Limits the cache size to 60% of the JVM heap memory.

Constructor


private AdaptiveCache(long maximumWeight)

  • Parameters:
    • maximumWeight - the maximum memory the cache can use.

Methods:


static AdaptiveCache getInstance()

  • Returns:

    • The singleton AdaptiveCache instance.
  • Purpose:

    • Provides access to the singleton instance of the cache.

BufferedImage get(@NotNull File key) throws IOException

  • Parameters:

    • key - A File object representing the image file.
  • Returns:

    • The BufferedImage loaded from cache or disk.
  • Throws:

    • IOException - If the image cannot be loaded.
  • Purpose:

    • Retrieves an image from the cache. If the image is not cached, it loads it from disk and stores it in the cache.

void monitor(long period)

  • Parameters:

    • period - Interval (in seconds) between cache logs.
  • Purpose:

    • Starts a periodic task that logs cache statistics at regular intervals.

private int getImageWeight(File key, @NotNull BufferedImage value)

  • Parameters:

    • key - The image file.
    • value - The BufferedImage whose weight is calculated.
  • Returns:

    • The memory weight of the image in bytes.
  • Purpose:

    • Computes the memory weight of an image in bytes.
    • Assumes each pixel is represented by 4 bytes (RGBA).
  • Formula:

return value.getWidth() * value.getHeight() * 4;

private BufferedImage loadImage(@NotNull File key)

  • Parameters:

    • key - The image file.
  • Returns:

    • The loaded BufferedImage.
  • Throws:

    • UncheckedIOException - If the file cannot be read or the format is unsupported.
  • Purpose:

    • Loads an image from disk using ImageIO.

private static long getMaximumWeight()

  • Returns:

    • Maximum cache weight (in bytes).
  • Purpose:

    • Calculates the maximum cache size based on the available JVM memory (60% of the heap).

Key Design Notes:


  1. Adaptive Memory Management:
  • The maximumWeight is dynamically calculated based on JVM heap size.
  1. Thread-Safe:
  • The cache itself is thread-safe as Caffeine provides synchronized operations internally.
  • The monitor uses AtomicBoolean to ensure it starts only once.
  1. Error Handling:
  • Uses UncheckedIOException to propagate IOException from ImageIO read operations.
  • Logs detailed error information using SLF4J.

Grouping Package:

Grouper Interface:

The Grouper interface represents a functional interface designed to group files into subsets that share a common characteristic, such as having identical checksum or other similarity criteria. It abstracts the process of grouping files for organizational or comparison purposes.


Methods:


Set<Set<File>> divide(Collection<File> col) throws IOException

  • Parameters:

    • col - A Collection of File objects to be divided into subsets. Typically these files are of the same type (e.g., images).
  • Returns:

    • A set of subsets of files, where each subset (Set<File>) contains files that share a common characteristic.
  • Throws:

    • IOException - If an I/O error occurs while reading or processing the files.
  • Purpose: Divides a collection of files into subsets based on a defined distinction or grouping criterion. Each subset contains files that share a common property, such as identical content, checksum, or other user-defined similarities.


Key Design Notes:


  • Generality: The divide methods is intentionally kept generic to accommodate any type of grouping logic.
  • Performance Consideration: Implementations of divide should be optimized for performance, especially when processing large file collections.
  • Immutability of Results: Returning a Set<Set<File>> ensures no duplicate groups exists, and each subset of files can be easily iterated over.

Usage Example:


public class FileSizeGrouper implements Grouper {

    @Override
    public Set<Set<File>> divide(Collection<File> col) throws IOException {
        // Group files by their size using a map
        Map<Long, Set<File>> sizeGroups = new HashMap<>();

        for (File file : col) {
            if (file.isFile()) {
                long size = Files.size(file.toPath());
                sizeGroups.computeIfAbsent(size, k -> new HashSet<>()).add(file);
            }
        }

        return new HashSet<>(sizeGroups.values());
    }
}


public class Main {
    public static void main(String[] args) throws IOException {
        List<File> files = List.of(
                new File("image1.jpg"),
                new File("image2.jpg"),
                new File("duplicate_image1.jpg")
        );

        Grouper grouper = new FileSizeGrouper();
        Set<Set<File>> groupedFiles = grouper.divide(files);

        groupedFiles.forEach(group -> {
            System.out.println("Group:");
            group.forEach(file -> System.out.println(" - " + file.getName()));
        });
    }
}

CRC32Grouper Class:

The CRC32Grouper class implements the Grouper interface to group files based on their CRC32 checksum. Files that share the same checksum are assumed to be identical and grouped together. This approach is useful for detecting duplicate files in collection.


Methods:


Set<Set<File>> divide(Collection<File> col)

  • Parameters:

    • col - A Collection of File objects to group based on their checksum.
  • Returns:

    • A set of subsets of files, where each subset (Set<File) contains files that share the same checksum.
  • Purpose:

    • Divides a collection of files into subsets based on their CRC32 checksum values. Files with the same checksum are grouped together.

private long calculateChecksum(File f) throws IOException

  • Parameters:

    • f - The File for which the checksum is to be calculated.
  • Returns:

    • long - The CRC32 checksum value of the file.
  • Throws:

    • IOException - If an I/O error occurs while reading the file.
  • Purpose: Calculates the CRC32 checksum for a given file. This method reads the file in chunks to optimize memory usage and applies the CRC32 algorithm to generate the checksum.


Key Design Notes:


  1. Parallel Stream Processing:
  • Files are processed in parallel for performance. This ensures that large file collections are grouped efficiently.
  1. Grouping Logic:
  • Uses a Map to associate checksum values with sets of files.
  • Files with identical checksum are collected into the same group.
  1. Filter Unique Groups:
  • Only file groups with more than one file are retained as potential duplicates.

Usage Example:


import pl.magzik.grouping.CRC32Grouper;

import java.io.File;
import java.io.IOException;
import java.util.List;
import java.util.Set;

public class Main {
    public static void main(String[] args) {
        List<File> files = List.of(
                new File("file1.txt"),
                new File("file2.txt"),
                new File("duplicate_file1.txt")
        );

        Grouper grouper = new CRC32Grouper();


        Set<Set<File>> groupedFiles = grouper.divide(files);

        groupedFiles.forEach(group -> {
            System.out.println("Group of duplicate files:");
            group.forEach(file -> System.out.println(" - " + file.getName()));
        });
    }
}

IO Package:

FileOperation Interface:

The FileOperation interface defines a standardized contract for performing file management operations such as:

  1. Loading files.
  2. Moving files to a specified directory.
  3. Deleting files.

It supports operations on both collections of files and individual arrays of files. Default methods ensure flexibility by delegating array-based operations to their corresponding collection-based methods.


Methods:


List<File> load(Collection<File> files) throws IOException

  • Parameters:

    • files - A Collection of File objects to be loaded.
  • Returns:

    • `List - A list containing the loaded files.
  • Throws:

    • IOException - If an I/O error occurs while loading the files.
  • Purpose: Loads the provided collection of files. The operation may involve verifying file existence, reading metadata, or other preparatory operations.


default List<File> load(File... files) throws IOException

  • Parameters:

    • files - An array of files to be loaded.
  • Returns:

    • List<File> - A list containing the loaded files.
  • Throws:

  • Purpose: Loads the provided array of files. Delegates to the collection-based load(Collection<File>) method.


void move(File destination, Collection<File> files) throws IOException

  • Parameters:

    • destination - The target directory for the moved files.
    • files - The Collection of File objects to be moved.
  • Throws:

    • IOException - If an I/O error occurs while moving the files.
  • Purpose: Moves the provided collection of files to a specified destination directory.


default void move(File destination, File... files) throws IOException

  • Parameters:

    • destination - The target directory for the moved files.
    • files - An array of files to be moved.
  • Throws:

    • IOException - If an I/O error occurs while moving the files.
  • Purpose: Moves the provided array of files to the specified destination directory. Delegates to the collection-based move(File, Collection<File>) method.


void delete(Collection<File> files) throws IOException

  • Parameters:

    • files - The Collection of File objects to be deleted.
  • Throws:

    • IOException - If an I/O error occurs while deleting the files.
  • Purpose: Deletes the provided collection of files.


default void delete(File... files) throws IOException

  • Parameters:

    • files - An array of files to be deleted.
  • Throws:

    • IOException - If an I/O error occurs while deleting the files.
  • Purpose: Deletes the provided array of files. Delegates to the collection-based delete(Collection<File>) method.


Usage Example:


public class SimpleFileOperation implements FileOperation {

    @Override
    public List<File> load(Collection<File> files) throws IOException {
        for (File file : files) {
            if (!file.exists()) {
                throw new IOException("File not found: " + file.getName());
            }
        }
        return List.copyOf(files);
    }

    @Override
    public void move(File destination, Collection<File> files) throws IOException {
        if (!destination.isDirectory()) {
            throw new IOException("Destination must be a directory.");
        }

        for (File file : files) {
            File target = new File(destination, file.getName());
            if (!file.renameTo(target)) {
                throw new IOException("Failed to move file: " + file.getName());
            }
        }
    }

    @Override
    public void delete(Collection<File> files) throws IOException {
        for (File file : files) {
            if (!file.delete()) {
                throw new IOException("Failed to delete file: " + file.getName());
            }
        }
    }

    public static void main(String[] args) {
        FileOperation fileOperation = new SimpleFileOperation();

        try {
            File file1 = new File("file1.txt");
            File file2 = new File("file2.txt");
            File destination = new File("targetDirectory");

            fileOperation.load(file1, file2);
            fileOperation.move(destination, file1, file2);
            fileOperation.delete(file1, file2);

        } catch (IOException e) {
            e.printStackTrace();
        }
    }
}

Predicates Package:

Clone this wiki locally