A console utility program that allows getting a selection of the cheapest N products from the input CSV files, but no more than M products with the same ID. Used parallel processing to increase performance. Reading and handling data from file in parts to save memory.
Initial Data:
- Several CSV files. The number of files can be quite large (up to 100,000).
- The number of rows within each file can reach up to several million.
- Each file contains 5 columns:
- Product ID (int),
- Name (String),
- Condition (String),
- State (String),
- Price (double).
- The same product IDs may occur more than once in different CSV files and in the same CSV file.
directoryPath
delimiter (defaultValue: ,)
productResultRowsCount (defaultValue: 1000)
duplicateProductsMaxCount (defaultValue: 20)
Example:
directoryPath=C:\Users\Tiran\Desktop\files\csv delimiter=, productResultRowsCount=1000 duplicateProductsMaxCount=20