Large PDF files (2k-4k pages per file). All files have a similar structure. Formed as a legacy from office tables.
- get a file of table-like data, with the ability to use with pandas
- prevent "out-of-memory" error
- save the results to a file in a size-insensitive format
- be able to track progress and estimated time
- Count the number of pages of each file
- Split page range for part-by-part convertion (single page makes slowly)
- Convert part with attaching to common dataframe
- Save complete dataframe to pickle file