FileNotFoundError in `download_and_unzip` when running multiple easyocr's concurrently #1335

starpit · 2024-11-18T21:39:58Z

When we try to run two or more easyocr's concurrently, we get an error in the downloader. I am guessing that the download logic uses a fixed download filepath?

EasyOcrModel(
File ".../lib/python3.10/site-packages/docling/models self.reader = easyocr.Reader(config["lang"])
File ".../lib/python3.10/site-packages/easyocr/easyocr.py", line 92, in __init__
  detector_path = self.getDetectorPath(detect_network)
File ".../lib/python3.10/site-packages/easyocr/easyocr.py", line 253, in getDetectorPath
  download_and_unzip(self.detection_models[self.detect_network]['url'], self.detection_models[self.detect_network]['filename'], self.model_storage_directory, self.verbose)
File ".../lib/python3.10/site-packages/easyocr/utils.py", line 631, in download_and_unzip
  os.remove(zip_path)
FileNotFoundError: [Errno 2] No such file or directory: '/home/runner/.EasyOCR//model/temp.zip'

starpit · 2024-11-19T00:36:42Z

Update: by adding an fcntl file lock wrapper around the DocumentConverter constructor, we can skirt this race condition. Which seems like fair albeit not definitive evidence that it is indeed a race condition on the easyocr side.

https://github.com/IBM/lunchpail/pull/553/files#diff-887bd71eba07d3802a0d252334cc69f2ee9e74ac50e28a220dea8d9584ab6f44L130-R141

starpit mentioned this issue Nov 18, 2024

feat: update worker watcher to "pack" i.e. leverage as many cores as the pod/machine provides IBM/lunchpail#552

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FileNotFoundError in `download_and_unzip` when running multiple easyocr's concurrently #1335

FileNotFoundError in `download_and_unzip` when running multiple easyocr's concurrently #1335

starpit commented Nov 18, 2024 •

edited

Loading

starpit commented Nov 19, 2024

FileNotFoundError in download_and_unzip when running multiple easyocr's concurrently #1335

FileNotFoundError in download_and_unzip when running multiple easyocr's concurrently #1335

Comments

starpit commented Nov 18, 2024 • edited Loading

starpit commented Nov 19, 2024

FileNotFoundError in `download_and_unzip` when running multiple easyocr's concurrently #1335

FileNotFoundError in `download_and_unzip` when running multiple easyocr's concurrently #1335

starpit commented Nov 18, 2024 •

edited

Loading