Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix for config file locking #1719

Merged
merged 31 commits into from
Aug 8, 2023
Merged

Fix for config file locking #1719

merged 31 commits into from
Aug 8, 2023

Conversation

kessler-frost
Copy link
Member

@kessler-frost kessler-frost commented Jul 4, 2023

  • I have added the tests to cover my changes.
  • I have updated the documentation and CHANGELOG accordingly.
  • I have read the CONTRIBUTING document.

@Andrew-S-Rosen
Copy link
Contributor

Closes #1670.

@codecov
Copy link

codecov bot commented Jul 4, 2023

Codecov Report

Merging #1719 (97c7347) into develop (e500030) will decrease coverage by 1.93%.
The diff coverage is 100.00%.

Additional details and impacted files

Impacted file tree graph

@@             Coverage Diff             @@
##           develop    #1719      +/-   ##
===========================================
- Coverage    88.20%   86.28%   -1.93%     
===========================================
  Files          191      191              
  Lines         8218     8217       -1     
  Branches       151      151              
===========================================
- Hits          7249     7090     -159     
- Misses         864     1022     +158     
  Partials       105      105              
Flag Coverage Δ *Carryforward flag
Dispatcher 86.68% <ø> (ø) Carriedforward from 863a7e9
Functional_Tests ?
SDK 91.49% <100.00%> (-0.01%) ⬇️ Carriedforward from 863a7e9
UI_Backend 91.15% <ø> (ø)
UI_Frontend 72.64% <ø> (ø)

*This pull request uses carry forward flags. Click here to find out more.

@Andrew-S-Rosen
Copy link
Contributor

By the way, not that it needs to be done in this PR, but there's one other section in the codebase that relies on the OS-dependent fcntl.

def set_winsize(fd, row, col, xpix=0, ypix=0):
winsize = struct.pack("HHHH", row, col, xpix, ypix)
fcntl.ioctl(fd, termios.TIOCSWINSZ, winsize)

Maybe it'll be possible to replace that call too at some point.

@kessler-frost
Copy link
Member Author

Yep that's on my radar. Thanks @arosen93 !

@kessler-frost
Copy link
Member Author

kessler-frost commented Jul 7, 2023

@arosen93 Do you mind helping me out a bit with this one, can you check whether this solution works on your side on the slurm cluster (where there was the NFS based system) for which covalent was halting earlier? Thanks a lot!

@Andrew-S-Rosen
Copy link
Contributor

Certainly! I'll check in the AM here.

@Andrew-S-Rosen
Copy link
Contributor

Andrew-S-Rosen commented Jul 7, 2023

@kessler-frost: I reran my typical test (#1697). A few notes.

  1. After pip install-ing, it didn't install filelock. So, I had to do that manually.
  2. Covalent no longer freezes on import
  3. Jobs do not run successfully by default (traceback below), but this is expected.
Traceback (most recent call last):
  File "/pscratch/sd/r/rosen/test.py", line 1, in <module>
    import covalent
  File "/global/common/software/matgen/rosen/miniconda/envs/test/lib/python3.10/site-packages/covalent/__init__.py", line 26, in <module>
    from . import executor, leptons  # nopycln: import
  File "/global/common/software/matgen/rosen/miniconda/envs/test/lib/python3.10/site-packages/covalent/executor/__init__.py", line 33, in <module>
    from .._shared_files import logger
  File "/global/common/software/matgen/rosen/miniconda/envs/test/lib/python3.10/site-packages/covalent/_shared_files/logger.py", line 34, in <module>
    log_level = get_config("sdk.log_level").upper()
  File "/global/common/software/matgen/rosen/miniconda/envs/test/lib/python3.10/site-packages/covalent/_shared_files/config.py", line 244, in get_config
    cm = ConfigManager()
  File "/global/common/software/matgen/rosen/miniconda/envs/test/lib/python3.10/site-packages/covalent/_shared_files/config.py", line 56, in __init__
    self.update_config()
  File "/global/common/software/matgen/rosen/miniconda/envs/test/lib/python3.10/site-packages/covalent/_shared_files/config.py", line 113, in update_config
    with filelock.FileLock(f"{self.config_file}.lock", timeout=1):
  File "/global/common/software/matgen/rosen/miniconda/envs/test/lib/python3.10/site-packages/filelock/_api.py", line 255, in __enter__
    self.acquire()
  File "/global/common/software/matgen/rosen/miniconda/envs/test/lib/python3.10/site-packages/filelock/_api.py", line 222, in acquire
    raise Timeout(lock_filename)  # noqa: TRY301
filelock._error.Timeout: The file lock '/global/homes/r/rosen/.config/covalent/covalent.conf.lock' could not be acquired.

The reason the lock can't be acquired is because the default COVALENT_CONFIG_DIR (being in ~/) on Perlmutter doesn't support locking mechanisms in general. I don't see a real way of getting around that aside from not using a locking mechanism at all.

More importantly, because there's no stalling and a proper error is raised, at least it is telling the user to do something about it. In this case, the user has to just update their COVALENT_CONFIG_DIR to a filesystem that supports locking. I've confirmed that everything works as expected once that's done.

So, in summary, I think this solution is better and worth merging because:

  • There is no stalling upon error
  • A proper error is raised
  • It is an OS-agnostic solution

Just make sure filelock makes its way into setup.py.


If you're curious, I also tried pip install-ing Covalent on Windows and ran import covalent, but now that causes a hang related to the update_config. Obviously, Windows isn't supported so that's not a problem per se, but I wanted to pass that info along in case it might be relevant.

@kessler-frost
Copy link
Member Author

Thanks a lot @arosen93 ! I'll clean this one up and get this merged.

@Andrew-S-Rosen
Copy link
Contributor

Great! Thanks for addressing this :)

@kessler-frost
Copy link
Member Author

So, I wanted to try out covalent on Windows on my end as well and after spending quite a lot of time on setting things up, it turns out there's more work involved in making this work for Windows than I expected. So, I guess I'll tackle handling that support in this PR. Will still check whether this is working with multiple processes trying to edit the config file simultaneously.

@kessler-frost kessler-frost marked this pull request as ready for review July 31, 2023 07:59
@kessler-frost kessler-frost requested review from a team as code owners July 31, 2023 07:59
@kessler-frost
Copy link
Member Author

 def set_winsize(fd, row, col, xpix=0, ypix=0): 
     winsize = struct.pack("HHHH", row, col, xpix, ypix) 
     fcntl.ioctl(fd, termios.TIOCSWINSZ, winsize) 

The above fcntl in covalent/covalent_ui/api/main.py can be changed to something else later since it's UI based and requires a different type of testing.

@wjcunningham7 wjcunningham7 enabled auto-merge (squash) August 8, 2023 12:38
@wjcunningham7 wjcunningham7 merged commit cdd3022 into develop Aug 8, 2023
11 checks passed
@wjcunningham7 wjcunningham7 deleted the fix-config-file-lock branch August 8, 2023 13:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Make setup.py OS-agnostic and replace fcntl with an OS-agnostic alternative
4 participants