Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hkdb cannot seek #1092

Open
msilvafe opened this issue Jan 15, 2025 · 1 comment
Open

hkdb cannot seek #1092

msilvafe opened this issue Jan 15, 2025 · 1 comment
Assignees

Comments

@msilvafe
Copy link
Contributor

I was running a over many (3492) obs ids and calling and hkdb.LoadSpec() for the time range of each obs id. On 51 of them I get the following error:

ERROR (G3IndexedReader) 15-Jan-2025:21:24:47 UTC: Cannot seek; stream closed at EOF. (G3IndexedReader.cxx:95 in int G3IndexedReader::Seek(int))
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
Cell In[28], line 24
     21 cfg_plat = hkdb.HkConfig.from_yaml(cfg_plat)
     22 lspec_plat = hkdb.LoadSpec(start=t0, end=t1, cfg=cfg_plat,
     23                         fields=['hwp_rate1', 'hwp_rate2', 'hwp_direction'], hkdb=hkdb_plat)
---> 24 result_plat = hkdb.load_hk(lspec_plat, show_pb=False)
     26 hkdb_site.engine.dispose()
     27 hkdb_plat.engine.dispose()

File ~/.local/soconda/0.1.4.dev250/lib/python3.11/site-packages/sotodlib/io/hkdb.py:477, in load_hk(load_spec, show_pb)
    475 reader = so3g.G3IndexedReader(path)
    476 for offset in sorted(offsets):
--> 477     reader.Seek(offset)
    478     frame = reader.Process(None)[0]
    479     addr = frame['address']

RuntimeError: Cannot seek; stream closed at EOF.

The following call will reproduce the above error (with all paths on site-computing):

hkdb_plat = hkdb.HkDb('/so/home/msilvafe/shared_files/satp1_hkdb_cfg.yaml')
cfg_plat = hkdb.HkConfig.from_yaml('/so/home/msilvafe/shared_files/satp1_hkdb_cfg.yaml')
lspec_plat = hkdb.LoadSpec(start=1722245201.9659333, end=1722248712.6509333, cfg=cfg_plat,
                           fields=['hwp_rate1', 'hwp_rate2', 'hwp_direction'], hkdb=hkdb_plat)
result_plat = hkdb.load_hk(lspec_plat, show_pb=False)
@mhasself
Copy link
Member

If you run a debugger and look at the frame "offsets", the list contains:

[175031957, 175031957, 175070761, 175070761, 175250307, 175250307, 178188657, 178188657, 179452571, 179452571, 184821240, 184821240, .......
594170116, 594170116, 594231478, 594231478]

So that's weird. Seems like database corruption; duplicate entries. Probably "byte_offset" and "file_id" should be a primary key.

You can work around this by replacing for offset in sorted(offsets): with for offset in np.unique(offsets):.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants