-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
browser in state of paralyisis due to SKA-sized data set #25
Comments
Hi Tony. Maybe look into Multi-MS. You can create virtual Measurement Sets on the disk that point to sub-regions of your master MS, and should be interpreted as if they were individual Measurement Sets. That way you don't have to worry about splitting and re-concatenation, your SKA-sized MS can remain intact and hopefully the chunks will be small enough for MeqTrees etc. to digest... |
Thanks Ian - I'll try that - in any case I'm still having trouble with Calico-based selfcal and have been using the CASA gaincal and applycal tasks. Splitting an MS should certainly speed up those tasks. |
@twillis449 could you please post a copy of the console output. Is it just those two messages? |
Here's the entire output (you can generate the MS with the script make_ms_ska_batch.py in my test_fitting directory on jake). After generating the MS I start up the meqbrowser and load turbo-sim.py. The browser then takes 20 minutes !! before the selection GUI appears ..... I start up at 9:08 AM and GUI finally appears at 9:28 AM [NRC-005592LX 9:08am] [iono_sims]> meqbrowser |
By the way - a simple python script to just open the measurement set and load the entire CORRECTED_DATA column into memory on my 16 GB memory laptop only takes 2 min 44 sec. |
OK - so I have an SKA-sized data set 35 minutes long with 0.1 sec sampling and one channel, so about 19000 baselines with 21000 samples -> 400 million rows in the MS. I start up the meqbrowser and load turbo-sim.py. Despite not having a .tdl.conf file, turbo-sim seems to hunt for the first (and only) MS it can find, and then goes lot to lunch for the next 16 minutes while it seems to need to keep rereading and opening the MS with the standard sorts of comments
Using LSM module from Tigger (using svn version) at /usr/local/lib/python2.7/dist-packages/Tigger (in path)
Successful readonly open of default-locked table dummy.MS: 24 columns, 405426000 rows
blah blah ...
What's going on here? All these apparently repeat re-reads are taking forever ....!
The text was updated successfully, but these errors were encountered: