-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tests for DESCQA using the DM products #127
Comments
@fjaviersanchez @rmjarvis - this is a great start! I have a few questions and other suggestions:
|
Also note that we have one tract of HSC XMM PDR1 that is available in the same format as the Run 1.1p coadd catalog via GCR, which mean we can run a DESCQA test on both Run 1.1p coadd and HSC XMM and see a side-by-side comparison. This can also be useful for diagnosis/validation. |
Nice. If we are comparing quantities that depend on PSF size, then we would have to restrict the Run 1.1p to similar seeing size as the HSC XMM field, but as long as we do that, this could be interesting. |
There's already a test in a PR that computes the median background level and can include the prediction by OpSim.
Sounds good!
In principle we weren't thinking about splitting the sample but I think that's a good idea. Thanks!
Thanks! @danielsf @yymao, does the 1.2 reference catalog include the unextincted magnitudes or the extincted ones? One option can be generating two true catalogs (one for 1.2i and the other for 1.2p). Another solution is to generate just one catalog but including a column with the unextincted magnitudes and another with the correct extinction. The third option is just to use the extincted magnitudes and see that the PhoSim outputs are brighter than the inputs. The latter approach can, however, mask other problems... |
@fjaviersanchez you mean the truth catalog, right? The magnitudes in the truth catalog do not include extinction. |
Thanks @yymao! I meant the 1.2 reference catalog because I thought that the truth catalog for 1.2 is not in place, is it? (I can only see the 1.1 truth catalog and the 1.2 reference catalog) |
Ah, ok. I am not sure about the reference catalog. I would guess its magnitudes do not have extinction but @danielsf can confirm. However, I think we should generate truth catalog for Run 1.2 rather than using reference catalog for validation. |
Sorry, what is the distinction between reference and truth? I was thinking of the reference catalog as equivalent to a truth catalog. |
@rmandelb, we had intentionally avoided doing any tests of the galaxy shapes, since the PSF will complicate the interpretation, and I thought weird sub-populations (e.g. an excess at |e|=1) would more likely be a failure of the measurement code than a failure of the image simulations. So I was deferring careful tests of shapes to the WL group. However, you are quite right that we should at least plot some very basic things like p(e) to make sure there isn't something very badly wrong with the shapes. Just we probably won't be able to turn any of them into proper null tests (my goal for as many of these as possible). |
@rmjarvis Reference catalog contains simulated photometry and astrometry noises that are not present in the truth catalog. Also, reference catalog only goes down to a certain depth (e.g. Gaia depth). (see https://confluence.slac.stanford.edu/x/oJgHDg) |
@rmjarvis - I dithered over the question of p(|e|) or an e1 vs. e2 histogram (to look for weird orientation effects) for the same reason you mentioned, but I do think there are some useful sanity checks there. For example, we know that re-Gaussianization doesn't have a failure mode that should lead to a pileup at |e|=1, it should be a reasonably smooth distribution across that boundary (unphysical values can result from dividing two noisy quantities). Pileups at values like 0 or 1, or just plain crazy shapes, or a strong coherent direction in the e1 vs. e2 histogram, could actually tell us something about the sims. |
Regarding get_predicted_bkg, the predicted sky brightness from OpSim is interesting to have, but phoSim has its own sky brightness model, so the agreement won't be perfect. That is, phoSim evaluates the sky brightness (as a function of wavelength) based on other OpSim metadata, like elevation of the observing direction, altitude of the Sun, etc. instead of somehow inferring it from the OpSim sky brightness. The phoSim and OpSim sky brightness certainly should be correlated at whatever wavelength or band the OpSim brightness corresponds to, but again, the agreement won't be perfect. |
Thanks @sethdigel. Yes, that's a problem and I believe that the trick will be to have reasonable validation criteria (How different should we expect them to be? 20%? 30%?) |
Good question. I'm not sure how to answer, but the scatter seems quite large (and it is probably dependent on band). In April I put together run 1.2p OpSim metadata with basic information from the log files for phoSim r-band runs, including the numbers of photons that phoSim reported that it generated. This is dominated by the sky brightness. Here is a quick plot (sorry it is not Python; I love Python, really, but pandas still seems user hostile to me). These were early runs and phoSim could have changed in some way relevant to sky brightness since then, but I was not finding vSkyBright (or filtSkyBrightness) to be a good predictor of how long a phoSim run would take. A csv file with the run metadata and phoSim photon counts is here: |
Thanks for the plot and the data @sethdigel! Yes, the correlation is there but there are really big outliers (I wonder if those were exposures with only stellar sources?). Since we have a way to compute the median sky-level we can try to convert OpSim's values to counts (or the counts to magnitude) and set some arbitrary, but restrictive, tolerance that we can fine tune once we get more experience. At the end of the day, the test can flag the exposures that don't comply with the criteria set and we can inspect them. However, we don't want to have to inspect all of them. Does this sound reasonable? |
Yes that sounds reasonable; working in terms of the medians of the e-images (if that is what you have in mind) sounds sensible for figuring out whether a given sensor visit is way off what you'd expect from the OpSim metadata. Regarding the extreme outliers in the plot, I don't have an explanation, but the sensor visits clearly did have sky emission - Mie and Rayleigh scattering from the Moon, plus airglow - and also Zodiacal light). The log file for one of these is here: http://srs.slac.stanford.edu/Pipeline-II/exp/SRS/log.jsp?pi=50705671 |
I'm not sure if this is still relevant (sorry; I was on vacation last week), but @fjaviersanchez asked if the reference catalog contained dust extinction: Yes, it does. I created the reference catalog before we diagnosed the dust problem in PhoSim. I will generate a truth catalog for Run 1.2 in the next few days. |
Great. Thanks, Scott! |
I think this the place to suggest sensor level tests? There is a wide range of tree ring amplitude visible on sensors. See the work from @karpov-sv and all here: https://github.com/LSSTDESC/imSim/wiki/tree_ring_validation and I have seen this in the exposure checker. @karpov-sv when you say "The simulated data have been generated for all 189 different imSim sensor configurations using analytic formulae for pixel area variations shown above." Did you actually run a full focal plane using imSim? Or does this mean you used the formula? I think it be a nice check of the actual 1.2 imSim output to see that the maximum amplitude and all are reasonable. I think Serge could help with this. |
DC2 validation test brainstorming by @rmjarvis and @fjaviersanchez:
Image level (@rmjarvis):
Check that the images contain some pixels above 10sigma level.
Calculate gain and read noise and compare with prediction.
Check masked (saturated) bits of the images.
Check masked (bad/dead) pixels -> PhoSim.
Catalog (visit level):
use stars and use PSFmag to compute the
CheckAstroPhoto
test. (Using standalone test check in DC2-production #259). Update 09/09/18: Done in standalone code.size stars vs magnitude at different epochs should be flat (use HSM size/sdssShape). Use scatter plot for every single star. Update 09/09/18: Done in standalone code.
given a calexp, select a clean stellar sample, check the PSF on each location (position of the star) and check the stacked difference (low priority).
select a set of
calexp
s and check that the input seeing is correlated with the size of the stars appearing in them: Update 09/10/18: Done in standalone codeDCR test: translate the shape of the star to get the shape on the zenith direction for a bunch of good stars, separate per band, and check this as a function of airmass.
DCR test: repeat that splitting the sample into redder and bluer stars.
Catalog (coadd level):
Separate stars and galaxies and use them in.CheckAstroPhoto
In
CheckAstroPhoto
add the input N(m) and the output N(m), check ratio and see when they start to separate from each other (in progress, see here).Check that galaxy density decreases with MW extinction (First commit of Density test #140).Check color-color diagram for input and output for several colors (inspect to validate)-> (Update CheckColors test to be compatible with DM outputs #141)Red sequence test (red sequence colors (mean, scatter) as a function of redshift #41 and red sequence validation test #101)
Add input-true size as a function of true size.
Count the number of objects around a central galaxies in a given aperture (1 arcmin) and represent that as a function of the cluster richness (something in the input???).
The text was updated successfully, but these errors were encountered: