Merge pull request #336 from bd-j/obs_cal

Obs cal
bd-j · Aug 9, 2024 · 2ddd786 · 2ddd786
2 parents 09f11ca + 9f00d08
commit 2ddd786
Show file tree

Hide file tree

Showing 12 changed files with 507 additions and 318 deletions.
diff --git a/doc/dataformat.rst b/doc/dataformat.rst
@@ -79,7 +79,7 @@ For a single observation, you might do something like:
 .. code-block:: python
 
         def build_obs(N):
-            from prospect.data import Spectrum
+            from prospect.observation import Spectrum
             N = 1000  # number of wavelength points
             spec = Spectrum(wavelength=np.linspace(3000, 5000, N), flux=np.zeros(N), uncertainty=np.ones(N))
             # ensure that this is a valid observation for fitting
@@ -95,7 +95,7 @@ For photometry this might look like:
 .. code-block:: python
 
         def build_obs(N):
-            from prospect.data import Photometry
+            from prospect.observation import Photometry
             # valid sedpy filter names
             fnames = list([f"sdss_{b}0" for b in "ugriz"])
             Nf = len(fnames)
@@ -113,7 +113,7 @@ A tool exists to convert old combined observation dictionaries to a list of
 
 .. code-block:: python
 
-        from prospect.data import from_oldstyle
+        from prospect.observation import from_oldstyle
         # dummy observation dictionary with just a spectrum
         N = 1000
         obs = dict(wavelength=np.linspace(3000, 5000, N), spectrum=np.zeros(N), unc=np.ones(N),
@@ -129,7 +129,7 @@ The :py:meth:`build_obs` function
 
 The :py:meth:`build_obs` function in the parameter file is written by the user.
 It should take a dictionary of command line arguments as keyword arguments. It
-should return a list of :py:class:`prospect.data.Observation` instances,
+should return a list of :py:class:`prospect.observation.Observation` instances,
 described above.
 
 Other than that, the contents can be anything. Within this function you might
@@ -142,7 +142,7 @@ The point of this function is that you don't have to *externally* convert your
 data format to be what |Codename| expects and keep another version of files
 lying around: the conversion happens *within* the code itself. Again, the only
 requirement is that the function can take a ``run_params`` dictionary as keyword
-arguments and that it return :py:class:`prospect.data.Observation` instances, as
+arguments and that it return :py:class:`prospect.observation.Observation` instances, as
  described above.  Each observation instance should correspond to a particular
  dataset (e.g. a broadband photomtric SED, the spectrum from a particular
  instrument, or the spectrum from a particular night) that shares instrumental

diff --git a/doc/faq.rst b/doc/faq.rst
@@ -40,49 +40,6 @@ sampling algorithm being used, and the kind of data that you have.  Hours or
 even days per fit is not uncommon for more complex models.
 
 
-Can I fit my spectrum too?
---------------------------
-There are several extra considerations that come up when fitting spectroscopy
-
-   1) Wavelength range and resolution.
-      Prospector is based on FSPS, which uses the MILES spectral library. These
-      have a resolution of ~2.5A FWHM from 3750AA - 7200AA restframe, and much
-      lower (R~200 or so, but not actually well defined) outside this range.
-      Higher resolution data (after including both velocity dispersion and
-      instrumental resolution) or spectral points outside this range cannot yet
-      be fit.
-
-   2) Relatedly, line spread function.
-      Prospector includes methods for FFT based smoothing of the spectra,
-      assuming a gaussian LSF (in either wavelength or velocity space). There is
-      also the possibility of FFT based smoothing for wavelength dependent
-      Gaussian dispersion (i.e. sigma_lambda = f(lambda) with f possibly a
-      polynomial of lambda). In practice the smoothed spectra will be a
-      combination of the library resolution plus whatever FFT smoothing is
-      applied. Hopefully this can be made to match your actual data resolution,
-      which is a combination of the physical velocity dispersion and the
-      instrumental resolution. The smoothing is controlled by the parameters
-      `sigma_smooth`, `smooth_type`, and `fftsmooth`
-
-   3) Nebular emission.
-      While prospector/FSPS include self-consistent nebular emission, the
-      treatment is probably not flexible enough at the moment to fit high S/N,
-      high resolution data including nebular emission (e.g. due to deviations of
-      line ratios from Cloudy predictions or to complicated gas kinematics that
-      are different than stellar kinematics). Thus fitting nebular lines should
-      take adavantage of the nebular line amplitude optimization/marginalization
-      capabilities. For very low resolution data this is less of an issue.
-
-   4) Spectrophotometric calibration.
-      There are various options for dealing with the spectral continuum shape
-      depending on how well you think your spectra are calibrated and if you
-      also fit photometry to tie down the continuum shape. You can optimize out
-      a polynomial "calibration" vector, or simultaneously fit for a polynomial
-      and marginalize over the polynomial coefficients (this allows you to place
-      priors on the accuracy of the spectrophotometric calibration). Or you can
-      just take the spectrum as perfectly calibrated.
-
-
 How do I fit for redshift as well as other parameters?
 ------------------------------------------------------
 The simplest way is to just let the parameter specification for ``"zred"``

diff --git a/doc/index.rst b/doc/index.rst
@@ -30,6 +30,7 @@ Prospector allows you to:
    installation
    usage
    dataformat
+   spectra
    models
    sfhs
    nebular

diff --git a/doc/quickstart.rst b/doc/quickstart.rst
@@ -45,7 +45,7 @@ include an empty Spectrum data set to force a prediction of the full spectrum.
 .. code:: python
 
     from sedpy.observate import load_filters
-    from prospect.data import Photometry, Spectrum
+    from prospect.observation import Photometry, Spectrum
 
     filters = load_filters([f"sdss_{b}0" for b in bands])
     maggies = np.array([10**(-0.4 * cat[0][f"cModelMag_{b}"]) for b in bands])
@@ -59,6 +59,7 @@ include an empty Spectrum data set to force a prediction of the full spectrum.
     for obs in observations:
         obs.redshift = shdus[2].data[0]["z"]
 
+
 In principle we could also add noise models for the spectral and photometric
 data (e.g. to fit for the photometric noise floor), but we'll make the default
 assumption of iid Gaussian noise for the moment.
@@ -69,18 +70,25 @@ Build a Model
 
 Here we will get a default parameter set for a simple parametric SFH, and add a
 set of parameters describing nebular emission.  We'll also fix the redshift to
-the value given by SDSS. This model has 5 free parameters, each of which has a
-an associated prior distribution.  These default prior distributions can and
-should be replaced or adjusted depending on your particular science question.
+the value given by SDSS. This model has 5 free parameters, each of which has an
+associated prior distribution.  These default prior distributions can and should
+be replaced or adjusted depending on your particular science question. Here
+we'll just change the prior distribution for stellar mass, as an example.
 
 .. code:: python
 
+    # Get a baseline set of model parameters
     from prospect.models.templates import TemplateLibrary
     from prospect.models import SpecModel
     model_params = TemplateLibrary["parametric_sfh"]
     model_params.update(TemplateLibrary["nebular"])
     model_params["zred"]["init"] = obs["redshift"]
 
+    # Adjust the prior for mass, giving a wider range
+    from prospect.models import priors
+    model_params["mass"]["prior"] = priors.LogUniform(mini=1e6, maxi=1e13)
+
+    # Instantiate the model using this parameter dictionary
     model = SpecModel(model_params)
     assert len(model.free_params) == 5
     print(model)
@@ -100,37 +108,52 @@ spectral and isochrone libraries are being used.
     sps = CSPSpecBasis(zcontinuous=1)
     print(sps.ssp.libraries)
 
+For piecewise constant and other flexible SFHs use `FastStepBasis` instead of
+`CSPSpecBasis`.
 
 Make a prediction
 -----------------
 
 We can now predict our data for any set of parameters.  This will take a little
 time because fsps is building and caching the SSPs.  Subsequent calls to predict
-will be faster.  Here we'll just make the predicition for the current value of
+will be faster.  Here we'll just make the prediction for the current value of
 the free parameters.
 
 .. code:: python
 
     current_parameters = ",".join([f"{p}={v}" for p, v in zip(model.free_params, model.theta)])
     print(current_parameters)
     (spec, phot), mfrac = model.predict(model.theta, observations, sps=sps)
-    print(phot / obs["maggies"])
+    print("filter,observed,predicted")
+    for i, f in enumerate(obs["filters"]):
+        print(f"{f.name},{obs['maggies'][i]},{phot[i]}")
 
 
 Run a fit
 ---------
 
-Since we can make predictions and we have data and uncertainties, we should be
-able to construct a likelihood function.  Here we'll use the pre-defined default
+Since we can make predictions and we have (photometric) data and uncertainties,
+we should be able to construct a likelihood function, and then combine with the
+priors to sample the posterior.  Here we'll use the pre-defined default
 posterior probability function.  We also set some sampling related keywords to
-make the fit go a little faster, though it should still take of order tens of
-minutes.
+make the fit go a little faster (but give rougher posterior estimates), though
+it should still take of order tens of minutes.
 
 .. code:: python
 
     from prospect.fitting import lnprobfn, fit_model
+
+    # just the photometry
+    obs = [observations[1]]
+
+    # posterior probability of the initial parameters given the photometry
+    lnp = lnprobfn(model.theta, model, observations=obs, sps=sps)
+
+    # now do the posterior sampling
     fitting_kwargs = dict(nlive_init=400, nested_method="rwalk", nested_target_n_effective=10000, nested_dlogz_init=0.05)
-    output = fit_model(obs, model, sps, optimize=False, dynesty=True, lnprobfn=lnprobfn, **fitting_kwargs)
+    output = fit_model(obs, model, sps, lnprobfn=lnprobfn,
+                       optimize=False, dynesty=True,
+                       **fitting_kwargs)
     result, duration = output["sampling"]
 
 The ``result`` is a dictionary with keys giving the Monte Carlo samples of
@@ -146,7 +169,7 @@ as follows:
 
     from prospect.io import write_results as writer
     hfile = "./quickstart_dynesty_mcmc.h5"
-    writer.write_hdf5(hfile, {}, model, obs,
+    writer.write_hdf5(hfile, {}, model, observations,
                      output["sampling"][0], None,
                      sps=sps,
                      tsample=output["sampling"][1],
@@ -179,7 +202,7 @@ We'll mark the highest probablity posterior sample as well.
     from prospect.plotting import corner
     nsamples, ndim = out["chain"].shape
     cfig, axes = pl.subplots(ndim, ndim, figsize=(10,9))
-    axes = corner.allcorner(out["chain"].T, out["theta_labels"], axes, weights=out["weights"], color="royalblue", show_titles=True)
+    #axes = corner.allcorner(out["chain"].T, out["theta_labels"], axes, weights=out["weights"], color="royalblue", show_titles=True)
 
     from prospect.plotting.utils import best_sample
     pbest = best_sample(out)

diff --git a/doc/spectra.rst b/doc/spectra.rst
@@ -0,0 +1,64 @@
+Fitting Spectra
+================
+
+There are several extra considerations that come up when fitting spectroscopy.
+
+Wavelength range, resolution, and linespread function
+-----------------------------------------------------
+
+Prospector is based on FSPS, which uses stellar spectral libraries with given
+resolution. The empirical MILES library has a resolution of ~2.5A FWHM from
+3750AA - 7200AA restframe, and much lower (R~200 or so, but not actually well
+defined) outside this range. Higher resolution data (after including both
+velocity dispersion and instrumental resolution) or spectral points outside this
+range cannot yet be fit.
+
+Prospector includes methods for FFT based smoothing of the spectra, assuming a
+Gaussian LSF (in either wavelength or velocity space). There is also the
+possibility of FFT based smoothing for wavelength dependent Gaussian dispersion
+(i.e. sigma_lambda = f(lambda) with f possibly a polynomial of lambda). In
+practice the smoothed model spectra will be a combination of the library resolution
+plus whatever FFT smoothing is applied. Hopefully this can be made to match your
+actual data resolution, which is a combination of the physical velocity
+dispersion and the instrumental resolution. The smoothing is controlled by the
+parameters `sigma_smooth`, `smooth_type`, and `fftsmooth`
+
+For undersampled spectra, a special :py:class:`UnderSampledSpectrum` class
+exists that will integrate the model (smoothed by velocity dispersion and
+intrumental resolution) over the supplied pixels.
+
+
+Instrumental Response & Spectrophotometric Calibration
+---------------------
+
+There are various options for dealing with the spectral continuum shape
+depending on how well you think your spectra are calibrated and if you also fit
+photometry to tie down the continuum shape. You can optimize out a polynomial
+"calibration" vector, or simultaneously fit for a polynomial and marginalize
+over the polynomial coefficients (this allows you to place priors on the
+accuracy of the spectrophotometric calibration). Or you can just take the
+spectrum as perfectly calibrated.
+
+Particular treatments can be implemented using different mix-in classes, e.g.
+
+.. code-block:: python
+
+        from prospect.observation import Spectrum, PolyOptCal
+        class PolySpectrum(PolyOptCal, Spectrum):
+            pass
+        spec = PolySpectrum(wavelength=np.linspace(3000, 5000, N),
+                            flux=np.zeros(N),
+                            uncertainty=np.ones(N),
+                            polynomial_order=5)
+
+
+Nebular emission
+----------------
+
+While prospector/FSPS include self-consistent nebular emission, the treatment is
+probably not flexible enough at the moment to fit high S/N, high resolution data
+including nebular emission (e.g. due to deviations of line ratios from Cloudy
+predictions or to complicated gas kinematics that are different than stellar
+kinematics). Thus fitting nebular lines should take adavantage of the nebular
+line amplitude optimization/marginalization capabilities. For very low
+resolution data this is less of an issue.
diff --git a/prospect/io/read_results.py b/prospect/io/read_results.py
@@ -240,9 +240,9 @@ def get_sps(res):
                   "same as the FSPS libraries that you are using now ({})".format(flib, rlib))
         # If fitting and reading in are happening in different python versions,
         # ensure string comparison doesn't throw error:
-        if type(flib[0]) == 'bytes':
+        if isinstance(flib[0], bytes):
             flib = [i.decode() for i in flib]
-        if type(rlib[0]) == 'bytes':
+        if isinstance(rlib[0], bytes):
             rlib = [i.decode() for i in rlib]
         assert (flib[0] == rlib[0]) and (flib[1] == rlib[1]), liberr
 

diff --git a/prospect/io/write_results.py b/prospect/io/write_results.py
@@ -279,7 +279,7 @@ def chain_to_struct(chain, model=None, names=None, **extras):
     struct :
         A structured ndarray of parameter values.
     """
-    indict = type(chain) == dict
+    indict = isinstance(chain, dict)
     if indict:
         return dict_to_struct(chain)
     else:

diff --git a/prospect/models/__init__.py b/prospect/models/__init__.py
@@ -6,14 +6,13 @@
 """
 
 
-from .sedmodel import ProspectorParams, SpecModel
-from .sedmodel import PolySpecModel, SplineSpecModel
-from .sedmodel import AGNSpecModel
+from .parameters import ProspectorParams
+from .sedmodel import SpecModel, HyperSpecModel, AGNSpecModel
 
 
 __all__ = ["ProspectorParams",
            "SpecModel",
-           "PolySpecModel", "SplineSpecModel",
-           "LineSpecModel", "AGNSpecModel"
+           "HyperSpecModel",
+           "AGNSpecModel"
            ]
-Original file line number
+Diff line change
@@ Expand Up / @@ -30,6 +30,7 @@ Prospector allows you to: @@
        installation
        usage
        dataformat
+       spectra
        models
        sfhs
        nebular
@@ Expand Down @@