Skip to content

Commit

Permalink
Merge pull request #336 from bd-j/obs_cal
Browse files Browse the repository at this point in the history
Obs cal
  • Loading branch information
bd-j authored Aug 9, 2024
2 parents 09f11ca + 9f00d08 commit 2ddd786
Show file tree
Hide file tree
Showing 12 changed files with 507 additions and 318 deletions.
10 changes: 5 additions & 5 deletions doc/dataformat.rst
Original file line number Diff line number Diff line change
Expand Up @@ -79,7 +79,7 @@ For a single observation, you might do something like:
.. code-block:: python
def build_obs(N):
from prospect.data import Spectrum
from prospect.observation import Spectrum
N = 1000 # number of wavelength points
spec = Spectrum(wavelength=np.linspace(3000, 5000, N), flux=np.zeros(N), uncertainty=np.ones(N))
# ensure that this is a valid observation for fitting
Expand All @@ -95,7 +95,7 @@ For photometry this might look like:
.. code-block:: python
def build_obs(N):
from prospect.data import Photometry
from prospect.observation import Photometry
# valid sedpy filter names
fnames = list([f"sdss_{b}0" for b in "ugriz"])
Nf = len(fnames)
Expand All @@ -113,7 +113,7 @@ A tool exists to convert old combined observation dictionaries to a list of

.. code-block:: python
from prospect.data import from_oldstyle
from prospect.observation import from_oldstyle
# dummy observation dictionary with just a spectrum
N = 1000
obs = dict(wavelength=np.linspace(3000, 5000, N), spectrum=np.zeros(N), unc=np.ones(N),
Expand All @@ -129,7 +129,7 @@ The :py:meth:`build_obs` function

The :py:meth:`build_obs` function in the parameter file is written by the user.
It should take a dictionary of command line arguments as keyword arguments. It
should return a list of :py:class:`prospect.data.Observation` instances,
should return a list of :py:class:`prospect.observation.Observation` instances,
described above.

Other than that, the contents can be anything. Within this function you might
Expand All @@ -142,7 +142,7 @@ The point of this function is that you don't have to *externally* convert your
data format to be what |Codename| expects and keep another version of files
lying around: the conversion happens *within* the code itself. Again, the only
requirement is that the function can take a ``run_params`` dictionary as keyword
arguments and that it return :py:class:`prospect.data.Observation` instances, as
arguments and that it return :py:class:`prospect.observation.Observation` instances, as
described above. Each observation instance should correspond to a particular
dataset (e.g. a broadband photomtric SED, the spectrum from a particular
instrument, or the spectrum from a particular night) that shares instrumental
Expand Down
43 changes: 0 additions & 43 deletions doc/faq.rst
Original file line number Diff line number Diff line change
Expand Up @@ -40,49 +40,6 @@ sampling algorithm being used, and the kind of data that you have. Hours or
even days per fit is not uncommon for more complex models.


Can I fit my spectrum too?
--------------------------
There are several extra considerations that come up when fitting spectroscopy

1) Wavelength range and resolution.
Prospector is based on FSPS, which uses the MILES spectral library. These
have a resolution of ~2.5A FWHM from 3750AA - 7200AA restframe, and much
lower (R~200 or so, but not actually well defined) outside this range.
Higher resolution data (after including both velocity dispersion and
instrumental resolution) or spectral points outside this range cannot yet
be fit.

2) Relatedly, line spread function.
Prospector includes methods for FFT based smoothing of the spectra,
assuming a gaussian LSF (in either wavelength or velocity space). There is
also the possibility of FFT based smoothing for wavelength dependent
Gaussian dispersion (i.e. sigma_lambda = f(lambda) with f possibly a
polynomial of lambda). In practice the smoothed spectra will be a
combination of the library resolution plus whatever FFT smoothing is
applied. Hopefully this can be made to match your actual data resolution,
which is a combination of the physical velocity dispersion and the
instrumental resolution. The smoothing is controlled by the parameters
`sigma_smooth`, `smooth_type`, and `fftsmooth`

3) Nebular emission.
While prospector/FSPS include self-consistent nebular emission, the
treatment is probably not flexible enough at the moment to fit high S/N,
high resolution data including nebular emission (e.g. due to deviations of
line ratios from Cloudy predictions or to complicated gas kinematics that
are different than stellar kinematics). Thus fitting nebular lines should
take adavantage of the nebular line amplitude optimization/marginalization
capabilities. For very low resolution data this is less of an issue.

4) Spectrophotometric calibration.
There are various options for dealing with the spectral continuum shape
depending on how well you think your spectra are calibrated and if you
also fit photometry to tie down the continuum shape. You can optimize out
a polynomial "calibration" vector, or simultaneously fit for a polynomial
and marginalize over the polynomial coefficients (this allows you to place
priors on the accuracy of the spectrophotometric calibration). Or you can
just take the spectrum as perfectly calibrated.


How do I fit for redshift as well as other parameters?
------------------------------------------------------
The simplest way is to just let the parameter specification for ``"zred"``
Expand Down
1 change: 1 addition & 0 deletions doc/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@ Prospector allows you to:
installation
usage
dataformat
spectra
models
sfhs
nebular
Expand Down
49 changes: 36 additions & 13 deletions doc/quickstart.rst
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ include an empty Spectrum data set to force a prediction of the full spectrum.
.. code:: python
from sedpy.observate import load_filters
from prospect.data import Photometry, Spectrum
from prospect.observation import Photometry, Spectrum
filters = load_filters([f"sdss_{b}0" for b in bands])
maggies = np.array([10**(-0.4 * cat[0][f"cModelMag_{b}"]) for b in bands])
Expand All @@ -59,6 +59,7 @@ include an empty Spectrum data set to force a prediction of the full spectrum.
for obs in observations:
obs.redshift = shdus[2].data[0]["z"]
In principle we could also add noise models for the spectral and photometric
data (e.g. to fit for the photometric noise floor), but we'll make the default
assumption of iid Gaussian noise for the moment.
Expand All @@ -69,18 +70,25 @@ Build a Model

Here we will get a default parameter set for a simple parametric SFH, and add a
set of parameters describing nebular emission. We'll also fix the redshift to
the value given by SDSS. This model has 5 free parameters, each of which has a
an associated prior distribution. These default prior distributions can and
should be replaced or adjusted depending on your particular science question.
the value given by SDSS. This model has 5 free parameters, each of which has an
associated prior distribution. These default prior distributions can and should
be replaced or adjusted depending on your particular science question. Here
we'll just change the prior distribution for stellar mass, as an example.

.. code:: python
# Get a baseline set of model parameters
from prospect.models.templates import TemplateLibrary
from prospect.models import SpecModel
model_params = TemplateLibrary["parametric_sfh"]
model_params.update(TemplateLibrary["nebular"])
model_params["zred"]["init"] = obs["redshift"]
# Adjust the prior for mass, giving a wider range
from prospect.models import priors
model_params["mass"]["prior"] = priors.LogUniform(mini=1e6, maxi=1e13)
# Instantiate the model using this parameter dictionary
model = SpecModel(model_params)
assert len(model.free_params) == 5
print(model)
Expand All @@ -100,37 +108,52 @@ spectral and isochrone libraries are being used.
sps = CSPSpecBasis(zcontinuous=1)
print(sps.ssp.libraries)
For piecewise constant and other flexible SFHs use `FastStepBasis` instead of
`CSPSpecBasis`.

Make a prediction
-----------------

We can now predict our data for any set of parameters. This will take a little
time because fsps is building and caching the SSPs. Subsequent calls to predict
will be faster. Here we'll just make the predicition for the current value of
will be faster. Here we'll just make the prediction for the current value of
the free parameters.

.. code:: python
current_parameters = ",".join([f"{p}={v}" for p, v in zip(model.free_params, model.theta)])
print(current_parameters)
(spec, phot), mfrac = model.predict(model.theta, observations, sps=sps)
print(phot / obs["maggies"])
print("filter,observed,predicted")
for i, f in enumerate(obs["filters"]):
print(f"{f.name},{obs['maggies'][i]},{phot[i]}")
Run a fit
---------

Since we can make predictions and we have data and uncertainties, we should be
able to construct a likelihood function. Here we'll use the pre-defined default
Since we can make predictions and we have (photometric) data and uncertainties,
we should be able to construct a likelihood function, and then combine with the
priors to sample the posterior. Here we'll use the pre-defined default
posterior probability function. We also set some sampling related keywords to
make the fit go a little faster, though it should still take of order tens of
minutes.
make the fit go a little faster (but give rougher posterior estimates), though
it should still take of order tens of minutes.

.. code:: python
from prospect.fitting import lnprobfn, fit_model
# just the photometry
obs = [observations[1]]
# posterior probability of the initial parameters given the photometry
lnp = lnprobfn(model.theta, model, observations=obs, sps=sps)
# now do the posterior sampling
fitting_kwargs = dict(nlive_init=400, nested_method="rwalk", nested_target_n_effective=10000, nested_dlogz_init=0.05)
output = fit_model(obs, model, sps, optimize=False, dynesty=True, lnprobfn=lnprobfn, **fitting_kwargs)
output = fit_model(obs, model, sps, lnprobfn=lnprobfn,
optimize=False, dynesty=True,
**fitting_kwargs)
result, duration = output["sampling"]
The ``result`` is a dictionary with keys giving the Monte Carlo samples of
Expand All @@ -146,7 +169,7 @@ as follows:
from prospect.io import write_results as writer
hfile = "./quickstart_dynesty_mcmc.h5"
writer.write_hdf5(hfile, {}, model, obs,
writer.write_hdf5(hfile, {}, model, observations,
output["sampling"][0], None,
sps=sps,
tsample=output["sampling"][1],
Expand Down Expand Up @@ -179,7 +202,7 @@ We'll mark the highest probablity posterior sample as well.
from prospect.plotting import corner
nsamples, ndim = out["chain"].shape
cfig, axes = pl.subplots(ndim, ndim, figsize=(10,9))
axes = corner.allcorner(out["chain"].T, out["theta_labels"], axes, weights=out["weights"], color="royalblue", show_titles=True)
#axes = corner.allcorner(out["chain"].T, out["theta_labels"], axes, weights=out["weights"], color="royalblue", show_titles=True)
from prospect.plotting.utils import best_sample
pbest = best_sample(out)
Expand Down
64 changes: 64 additions & 0 deletions doc/spectra.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
Fitting Spectra
================

There are several extra considerations that come up when fitting spectroscopy.

Wavelength range, resolution, and linespread function
-----------------------------------------------------

Prospector is based on FSPS, which uses stellar spectral libraries with given
resolution. The empirical MILES library has a resolution of ~2.5A FWHM from
3750AA - 7200AA restframe, and much lower (R~200 or so, but not actually well
defined) outside this range. Higher resolution data (after including both
velocity dispersion and instrumental resolution) or spectral points outside this
range cannot yet be fit.

Prospector includes methods for FFT based smoothing of the spectra, assuming a
Gaussian LSF (in either wavelength or velocity space). There is also the
possibility of FFT based smoothing for wavelength dependent Gaussian dispersion
(i.e. sigma_lambda = f(lambda) with f possibly a polynomial of lambda). In
practice the smoothed model spectra will be a combination of the library resolution
plus whatever FFT smoothing is applied. Hopefully this can be made to match your
actual data resolution, which is a combination of the physical velocity
dispersion and the instrumental resolution. The smoothing is controlled by the
parameters `sigma_smooth`, `smooth_type`, and `fftsmooth`

For undersampled spectra, a special :py:class:`UnderSampledSpectrum` class
exists that will integrate the model (smoothed by velocity dispersion and
intrumental resolution) over the supplied pixels.


Instrumental Response & Spectrophotometric Calibration
---------------------

There are various options for dealing with the spectral continuum shape
depending on how well you think your spectra are calibrated and if you also fit
photometry to tie down the continuum shape. You can optimize out a polynomial
"calibration" vector, or simultaneously fit for a polynomial and marginalize
over the polynomial coefficients (this allows you to place priors on the
accuracy of the spectrophotometric calibration). Or you can just take the
spectrum as perfectly calibrated.

Particular treatments can be implemented using different mix-in classes, e.g.

.. code-block:: python
from prospect.observation import Spectrum, PolyOptCal
class PolySpectrum(PolyOptCal, Spectrum):
pass
spec = PolySpectrum(wavelength=np.linspace(3000, 5000, N),
flux=np.zeros(N),
uncertainty=np.ones(N),
polynomial_order=5)
Nebular emission
----------------

While prospector/FSPS include self-consistent nebular emission, the treatment is
probably not flexible enough at the moment to fit high S/N, high resolution data
including nebular emission (e.g. due to deviations of line ratios from Cloudy
predictions or to complicated gas kinematics that are different than stellar
kinematics). Thus fitting nebular lines should take adavantage of the nebular
line amplitude optimization/marginalization capabilities. For very low
resolution data this is less of an issue.
4 changes: 2 additions & 2 deletions prospect/io/read_results.py
Original file line number Diff line number Diff line change
Expand Up @@ -240,9 +240,9 @@ def get_sps(res):
"same as the FSPS libraries that you are using now ({})".format(flib, rlib))
# If fitting and reading in are happening in different python versions,
# ensure string comparison doesn't throw error:
if type(flib[0]) == 'bytes':
if isinstance(flib[0], bytes):
flib = [i.decode() for i in flib]
if type(rlib[0]) == 'bytes':
if isinstance(rlib[0], bytes):
rlib = [i.decode() for i in rlib]
assert (flib[0] == rlib[0]) and (flib[1] == rlib[1]), liberr

Expand Down
2 changes: 1 addition & 1 deletion prospect/io/write_results.py
Original file line number Diff line number Diff line change
Expand Up @@ -279,7 +279,7 @@ def chain_to_struct(chain, model=None, names=None, **extras):
struct :
A structured ndarray of parameter values.
"""
indict = type(chain) == dict
indict = isinstance(chain, dict)
if indict:
return dict_to_struct(chain)
else:
Expand Down
9 changes: 4 additions & 5 deletions prospect/models/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,14 +6,13 @@
"""


from .sedmodel import ProspectorParams, SpecModel
from .sedmodel import PolySpecModel, SplineSpecModel
from .sedmodel import AGNSpecModel
from .parameters import ProspectorParams
from .sedmodel import SpecModel, HyperSpecModel, AGNSpecModel


__all__ = ["ProspectorParams",
"SpecModel",
"PolySpecModel", "SplineSpecModel",
"LineSpecModel", "AGNSpecModel"
"HyperSpecModel",
"AGNSpecModel"
]

Loading

0 comments on commit 2ddd786

Please sign in to comment.