Statistical assessment of spatial distributions #12835

mweatherley · 2024-04-01T19:58:46Z

Objective

In #12484 the question arose of how we would actually go about testing the point-sampling methods introduced. In this PR, we introduce statistical tools for assessing the quality of spatial distributions in general and, in particular, of the ShapeSample implementations that presently exist.

Background and approach

A uniform probability distribution is one where the probability density is proportional to the area — that is, for any given region, the probability of a sample being drawn from that region is equal to the proportion of the total area that region occupies.

It follows from this that, if one discretizes the sample space by partitioning it into labeled regions and assigning to each sample the label of the region it falls into, the discrete probability distribution sampled from the labels is a multinomial distribution with probabilities given by the proportions of the total area taken by each region of the partition.

Given, then, some probability distribution which is supposed to be uniform on some region, we can attempt to assess its uniformity by discretizing — as described above — and then performing statistical analysis of the resulting discrete distribution using Pearson's chi-squared test. The point is that, if the distribution exhibits some bias, it might be detected in the discrete distribution, which will fail to conform adequately to the associated multinomial density.

Solution

This branch contains a small library that supports this process with a few parts:

/// A trait implemented by a type which discretizes the sample space of a [`Distribution`] simultaneously
/// in `N` dimensions. To sample an implementing type as a [`Distribution`], use the [`BinSampler`] wrapper
/// type.
pub trait Binned<const N: usize> {
    /// The type defining the sample space discretized by this type.
    type IntermediateValue;

    /// The inner distribution type whose samples are to be discretized.
    type InnerDistribution: Distribution<Self::IntermediateValue>;

    /// The concrete inner distribution of this distribution, used to sample into an `N`-dimensional histogram.
    fn inner_dist(&self) -> Self::InnerDistribution;

    /// A function that takes output from the inner distribution and maps it to `N` bins. This allows
    /// any implementor of `Binned` to be a [`Distribution`] — the output of the distribution is `Option<[usize; N]>`
    /// because the mapping to bins is generally fallible, resulting in an error state when a sample misses every bin.
    fn bin(&self, value: Self::IntermediateValue) -> Option<[usize; N]>;

    /// Bin-sample the discretized distribution.
    fn sample<R: Rng + ?Sized>(&self, rng: &mut R) -> Option<[usize; N]> {
        let v = self.inner_dist().sample(rng);
        self.bin(v)
    }
}

The preceding trait models the discretization process for arbitrary spatial distributions, but provides no metadata about what the associated multinomial densities should be; that is supported by the following additional trait:

/// A discretized ([`Binned`]) probability distribution that also has extrinsic weights associated to its bins;
/// primarily intended for use in chi-squared analysis of spatial distributions.
pub trait WithBinDistributions<const N: usize>: Binned<N> {
    /// Get the bin weights to compare with actual samples.
    fn get_bins(&self) -> [BinDistribution; N];

    /// Get the degrees of freedom of each set of bins.
    fn dfs(&self) -> [usize; N] {
        self.get_bins().map(|b| b.bins.len().saturating_sub(1))
    }
}

Next, an N-dimensional histogram type is used to actually aggregate samples for the purposes of comparison:

/// An `N`-dimensional histogram, holding data simultaneously assessed to lie in
/// `N` different families of bins.
///
/// Constructed via its [`FromIterator`] implementation, hence by calling [`Iterator::collect`]
/// on an iterator whose items are of type `Option<[usize; N]>`. Most notably, the sample iterator
/// of [`BinSampler<T>`](super::traits::BinSampler) where `T` implements [`Binned`](super::traits::Binned)
/// produces values of this type.
pub struct Histogram<const N: usize> {
    /// The actual histogram, with the invalid items diverted to `invalid`
    pub(crate) inner: BTreeMap<[usize; N], usize>,

    /// The total samples present in the histogram — i.e., excluding invalid items.
    pub total: usize,

    /// Count of invalid items, separate from the actual histogram.
    pub invalid_count: usize,
}

Finally, chi-squared analysis functions take these histograms (or their projections) as input and produce actual chi-squared values:

/// Compute the chi-squared goodness-of-fit test statistic for the `histogram` relative to the ideal
/// distribution described by `ideal`. Note that this is distinct from the p-value, which must be
/// assessed separately.
pub fn chi_squared_fit(histogram: &Histogram<1>, ideal: &BinDistribution) -> f64 { //... }

Presently, the actual testing implemented by this branch includes Binned implementations for the interiors and boundaries of Circle and Sphere. Two wrapper types, InteriorOf<T> and BoundaryOf<T> have been introduced for implementors of ShapeSample, with the purpose of allowing the constituent sampling methods to be used directly as Distributions. This adds modularity; the library itself operates also at the level of Distributions.

Changelog

Moved shape_sampling.rs into a new sampling submodule of bevy_math that holds all of the rand dependencies.
New wrapper structs InteriorOf<T> and BoundaryOf<T> allow conversion of ShapeSample implementors into Distributions.

Discussion

Caveat emptor

The statistical tests in sampling/statistical_tests/impls.rs are marked #[ignore] so that they do not run in CI testing. They must never, ever, ever run in CI testing. The purpose of these statistical tests is that they reliably fail when something is wrong — not that they always succeed when everything is fine.

Presently, the alpha-level of each individual test is .001, meaning that each constituent check fails 1/1000th of the time; with the current volume of tests, this means that about 1% of the time, a failure would occur even if everything was perfect.

On the other hand, chi-squared error has the property that it grows with sample size for mismatched distributions, while remaining constant for matched ones. That is to say: statistical biases in the output should lead to the tests failing quite reliably, meaning they do not need to be run particularly often. We can use very large sample sizes to ensure this if need be.

Personally, I am not sure what the best way of using these tests would be other than running them manually. Presently, this can be done as follows:

cargo run -p bevy_math -- --ignored

What?

I'm sure this looks like building a death ray to kill an ant. In a sense, it is. Frankly, the reason that I made this isn't because I wanted to (not that I didn't enjoy myself), but really that I couldn't think of any other way to externally assess the quality of our sampling code that was actually meaningful in any way. For example, using a fixed-seed RNG and comparing output to some known values doesn't really demonstrate anything (and, in fact, breaks spuriously when the code is refactored).

NthTensor · 2024-04-01T20:10:11Z

Chi-squared, Nice! Looks pretty good at a glance, I will have time to review later in the week.

…tributions

Stolen from #12835. # Objective Sometimes you want to sample a whole bunch of points from a shape instead of just one. You can write your own loop to do this, but it's really more idiomatic to use a `rand` [`Distribution`](https://docs.rs/rand/latest/rand/distributions/trait.Distribution.html) with the `sample_iter` method. Distributions also support other useful things like mapping, and they are suitable as generic items for consumption by other APIs. ## Solution `ShapeSample` has been given two new automatic trait methods, `interior_dist` and `boundary_dist`. They both have similar signatures (recall that `Output` is the output type for `ShapeSample`): ```rust fn interior_dist(self) -> impl Distribution<Self::Output> where Self: Sized { //... } ``` These have default implementations which are powered by wrapper structs `InteriorOf` and `BoundaryOf` that actually implement `Distribution` — the implementations effectively just call `ShapeSample::sample_interior` and `ShapeSample::sample_boundary` on the contained type. The upshot is that this allows iteration as follows: ```rust // Get an iterator over boundary points of a rectangle: let rectangle = Rectangle::new(1.0, 2.0); let boundary_iter = rectangle.boundary_dist().sample_iter(rng); // Collect a bunch of boundary points at once: let boundary_pts: Vec<Vec2> = boundary_iter.take(1000).collect(); ``` Alternatively, you can use `InteriorOf`/`BoundaryOf` explicitly to similar effect: ```rust let boundary_pts: Vec<Vec2> = BoundaryOf(rectangle).sample_iter(rng).take(1000).collect(); ``` --- ## Changelog - Added `InteriorOf` and `BoundaryOf` distribution wrapper structs in `bevy_math::sampling::shape_sampling`. - Added `interior_dist` and `boundary_dist` automatic trait methods to `ShapeSample`. - Made `shape_sampling` module public with explanatory documentation. --- ## Discussion ### Design choices The main point of interest here is just the choice of `impl Distribution` instead of explicitly using `InteriorOf`/`BoundaryOf` return types for `interior_dist` and `boundary_dist`. The reason for this choice is that it allows future optimizations for repeated sampling — for example, instead of just wrapping the base type, `interior_dist`/`boundary_dist` could construct auxiliary data that is held over between sampling operations.

cart · 2024-08-23T20:02:16Z

I'm thinking this code/module should probably be behind a feature flag. I'm not convinced bevy devs need this compiled into their apps by default.

alice-i-cecile · 2024-08-23T23:47:35Z