Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Statistical assessment of spatial distributions #12835

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

mweatherley
Copy link
Contributor

Objective

In #12484 the question arose of how we would actually go about testing the point-sampling methods introduced. In this PR, we introduce statistical tools for assessing the quality of spatial distributions in general and, in particular, of the ShapeSample implementations that presently exist.

Background and approach

A uniform probability distribution is one where the probability density is proportional to the area — that is, for any given region, the probability of a sample being drawn from that region is equal to the proportion of the total area that region occupies.

It follows from this that, if one discretizes the sample space by partitioning it into labeled regions and assigning to each sample the label of the region it falls into, the discrete probability distribution sampled from the labels is a multinomial distribution with probabilities given by the proportions of the total area taken by each region of the partition.

Given, then, some probability distribution which is supposed to be uniform on some region, we can attempt to assess its uniformity by discretizing — as described above — and then performing statistical analysis of the resulting discrete distribution using Pearson's chi-squared test. The point is that, if the distribution exhibits some bias, it might be detected in the discrete distribution, which will fail to conform adequately to the associated multinomial density.

Solution

This branch contains a small library that supports this process with a few parts:

/// A trait implemented by a type which discretizes the sample space of a [`Distribution`] simultaneously
/// in `N` dimensions. To sample an implementing type as a [`Distribution`], use the [`BinSampler`] wrapper
/// type.
pub trait Binned<const N: usize> {
    /// The type defining the sample space discretized by this type.
    type IntermediateValue;

    /// The inner distribution type whose samples are to be discretized.
    type InnerDistribution: Distribution<Self::IntermediateValue>;

    /// The concrete inner distribution of this distribution, used to sample into an `N`-dimensional histogram.
    fn inner_dist(&self) -> Self::InnerDistribution;

    /// A function that takes output from the inner distribution and maps it to `N` bins. This allows
    /// any implementor of `Binned` to be a [`Distribution`] — the output of the distribution is `Option<[usize; N]>`
    /// because the mapping to bins is generally fallible, resulting in an error state when a sample misses every bin.
    fn bin(&self, value: Self::IntermediateValue) -> Option<[usize; N]>;

    /// Bin-sample the discretized distribution.
    fn sample<R: Rng + ?Sized>(&self, rng: &mut R) -> Option<[usize; N]> {
        let v = self.inner_dist().sample(rng);
        self.bin(v)
    }
}

The preceding trait models the discretization process for arbitrary spatial distributions, but provides no metadata about what the associated multinomial densities should be; that is supported by the following additional trait:

/// A discretized ([`Binned`]) probability distribution that also has extrinsic weights associated to its bins;
/// primarily intended for use in chi-squared analysis of spatial distributions.
pub trait WithBinDistributions<const N: usize>: Binned<N> {
    /// Get the bin weights to compare with actual samples.
    fn get_bins(&self) -> [BinDistribution; N];

    /// Get the degrees of freedom of each set of bins.
    fn dfs(&self) -> [usize; N] {
        self.get_bins().map(|b| b.bins.len().saturating_sub(1))
    }
}

Next, an N-dimensional histogram type is used to actually aggregate samples for the purposes of comparison:

/// An `N`-dimensional histogram, holding data simultaneously assessed to lie in
/// `N` different families of bins.
///
/// Constructed via its [`FromIterator`] implementation, hence by calling [`Iterator::collect`]
/// on an iterator whose items are of type `Option<[usize; N]>`. Most notably, the sample iterator
/// of [`BinSampler<T>`](super::traits::BinSampler) where `T` implements [`Binned`](super::traits::Binned)
/// produces values of this type.
pub struct Histogram<const N: usize> {
    /// The actual histogram, with the invalid items diverted to `invalid`
    pub(crate) inner: BTreeMap<[usize; N], usize>,

    /// The total samples present in the histogram — i.e., excluding invalid items.
    pub total: usize,

    /// Count of invalid items, separate from the actual histogram.
    pub invalid_count: usize,
}

Finally, chi-squared analysis functions take these histograms (or their projections) as input and produce actual chi-squared values:

/// Compute the chi-squared goodness-of-fit test statistic for the `histogram` relative to the ideal
/// distribution described by `ideal`. Note that this is distinct from the p-value, which must be
/// assessed separately.
pub fn chi_squared_fit(histogram: &Histogram<1>, ideal: &BinDistribution) -> f64 { //... }

Presently, the actual testing implemented by this branch includes Binned implementations for the interiors and boundaries of Circle and Sphere. Two wrapper types, InteriorOf<T> and BoundaryOf<T> have been introduced for implementors of ShapeSample, with the purpose of allowing the constituent sampling methods to be used directly as Distributions. This adds modularity; the library itself operates also at the level of Distributions.


Changelog

  • Moved shape_sampling.rs into a new sampling submodule of bevy_math that holds all of the rand dependencies.
  • New wrapper structs InteriorOf<T> and BoundaryOf<T> allow conversion of ShapeSample implementors into Distributions.

Discussion

Caveat emptor

The statistical tests in sampling/statistical_tests/impls.rs are marked #[ignore] so that they do not run in CI testing. They must never, ever, ever run in CI testing. The purpose of these statistical tests is that they reliably fail when something is wrong — not that they always succeed when everything is fine.

Presently, the alpha-level of each individual test is .001, meaning that each constituent check fails 1/1000th of the time; with the current volume of tests, this means that about 1% of the time, a failure would occur even if everything was perfect.

On the other hand, chi-squared error has the property that it grows with sample size for mismatched distributions, while remaining constant for matched ones. That is to say: statistical biases in the output should lead to the tests failing quite reliably, meaning they do not need to be run particularly often. We can use very large sample sizes to ensure this if need be.

Personally, I am not sure what the best way of using these tests would be other than running them manually. Presently, this can be done as follows:

cargo run -p bevy_math -- --ignored

What?

I'm sure this looks like building a death ray to kill an ant. In a sense, it is. Frankly, the reason that I made this isn't because I wanted to (not that I didn't enjoy myself), but really that I couldn't think of any other way to externally assess the quality of our sampling code that was actually meaningful in any way. For example, using a fixed-seed RNG and comparing output to some known values doesn't really demonstrate anything (and, in fact, breaks spuriously when the code is refactored).

@NthTensor NthTensor added C-Testing A change that impacts how we test Bevy or how users test their apps A-Math Fundamental domain-agnostic mathematical operations labels Apr 1, 2024
@NthTensor
Copy link
Contributor

Chi-squared, Nice! Looks pretty good at a glance, I will have time to review later in the week.

github-merge-queue bot pushed a commit that referenced this pull request May 22, 2024
Stolen from #12835. 

# Objective

Sometimes you want to sample a whole bunch of points from a shape
instead of just one. You can write your own loop to do this, but it's
really more idiomatic to use a `rand`
[`Distribution`](https://docs.rs/rand/latest/rand/distributions/trait.Distribution.html)
with the `sample_iter` method. Distributions also support other useful
things like mapping, and they are suitable as generic items for
consumption by other APIs.

## Solution

`ShapeSample` has been given two new automatic trait methods,
`interior_dist` and `boundary_dist`. They both have similar signatures
(recall that `Output` is the output type for `ShapeSample`):
```rust
fn interior_dist(self) -> impl Distribution<Self::Output>
where Self: Sized { //... }
```

These have default implementations which are powered by wrapper structs
`InteriorOf` and `BoundaryOf` that actually implement `Distribution` —
the implementations effectively just call `ShapeSample::sample_interior`
and `ShapeSample::sample_boundary` on the contained type.

The upshot is that this allows iteration as follows:
```rust
// Get an iterator over boundary points of a rectangle:
let rectangle = Rectangle::new(1.0, 2.0);
let boundary_iter = rectangle.boundary_dist().sample_iter(rng);
// Collect a bunch of boundary points at once:
let boundary_pts: Vec<Vec2> = boundary_iter.take(1000).collect();
```

Alternatively, you can use `InteriorOf`/`BoundaryOf` explicitly to
similar effect:
```rust
let boundary_pts: Vec<Vec2> = BoundaryOf(rectangle).sample_iter(rng).take(1000).collect();
```

---

## Changelog

- Added `InteriorOf` and `BoundaryOf` distribution wrapper structs in
`bevy_math::sampling::shape_sampling`.
- Added `interior_dist` and `boundary_dist` automatic trait methods to
`ShapeSample`.
- Made `shape_sampling` module public with explanatory documentation.

---

## Discussion

### Design choices

The main point of interest here is just the choice of `impl
Distribution` instead of explicitly using `InteriorOf`/`BoundaryOf`
return types for `interior_dist` and `boundary_dist`. The reason for
this choice is that it allows future optimizations for repeated sampling
— for example, instead of just wrapping the base type,
`interior_dist`/`boundary_dist` could construct auxiliary data that is
held over between sampling operations.
@janhohenheim janhohenheim added the S-Needs-Review Needs reviewer attention (from anyone!) to move forward label Jul 17, 2024
@cart
Copy link
Member

cart commented Aug 23, 2024

I'm thinking this code/module should probably be behind a feature flag. I'm not convinced bevy devs need this compiled into their apps by default.

#[derive(Debug, Error)]
#[error("One or more provided dimensions {dimensions:?} was outside of the range 0..{ambient_dimension}")]
pub struct InvalidDimensionError {
ambient_dimension: usize,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These fields should be pub.

/// of freedom exactly; as a result, index zero is just a placeholder value.
///
/// Source: [NIST](https://www.itl.nist.gov/div898/handbook/eda/section3/eda3674.htm)
pub const CHI_SQUARED_CRIT_VALUES_EMINUS3: [f64; 101] = [
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit annoyed at the magic constants rather than computing these as needed, but I'm not sure that it's better.

use super::{chi_squared_fit, chi_squared_independence, BinDistribution, Histogram};
use std::collections::BTreeMap;

const SAMPLES_2D: [Option<[usize; 2]>; 6] = [
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment here about why the values were chosen please.

let ideal_first = BinDistribution::from_weights(vec![1.; 2]);
let ideal_second = BinDistribution::from_weights(vec![1.; 2]);
let chi_squared = chi_squared_independence(&histogram, &[ideal_first, ideal_second]);
assert!((chi_squared - 10. / 3.).abs() < 1e-7);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This very much needs a comment.

// Uniform distribution on 5 bins 0..5
let ideal = BinDistribution::from_weights(vec![1.; 5]);
let chi_squared = chi_squared_fit(&histogram, &ideal);
assert!((chi_squared - 3.5).abs() < 1e-7);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment please.

fn histogram_project_two_duplicate() {
let histogram: Histogram<2> = SAMPLES_2D.into_iter().collect();

// Verify that diagonal projections work how one would expect.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't clear.

// Critical Values //
//-----------------//

/// Critical values of the chi-squared distribution for ascending degrees of freedom at an alpha
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to have a line in here about what a critical value is, and maybe a doc test about how it would be used.

where
T: Binned<N> + WithBinDistributions<N> + Copy,
{
let rng = StdRng::from_entropy();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should be using unseeded RNG in our automated tests unfortunately.


/// A discretized distribution for the boundary of a [`Sphere`].
#[derive(Clone, Copy)]
pub struct SphereBoundaryBins {
Copy link
Member

@alice-i-cecile alice-i-cecile Aug 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should just be able to make these fields pub right?

Copy link
Member

@alice-i-cecile alice-i-cecile left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely agree on the feature flag here: I think that InteriorOf / BoundaryOf can stay unflagged, but the statistical tests should be flagged.

As for the content of this PR, the math makes sense to me, and I think it's valuable to have tests for this. I think this could use a bit more care / explanation on how we expose this (these types are pub, but their actual usage is all in the private test). A module level doc test would be nice. Alternatively, we could decide that this is all private test code and cfg[test] the whole lot.

Indexing into an array is a bit dicey for gathering critical values / performing tests: perhaps we can write something a bit safer? Non-blocking though: this doesn't need to be super robust.

Unseeded RNG in tests is also a blocker, even if I expect spurious failures to be rare.

@janhohenheim
Copy link
Member

janhohenheim commented Aug 24, 2024

Regarding spurious failures: at least the false-negative rate could be calculated exactly in this case 😛
Jokes aside, I agree with Alice on all points, I just wanted to mention that I think this PR is really really cool.
Math checks out at a cursory glance, although I didn't double-check any constants used.
Not entirely sure if we need to explain what a critical value is, otherwise this might quickly turn into a statistics 101.
But yeah, when in doubt, add more documentation :)

@janhohenheim janhohenheim added S-Waiting-on-Author The author needs to make changes or address concerns before this can be merged and removed S-Needs-Review Needs reviewer attention (from anyone!) to move forward labels Aug 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-Math Fundamental domain-agnostic mathematical operations C-Testing A change that impacts how we test Bevy or how users test their apps S-Waiting-on-Author The author needs to make changes or address concerns before this can be merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants