V1.2 proposal #61

idontgetoutmuch · 2020-05-13T11:47:46Z

Context

Following @lehins' performance analysis of Haskell pseudo-random number libraries and the ensuing discussion, @lehins, @idontgetoutmuch and @curiousleo with help from @Shimuuar set out to improve random as both an interface for and implementation of a pseudo-random number generator for Haskell.

Our goals were to fix #25 (filed in 2015) and #51 (filed in 2018), see "Quality" and "Performance" below.

In the process of tackling these two issues, we addressed a number of other issues too (see "Other issues addressed" below) and added a monadic interface to the library so monadic pseudo-random number generators can be used interchangeably with random, see "API changes" below.

This PR is the result of that effort. The changes are considerable. To signal this, we propose to release this as version 1.2 (the previous released version is 1.1, from 2014).

However, the API changes are generally backwards-compatible, see "Compatibility" below.

Quality (#25)

We created an environment for running statistical pseudo-random number generator tests, tested random v1.1 and splitmix using dieharder, TestU01, PractRand and other test suites and recorded the results.

The results clearly show that the split operation in random v1.1 produces pseudo-random number generators which are correlated, corroborating #25. The split operation in splitmix showed no weakness in our tests.

As a result, we replaced the pseudo-random number generator implementation in random by the one provided by splitmix.

Performance (#51)

@lehins' performance analysis has the data for random v1.1. It is slow, and using faster pseudo-random number generators via random v1.1 makes them slow.

By switching to splitmix and improving the API, this PR speeds up pseudo-random number generation with random by one to three orders of magnitude, depending on the number type. See Benchmarks for details.

API changes

`MonadRandom`

The major API addition in this PR is the definition of a new class MonadRandom:

-- | 'MonadRandom' is an interface to monadic pseudo-random number generators.
class Monad m => MonadRandom g s m | g m -> s where
  {-# MINIMAL freezeGen,thawGen,(uniformWord32|uniformWord64) #-}

  type Frozen g = (f :: Type) | f -> g
  freezeGen :: g s -> m (Frozen g)
  thawGen :: Frozen g -> m (g s)

  uniformWord32 :: g s -> m Word32 -- default implementation in terms of uniformWord64
  uniformWord64 :: g s -> m Word64 -- default implementation in terms of uniformWord32
  -- plus methods for other word sizes and for byte strings
  -- all have default implementations so the MINIMAL pragma holds

Conceptually, in MonadRandom g s m, g s is the type of the generator, s is the state type, and m the underlying monad. Via the functional dependency g m -> s, the state type is determined by the generator and monad.

Frozen is the type of the generator's state "at rest". It is defined as an injective type family via f -> g, so there is no ambiguity as to which g any Frozen g belongs to.

This definition is generic enough to accomodate, for example, the Gen type from mwc-random, which itself abstracts over the underlying primitive monad and state token. This is the full instance declaration (provided here as an example - this instance is not part of random as random does not depend on mwc-random):

instance (s ~ PrimState m, PrimMonad m) => MonadRandom MWC.Gen s m where
  type Frozen MWC.Gen = MWC.Seed
  freezeGen = MWC.save
  thawGen = MWC.restore

  uniformWord8 = MWC.uniform
  uniformWord16 = MWC.uniform
  uniformWord32 = MWC.uniform
  uniformWord64 = MWC.uniform
  uniformShortByteString n g = unsafeSTToPrim (genShortByteStringST n (MWC.uniform g))

Four MonadRandom instances ("monadic adapters") are provided for pure generators to enable their use in monadic code. The documentation describes them in detail.

`Uniform` and `UniformRange`

The Random typeclass has conceptually been split into Uniform and UniformRange. The Random typeclass is still included for backwards compatibility. Uniform is for types where it is possible to sample from the type's entire domain; UniformRange is for types where one can sample from a specified range.

Changes left out

There were changes we considered and decided against including in this PR.

Some pseudo-random number generators are splittable, others are not. A good way of communicating this is to have a separate typeclass, Splittable, say, which only splittable generators implement. After long discussions (see this issue and this PR), we decided against adding Splittable: the interface changes would either have been backwards-incompatible or too complex. For now, split stays part of the RandomGen typeclass. The new documentation suggests that split should call error if the generator is not splittable.

Due to floating point rounding, generating a floating point number in a range can yield surprising results. There are techniques to generate floating point numbers in a range with actual guarantees, but they are more complex and likely slower than the naive methods, so we decided to postpone this particular issue.

Ranges on the real number line can be inclusive or exclusive in the lower and upper bound. We considered API designs that would allow users to communicate precisely what kind of range they wanted to generate. This is particularly relevant for floating point numbers. However, we found that such an API would make more sense in conjunction with an improved method for generating floating point numbers, so we postponed this too.

Compatibility

We strove to make changes backwards compatible where possible and desirable.

The following changes may break existing packages:

import clashes, e.g. with the new functions uniform and uniformR
randomIO and randomRIO where extracted outside of Random class as separate functions, which means some packages need to adjust how they are imported
StdGen is no longer an instance of Read
requires base >= 4.10 (GHC-8.2)

In addition, genRange and next have been deprecated.

We have built all of Stackage against the code in this PR, and confirmed that no other build breakages occurred.

For more details, see this comment and the "Compatibility" section in the docs.

Other issues addressed

This PR also addresses #26, #44, #53, #55, #58 and #59, see Issues Addressed for details.

…ypes Introduce PrimMonad interface. Add range generation used in splitmix Export all class contents Initial stab at MonadRandom Working MonadRandom Rename next* to gen*

…platform independent

Add functionality for generating a ByteArray

… spaces

…uniform-range Implement Uniform and UniformRange

* StdGen = SMGen * Remove dependency on "time"

Export Uniform and UniformRange

Implement PrimGen

…d breakage

Implement MutGen (rebased)

This reverts commit 67952a4.

lehins · 2020-05-20T15:22:27Z

@cartazio I thought you are going to merge this PR? Why did you close it again?

cartazio · 2020-05-20T15:22:56Z

shit, id din't see you repointed it, my bad :)

cartazio · 2020-05-20T15:25:54Z

i totally respect your persectives here (we're both right overall), and you're right about scope. the improvements for the APIS will all land, and i'm thinking for now we table the monadic apis from the actual release.

even if i'm sometimes a tad "grrr", that work is outstanding, and shifting out the default RNG and not letting other things block it is on me.

i will do some other RNG shuffling, but i think you'll be happy overall (though perhaps rightly frustrated with me :) ).

curiousleo · 2020-05-20T15:29:07Z

Hi @cartazio, re squashing: I also think that the commit history of this PR is valuable. A lot of thinking went into the decisions that were made, and the commit history points to the PRs where the relevant discussions took place.

There are definitely commits in the history that by themselves don't add much value, so there is an opportunity for careful squashing.

I volunteer to squash the history of this branch into one commit per PR. But only if the entire history does not get squashed into one big commit in the end. Can you confirm?

curiousleo · 2020-05-20T15:30:05Z

#61 (comment) ah I guess that's a moot point now :)

cartazio · 2020-05-20T15:36:29Z

@curiousleo i'm going to persist this entire diff in a design-artifact/v1.2-proposal, so dont worry :)

i'm also going to probably see about re-applpying the diff set ontop of a compacted/filtered history with v1.1 as the historical base for the new-new repo (and i"ll see about making sure all urls also point to internet archive backups if needed)

i'm actually frustrated with myself for not seeing the applicative+alternative design path YEARS ago, but c'est lie-vie

I hope that in aggregate you all find yourself more happy than annoyed with how i use this work, but i'm in a good headspace to engage on this stuff and focus (despite the plague times), and zooming out, we ALL care about good randomness :)

i'm thinking for near term i'll have a child / satellite lib for workshopping out the fun fancy stuff

cartazio · 2020-05-20T15:38:20Z

design_artifacts/theLeo_and_Lehins_etal_v1.2_proposal is the stable in repo snapshot for now

curiousleo · 2020-05-20T15:40:19Z

@curiousleo i'm going to persist this entire diff in a design-artifact/v1.2-proposal, so dont worry :)

That is appreciated, but beside the point: I think that a lot of the value in a good Git history lies in being able to answer the question "why is this code the way it is?" using git blame. This requires the master branch to have that history, since that is what people generally check out and run git blame on.

lehins · 2020-05-20T16:13:25Z

i'm thinking for now we table the monadic apis from the actual release.

@cartazio monadic part of the API is critical to all of the work in this PR, cutting it out of the release will be a huge mistake

cartazio · 2020-05-20T17:45:07Z

If we go that direction, I’d much rather use Brent yorgies monad random design. I asked him many months ago if I could migrate those into Plus making sure that that the base class has a liftSampling prim operation with the type “g->(g,a)-> m a ” That would work for both pure rng impls and runSt /primmonad compatible implementations . I’ll show y’all one that works with the mutable stuff you care about instance wise. And would support both styles. Have a look at Brent’s olde but good monad random type classes and please tell me what you think! There’s probably some opportunities for polish there but I think it’s a good design

…

On Wed, May 20, 2020 at 12:13 PM Alexey Kuleshevich < ***@***.***> wrote: i'm thinking for now we table the monadic apis from the actual release. @cartazio <https://github.com/cartazio> monadic part of the API is critical to all of the work in this PR, cutting it out of the release will be a huge mistake — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#61 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAABBQXD5UZTZMRXGNKTBKLRSP6TLANCNFSM4M7VR6VQ> .

lehins · 2020-05-20T18:13:49Z

@cartazio We have a concrete design that already works. Instead of saying what is possible could you please explain what you don't like about the current design and why do you think it doesn't work?

I am very well aware of all other approaches taken in all other libraries that provide alternative monadic interface. I know for a fact that none of them have the desired property of being suitable for both pure and stateful generators at the same time, including Brent's MonadRandom package.

We've iterated on the design quite a bit and weighed all pros and cons. @Shimuuar who is the maintainer for the most popular satetful RNG library mwc-random has also participated in the design process and is onboard with this monadic interface.

Plus making sure that that the base class has a liftSampling prim operation
with the type “g->(g,a)-> m a ”

Why do you need this extra function, this is just the state :: (s -> (a, s)) -> m a function from MonadState, so what's the point of reinventing stuff here. Current design already works with any monad that implements MonadState.

Current design is the only one that provides such great performance and usability across all available non-crypto RNGs in Haskell, it works with all transformers without any need to create crap load of instances, there is literally no more work needs to be done, so could you please shed the light for me why are you so reluctant about it?

cartazio · 2020-05-20T19:01:00Z

To be concrete I mean https://hackage.haskell.org/package/MonadRandom-0.5.1.2/docs/Control-Monad-Random-Class.html

lehins · 2020-05-20T19:03:17Z

I know, and I am saying that I know about it, I studied it amongst other approaches and it does not work with mwc-random and alike

idontgetoutmuch · 2020-05-20T19:25:48Z

@cartazio you will no doubt be aware that many years ago it was proposed to make MWC compatible with MonadRandom.

byorgey/MonadRandom#26

It's not clear from the discussion if this is feasible without changing MonadRandom in a breaking way.

I think you are looking at this the wrong way up. With this proposal, there is no need for MonadRandom.

cartazio · 2020-05-20T19:27:46Z

... umm, @lehins ... mwc doesnt have a monadic interface for threading the generator.

cartazio · 2020-05-20T19:28:43Z

(i guess that lives in random-fu)

lehins · 2020-05-20T19:30:12Z

Current PR has MonadRandom class that can be used with mwc-random, see the haddock in this PR it has examples on how it works

idontgetoutmuch · 2020-05-20T19:33:23Z

I was just about to say the same thing :)

@cartazio see here:

https://htmlpreview.github.io/?https://raw.githubusercontent.com/idontgetoutmuch/random/haddock-preview/doc/System-Random-Monad.html#g:12

Shimuuar · 2020-05-20T19:38:46Z

... umm, @lehins ... mwc doesnt have a monadic interface for threading the generator.

What is PrimMonad m that appears everywhere then :)? Instead of passing state explicitly it passes state token around and modifies generator in place.

Although it's possible to have some vaguely MonadRandomish interfaces for mwc-random for monads like ReaderT (Gen (PrimState m)) m it does require breaking changes.

cartazio · 2020-05-20T19:45:42Z

Oooo. I see what you mean. Brents stuff as is doesn’t work for you. And also isnt quite what I meant. I’m gonna go play around and show you want I meant. Might be a Friday or next Monday (weekend fam trip).

…

On Wed, May 20, 2020 at 3:39 PM Aleksey Khudyakov ***@***.***> wrote: ... umm, @lehins <https://github.com/lehins> ... mwc doesnt have a monadic interface for threading the generator. What is PrimMonad m that appears everywhere then :)? Instead of passing state explicitly it passes state token around and modifies generator in place. Although it's possible to have some vaguely MonadRandomish interfaces for mwc-random for monads like ReaderT (Gen (PrimState m)) m it does require breaking changes. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#61 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAABBQW426PP7OMNP64Y4MLRSQWVLANCNFSM4M7VR6VQ> .

cartazio · 2020-05-20T19:46:34Z

Absolutely agree mwc needs to be sane to have a monad wrapper for. On Wed, May 20, 2020 at 3:45 PM Carter Schonwald <carter.schonwald@gmail.com> wrote:

…

Oooo. I see what you mean. Brents stuff as is doesn’t work for you. And also isnt quite what I meant. I’m gonna go play around and show you want I meant. Might be a Friday or next Monday (weekend fam trip). On Wed, May 20, 2020 at 3:39 PM Aleksey Khudyakov < ***@***.***> wrote: > ... umm, @lehins <https://github.com/lehins> ... mwc doesnt have a > monadic interface for threading the generator. > > What is PrimMonad m that appears everywhere then :)? Instead of passing > state explicitly it passes state token around and modifies generator in > place. > > Although it's possible to have some vaguely MonadRandomish interfaces > for mwc-random for monads like ReaderT (Gen (PrimState m)) m it does > require breaking changes. > > — > You are receiving this because you were mentioned. > > > Reply to this email directly, view it on GitHub > <#61 (comment)>, or > unsubscribe > <https://github.com/notifications/unsubscribe-auth/AAABBQW426PP7OMNP64Y4MLRSQWVLANCNFSM4M7VR6VQ> > . >

curiousleo · 2020-05-21T06:59:26Z

Absolutely agree mwc needs to be sane to have a monad wrapper for.

With this PR, it does. Literal copy from the docs section "How to implement MonadRandom" (see https://htmlpreview.github.io/?https://raw.githubusercontent.com/idontgetoutmuch/random/haddock-preview/doc/System-Random-Monad.html#g:12), as @idontgetoutmuch pointed out:

Here is an example instance for the monadic pseudo-random number generator from the mwc-random package:

instance (s ~ PrimState m, PrimMonad m) => MonadRandom MWC.Gen s m where
  type Frozen MWC.Gen = MWC.Seed
  thawGen = MWC.restore
  freezeGen = MWC.save
  uniformWord8 = MWC.uniform
  uniformWord16 = MWC.uniform
  uniformWord32 = MWC.uniform
  uniformWord64 = MWC.uniform
  uniformShortByteString n g = unsafeSTToPrim (genShortByteStringST n (MWC.uniform g))

cartazio · 2020-05-21T16:17:25Z

It doesn’t need the state token in the monad random though.

…

On Thu, May 21, 2020 at 2:59 AM Leonhard Markert ***@***.***> wrote: Absolutely agree mwc needs to be sane to have a monad wrapper for. With this PR, it does. Literal copy from the docs section "How to implement MonadRandom" (see https://htmlpreview.github.io/?https://raw.githubusercontent.com/idontgetoutmuch/random/haddock-preview/doc/System-Random-Monad.html#g:12), as @idontgetoutmuch <https://github.com/idontgetoutmuch> pointed out: ------------------------------ Here is an example instance for the monadic pseudo-random number generator from the mwc-random package: instance (s ~ PrimState m, PrimMonad m) => MonadRandom MWC.Gen s m where type Frozen MWC.Gen = MWC.Seed thawGen = MWC.restore freezeGen = MWC.save uniformWord8 = MWC.uniform uniformWord16 = MWC.uniform uniformWord32 = MWC.uniform uniformWord64 = MWC.uniform uniformShortByteString n g = unsafeSTToPrim (genShortByteStringST n (MWC.uniform g)) — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#61 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAABBQUBJUVFPFGMTLCLSWDRSTGNZANCNFSM4M7VR6VQ> .

lehins · 2020-05-21T16:26:06Z

It doesn’t need the state token in the monad random though.

@cartazio please elaborate, best with example.

If I am inferring correctly what you are saying, then it does not need s for PrimMonad becasue it works for all s ~ PrimState m. Loosing s though will prevent it from working in with things like gen stored in IORef and monad being StateT. In short, It DOES need the state token in the monad random!

idontgetoutmuch · 2020-05-23T07:47:13Z

Another data point: random-fu is about x3 faster using the proposed new version:

lehins and others added 30 commits February 22, 2020 20:59

Allow RNGs to provide efficient implementations for variety of prim t…

1bc9fb1

…ypes Introduce PrimMonad interface. Add range generation used in splitmix Export all class contents Initial stab at MonadRandom Working MonadRandom Rename next* to gen*

Add some examples and an instance for mwc-random

a0b17bd

Couple extra constraints

6790ade

A few inline pragmas. Check performance

bb10f39

Rename GenState to PureGen

d3f61e8

Remove support for non-ghc compilers

4bb37cf

Removed mkGen, saveGen and GenSeed

7804b7e

Add functionality for generating a ByteArray

04f271c

Random for word ranges

cbcca00

Attempt to use Word64 only

1a51071

A working solution for genrating a ByteArray, so it is efficient and …

046f1e5

…platform independent

Helper functions for generating ByteString

44b33ea

Merge pull request #10 from idontgetoutmuch/bytearray

200143c

Add functionality for generating a ByteArray

Cleanup cabal file a bit. Bump up the version to 1.2

0b45a29

General cleanup, remove dead code. Remove usagae of tabs and trailing…

83262cc

… spaces

Implement Uniform and UniformRange

b01074f

Merge pull request #14 from idontgetoutmuch/interface-to-performance-…

656ef17

…uniform-range Implement Uniform and UniformRange

StdGen = SMGen (#22)

028de7a

* StdGen = SMGen * Remove dependency on "time"

Export Uniform and UniformRange

cd94322

Merge pull request #25 from idontgetoutmuch/export-uniform

542c55d

Export Uniform and UniformRange

Implement PrimGen

9564db4

Add ability to split PrimGen

b918908

Merge pull request #16 from idontgetoutmuch/primgen

4024650

Implement PrimGen

Implement MutGen. Remove default implmenetation for randomM to avoi…

5849512

…d breakage

Remove unnecessary imports

2395472

Merge pull request #33 from idontgetoutmuch/mutgen-leo-rebased

e8588fb

Implement MutGen (rebased)

doctests to explain the new features in random and forward direction

834aa54

Deprecate next

a680200

Fix overflow

18239f9

Revert "Fix overflow"

c0a0204

This reverts commit 67952a4.

cartazio closed this May 20, 2020

cartazio reopened this May 20, 2020

cartazio merged commit 1b8d98f into haskell:wip/carter/1.1-base-forv1.2-collab May 20, 2020

This was referenced May 26, 2020

Very low throughput #51

Closed

V1.2 proposal #62

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

V1.2 proposal #61

V1.2 proposal #61

idontgetoutmuch commented May 13, 2020 •

edited

Loading

lehins commented May 20, 2020 •

edited

Loading

cartazio commented May 20, 2020

cartazio commented May 20, 2020

curiousleo commented May 20, 2020

curiousleo commented May 20, 2020

cartazio commented May 20, 2020

cartazio commented May 20, 2020

curiousleo commented May 20, 2020

lehins commented May 20, 2020

cartazio commented May 20, 2020 via email

lehins commented May 20, 2020

cartazio commented May 20, 2020

lehins commented May 20, 2020

idontgetoutmuch commented May 20, 2020

cartazio commented May 20, 2020

cartazio commented May 20, 2020

lehins commented May 20, 2020

idontgetoutmuch commented May 20, 2020

Shimuuar commented May 20, 2020

cartazio commented May 20, 2020 via email

cartazio commented May 20, 2020 via email

curiousleo commented May 21, 2020

cartazio commented May 21, 2020 via email

lehins commented May 21, 2020

idontgetoutmuch commented May 23, 2020

V1.2 proposal #61

V1.2 proposal #61

Conversation

idontgetoutmuch commented May 13, 2020 • edited Loading

Context

Quality (#25)

Performance (#51)

API changes

MonadRandom

Uniform and UniformRange

Changes left out

Compatibility

Other issues addressed

lehins commented May 20, 2020 • edited Loading

cartazio commented May 20, 2020

cartazio commented May 20, 2020

curiousleo commented May 20, 2020

curiousleo commented May 20, 2020

cartazio commented May 20, 2020

cartazio commented May 20, 2020

curiousleo commented May 20, 2020

lehins commented May 20, 2020

cartazio commented May 20, 2020 via email

lehins commented May 20, 2020

cartazio commented May 20, 2020

lehins commented May 20, 2020

idontgetoutmuch commented May 20, 2020

cartazio commented May 20, 2020

cartazio commented May 20, 2020

lehins commented May 20, 2020

idontgetoutmuch commented May 20, 2020

Shimuuar commented May 20, 2020

cartazio commented May 20, 2020 via email

cartazio commented May 20, 2020 via email

curiousleo commented May 21, 2020

cartazio commented May 21, 2020 via email

lehins commented May 21, 2020

idontgetoutmuch commented May 23, 2020

idontgetoutmuch commented May 13, 2020 •

edited

Loading

`MonadRandom`

`Uniform` and `UniformRange`

lehins commented May 20, 2020 •

edited

Loading