Synthia is a flexible generator of data structures. With Synthia, it is possible to generate numbers, strings, lists, sentences from a predefined grammar, random walks in a finite-state machine, or any other user-defined object. More importantly, all these basic "generators" can be combined to create complex structures and behaviors.
Synthia comes with a library of dozens of code examples illustrating what can be done with it. Among other things can be used to generate test inputs for fuzzing, create a test stub that provides more flexible responses than simple "canned answers", or create a semi-realistic simulation of a system's output, such as a log.
Among the features in Synthia:
- All sources of "choice" in the various generators are passed as arguments
in the form of
Picker
objects. Pickers can be random, but they can also do something else, such as always returning the same value (special typeContstant
), or cycling through a user-defined list of values (special typePlayback
). Users can define their own pickers, and they can return any type of object, not just numbers. - The outputs of pickers can be wrapped into a
Record
object, which remembers the successive values they produce. These values can be replayed through aPlayback
picker. Since generators make choices only through pickers, this makes it possible to regenerate any random object. This feature is useful e.g. for debugging, by running a program on a specific input causing a failure.
Here are simple examples of the basic building blocks to synthesize things with Synthia. The Examples folder has more complex examples, such as simulating multiple users visiting a web site.
The ca.uqac.lif.synthia.random
package provides basic Picker
objects that
can generate primitive values selected with a uniform probability distribution.
RandomInteger ri = new RandomInteger(2, 10); // uniform random int between 2 and 10
RandomFloat rf = new RandomFloat(); // uniform random float between 0 and 1
RandomBoolean rb = new RandomBoolean(0.7); // coin toss returning true 7 out of 10 times
All random pickers implement methods setSeed()
(to specify the starting seed
of their internal random source) and reset()
(to reset their internal random
source to the initial seed value), making it possible to reproduce the random
sequences they generate.
Random strings can also be generated (using RandomStringUtils
from
Apache Commons Lang
in the background):
// Generate strings of length 10
RandomString rs1 = new RandomString(new Constant(10));
// Generates strings of length randomly selected in the interval 2-10
RandomString rs2 = new RandomString(new RandomInteger(2, 10));
This last example shows that the input to a Picker may itself be another Picker.
For rs2
, the length of the string is determined by the result of a random
pick in the interval [2,10].
Similarly, the Tick
picker generates an increasing sequence of numbers:
Tick t1 = new Tick(10, 1);
Tick t2 = new Tick(new RandomInteger(5, 15), new RandomFloat(0.5, 1.5));
Here, t1
will generate a sequence starting at 10 and incrementing by 1 on every
call to pick()
--that is, this object is completely deterministic. In contrast,
t2
wil generate a sequence starting at a randomly selected value between 5 and
15, and incrementing each time by a randomly selected value between 0.5 and 1.5.
The StringPattern
picker can produce character strings where some parts are
determined by the output of other pickers.
StringPattern pat = new StringPattern(
"{$0} {$1} - Source: {$2}, Destination: {$3}, Duration: {$4}",
new Tick(),
new Freeze<String>(new RandomString(new RandomInteger(3, 6))),
new Freeze<String>(new IpAddressProvider(new RandomInteger(0, 256))),
new IpAddressProvider(new RandomInteger(0, 256)),
new RandomInteger(10, 1000)));
for (int i = 0; i < 10; i++)
System.out.println(gp.pick());
Produces an output like:
0.73096776 3dF - Source: 187.212.61.155, Destination: 187.212.61.155, Duration: 340
1.5624087 3dF - Source: 187.212.61.155, Destination: 163.79.140.29, Duration: 368
1.8029451 3dF - Source: 187.212.61.155, Destination: 152.200.85.64, Duration: 689
Notice the use of the Freeze
picker: it asks for the input of another picker once,
and then returns that value forever.
A Markov chain
is a stochastic model describing a sequence of possible events.
In the following example, we define a Markov chain by creating
states (numbers 0-4), associate each state with a Picker
object
(here all pickers generate a different string), and set the probabilities
of transitioning from a state to another.
Tick tick = new Tick(0, new Enumerate<Integer>(1, 2, 1));
MarkovChain<String> mc = new MarkovChain<String>(new RandomFloat());
mc.add(0, new Constant<String>(""));
mc.add(1, new Constant<String>("START"));
mc.add(2, new StringPattern("{$0},A", tick));
mc.add(3, new StringPattern("{$0},B", tick));
mc.add(4, new StringPattern("{$0},C", tick));
mc.add(0, 1, 1).add(1, 2, 0.9).add(1, 4, 0.1)
.add(2, 3, 1).add(3, 2, 0.7).add(3, 4, 0.3)
.add(4, 4, 1);
for (int i = 0; i < 8; i++)
System.out.println(mc.pick());
Note how the generators for states 2-3-4 share the same tick
source,
ensuring that each string contains an incrementing "timestamp".
The Markov chain created by this block of code looks like this:
The program produces an output like this one:
START
1.0,A
3.0,B
4.0,A
5.0,B
7.0,C
8.0,C
Synthia can produce randomly generated expressions from a formal grammar contained in an instance of a Bullwinkle BNF parser.
BnfParser parser = ... // A Bullwinkle parser for an arbitrary grammar
GrammarSentence p = new GrammarSentence(parser, new RandomIndex());
for (int i = 0; i < 10; i++)
System.out.println(gp.pick());
With the grammar defined here, you get an output that looks like:
The funny old grey fox plays with the ugly big brown siamese .
The young white cats play with the ugly small old grey siamese .
The grey birds watch the ugly old brown bird .
...
The library is structured using the AntRun build scripts. Please see the AntRun Readme file for instructions on compiling.
The documentation for this repository is generated by
Doxygen. The configuration file Doxyfile
contains the
appropriate settings for reading the source files and generating HTML in the
docs
folder. On the command line, you should be able to run:
$ doxygen Doxyfile
However, these files must be post-processed a little. Run the PHP script
post-process.php
afterwards. When re-generating the documentation, it may
also be wise to first wipe the contents of the docs
folder.
It is not recommended to use javadoc
to generate the HTML documentation.
Several features (such as cross-referenced source code) will be missing from
the output.
Synthia is a play on "synthesizing" data structures.
Synthia is being developed by all the folks of Laboratoire d'informatique formelle at Université du Québec à Chicoutimi, Canada. The project lead is Sylvain Hallé, professor at Université du Québec à Chicoutimi, Canada.