Biparsing is a bidirectional programming technique that specializes in constructing parsing and printing programs simultaneously.
Until this package there was not a usable, only my opinion, package that allowed for the construction of useful biparsers.
- Less Bugs, keeps the programs in sync so that when the biparsing code is modified both the parser and printer are updated.
- Less Costs, reduces maintenance and upgrade costs since changes only need to be made in a single location.
- Less Code, reduces project size in "half" (perhaps a third) since two parts are written simultaneously.
Start with the package biparsing-mixes which has an easier interface to start with. It currently allows using the monads IO
and Either
and the types String
, ByteString
(Strict and Lazy), Text
(Strict and Lazy), and [a]
. These monads and types can be easily expanded to others; the tests for biparsing-core
use Maybe
, Vector
, Seq
.
Most of the combinators you are in, or should be in (please help), Biparse.General
, Biparse
Biparse.Biparser`,
See directory examples
A quick and simple example that both defines the parser and printer
product :: IsoEasy IndexContext Text (Text, Int)
product = do
take '('
s <- takeWhile (/= ',') `upon` fst
take ','
i <- intBaseTen `upon` snd
take ')'
return (s, i)
print $ decodeEasy product () "(Some text,123)"
print $ encodeEasy product ("Some text", 123)
-
Terminology; I need some help with proper naming of types and functions. Please make suggestions.
-
Biparse.Text.Numeric
does not support many biparsers for numeric types and formats and could be greatly improved. -
Documentation
-
Add more examples in the examples directory and in this README
-
More complete exports in biparsing-mixes/src/Biparse/Mixes/Exports.hs that make it easy to use without import lots of modules since there are many name clashes with base Prelude
- Polymorphic Streams and Token types. This should not just be for parsing Text, String, or ByteString streams but any kind of tokenized streams.
- Parser composition. If the parser consumes stream 'a' and produces stream 'b' it should be composable with another parser that consumes stream 'b'.
- Currently possible but not added to biparsing-mixes to make it easy. See
Biparse.Biparser.PolyKinds.focus
.
- Currently possible but not added to biparsing-mixes to make it easy. See
- Ability to have guaranteed serialization. Parsing needs to be able to fail.
- Design a grammer for biparsers.
- Composing bidirectional programs monadically most utilized and improved upon
- List Relevant References
- Invertible Syntax Descriptions: Unifying Parsing and Pretty Printing
- Generalized Convolution and Efficient Language Recognition by Conal Elliott
"IMO it's a mistake (although not an infrequent one) to try and use an optic as a parser or a printer or whatever. It's like trying to light a fire by striking your lighter against a rock. The whole point of polymorphic profunctor optics is that they can adjust arbitrary profunctors, including natural representations of parsers and printers" masaeedu
"I suspect a "clean" way to do this sort of thing will eventually reveal within a profunctor encoding, but I haven't put much time into it yet :)" Chris Penner
"Are there any bidirectional parser libraries ready to use?" TheMatten
"None that are good. Well "good" is not the right word here Basically the best current approach to bidirectional parsing in Haskell does not have the ideal interface" chessai
There are other implementions for biparsing but they did meet my requirements.
Currently, a viable solution for constructing biparsers. The drawback is that it is hard to modify since it requires that the parser constructor function must be defined separately from the parsing logic. Also, for serialization an extractor function is required to remove what is to be serialized from the parse type.
data Person = Person String String
-- constructor function
person :: String -> String -> Person
person f l = Person f l
-- extractor functions
firstName :: Person -> String
firstName (Person x _) = x
lastName :: Person -> String
lastName (Person _ x) = x
personBiparser :: Biparser String Person
personBiparser =
person <$> -- sets constructor function
many firstName alpha <*> -- Parsing: `many` passes the consumed alpha characters as the first argument of the `person` constructor function. Serializing: `many` uses the extractor function `firstName` to get the first name charcaters from a `Person` instance and passes each single caracter to the alpha serializer
(spaces *> many lastName alpha) -- Parsing: `spaces` skips the spaces, `many` passes the consumed alpha characters as the second argument of the `person` constructor function. Serializing: `spaces` writes a " " space. `many lastName alpha` does what the above line does.
spaces :: Biparser a String
spaces = many (const [' ']) space
many :: (a -> [b]) -> Biparser b c -> Biparser a [c]
many extractor biparser = ...
Another, example from the linked codec package.
data RecordB = RecordB
{ recordBString :: String
, recordBDouble :: Double
} deriving (Eq, Ord, Show)
recordBObjCodec :: JSONCodec RecordB
recordBObjCodec = asObject "RecordB" $
RecordB
<$> recordBString =. field "string"
<*> recordBDouble =. field "double"
Until this package, monadic composition did not have a valid implementation yet but did have an example implementation unparse-attoparsec, details below, that is outlined in Composing bidirectional programs monadically.f
int :: Biparser Int Int
int = do
let printedInt n = show n ++ " "
ds <- digits 'upon' printedInt
return (read ds)
digits :: Biparser String String
digits = do
d <- char 'upon' head
if isDigit d then do
igits <- digits 'upon' tail
return (d : igits)
else if d == ' ' then return " "
else error "Expected digit or space"
Optics composition can be used to create bidirectional programs by utilizing Iso (isomorphisms) and Prisms. These are ready to go and can be used to create composable bidirectional programs. Unfortunately at this point, optics do not support state very well so it's difficult to report errors with locations.
codec Simple bidirectional serialization
Codec makes it simple to write composable bidirectional serializers with a consistent interface.
Pros
- JSON and binary supported natively
- Allows any kind of token not just Char and Word8
Cons
- Does not give any helper functions for common parsing needs like parsec-combinators
- More could be handled in the type level
- Dependant on the constructor argument order. The constructor arguments need to be reordered if the parser changes. Type errors are not clear as to which field there is a problem parsing for.
- Does not easily support ADTs
Features we would like to also have
- ...
lens Lenses, Folds and Traversals
Pros
- ...
Cons
- ...
Features we would like to also have
- ...
boomerang Library for invertible parsing and printing
Specify a single unified grammar which can be used for parsing and pretty-printing
Pros
- Supports ADTs
- Allows any token type
- Natively supports String and Text
Cons
- Uses lots of Template Haskell
- Mostly for use with Text
Features we would like to also have
- ...
Based On
- Zwaluw by Sjoerd Visscher and Martijn van Steenbergen
- invertible-syntax
Combinators for bidirectional URL routing.
Pros
- ...
Cons
- ...
Features we would like to also have
- ...
Grammar combinator framework to build invertible parsing libraries for concrete syntax.
Contains invertible-syntax
Pros
- ...
Cons
- ...
Features we would like to also have
- ...
Based On
- Invertible Syntax Descriptions: Unifying Parsing and Pretty Printing by Tillmann Rendel and Klaus Ostermann
roundtrip Bidirectional (de-)serialization
Roundtrip allows the definition of bidirectional (de-)serialization specifications. The specification language is based on the ideas described in the paper Invertible Syntax Descriptions: Unifying Parsing and Pretty Printing by Tillmann Rendel and Klaus Ostermann, Haskell Symposium 2010. This package does not provide concrete instances of the specification classes, see the packages roundtrip-string and roundtrip-xml instead. The package contains slightly modified code from Tillmann Rendel's partial-isomorphisms and invertible-syntax packages (Copyright (c) 2010-11 University of Marburg).
Pros
- ...
Cons
- ...
Features we would like to also have
- ...
Based On
- Invertible Syntax Descriptions: Unifying Parsing and Pretty Printing by Tillmann Rendel and Klaus Ostermann
- partial-isomorphisms by Tillmann Rendel
- invertible-syntax
This library provides a wrapper attoparsec to build programs that can be interpreted both as parsers and as printers.
Pros
- ...
Cons
- Only supports ByteString
Features we would like to also have
- ...
archived. Boomerang is a programming language for writing lenses—well-behaved bidirectional transformations—that operate on ad-hoc, textual data formats.
Pros
- ...
Cons
- ...
Features we would like to also have
- ...
Partial isomorphisms is the basis of this implementation data Iso a b = Iso (a -> Maybe b) (b -> Maybe a)
and redefining <$>
, <*>
, <|>
to suite partial Iso
. The reader may be wondering why it is partial in both directions since it is strange that serialization could fail. This is due to the fact that <|>
Alternative does not force the serialization of all possibilities at the type level so failure must be passed into runtime.
data Iso a b = Iso (a -> Maybe b) (b -> Maybe a)
class IsoFunctor f where
(<$>) :: Iso a b -> f a -> f b
class ProductFunctor f where
(<*>) :: f a -> f b -> f (a,b)
class Alternative f where -- lacks Applicative superclass
(<|>) :: f a -> f a -> f a
empty :: f a -- fails parsing and printing
I would have preferred that the definition of data Iso a b = Iso (a -> Maybe b) (b -> a)
have been used instead to force successful serialization. This would however put more complexity in the types to enforce this.
Also, due to using applicative the order of the parsing must coincide with the constructor arguments. With monadic parsers the parsed value can be assigned to a variable which disconnects the constructor argument order and the parser order.
Type lists may be able to enforce: guaranteed serialization with Alternative, and unordered parsing.
- Parser - a function that converts Data to a Data-Structure
- Serializer - a function that converts the Data-Structure produced by a Parser back to the Data that the parser consumes
- Data - typically a stream of Tokens, but could be any structure that supports the Head and Tail functions. Examples: string, list of strings, tree of integers
- Data-Structure - a native programatic construct that can be more easily inspected and modified by a program than Data
- Token - typically a finite piece of data. Examples: character, string, integer, algebraic data type
- Head function - returns the first element of a monomorphic container.
MonoFoldable mono => NonNull mono -> Element mono
https://hackage.haskell.org/package/mono-traversable-1.0.15.1/docs/Data-NonNull.html#v:head - Tail function - return the container less the element returned by head
IsSequence seq => NonNull seq -> seq
https://hackage.haskell.org/package/mono-traversable-1.0.15.1/docs/Data-NonNull.html#v:tail