Skip to content

Latest commit

 

History

History
1053 lines (881 loc) · 39.6 KB

API.org

File metadata and controls

1053 lines (881 loc) · 39.6 KB

API Reference

Table of Contents

Core API
  • CPARSEC2 core APIs
  • Exception handling API for C.
Container classes
  • Type of Containers
  • Supporting concrete List(T) types
  • Iteration API for List(T)
  • Supporting concrete Buff(T) types
  • List builder API for Buff(T)
Parser classes
  • Type of Parsers
  • Supporting concrete PARSER(T) types
Built-in Parsers, Parser generators, and Parser combinators
  • Built-in Parsers
  • Built-in Parser-generators
  • Built-in GENERIC Parser-combinators
Building block of Parser-class
  • Declares/Defines new PARSER(T) class
  • Construct an instance of PARSER(T)
  • Apply an instance of PARSER(T) to a text
Building block of Parser-combinators
  • Declares/Defines new Parser-combinators
  • PARSER_CAST(expr)
  • GENERIC_METHOD(expr, C, F, …)
  • GENERIC_P(expr, F, …)
  • FOREACH(F, …)
  • TYPESET
Extends CPARSEC2 library for user defined types
  • CPARSEC2_USER_TYPESET
  • CPARSEC2_DEFINE_USER_TYPESET()

Core API

CPARSEC2 core APIs

cparsec2_init()
Initialize cparsec2 library.
This must be called once at first.
cparsec2_end()
Clean up cparsec2 library. (deallocate all allocated memory)
During after calling this API and before calling cparsec2_init(), calling any other cparsec2 API and values returned from them are invalid (access to them will be undefined behavior).
Source_new(input)
Constructs a Source object from the given input.
input
a const char*, FILE*, or Source.
return
a new Source object, or input if input was a Source.
parse(p, src, ctx)
Apply parser p to text provided from the Source object src, and return result (such as char, const char *). If an error occurred, it is thrown as exception through ctx. (see also Exception handling)
p
a parser of PARSER(T) type such as PARSER(Char), etc.
src
a Source object.
ctx
a potinter to Ctx object.
return
an object of RETURN_TYPE(PARSER(T)) type. For example,
  • a char if p was a PARSER(Char),
  • a const char* if p was a PARSER(String),
  • a int if p was a PARSER(Int),
  • NONE if p was a PARSER(None).
parseTest(p, text)
Apply parser p to text and print result to standard output.
Return true if passed, false if failed.
p
a parser of PARSER(T) type such as PARSER(Char), etc.
text
a const char*. (null-terminated char sequence; i.e. string in C)
return
true or false.
PARSE_TEST(p, text)
Same as parseTest and also print p text. (for debug purpose)
Return true if passed, false if failed.
p
a parser of PARSER(T) type such as PARSER(Char), etc.
text
a const char*. (null-terminated char sequence; i.e. string in C)
return
true or false.
runParser(p, input)
Apply parser p to input. input is a const char*, FILE*, or Source.
Returns a ParseResult(T).
p
a parser of PARSER(T) type such as PARSER(Char), etc.
input
a const char*, FILE*, or Source.
return
an object of ParseResult(T) type such as ParseResult(Char), etc.

Exception handling

Ctx
Type of context for exception handling.
TRY(ctx) {…} else {…}
Exception handling macro. (C++ or Java like try {...} catch {...} clause)
cthrow(ctx, msg)
Throw a string msg as an exception.

Example:

Ctx ctx;
TRY(&ctx) {                        /* try */
  // ...
  cthrow(&ctx, "something wrong!"); /* throw "something wrong!" */
  // ...
}
else {                             /* catch */
  printf("error:%s\n", ctx.msg);   /* -> "error: something wrong!" */
}

Container Classes

Type of Containers

List(T)
Generic type of a list ; container of object sequence.
NOTE : To construct a List(T) object, use Buff(T).
ELEMENT_TYPE(List(T))
Element type of List(T).
Buff(T)
Generic type of a list builder ; variadic buffer of object sequence for building a list.
ELEMENT_TYPE(Buff(T))
Element type of Buff(T).

Supporting concrete List(T) types

List(Char)
A type of a list container whose element type is const char.
(i.e. ELEMENT_TYPE(List(Char)) is const char.)
List(String)
A type of a list container whose element type is const char*.
(i.e. ELEMENT_TYPE(List(String)) is const char*.)
List(Int)
A type of a list container whose element type is int.
(i.e. ELEMENT_TYPE(List(Int)) is int.)
List(None)
A type of a list container whose element type is None.
(i.e. ELEMENT_TYPE(List(None)) is None.)
List(Ptr)
A type of a list container whose element type is void*.
(i.e. ELEMENT_TYPE(List(Ptr)) is void*.)

NOTE : List(Char) is same as const char* (i.e. string in C)

The below is an experimental:

List(Node)
A type of a list container whose element type is Node.
(i.e. ELEMENT_TYPE(List(Node)) is Node.)

Iteration API for List(T)

To iterate elements contained in a List(T) object, use the following APIs.

ELEMENT_TYPE(List(T))* list_begin(List(T) xs)
Retunrs an iterator, which points to the 1st element of the list. (inclusive)
ELEMENT_TYPE(List(T))* list_end(List(T) xs)
Returns an iterator, which points to the next of the last element. (out of range)
int list_length(List(T) xs)
Returns the number of elements.

NOTE : list_begin(xs) + list_length(xs) == list_end(xs)

For example:

/* a null-terminated char sequence is also a List(Char) */
List(Char) xs = "abcdefg";

const char* itr = list_begin(xs);
const char* end = list_end(xs);
while (itr != end) {
  printf("%c\n", *itr);
  itr++;
}

Supporting concrete Buff(T) types

Buff(Char)
A type of a list-builder whose element type is char.
(i.e. ELEMENT_TYPE(Buff(Char)) is char.)
Buff(String)
A type of a list-builder whose element type is const char*.
(i.e. ELEMENT_TYPE(Buff(String)) is const char*.)
Buff(Int)
A type of a list-builder whose element type is int.
(i.e. ELEMENT_TYPE(Buff(Int)) is int.)
Buff(None)
A type of a list-builder whose element type is None.
(i.e. ELEMENT_TYPE(Buff(None)) is None.)
Buff(Ptr)
A type of a list-builder whose element type is void*.
(i.e. ELEMENT_TYPE(Buff(Ptr)) is void*.)

The below is an experimental:

Buff(Node)
A type of a list-builder whose element type is Node.
(i.e. ELEMENT_TYPE(Buff(Node)) is Node.)

List builder API for Buff(T)

To build a List(T) object, use the following APIs:

void buff_push(Buff(T)* buf, ELEMENT_TYPE(Buff(T)) x)
Adds an element x to the last of buf.
void buff_append(Buff(T)* buf, List(T) xs)
Adds elements in the xs to the last of buf.
List(T) buff_finish(Buff(T)* buf)
Creates a List(T) object and clear contents of buf.

For example:

/* A Buff(T) object must be initialized with {0} at first. */
Buff(Int) buf = {0};

for (int i = 0; i < 10; ++i) {
  buff_push(&buf, i);
}
List(Int) xs = buf_finish(&buf);

int* itr = list_begin(xs);
int* end = list_end(xs);
while (itr != end) {
  printf("%d", *itr++);         /* -> "0123456789" */
}
printf("\n");

Additionaly, Buff(Char) has also the following APIs:

void buff_printf(Buff(Char)* buf, const char* format, …)
Appends a printf()-like formatted string to the last of buf.
void buff_vprintf(Buff(Char)* buf, const char* format, va_list ap)
Equivalent to the buff_printf() except that it is called with a va_list instead of variable number of arguments. See stdarg(3).

For example:

const char* name = "Bob";
int year = 2000;
int month = 4;
int day = 10;

Buff(Char) buf = {0};
buff_printf(&buf, "name: %s, ", name);
buff_printf(&buf, "birthday: %4d-%02d-%02d", year, month, day);

const char* s = buff_finish(&buf);
// -> s = "name: Bob, birthday: 2000-04-10"

Parser Classes

Type of Parsers

PARSER(T)
Generic type of parser.
When a parser applied to a text (char sequence), the parser reads the given text and returns a corresponding value as the parsed result.
RETURN_TYPE(PARSER(T))
Type of a value to be returned by a parser of PARSER(T) type.

Supporting concrete PARSER(T) types

PARSER(Char)
A parser of PARSER(Char) type returns a char value when it is applied.
(i.e. RETURN_TYPE(PARSER(Char)) is char.)
PARSER(String)
A parser of PARSER(String) type returns a const char* value when it is applied.
(i.e. RETURN_TYPE(PARSER(String)) is const char*.)
PARSER(Int)
A parser of PARSER(Int) type returns a int value when it is applied.
(i.e. RETURN_TYPE(PARSER(Int)) is int.)
PARSER(None)
A parser of PARSER(None) type returns NONE when it is applied.
(i.e. RETURN_TYPE(PARSER(None)) is None.)
PARSER(List(Char))
A parser of PARSER(List(Char)) type returns a List(Char) value when it is applied.
(i.e. RETURN_TYPE(PARSER(List(Char))) is List(Char).)
  • NOTE :
    • PARSER(List(Char)) is same as PARSER(String), and
    • List(Char) is same as const char*.
PARSER(List(String))
A parser of PARSER(List(String)) type returns a List(String) value when it is applied.
(i.e. RETURN_TYPE(PARSER(List(String))) is List(String).)
PARSER(List(Int))
A parser of PARSER(List(Int)) type returns a List(Int) value when it is applied.
(i.e. RETURN_TYPE(PARSER(List(Int))) is List(Int).)
PARSER(List(None))
A parser of PARSER(List(None)) type returns a List(None) value when it is applied.
(i.e. RETURN_TYPE(PARSER(List(None))) is List(None).)

The below is an experimental:

PARSER(Node)
A parser of PARSER(Node) type returns a Node value when it is applied.
(i.e. RETURN_TYPE(PARSER(Node)) is Node.)
PARSER(List(Node))
A parser of PARSER(List(Node)) type returns a List(Node) value when it is applied.
(i.e. RETURN_TYPE(PARSER(List(Node))) is List(Node).)

Built-in Parsers, Parser generators, and Parser combinators

parser
A functional object for parsing input stream.
  • When a parser was applied to an input stream:
    • it takes zero or more tokens (e.g. sequence of chars) from the input,
    • then execute something pattern match, and
    • returns a corresponding value if succeeded.
    • otherwise causes an error by throwing an exception.
  • To apply a parser, use one of the following APIs (see also Core API):
    • parse(parser, src, ex)
    • parseTest(parser, text)
    • PARSE_TEST(parser, text)
    • runParser(parser, input)
parser generator
A factory method (constructor function) for creating a parser.
  • A parser generator takes one or more arguments for creating a parameterized parser.
  • Typically the given arguments are used as parameters for pattern match.
parser combinator
A factory method (constructor function) for creating a composite parser.
  • A parser combinator takes one or more parsers for creating a composite parser.
  • It is used to create a complex parser by combinating one or more simple parsers.

The below table shows characteristics of built-in parsers, parser generators, and parser combinators:

  • parser column shows built-in parsers, parser generators, or parser combinators
  • other columns show resulting status of parse(parser, src, ex).
    It causes one of the following result:
    eok (empty ok)
    • parser succeeded without consuming any input.
    • Returns a corresponding value explained in that column.
    eerr (empty error)
    • parser failed without consuming any input.
    • Throws an exception via ex.
      (annotated as error in that column)
    cok (consumed ok)
    • parser succeeded after consumed some input from src.
    • Returns a corresponding value explained in that column.
    cerr (consumed error)
    • parser failed after consumed some input from src.
    • Throws an exception via ex.
      (annotated as error in that column)
  • NOTE : n/a (not applicable) means that such resulting status does not occur.
  • NOTE : NONE is a value of type None.
    • A parser of PARSER(None) type returns NONE value, when it succeeded.
    • Parser of PARSER(None) type has no meaningful value should to be returned, so it returns NONE instead.
parsereokeerrcokcerr
anyCharn/aerrora charn/a
digitn/aerrora decimal digitn/a
hexDigitn/aerrora hexadecimal digitn/a
octDigitn/aerroran octal digitn/a
lowern/aerrora lower-case alphabetn/a
uppern/aerrora upper-case alphabetn/a
alphan/aerroran alphabetn/a
alnumn/aerroran alphabet or a decimal digitn/a
lettern/aerror_’ or a alphabetn/a
newlinen/aerrorlinefeed (LF)n/a
crlfn/aerrorlinefeed (LF)n/a
endOfLinen/aerrorlinefeed (LF)n/a
endOfFileNONEerrorn/an/a
tabn/aerrorhorizontal tab (TAB)n/a
spacen/aerrorspace (SPC)n/a
spacesNONEn/aNONEn/a
numbern/aerroran intn/a
anyUtf8n/aerrora UTF-8 character as stringn/a
char1(c)n/aerrorchar cerror
string1(s)n/aerrorstring serror
utf8(s)n/aerrorUTF-8 string serror
oneOf(s)n/aerrora char included in serror
noneOf(s)n/aerrora char not included in serror
satisfy(pred)n/aerrorc satisfing pred(c) == trueerror
range(min, max)n/aerrorc satisfing min <= c && c <= maxerror
many(p)empty listn/aN-elements list (N > 0)error
many1(p)n/aerrorN-elements list (N > 0)error
seq(p1, …, pn)N-elements list (N = n)errorN-elements list (N = n)error
cons(p, ps)N-elements list (N > 0)errorN-elements list (N > 0)error
skip(p)NONEerrorNONEerror
skip1st(p1, p2)return value of p2errorreturn value of p2error
skip2nd(p1, p2)return value of p1errorreturn value of p1error
token(p)return value of perrorreturn value of perror
either(p1, p2)return value of p1 or p2errorreturn value of p1 or p2error
tryp(p)return value of perrorreturn value of pn/a

Built-in Parsers

anyChar
A PARSER(Char) which parse any one char
digit
A PARSER(Char) which parse a digit (i.e. 0 .. 9)
hexDigit
A PARSER(Char) which parse a hexadecimal digit (i.e. 0 .. 9, a .. f, and A .. F)
octDigit
A PARSER(Char) which parse a octal digit (i.e. 0 .. 7)
lower
A PARSER(Char) which parse a lower-case char (i.e. a .. z)
upper
A PARSER(Char) which parse a upper-case char (i.e. A .. Z)
alpha
A PARSER(Char) which parse an alphabet char (i.e. a .. z, A .. Z)
alnum
A PARSER(Char) which parse a digit or an alphabet char (i.e. 0 .. 9, a .. z, A .. Z)
letter
A PARSER(Char) which parse underscore or an alphabet char (i.e. _, a .. z, A .. Z)
newline
A PARSER(Char) which parse a newline character (i.e. LF)
crlf
A PARSER(Char) which parse a pair of CR and LF, and returns LF (i.e. CR LF → LF)
endOfLine
A PARSER(Char) which parse a LF or a CR-LF pair and returns LF.
endOfFile
A PARSER(None) which succeeds if and only if it was the end of input, and returns NONE.
tab
A PARSER(Char) which parse a TAB character.
space
A PARSER(Char) which parse a white-space (i.e. space, TAB, LF, CR)
spaces
A PARSER(None) which parse zero or more white-spaces (i.e. space, TAB, LF, CR), and returns NONE.
number
A PARSER(Int) which parse one or more digits and skips trailing white-spaces, then returns it as an int value.
anyUtf8
A PARSER(String) which parse any one UTF-8 character and returns it as a string.

Built-in Parser-generators

char1(c)
Create a PARSER(Char) which parse the char c
string1(s)
Create a PARSER(String) which parse the string s.
  • NOTE : parse(string1(s), src, ex) succeeds:
    • if and only if the input from src was starting with s.
    • otherwise fails without consuming any input.
utf8(s)
Create a PARSER(String) which parse the UTF-8 string s.
oneOf(cs)
Create a PARSER(Char) which parse a char c satisfying it is contained in the string cs.
noneOf(cs)
Create a PARSER(Char) which parse a char c satisfying it is not contained in the string cs.
satisfy(pred)
Create a PARSER(Char) which parse a char c satisfying pred(c) == true
range(min, max)
Create a PARSER(Char) which parse a char c satisfying min <= c && c <= max.

Built-in GENERIC Parser-combinators

many(p)

PARSER(List(Char)) many(char c)
Same as many(char1(c)).
PARSER(List(String)) many(const char* s)
Same as many(string1(s)).
PARSER(List(T)) many(PARSER(T) p)
Create a parser of PARSER(List(T)) type, which
  • apply p zero or more to a text.
  • returns a list of List(T) type, which consists of the each result of p.
  • NOTE : parse(many(p), src, ex)
    • succeeds if p exact matched N times (N ≥ 0) to the input from src and N+1 th p failed without consuming any input.
    • fails if p succeeded without consuming any input.
    • otherwise fails after consumed some input.
  • NOTE : T must be a member of TYPESET(0)

many1(p)

NOTE : many1(p) is same as cons(p, many(p)).

PARSER(List(Char)) many1(char c)
Same as many1(char1(c)).
PARSER(List(String)) many1(const char* s)
Same as many1(string1(s)).
PARSER(List(T)) many1(PARSER(T) p)
Create a parser of PARSER(List(T)) type, which
  • apply p once or more to a text.
  • returns a list of List(T) type, which consists of the each result of p.
  • NOTE : parse(many1(p), src, ex)
    • succeeds if p exact matched N times (N ≥ 1) to the input from src and N+1 th p failed without consuming any input.
    • fails if p succeeded without consuming any input.
    • otherwise fails after consumed some input.
  • NOTE : T must be a member of TYPESET(0)

seq(p, …)

PARSER(List(T)) seq(PARSER(T) p, …)
Create a parser of PARSER(List(T)) type, which
  • apply for all parsers p, ... to a text.
  • return a list consists of the each retults of p, ....
  • if a parser in the p, ... failed, throws an error of it.
  • NOTE : For all parser in the p, ..., whose type must be same, otherwise the behavior is undefined.
  • NOTE : T must be a member of TYPESET(0)

cons(p, ps)

PARSER(List(Char)) cons(char c, const char* cs)
Same as cons(char1(c), string1(cs)).
PARSER(List(Char)) cons(char c, PARSER(List(Char)) ps)
Same as cons(char1(c), ps).
PARSER(List(String)) cons(const char* s, PARSER(List(String)) ps)
Same as cons(string1(s), ps).
PARSER(List(T)) cons(PARSER(T) p, PARSER(List(T)) ps)
Create a parser of PARSER(List(T)) type, which
  • apply p at first, and then apply ps to the subsequent text.
  • return a list consists of the following:
    • the result of p and
    • elements of the result of ps.
  • if p or ps failed, throws an error of it.
  • NOTE : T must be a member of TYPESET(0)

skip(p)

PARSER(None) skip(char c)
Same as skip(char1(c)).
PARSER(None) skip(const char* s)
Same as skip(string1(s)).
PARSER(None) skip(PARSER(T) p)
Create a parser of PARSER(List(T)) type, which
  • apply p and returns NONE.
  • a value returned by p is discarded.
  • if p failed, throws error of p.
  • NOTE : T must be a member of TYPESET(1)

skip1st(p1, p2)

PARSER(Char) skip1st(char c1, char c2)
Same as skip1st(char1(c1), char1(c2)).
PARSER(Char) skip1st(const char* s, char c)
Same as skip1st(string1(s), char1(c)).
PARSER(Char) skip1st(PARSER(S) p, char c)
Same as skip1st(p, char1(c)).
PARSER(String) skip1st(char c, const char* s)
Same as skip1st(char1(c), string1(s)).
PARSER(String) skip1st(const char* s1, const char* s2)
Same as skip1st(string1(s1), string1(s2)).
PARSER(String) skip1st(PARSER(S) p, const char* s)
Same as skip1st(p, string1(s)).
PARSER(T) skip1st(char c, PARSER(T) p)
Same as skip1st(char1(c), p).
PARSER(T) skip1st(const char* s, PARSER(T) p)
Same as skip1st(string1(s), p).
PARSER(T) skip1st(PARSER(S) p1, PARSER(T) p2)
Create a parser of PARSER(T) type, which
  • apply p1 at first, and then apply p2 to the subsequent text.
  • return the result of p2 if both p1 and p2 success.
  • if p1 failed, p2 is not applied and throws error of p1.
  • if p1 success and then p2 failed, throws error of p2.
  • NOTE : S and T must be a member of TYPESET(1)
  • NOTE : S and T may or may not be same.
    (i.e. p1 and p2 may be a parser of same type or different type)

For example:

parseTest(skip1st(char1('a'), string1("bc")), "abc"); // -> "bc"
parseTest(skip1st(string1("ab"), char1('c')), "abc"); // -> 'c'

skip2nd(p1, p2)

PARSER(Char) skip2nd(char c1, char c2)
Same as skip2nd(char1(c1), char1(c2)).
PARSER(Char) skip2nd(char c, const char* s)
Same as skip2nd(char1(c), string1(s)).
PARSER(Char) skip2nd(char c, PARSER(S) p)
Same as skip2nd(char1(c), p).
PARSER(String) skip2nd(const char* s, char c)
Same as skip2nd(string1(s), char1(c)).
PARSER(String) skip2nd(const char* s1, const char* s2)
Same as skip2nd(string1(s1), string1(s2)).
PARSER(String) skip2nd(const char* s, PARSER(S) p)
Same as skip2nd(string1(s), p).
PARSER(T) skip2nd(PARSER(T) p, char c)
Same as skip2nd(p, char1(c)).
PARSER(T) skip2nd(PARSER(T) p, const char* s)
Same as skip2nd(p, string1(s)).
PARSER(T) skip2nd(PARSER(T) p1, PARSER(S) p2)
Create a parser of PARSER(T) type, which
  • apply p1 at first, and then apply p2 to the subsequent text.
  • return the result of p1 if both p1 and p2 success.
  • if p1 failed, p2 is not applied and throws error of p1.
  • if p1 success and then p2 failed, throws error of p2.
  • NOTE : S and T must be a member of TYPESET(1)
  • NOTE : S and T may or may not be same.
    (i.e. p1 and p2 may be a parser of same type or different type)

For example:

parseTest(skip2nd(char1('a'), string1("bc")), "abc"); // -> 'a'
parseTest(skip2nd(string1("ab"), char1('c')), "abc"); // -> "ab"

token(p)

NOTE : token(p) is same as skip2nd(p, spaces).

PARSER(Char) token(char c)
Same as token(char1(c)).
PARSER(String) token(const char* s)
Same as token(string1(c)).
PARSER(T) token(PARSER(T) p)
Create a parser of PARSER(T) type, which
  • apply p at first, then
  • skip any trailing white-spaces, and
  • return the result of p.
  • NOTE : T must be a member of TYPESET(1)

either(p1, p2)

PARSER(Char) either(char c1, char c2)
Same as either(char1(c1), char1(c2)).
PARSER(Char) either(char c, PARSER(Char) p)
Same as either(char1(c), p)).
PARSER(Char) either(PARSER(Char) p, char c)
Same as either(p, char1(c))).
PARSER(String) either(const char* s1, const char* s2)
Same as either(string1(s1), string1(s2)).
PARSER(String) either(const char* s, PARSER(String) p)
Same as either(string1(s), p).
PARSER(String) either(PARSER(String) p, const char* s)
Same as either(p, string1(s)).
PARSER(T) either(PARSER(T) p1, PARSER(T) p2)
Create a parser of PARSER(T) type, which
  • return result of p1 if p1 succeeded,
  • if p1 consumed one or more chars and failed, throw error of p1,
  • if p1 consumed no chars and failed, return result of p2, or
  • throw error of p2
  • NOTE : T must be a member of TYPESET(1)

tryp(p)

PARSER(Char) tryp(char c)
Same as tryp(char1(c)).
PARSER(String) tryp(const char* s)
Same as tryp(string1(s)).
PARSER(T) tryp(PARSER(T) p)
Create a parser of PARSER(T) type, which
  • return result of p if p success,
  • otherwise rewind the input-state back then throw error of p.
  • NOTE : T must be a member of TYPESET(1)

Building block of Parser-class

Declares/Defines new PASER(T) class

NOTE : This section is mainly described for developers of CPARSEC2 library, not for users at the present.

TYPEDEF_PARSER(T, R)
Define new concrete PARSER(T) type and RETURN_TYPE(PARSER(T)).

A parser of type PARSER(T) returns a value of type R when the parser was applied to a text.
(i.e. RETURN_TYPE(PARSER(T)) will be R)

DECLARE_PARSER(T)
Declares functions/methods for PARSER(T).
DEFINE_PARSER(T)
Defines functions/methods for PARSER(T).
void SHOW(T)(R x) { /* print x; */ }
Defines function void SHOW(T)(R x).
  • NOTE : void SHOW(T)(R x) is called by parseTest(p, text) to print x.
  • NOTE : x is the result of parser p applied to the text.

Example: ‘IntParser.h’

#include <cparsec2.h>

/* Defines PARSER(Int) type, and RETURN_TYPE(PARSER(T)) as int */
TYPEDEF_PARSER(Int, int);
/* Declares functions/methods for PARSER(Int) */
DECLARE_PARSER(Int);

Example: ‘IntParser.c’

#include "IntParser.h"

/* Defines (implement) functions/methods for PARSER(Int) */
DEFINE_PARSER(Int);
/* and defines void SHOW(Int)(int x) */
void SHOW(Int)(int x) {
  printf("%d", x);
}

Construct an instance of PARSER(T) class

PARSER(T) PARSER_GEN(T)(PARSER_FN(T) f, void* arg)
Create new instance of PARSER(T).
f is used as a function body of the parser instance, and arg is argument to be passed to f when the parser instance was applied to a text.
PARESR_FN(T)
Type of function body of a parser instance of PARSER(T) type.
PARSER_FN(T) is the type of function pointer RETURN_TYPE(PARSER(T)) (*)(void* arg, Source src, Ctx* ex).

For example, PARSER_GEN(Int) and PARSER_FN(Int) are defiened as follows:

typedef int (* PARSER_FN(Int))(void* arg, Source src, Ctx* ex);
PARSER(Int) PARSER_GEN(Int)(PARSER_FN(Int) f, void* arg);

Example of Parser-generator PARSER(Int) mult(int a)

The below is a example of parser-generator mult(int a), which

  • creates a parser of PARSER(Int) type.
    • When the parser applied to one or more digits,
      • it returns a int value multiplied by a.

Example: ‘mult.h’

#include "IntParser.h"

/* a parser generator 'mult(a)' */
PARSER(Int) mult(int a);

Example: ‘mult.c’

#include <stdlib.h>
#include "IntParser.h"

/* function body of a parser to be generated by mult(a) */
static int mult_func(void* arg, Source src, Ctx* ex) {
  int a = (int)(intptr_t)arg;
  return a * atoi(parse(many1(digit), src, ex));
}

/* a parser generator 'mult(a)' */
PARSER(Int) mult(int a) {
  /* construct an instance of PARSER(Int) */
  return PARSER_GEN(Int)(mult_func, (void*)(intptr_t)a);
}

Apply an instance of PARSER(T) to a text

To apply a parser, use parse(p, src, ctx), parseTest(p, text) and PARSE_TEST(p, text) macros. These macros are fully generic and easy to use.

In the below example, using parse(p, src, ex).

Example: ‘main.c’

#include <stdio.h>
#include "mult.h"

int main(int argc, char** argv) {
  UNUSED(argc);
  UNUSED(argv);

  /* initialize CPARSEC2 library */
  cparsec2_init();

  Ctx ctx;
  TRY(&ctx) {
    /* input text is "100 200" */
    Source src = Source_new("100 200");
    /* parse the input text */
    int x = parse(mult(1), src, &ctx); /* x = 1 * 100 */
    parse(spaces, src, &ctx);          /* skip white-spaces */
    int y = parse(mult(2), src, &ctx); /* y = 2 * 200 */
    /* print x + y */
    printf("%d\n", x + y);
    return 0;
  }
  else {
    printf("error:%s\n", ctx.msg);
    return 1;
  }
}

Building block of Parser-combinators

Declares/Defines new Parser-combinators

For example in case of many(p) :

/* Name of MANY(T) */
#define MANY(T) CAT(many_, T)

/* Generic macro function `many(p)` */
#define many(p) (GENERIC_P(PARSER_CAST(p), MANY, TYPESET(0))(PARSER_CAST(p)))

// For example:
// - `many("abc")` is expanded to `MANY(String)(string1("abc"))`
// - `many(number)` is expanded to `MANY(Int)(number)`

/* Generic function prototype `MANY(T)(p)` */
#define DECLARE_MANY(T) PARSER(List(T)) MANY(T)(PARSER(T) p)

/* Declares `PARSER(List(T)) MANY(T)(PARSER(T) p);` for each T in TYPESET(0) */
FOREACH(DECLARE_MANY, TYPESET(0));

// `FOREACH(DECLARE_MANY, TYPESET(0));` is expanded to as follows:
// ~~~c
// PARSER(List(Char)) MANY(Char)(PARSER(Char) p);
// PARSER(List(String)) MANY(String)(PARSER(String) p);
// PARSER(List(Int)) MANY(Int)(PARSER(Int) p);
// ~~~

/* Implementation of `MANY(T)(p)` */
#define DEFINE_MANY(T)                          \
  PARSER(List(T)) MANY(T)(PARSER(T) p) {        \
    /* implementation of MANY(T)(p) */          \
  }                                             \
  END_OF_STATEMENTS

/* Defines `PARSER(List(T)) MANY(T)(PARSER(T) p)` for each T in TYPESET(0) */
FOREACH(DEFINE_MANY, TYPESET(0));

// `FOREACH(DEFINE_MANY, TYPESET(0));` is expanded to as follows:
// ~~~c
// PARSER(List(Char)) MANY(Char)(PARSER(Char) p) {
//   /* implementation of MANY(T)(p) */
// }
// _Static_assert(1, "");
// PARSER(List(String)) MANY(String)(PARSER(String) p) {
//   /* implementation of MANY(T)(p) */
// }
// _Static_assert(1, "");
// PARSER(List(Int)) MANY(Int)(PARSER(Int) p) {
//   /* implementation of MANY(T)(p) */
// }
// _Static_assert(1, "");
// ~~~

PARSER_CAST(expr)

PARSER_CAST(expr) cast expr to a parser.

  • if expr was a parser of supported PARSER(T) type, returns expr itself.
  • if expr was a char or const char, returns char1(expr).
  • if expr was a char* or const char*, returns string1(expr).

GENERIC_METHOD(expr, C, F, args…)

GENERIC_METHOD(expr, C, F, args...) is a macro function to define a “C11 _Generic selection” expression.

  • GENERIC_METHOD(expr, C, F, args...) is expanded to _Generic(expr, C(T) : F(T), ...) for each T in args....

GENERIC_P(expr, F, args…)

GENERIC_P(expr, F, args...) is a macro function to define a “C11 _Generic selection” expression.

  • GENERIC_P(expr, F, args...) is expanded to _Generic(expr, PARSER(T) : F(T), ...) for each T in args....
  • Same as GENERIC_METHOD(expr, PARSER, F, args...).

FOREACH(F, args…)

FOREACH(F, args...) is a macro function for unrolling statements.

  • FOREACH(F, args...) is expanded to F(T); for each T in args....
#define F(T) T CAT(T, _value)
FOREACH(F, char, int, double);
// `FOREACH(F, x, y, z)` is expanded to `F(x); F(y); F(z)`
// Therefore the above is expanded to as follows:
// ~~~c
// char char_value;
// int int_value;
// double double_value;
// ~~~

TYPESET

Set of type-names.

TYPESET(0)
A set of type-names for parser-combinators.
TYPESET(0) is expanded to Char, String, Int, None, Node.
TYPESET(1)
Another set of type-names for parser-combinators.
TYPESET(1) is expanded to Char, String, Int, None, Node, List(String), List(Int), List(None), List(Node).
ELEMENT_TYPESET
A set of type-names for element-type of generic-containers.
ELEMENT_TYPESET is expanded to Ptr, Char, String, Int, None, Node

These macros are convenient and easy to use with

  • GENERIC_METHOD(expr, C, F, ...),
  • GENERIC_P(expr, F, ...), and
  • FOREACH(F, ...).

For example :

GENERIC_P(expr, F, TYPESET(0))(expr);

The above code is expanded as follows:

_Generic(expr, PARSER(Char) : F(Char), PARSER(String) : F(String), PARSER(Int) : F(Int))(expr);

Extends CPARSEC2 library for user defined types

  • Where is List(float)?
  • Where is PARSER(double)?
  • How to add List(UserType), Buff(UserType), and PARSER(UserType)?

Good question!

The CPARSEC2 library is ready to extend itself against user-defined types.

To extend the library,

  1. Define CPARSEC2_USER_TYPESET macro, before #include <cparsec2.h>
  2. Call CPARSEC2_DEFINE_USER_TYPESET() macro, after #include <cparsec2.h>
  3. Define void SHOW(U)(U x) function for each type U in CPARSEC2_USER_TYPESET.

The following sections show an example how to add user-defined type Person to the CPARSEC2 library. See also example/extend_cparsec2.

Example of extended CPARSEC2 library

Here is an example that is extending CPARSEC2 library against user-defined type Person.

my_cparsec2.h - a header file of extended CPARSEC2 library:

// (1) Define types
typedef struct {
  const char* name;
  int age;
} Person;

// (2) Define 'CPARSEC2_USER_TYPESET' macro, 
//     which should be a list of type names. (comma-separated)
#define CPARSEC2_USER_TYPESET Person

// (3) Then include 'cparsec2.h'
#include <cparsec2.h>

my_cparsec2.c - an implementation of extended CPARSEC2 library:

// (1) include 'my_cparsec2.h'
#include "my_cparsec2.h"

// (2) Call CPARSEC2_DEFINE_USER_TYPESET() macro, which defines implementation
//     of the following classes/APIs for each U in CPARSEC2_USER_TYPESET:
//     - Buff(U),
//     - List(U),
//     - PARSER(U),
//     - PARSER(List(U)),
//     - All APIs for these classes such as `parse(p, src, ex)`, and
//     - All parser-combinators such as `many(p)`, `either(p1, p2)`, etc.
CPARSEC2_DEFINE_USER_TYPESET();

// (3) Define function `void SHOW(U)(U x)` for each U in CPARSEC2_USER_TYPESET.
void SHOW(Person)(Person x) {
  printf("<Person:.name=\"%s\", .age=%d>", x.name, x.age);
}

That’s all.

Sample application for the extended CPARSEC2 library

Now you can use Buff(Person), List(Person), PARSER(Person), PARSER(List(Person)), all APIs for these classes, and all parser-combinators for PARSER(Person).

Here is the application sample:

main.c - an application using the extended CPARSEC2 library:

#include "my_cparsec2.h"

// implementation of 'person' parser
Person person_fn(void* arg, Source src, Ctx* ex) {
  UNUSED(arg);
  parse(token((char)'('), src, ex);
  const char* name = parse(token(many1(letter)), src, ex);
  parse(token((char)','), src, ex);
  int age = parse(number, src, ex);
  parse(token((char)')'), src, ex);
  return (Person){.name = name, .age = age};
}

int main() {
  cparsec2_init();

  PARSER(Person) person = PARSER_GEN(Person)(person_fn, NULL);
  PARSE_TEST(many(person), "(Alice, 19) (Bob, 24)");
  // -> [<Person:.name="Alice", .age=19>, <Person:.name="Bob", .age=24>]

  cparsec2_end();
}