Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tagged unions/product types for results #35

Open
deliciouslytyped opened this issue Dec 9, 2019 · 11 comments
Open

Tagged unions/product types for results #35

deliciouslytyped opened this issue Dec 9, 2019 · 11 comments

Comments

@deliciouslytyped
Copy link

seq provides a convenient way to return a nicely structured result (and close enough, if not exactly what you want most of the time), namely by using it with kwargs.
e.g.

>>> seq(a=any_char, b=any_char).parse("ab")
{'a': 'a', 'b': 'b'}

This is basically an encoding of a product type.

It would be nice if there was a similarly convenient way to return something analogous for alternatives (i.e. alt or the | operator), the result of which is encodable as a tagged union/is an encoding of a sum type.

Example pseudocode:

>>> alt(a=any_char, b=any_char).parse("a")
{'a': 'a'}
>>> alt(a=any_char, b=any_char).parse("b")
{'b': 'b'}

I don't see a way to use this together with an operator without hacks like passing it tuples or something (like (name, parser) | (name,parser)), but i think this would still be nice to have for the alt() form.

@deliciouslytyped
Copy link
Author

deliciouslytyped commented Dec 9, 2019

A thrown together example, based on seq:

def alt(*parsers, **kw_parsers):
    if not parsers and not kw_parsers:
        return fail('<empty alt>')

    if parsers and kw_parsers:
        raise ValueError("Use either positional arguments or keyword arguments with alt, not both")

    if parsers:
        @Parser
        def alt_parser(stream, index):
            result = None
            for parser in parsers:
                result = parser(stream, index).aggregate(result)
                if result.status:
                    return result

            return result

        return alt_parser

    else:
        @Parser
        def alt_kwarg_parser(stream, index):
            result = None
            for name, parser in kw_parsers.items():
                result = parser(stream, index).aggregate(result)
                if result.status:
                    #I'm not actually sure if this is the correct way to use aggregate here, I haven't looked at how it's supposed to work
                    return Result.success(result.index, { name: result.value }).aggregate(result)

            return result

        return alt_kwarg_parser

(also needs the >= 3.6 check for ordered kwargs)

@jneen
Copy link
Contributor

jneen commented Dec 12, 2019

I would prefer (name, value) for this, but it seems reasonable to me!

@jneen
Copy link
Contributor

jneen commented Dec 12, 2019

Reason being: you could follow it up with combine, like alt(a=..., b=...).combine(Node) and would receive Node(type, value)

@deliciouslytyped
Copy link
Author

deliciouslytyped commented Dec 13, 2019

I'm making this up as I go and use parsy more.

What you said sounds reasonable.

I happen to have been using(/my rationale for using) dicts (was) because it's lets me directly, somewhat meaningfully address the component of the result that I want, as opposed to using some "arbitrary" numerical index, which might possibly (?) change.

I don't know right now which should be done, or if it's possible to have both.

@spookylukey
Copy link
Member

@deliciouslytyped Could you give a full example of what you are thinking of? The kind of thing that could potentially go in our docs? For me at least that would help assess whether this would fit into the existing patterns we are encouraging in parsy.

@jneen
Copy link
Contributor

jneen commented Dec 17, 2019

if i understand correctly, it's something like:

letters = regex(r'\w+')
numbers = regex(r'\d+')
atom = alt(word=letters, int=numbers.map(int))

atom.parse('abcd') # => ('word', 'abcd')
atom.parse('1234') # => ('int', 1234)

@jneen
Copy link
Contributor

jneen commented Dec 17, 2019

without pattern matching, it's difficult for python to actually make use of those tuple values, but those seem more correct in general.

@jneen
Copy link
Contributor

jneen commented Dec 17, 2019

i mean, something hacky like this would work, though.

match = lambda fs: lambda v: fs[v[0]](v[1])

atom.map(match(word=lambda w: ..., int=lambda i: ...))

@spookylukey
Copy link
Member

Thanks @jneen for the explanation.

Currently we have the tag method that can do something very similar:

letters = regex(r'\w+')
numbers = regex(r'\d+')
atom = letters.tag('word') | numbers.tag('int')

One of the purposes of this is to be able to pass to combine/combine_dict - see docs - as you suggest.

Regarding building this into alt, I'm wondering whether is particular value in having another way to do the same thing? Without proper sum types in Python, it's difficult to know exactly what pattern to encourage for building something that emulates them.

@deliciouslytyped
Copy link
Author

Sorry, I'm super tired right now, I think my use of dicts might have been minimally analogous to pattern matching, but I'm not sure.

@deliciouslytyped
Copy link
Author

deliciouslytyped commented Dec 17, 2019

Ok tag looks quite relevant. On the other hand you can choose to not tag some elements of the alternative. I'm not sure if that's desirable.

seq does force using one or the other of positional/kwargs, but not mixing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants