Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal for Sequence, Map, and Array Decomposition #8

Open
wants to merge 8 commits into
base: master
Choose a base branch
from
119 changes: 119 additions & 0 deletions tuple-decomposition.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,119 @@
# Tuple Decomposition

**Author**: Reece H. Dunn. 67 Bricks.

This proposal allows fixed length sequences and arrays to be decomposed and assigned to separate variables in a single declaration.


## Description

\[Definition: A *tuple sequence* is a fixed length sequence, where the items in the array represent distinct parts of an object, not an ordered collection of objects.\] For example, a 2D point could be represented as a tuple sequence, where the first value will be the x coordinate and the second value the y coordinate.

\[Definition: A *tuple array* is a fixed length array, where the items in the array represent distinct parts of an object, not an ordered collection of objects.] The 2D point example could also be represented as an array with 2 values. An array can also be used for objects that contain optional items.

\[Definition: A *tuple* is a tuple sequence or tuple array.\]

Given a tuple such as `(1, 2, 3)`, the values within that sequence or array cannot easily be extracted. With the current version of XPath and XQuery, they need to be assigned to a temporary variable first. For example:

let $result := get-camera-point()
let $x := $result[1]
let $y := $result[2]
let $x := $result[3]
return "(" || $x || "," || $y || "," || $z || ")"

This proposal would allow this to be written more concisely as:

let ($x, $y, $z) := get-camera-point()
return "(" || $x || "," || $y || "," || $z || ")"

An XPath or XQuery processor may implement this by transforming it to the expanded form above with `$result` being a unique variable that is not visible to the expression.

For tuple sequences, `$tuple[N]` would be used to extract the nth item in the tuple sequence. If the item does not exist, an empty sequence is returned.

For tuple arrays, `$tuple(N)` or `array:get($tuple, N)` would be used to extract the nth item in the tuple array. If the item does not exist, an `err:FOAY0001` (array index out of bounds) error will be raised.

This would apply to any variable declaration or binding where `:=` is used to assign a variable to an expression. Specifically:

1. variable declarations, including decomposition of default values (XQuery)
1. context item declaration (XQuery)
1. let clauses (XPath/XQuery)
1. for clauses (XPath/XQuery)
1. grouping spec (XQuery)

### Variation: Separate Tuple Array Decomposition Syntax

If the tuple decomposition is being performed on a tuple array, it may be better to use array syntax to define the composition:

let [$x, $y, $z] := get-camera-point()
return "(" || $x || "," || $y || "," || $z || ")"

The `(...)` syntax would then be *tuple sequence decomposition*, while the `[...]` syntax would be *tuple array decomposition*.

This would allow XPath/XQuery processors to report an error if tuple sequence decomposition was used on tuple arrays, and when tuple array decomposition was used on tuple sequences.

This would make it clearer to the user when a sequence is expected and when an array is expected, and thus when out of bounds access would result in an empty sequence or an error.

This does add an additional level of complexity to the language grammar, but may help processors decide how to decompose the tuple values as determining whether the tuple type being decomposed is a sequence or array can be determined during the static analysis phase.

### Assigning the rest of a sequence or array

It can be useful to only extract part of a sequence or array (e.g. the heading of a table), and store the rest of the items in another variable. For example:

let ($heading as array(xs:string), $rows as array(xs:string)*) := load-csv("test.csv")

It may be useful to define a shorthand for selecting the rest of the sequence or array. Using the CSV example above:

let ($heading, $rows*) := load-csv("test.csv")

The other occurrence indicators would also be usable after the last variable binding.

### Influences

Tuple decomposition is found in various languages such as Python, Scala, and C#. These languages also have support for tuple types.

Python has support for specifying that a variable is assigned the remaining values in the tuple.

## Use Cases

There are many cases where fixed size sequences (*tuple sequences*) may be used such as points, complex and rational numbers, sin/cos, and mul/div. This makes extracting data from these simpler, and may also be used to aid readability by assigning descriptive names to each of the tuple items.

### Potential Confusion and Complexity

From a user's perspective it would be confusing if an item in the sequence is an empty sequence, as the items after that would be assigned to the wrong variable. However, this is no different from them using the long form to extract the values from the tuple sequence.

There is a potential for confusion if changing from a sequence return type to an array, where the code may subsequently raise an `err:FOAY0001` error. The reason for this is hidden from the user.

There is complexity for the semantics of variable declarations, especially those with external values. If this is deemed too complex, it can be dropped from this proposal.


## Examples

Extracting values from a tuple sequence:

declare function sincos($angle as xs:double?) {
math:sin($angle), math:cos($angle)
};

let $angle := math:pi()
let ($sin, $cos) := sincos($angle)
return $sin || "," || $cos

Extracting values from a tuple array:

declare function sincos($angle as xs:double?) {
[ math:sin($angle), math:cos($angle) ]
};

let $angle := math:pi()
let ($sin, $cos) := sincos($angle)
return $sin || "," || $cos

Extracting values with typed variables:

let ($x as xs:double, $y as xs:double) := polar-to-cartesian(1.0, math:pi())
return "x=" || $x || ", y=" || $y


## Grammar

TBD.