Skip to content
Tim Armstrong edited this page Mar 2, 2015 · 43 revisions

Cross-cutting Issues

This section is for global decisions that affect multiple functions

  • camelCase is the standard for library functions. Non-camelcase functions may be retained for compatibility
  • In Swift/T, library functions that do not have a return value typically return a void output that allows explicit dependence on it running.
  • In Swift/T, we annotate some functions with properties like @pure that are used by the optimizer. Do we need to include this info in the docs? πŸ’¦ I was going for all library functions being pure.
  • Implemented functions are annotated with 🍊 for Swift/T and 🐨 for Swift/K once they exactly match the functionality described here.

Issues still needing resolution

  • Random numbers/sequences:
    • changed to non-lazy sequences (Tim, would this work?)
    • non-deterministic random still possible, but not being pushed as part of the standard library
  • Shift/bitwise ops: no use case in sight, so we need to decide something
  • isInt/isFloat:
    • no use case
    • also, Swift being statically-typed with no casting (so no way of doing much run-time manipulation), such a function would be completely static. But then it provides no run-time benefit and it's not hard for the programmer to figure out what these would do statically.
  • join/stringJoin: see comment for string join(string[] sa, string delimiter)
  • boolean exists(V A[K] array, K key): nondeterminism
  • getEnv: empty string vs. variable not set

Math

Constants

PI 🍊

E 🍊

Trig functions

float sin(float x) 🍊

float cos(float x) 🍊

float tan(float x) 🍊

float asin(float x) 🍊

float acos(float x) 🍊

float atan(float x) 🍊

float atan2(float y, float x) 🍊

Exponentials/Powers

float exp(float x) 🍊

float ln(float x) 🍊

float log(float x, float base) 🍊

Note: in Swift/T we currently have log(float x) mean natural log.

float log10(float x) 🍊

float pow(float base, float exponent) 🍊

float pow(int base, int exponent) ❓ Should we support this overload?

float sqrt(float x) 🍊

float cbrt(float x)

Rounding

float ceil(float x) 🍊

float floor(float x) 🍊

float round(float x) 🍊

Misc

int min(int a, int b) 🍊

float min(float a, float b) 🍊

int max(int a, int b) 🍊

float max(float a, float b) 🍊

int abs(int z) 🍊

float abs(float x) 🍊

boolean isNaN(float x) 🍊

❓ Do we need to document details of floating point behavior such as invalid values, etc? πŸ’¦ Yes, but we can postpone that. An initial iteration is probably going to have reasonably uniform behavior for both T and K

Random numbers

This probably needs some discussion.

❗ As primitives we should provide something simpler that doesn't depend on lazy arrays. I have a number of implementation concerns for Swift/T.

πŸ’¦ Unfortunately I'm not sure how to keep it deterministic so that restart logs would work - Mihael

πŸŽ… Maybe we should avoid having this be the canonical way for now? I'm not confident on implementing it in T because it seems to depend on running arbitrary code lazily when an array is read. How about if we had, as a lowest common denominator, a type random_state and functions random_state seed_random(int seed), (random_state, int) next_random(random_state state), etc. You could use this to fill an array if needed. It woudl be deterministic. Maybe one downside is that programmers might accidentally bifurcate the RNG.

int randomInt(int seed, int sequenceNum, int min, int max) Returns a deterministic uniform random sequence indexed by sequenceNum.

float randomFloat(int seed, int sequenceNum, float min, float max)

Same, but with floats.

float gaussian(int seed, int sequenceNum)

Normally distributed random with mu = 0 and sigma = 1.

Stats

float sum(float[] a) 🍊 ❓ Guarantees about how the sum is computed, given that floating point addition is not associative/commutative?

  • dont make guarantees. Only reasonable guarantee would be serial from 0; parallel addition of partitions would be useful for huge arrays. So leave order of addition undefined. - Mike

int sum(int[] a) 🍊

  • Do we want to add - perhaps elsewhere: string sum(string[] a) ?

πŸ’¦ That would be equivalent to join() - Mihael

float avg(float[] a) 🍊

float avg(int[] a) 🍊

float moment(float[] a, int n, float center)

Returns the n-th moment of an array about a value (center). For example, the mean would be moment(a, 1), while the standard deviation is moment(a, 2, avg(a)).

float moment(int[] a, int n, float center)

❓ Do we need shift and other bitwise operators? One could potentially use division/multiplication to emulate them, but there are subtleties with rounding and signs that might make it tricky.

  • I think if there is a use case common enough that people will need them, we should support them. Emulation seems impractical. -Tim
  • Right, does anybody know of a use case? -Mihael

Conversion functions

int toInt(float x) 🍊

float toFloat(int z) 🍊

int parseInt(string s, int base = 10)

float parseFloat(string s) 🍊

string toString(int z) 🍊

string toString(float x) 🍊

string toString(boolean b) ❓ What is exact intended output?

❗ Swift/T includes a function repr that dumps any data type to a string in some implementation-specified form. This may be useful too. Also array_repr which is essentially map repr.

❗ isInt/isFloat/etc functions can be very useful. πŸ’¦ Use case?

String functions

❗ Swift/T's strcat takes string|float|int args. This is tied into support for converting floats/ints to strings when concatenating to a string for the + operator. I can disentangle these two things - Tim. πŸ’¦ we can have that as default behavior. I believe K does an automatic conversion to string also.

string strcat(string... s) 🍊

int length(string s) 🍊

string[] split(string s, string delimiter)

string[] split(string s, string delimiter, int maxSplit)

string[] splitRe(string s, string regexp)

string[] splitRe(string s, string regexp, int maxSplit)

like split(), except the delimiter is a regular expression

string trim(string s)

string substring(string s, int start)

string substring(string s, int start, int end)

string toUpper(string s)

string toLower(string s)

string join(string[] sa, string delimiter) ❗ Overloading with the array join operation may be confusing. stringJoin?

πŸ’¦ the delimiter argument is unique to string join()

string replaceAll(string s, string find, string replacement)

string replaceAll(string s, string find, string replacement, int start, int end)

string replaceAllRe(string s, string findRe, string replacementRe)

string replaceAllRe(string s, string findRe, string replacementRe, int start, int end)

❗ This is meant to allow the use of capture groups to, for example, do replacements of the sort "key1=v1, key2=v2,..." β†’ "rep1=v1, rep2=v2,...".

int indexOf(string s, string find, int start)

int indexOf(string s, string find, int start, int end)

int lastIndexOf(string s, string find, int start)

int lastIndexOf(string s, string find, int start, int end)

string format(string spec, any... args)

boolean matches(string s, string re)

string[] findAllRe(string s, string re)

❓ Not sure about this one, but because we would not otherwise offer a comprehensive regexp library, this could be used to return an array with all capture groups in a regexp. I do not see how this would be possible with any of the other functions.

❓ do we need reverse?

πŸ’¦ It's not very common as far as I can tell. Is there at least one use case in the whole history of Swift? -Mihael

Array functions

T[K] slice(T[K] a, int start, int end)

T[int][K] split(T[K], int n)

Splits an array into chunks of size n. The last element of the returned array could have fewer than n elements

❓ if the array is sparse or not zero-based, are these indices the keys or the physical indices?

πŸ’¦ the [int] keys label the chunks. The [K] indices are the actual keys. In other words it splits a hashtable into multiple hastables without losing the mappings. -Mihael

T[int] join(T[K1][K2] a)

❗ Joins a number of arrays into a single array. If an ordering exists on K1 and/or K2, it should be preserved.

T[int] compact(T[K] a)

❗ Returns an int-indexed array containing all elements of T[K]. The exact mapping between between the keys in the initial array and the integer keys is not specified, but it is guaranteed that the same input will always return the same output. If a complete ordering exists on K, then the result is stable (i.e. the ordering is preserved). In particular, stability is guaranteed for K == int.

T[K][int] zip(T[K] a, T[K] b)

❓ This is based on pairing up keys right? What is behaviour if matching keys not present? Discard them or raise an error?

πŸ’¦ I would vote for error instead of subtle and hard to debug issues. I'm also not very insistent on having this function.

boolean contains(V A[K] array, K key) 🍊

Returns true if the array contains the key upon closing

❗ Currently this doesn't return until the array is closed, even if the element is assigned. This seems bad.

πŸ’¦ Yes, I believe that as soon as the assertion that A contains key K could be made, this should return.

boolean exists(V A[K] array, K key) 🍊

Returns true if the array contains the key whenever the function runs (maybe before the array is closed.

❗ Non-deterministic.

πŸ’¦ I'm not sure I like that. Actually, I'm sure I don't like that.

  • Maybe we should drop this one from the standard library? There was a use case, but it's probably not relevant for K.

I/O

any read(file f, string format="None")

❗ Reads data from a file. A minimum of the following formats should be supported:

  • "None" - read the entire file as a string
  • "FieldAndValue" - field = value, one per line

Other possible/desired formats:

  • "CSV" - read a CSV file (for backwards compatibility with K's readData)
  • "JSON"
    • specify readData() compatibility more clearly. May want to leave readData() for a while to ease transition? - Mike

πŸ’¦ We leave everything as is for a while. We will need to figure out a smooth transition scheme.

❗ Should format be an enum or set of constants? - Tim

πŸ’¦ Random string. The implementation should produce an error for an unrecognized format.

file write(any data, string format="None")

The inverse of read(), with relevant constraints on what "None" and "CSV" can do.

string getEnv(string name)

❓ should we disambiguate between an undefined environment variable and an environment variable set to the empty string, e.g. with an extra output argument? 99% of the time the difference isn't important, but some applications may want to handle them differently. May not be worth the hassle, or we could provide a separate function to check if it was defined.

πŸ’¦ Excellent point. Why don't we have null?

Blobs

Swift/T supports a range of functions for working with binary blobs. I've included the major functions here for reference.

❗ This is a very low priority to port to Swift/K.

πŸ’¦ I wonder if this can be merged with read/write, since it's essentially serialization to memory

(int o) blobSize(blob b)

Size in bytes

(blob o) blobNull()

Zero-length blob

blobFromString(string s)

Conversion function. Includes null terminator in blob.

((string o) stringFromBlob(blob b)

Conversion function. Expects null terminator in blob.

(blob o) blobFromFloats(float f[])

(blob o) blobFromInts(int i[])

(float f[]) floatsFromBlob(blob b)

(blob o) blobRead(file f)

(file f) blobWrite(blob b)

(blob o) blobZeroesFloat(int n)

Assertions

Assertions may not be to everyone's taste, but they can be very useful, especially in tests.

assertEqual requires equality comparison to be supported. assertLT/assertLTE require the type to be one with a logical order. For now, implemented a limited set of overloads for TEq in {string, int, float, bool} and TOrdered in {int, float}.

(void o) assert(boolean condition, string msg="assertion failed") 🍊

(void o) assertEqual(TEq v1, TEq v2, string msg="assertion failed") 🍊

(void o) assertLT(TOrdered v1, TOrdered v2, string msg="assertion failed") 🍊

(void o) assertLTE(TOrdered v1, TOrdered v2, string msg="assertion failed") 🍊

Clone this wiki locally