-
Notifications
You must be signed in to change notification settings - Fork 23
StandardLibrary
❓ Question: Do we support constants?
- Why not? - Tim
- Just asking - Mihael
- The Swift/K constant and global support is a little different from Swift/T, but we should be able to have a set of constants as part of the library in both, I think... - Tim
This mostly reflects pre 1.5 java.lang.Math
float sin(float x)
float cos(float x)
float tan(float x)
float asin(float x)
float acos(float x)
float atan(float x)
float atan2(float y, float x)
float exp(float x)
float ln(float x)
float log(float x, float base)
float log10(float x)
float pow(float base, float exponent)
float sqrt(float x)
float cbrt(float x)
float ceil(float x)
float floor(float x)
float round(float x)
int min(int a, int b)
float min(float a, float b)
int max(int a, int b)
float max(float a, float b)
int abs(int z)
float abs(float x)
boolean isNaN(float x)
❓ Do we need to document details of floating point behavior such as invalid values, etc?
This probably needs some discussion.
❗ As primitives we should provide something simpler that doesn't depend on lazy arrays. I have a number of implementation concerns for Swift/T.
💦 Unfortunately I'm not sure how to keep it deterministic so that restart logs would work - Mihael 🎅 Maybe we should avoid having this be the canonical way for now? I'm not confident on implementing it in T because it seems to depend on running arbitrary code lazily when an array is read. How about if we had, as a lowest common denominator, a type random_state and functions random_state seed_random(int seed), (random_state, int) next_random(random_state state), etc. You could use this to fill an array if needed. It woudl be deterministic. Maybe one downside is that programmers might accidentally bifurcate the RNG.
int[] random(int seed, int min, int max)
Returns a (lazy) deterministic uniform random sequence. ❗ The implementation will have to be inefficient unless the values are accessed in order. Well, maybe not. A crypto-hash of some function of the seed and the index could probably work as a democratically-slow random-access array.
float[] random(int seed, float min, float max)
Same, but with floats.
🎅 I'm not sure how I feel about the overloading here. It seems surprising to me but I'm not sure why. - Tim
float[] gaussian(int seed)
Normally distributed random with mu = 0 and sigma = 1.
❗ Maybe. We should discuss.
float sum(float[] a)
❓ Guarantees about how the sum is computed, given that floating point addition is not associative/commutative?
- dont make guarantees. Only reasonable guarantee would be serial from 0; parallel addition of partitions would be useful for huge arrays. So leave order of addition undefined. - Mike
int sum(int[] a)
- Do we want to add - perhaps elsewhere:
string sum(string[] a)
?
float avg(float[] a)
float avg(int[] a)
float moment(float[] a, int n, float center)
Returns the n-th moment of an array about a value (center). For example, the mean would be moment(a, 1), while the standard deviation is moment(a, 2, avg(a)).
float moment(int[] a, int n, float center)
❓ Do we need shift and other bitwise operators? One could potentially use division/multiplication to emulate them, but there are subtleties with rounding and signs that might make it tricky.
- I think if there is a use case common enough that people will need them, we should support them. Emulation seems impractical. -Tim
- Right, does anybody know of a use case? -Mihael
int toInt(float x)
float toFloat(int z)
int parseInt(string s, int base = 10)
float parseFloat(string s)
string toString(int z)
string toString(float x)
string toString(boolean b)
❓ Should we have a separate concatenation function or should overloading "+" be enough?
- Just keep strcat() for compatibility? - Tim
- agreed - Mike
string strcat(string... s)
❗ I like Python's use of negative indices to signify an index relative to the end of the string. However, that doesn't work well with exclusive end indices (like Java's). Perhaps, a compromise solution would be to only use -1 to signify length(str).
- Why not? Python uses exclusive end indices. - Tim
- Because if s.charAt(-1) is the last character then in an end-exclusive convention, s.substring(0, 0) would represent the whole string. If -1 represents the length, then s == s.substring(0, -1), but the last character is s.charAt(-2). - Mihael
- in an end-exclusive convention, s.substring(0,0) should be an empty string though. s.substring(0, 1) is the first character only. I tested this in both Python and Java. There is an issue though, that Python's slice syntax treats a non-existent end index differently from -1. s[0:] is all n digits of the string, s[0:0] is the empty string, s[0:-1] is first n - 1 digits of the string. Maybe the crux of it is that we can't emulate Python's negative indices by mapping an absent end index to -1? - Tim
int length(string s)
string[] split(string s, string delimiter, int max = -1)
string[] splitRe(string s, string regexp, int max = -1)
like split(), except the delimiter is a regular expression
string trim(string s)
string substring(string s, int start, int end = -1)
❗ I frequently make the mistake of writing subString in Java. Maybe that is a choice we should consider.
- I think substring is fine, personally - Tim
string toUpper(string s)
string toLower(string s)
string join(string[] sa, string delimiter)
string replaceAll(string s, string find, string replacement, int start = 0, int end = -1)
string replaceAllRe(string s, string findRe, string replacementRe, int start = 0, int end = -1)
❗ This is meant to allow the use of capture groups to, for example, do replacements of the sort "key1=v1, key2=v2,..." → "rep1=v1, rep2=v2,...".
int indexOf(string s, string find, int start = 0, int end = -1)
int lastIndexOf(string s, string find, int start = -1; int end = 0)
string format(string spec, any... args)
boolean matches(string s, string re)
string[] findAllRe(string s, string re)
❓ Not sure about this one, but because we would not otherwise offer a comprehensive regexp library, this could be used to return an array with all capture groups in a regexp. I do not see how this would be possible with any of the other functions.
❓ do we need reverse?
💦 It's not very common as far as I can tell. Is there at least one use case in the whole history of Swift? -Mihael
T[K] slice(T[K] a, int start, int end)
T[int][K] split(T[K], int n)
Splits an array into chunks of size n. The last element of the returned array could have fewer than n elements
❓ if the array is sparse or not zero-based, are these indices the keys or the physical indices?
💦 the [int] keys label the chunks. The [K] indices are the actual keys. In other words it splits a hashtable into multiple hastables without losing the mappings. -Mihael
T[int] join(T[K1][K2] a)
❗ Joins a number of arrays into a single array. If an ordering exists on K1 and/or K2, it should be preserved.
T[int] join(T[K]... arrays)
❗ The overloading is ambiguous. In the second one T could be an array type.
💦 True.
T[int] compact(T[K] a)
❗ Returns an int-indexed array containing all elements of T[K]. The exact mapping between between the keys in the initial array and the integer keys is not specified, but it is guaranteed that the same input will always return the same output.
❓ Why not guarantee that the order is stable?
💦 What would "stable" mean here? -Mihael
🎅 If (k1, v1) and (k2, v2) are key/value pairs in the input and (k1', v1), (k2', v2) are key/value pairs in the output, then k1 < k2 <=> k1' < k2' . I.e. the order is preserved. It might be more practical to give the implementation leeway, but I feel like some users might expect this behaviour.
💦 Sure. I think the problem is when there is no clear ordering on the keys. At least in theory, one can have struct types as keys, but I agree that if there is an ordering on K, then it should be "preserved" -Mihael
T[K][int] zip(T[K] a, T[K] b)
❓ This is based on pairing up keys right? What is behaviour if matching keys not present? Discard them or raise an error?
any read(file f, string format="None")
❗ Reads data from a file. A minimum of the following formats should be supported:
- "None" - read the entire file as a string
- "FieldAndValue" - field = value, one per line
Other possible/desired formats:
- "CSV" - read a CSV file (for backwards compatibility with K's readData)
- "JSON"
-
- specify readData() compatibility more clearly. May want to leave readData() for a while to ease transition? - Mike
file write(any data, string format="None")
The inverse of read(), with relevant constraints on what "None" and "CSV" can do.
string getEnv(string name)
❓ should we disambiguate between an undefined environment variable and an environment variable set to the empty string, e.g. with an extra output argument? 99% of the time the difference isn't important, but some applications may want to handle them differently. May not be worth the hassle, or we could provide a separate function to check if it was defined.