Skip to content

Commit

Permalink
20454: Allows [] and {} to represent lists and assocs respectively, M…
Browse files Browse the repository at this point in the history
…INOR (#144)
  • Loading branch information
howsohazard authored Jun 10, 2024
1 parent e4761bb commit ba637e9
Show file tree
Hide file tree
Showing 5 changed files with 1,078 additions and 1,006 deletions.
9 changes: 5 additions & 4 deletions docs/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -327,9 +327,10 @@ <h2>Type descriptions</h2>
<h2>Language syntax</h2>
<pre class='ex code'>
<span class='comment'>;comments until the end of the line</span>
[#label] [||][@]( opcode [parameter1] [parameter2] ...)</pre>

# means that the opcode has that label. Multiple labels may be specified on the same node. If the label has any nonalphanumeric character, it may be surrounded by quotations as a string. E.g., <span class='ex code'>#"Some complex label; with \"punctuation!"</span><br>
[#label] [||][@](opcode [parameter1] [parameter2] ...)</pre>
The language generally follows a parse tree in a manner similar to Lisp and Scheme, with opcodes surrounded by parenthesis and including parameters in a recursive fashion. The exceptions are that the opcodes list and assoc (associative array, sometimes referred to as a hashmap or dict) may use [] and {} respectively and ommit the opcode, though are still considered identical to <span class='ex code'>(list)</span> and <span class='ex code'>(assoc)</span>.
<p>
The character # means that the opcode has that label. Multiple labels may be specified on the same node. If the label has any nonalphanumeric character, it may be surrounded by quotations as a string. E.g., <span class='ex code'>#"Some complex label; with \"punctuation!"</span><br>
If a label is preceeded by more than one #, then it is disabled and thus ignored with regard to its current entity. Some commands will add or remove a # at the beginning to assist ease of specifying labels when creating entities and to help prevent accidental label sharing. If a label starts with a caret (^), e.g. <span class='ex code'>#^method_for_contained_entities</span>, then it can be accessed by contained entities, and they do not need to specify the caret. Parent entities do need to specify the caret (^). For example, if #^foo is a label of a container, a contained entity within could call the label "foo". This adds a layer of security and prevents contained entities from affecting parts of the container that are exposed for its own container's access. Labels starting with an exclamation point (!), e.g. <span class='ex code'>#!private_method</span>, are not accessible by the container and can only be accessed by the entity itself except for by the contained entity getting all of the code, acting as a private label. A label cannot be simultaneously private to its container and accessible to contained entities.
<p>
Variables are accessed in from the closest immediate scope, which means if there is a global variable named <span class='ex code'>x</span> and a function parameter named <span class='ex code'>x</span>, the function parameter will be used. Entity labels are considered the global-most scope. If a variable name cannot be found, then it will look at the entity's labels instead. Scope is handled as a stack, and some opcodes may modify the scope.
Expand All @@ -342,7 +343,7 @@ <h2>Language syntax</h2>
<p>
In-order evaluation of parameters of most opcodes are not guaranteed to execute in order, or be executed at all if not necessary, unless otherwise specified in the opcode (e.g., seq, declare, let, etc. all execute in order). It is generally not recommended practice to have side effects (variable or entity writes) in opcodes whose parameters are not guaranteed to be sequential.
<p>
If the concurrent (parallel) symbol, ||, is specified then the opcode's computations will be executed concurrently if possible. The concurrent execution will be interpreted with regard to the specific opcode, but any function calls may be executed in any order and possibly concurrently.
If the concurrent/parallel symbol, ||, is specified then the opcode's computations will be executed concurrently if possible. The concurrent execution will be interpreted with regard to the specific opcode, but any function calls may be executed in any order and possibly concurrently.
<p>
Each Entity contains code/data via a root node that is executed every call to the entity. An Entity has complete abilities to perform reads and writes to any other Entity contained within it; it is also allowed to create, destroy, access, or modify other entities. The root-most entity has permissions to call system commands and load files, and this is called root permission. These permissions may be set on other entities by an entity with root permission.
<p>
Expand Down
8 changes: 4 additions & 4 deletions docs/language.js
Original file line number Diff line number Diff line change
Expand Up @@ -867,8 +867,8 @@ var data = [
"new value" : "new",
"concurrency" : true,
"new target scope": true,
"description" : "Evaluates to the list specified by the parameters. Pushes a new target scope such that (target), (current_index), and (current_value) access the list, the current index, and the current value.",
"example" : "(print (list \"a\" 1 \"b\"))"
"description" : "Evaluates to the list specified by the parameters. Pushes a new target scope such that (target), (current_index), and (current_value) access the list, the current index, and the current value. If []'s are used instead of parenthesis, the keyword list may be omitted. [] are considered identical to (list).",
"example" : "(print (list \"a\" 1 \"b\"))\n(print [1 2 3])"
},

{
Expand All @@ -877,8 +877,8 @@ var data = [
"new value" : "new",
"concurrency" : true,
"new target scope": true,
"description" : "Evaluates to the associative list, where each pair of parameters (e.g., index1 and value1) comprises a index/value pair. Pushes a new target scope such that (target), (current_index), and (current_value) access the assoc, the current index, and the current value. If any of the bstrings do not have reserved characters or spaces, then quotes are optional; if spaces or reserved characters are present, then quotes are required.",
"example" : "(print (assoc b 2 c 3))\n(print (assoc a 1 \"b\\ttab\" 2 c 3 4 \"d\"))"
"description" : "Evaluates to the associative list, where each pair of parameters (e.g., index1 and value1) comprises a index/value pair. Pushes a new target scope such that (target), (current_index), and (current_value) access the assoc, the current index, and the current value. If any of the bstrings do not have reserved characters or spaces, then quotes are optional; if spaces or reserved characters are present, then quotes are required. If {}'s are used instead of parenthesis, the keyword assoc may be omitted. {} are considered identical to (assoc)",
"example" : "(print (assoc b 2 c 3))\n(print (assoc a 1 \"b\\ttab\" 2 c 3 4 \"d\"))\n(print {a 1 b 2})"
},

{
Expand Down
156 changes: 111 additions & 45 deletions src/Amalgam/Parser.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -402,6 +402,8 @@ void Parser::SkipToEndOfIdentifier(bool allow_leading_label_marks)
//check language characters
if(cur_char == '#'
|| cur_char == '(' || cur_char == ')'
|| cur_char == '[' || cur_char == ']'
|| cur_char == '{' || cur_char == '}'
|| cur_char == ';')
break;

Expand All @@ -425,7 +427,7 @@ std::string Parser::GetNextIdentifier(bool allow_leading_label_marks)
}
}

EvaluableNode *Parser::GetNextToken(EvaluableNode *new_token)
EvaluableNode *Parser::GetNextToken(EvaluableNode *parent_node, EvaluableNode *new_token)
{
if(new_token == nullptr)
new_token = evaluableNodeManager->AllocNode(ENT_NULL);
Expand All @@ -439,7 +441,7 @@ EvaluableNode *Parser::GetNextToken(EvaluableNode *new_token)

auto cur_char = (*code)[pos];

if(cur_char == '(') //identifier as command
if(cur_char == '(' || cur_char == '[' || cur_char == '{') //identifier as command
{
pos++;
numOpenParenthesis++;
Expand All @@ -450,24 +452,51 @@ EvaluableNode *Parser::GetNextToken(EvaluableNode *new_token)
return nullptr;
}

std::string token_string = GetNextIdentifier();
EvaluableNodeType token_type = GetEvaluableNodeTypeFromString(token_string);
//first see if it's a keyword
new_token->SetType(token_type, evaluableNodeManager, false);
if(cur_char == '(')
{
std::string token = GetNextIdentifier();
EvaluableNodeType token_type = GetEvaluableNodeTypeFromString(token);
new_token->SetType(token_type, evaluableNodeManager, false);

if(IsEvaluableNodeTypeValid(token_type) && !IsEvaluableNodeTypeImmediate(token_type))
return new_token;
if(!IsEvaluableNodeTypeValid(token_type) || IsEvaluableNodeTypeImmediate(token_type))
{
//invalid opcode, warn if possible and store the identifier as a string
if(!originalSource.empty())
std::cerr << "Warning: " << "Invalid opcode \"" << token << "\" at line " << lineNumber + 1 << " of " << originalSource << std::endl;

//invalid opcode, warn if possible and store the identifier as a string
if(!originalSource.empty())
std::cerr << "Warning: " << " Invalid opcode at line " << lineNumber + 1 << " of " << originalSource << std::endl;
new_token->SetType(ENT_STRING, evaluableNodeManager, false);
new_token->SetStringValue(token);
}
}
else if(cur_char == '[')
{
new_token->SetType(ENT_LIST, evaluableNodeManager, false);
}
else if(cur_char == '{')
{
new_token->SetType(ENT_ASSOC, evaluableNodeManager, false);
}

new_token->SetType(ENT_STRING, evaluableNodeManager, false);
new_token->SetStringValue(token_string);
return new_token;
}
else if(cur_char == ')')
else if(cur_char == ')' || cur_char == ']' || cur_char == '}')
{
EvaluableNodeType parent_node_type = ENT_NULL;
if(parent_node != nullptr)
parent_node_type = parent_node->GetType();

//make sure the closing character and type match
if(cur_char == ']')
{
if(parent_node_type != ENT_LIST)
std::cerr << "Warning: " << "Mismatched ] at line " << lineNumber + 1 << " of " << originalSource << std::endl;
}
else if(cur_char == '}')
{
if(parent_node_type != ENT_ASSOC)
std::cerr << "Warning: " << "Mismatched } at line " << lineNumber + 1 << " of " << originalSource << std::endl;
}

pos++; //skip closing parenthesis
numOpenParenthesis--;
FreeNode(new_token);
Expand Down Expand Up @@ -529,28 +558,28 @@ void Parser::FreeNode(EvaluableNode *node)
EvaluableNode *Parser::ParseNextBlock()
{
EvaluableNode *tree_top = nullptr;
EvaluableNode *curnode = nullptr;
EvaluableNode *cur_node = nullptr;

//as long as code left
while(pos < code->size())
{
EvaluableNode *n = GetNextToken();
EvaluableNode *n = GetNextToken(cur_node);

//if end of a list
if(n == nullptr)
{
//nothing here at all
if(curnode == nullptr)
if(cur_node == nullptr)
return nullptr;

const auto &parent = parentNodes.find(curnode);
const auto &parent = parentNodes.find(cur_node);

//if no parent, then all finished
if(parent == end(parentNodes) || parent->second == nullptr)
return tree_top;

//jump up to the parent node
curnode = parent->second;
cur_node = parent->second;
continue;
}
else //got some token
Expand All @@ -559,48 +588,48 @@ EvaluableNode *Parser::ParseNextBlock()
if(tree_top == nullptr)
{
tree_top = n;
curnode = n;
cur_node = n;
continue;
}

if(curnode->IsOrderedArray())
if(cur_node->IsOrderedArray())
{
curnode->AppendOrderedChildNode(n);
cur_node->AppendOrderedChildNode(n);
}
else if(curnode->IsAssociativeArray())
else if(cur_node->IsAssociativeArray())
{
//n is the id, so need to get the next token
StringInternPool::StringID index_sid = EvaluableNode::ToStringIDTakingReferenceAndClearing(n);

//reset the node type but continue to accumulate any attributes
n->SetType(ENT_NULL, evaluableNodeManager, false);
n = GetNextToken(n);
curnode->SetMappedChildNodeWithReferenceHandoff(index_sid, n, true);
n = GetNextToken(cur_node, n);
cur_node->SetMappedChildNodeWithReferenceHandoff(index_sid, n, true);

//handle case if uneven number of arguments
if(n == nullptr)
{
//nothing here at all
if(curnode == nullptr)
if(cur_node == nullptr)
return nullptr;

const auto &parent = parentNodes.find(curnode);
const auto &parent = parentNodes.find(cur_node);

//if no parent, then all finished
if(parent == end(parentNodes) || parent->second == nullptr)
return tree_top;

//jump up to the parent node
curnode = parent->second;
cur_node = parent->second;
continue;
}
}

parentNodes[n] = curnode;
parentNodes[n] = cur_node;

//if it's not immediate, then descend into that part of the tree, resetting parent index counter
if(!IsEvaluableNodeTypeImmediate(n->GetType()))
curnode = n;
cur_node = n;

//if specifying something unusual, then assume it's just a null
if(n->GetType() == ENT_NOT_A_BUILT_IN_TYPE)
Expand Down Expand Up @@ -727,15 +756,17 @@ void Parser::AppendLabels(UnparseData &upd, EvaluableNode *n, size_t indentation
}

void Parser::AppendAssocKeyValuePair(UnparseData &upd, StringInternPool::StringID key_sid, EvaluableNode *n, EvaluableNode *parent,
bool expanded_whitespace, size_t indentation_depth)
bool expanded_whitespace, size_t indentation_depth, bool need_initial_space)
{
if(expanded_whitespace)
{
for(size_t i = 0; i < indentation_depth; i++)
upd.result.push_back(indentationCharacter);
}
else
else if(need_initial_space)
{
upd.result.push_back(' ');
}

auto key_str = string_intern_pool.GetStringFromID(key_sid);

Expand Down Expand Up @@ -819,9 +850,10 @@ void Parser::Unparse(UnparseData &upd, EvaluableNode *tree, EvaluableNode *paren
}

//check if it's an immediate/variable before deciding whether to surround with parenthesis
if(IsEvaluableNodeTypeImmediate(tree->GetType()))
EvaluableNodeType tree_type = tree->GetType();
if(IsEvaluableNodeTypeImmediate(tree_type))
{
switch(tree->GetType())
switch(tree_type)
{
case ENT_NUMBER:
upd.result.append(EvaluableNode::ToStringPreservingOpcodeType(tree));
Expand Down Expand Up @@ -859,9 +891,21 @@ void Parser::Unparse(UnparseData &upd, EvaluableNode *tree, EvaluableNode *paren
else
{
//emit opcode
upd.result.push_back('(');
upd.result.append(GetStringFromEvaluableNodeType(tree->GetType()));
if(tree_type == ENT_LIST)
{
upd.result.push_back('[');
}
else if(tree_type == ENT_ASSOC)
{
upd.result.push_back('{');
}
else
{
upd.result.push_back('(');
upd.result.append(GetStringFromEvaluableNodeType(tree_type));
}

//decide whether to expand whitespace of child nodes or write all on the same line
bool recurse_expanded_whitespace = expanded_whitespace;
if(expanded_whitespace)
{
Expand All @@ -875,7 +919,7 @@ void Parser::Unparse(UnparseData &upd, EvaluableNode *tree, EvaluableNode *paren
{
recurse_expanded_whitespace = false;
}
else if(num_child_nodes <= 6 && num_child_nodes + indentation_depth < 14)
else if(num_child_nodes <= 6 && num_child_nodes + indentation_depth <= 14)
{
//make sure all child nodes are leaf nodes and have no metadata
bool all_leaf_nodes = true;
Expand Down Expand Up @@ -912,10 +956,14 @@ void Parser::Unparse(UnparseData &upd, EvaluableNode *tree, EvaluableNode *paren
if(tree->IsAssociativeArray())
{
auto &tree_mcn = tree->GetMappedChildNodesReference();
bool initial_space = (tree_type != ENT_LIST && tree_type != ENT_ASSOC);
if(!upd.sortKeys)
{
for(auto &[k_id, k] : tree_mcn)
AppendAssocKeyValuePair(upd, k_id, k, tree, recurse_expanded_whitespace, indentation_depth + 1);
{
AppendAssocKeyValuePair(upd, k_id, k, tree, recurse_expanded_whitespace, indentation_depth + 1, initial_space);
initial_space = true;
}
}
else //sortKeys
{
Expand All @@ -929,23 +977,27 @@ void Parser::Unparse(UnparseData &upd, EvaluableNode *tree, EvaluableNode *paren
for(auto &key_sid : key_sids)
{
auto k = tree_mcn.find(key_sid);
AppendAssocKeyValuePair(upd, k->first, k->second, tree, recurse_expanded_whitespace, indentation_depth + 1);
AppendAssocKeyValuePair(upd, k->first, k->second, tree, recurse_expanded_whitespace, indentation_depth + 1, initial_space);
initial_space = true;
}
}
}
else if(tree->IsOrderedArray())
{
auto &tree_ocn = tree->GetOrderedChildNodesReference();
if(recurse_expanded_whitespace)
{
for(auto &e : tree->GetOrderedChildNodesReference())
for(auto &e : tree_ocn)
Unparse(upd, e, tree, true, indentation_depth + 1, true);
}
else //expanded whitespace
{
for(auto &e : tree->GetOrderedChildNodesReference())
for(size_t i = 0; i < tree_ocn.size(); i++)
{
upd.result.push_back(' ');
Unparse(upd, e, tree, false, indentation_depth + 1, true);
//if not the first or if it's not a type with a special delimeter, insert a space
if(i > 0 || (tree_type != ENT_LIST && tree_type != ENT_ASSOC))
upd.result.push_back(' ');
Unparse(upd, tree_ocn[i], tree, false, indentation_depth + 1, true);
}
}
}
Expand All @@ -959,11 +1011,25 @@ void Parser::Unparse(UnparseData &upd, EvaluableNode *tree, EvaluableNode *paren
for(size_t i = 0; i < indentation_depth; i++)
upd.result.push_back(indentationCharacter);
}
upd.result.append(")\r\n");

if(tree_type == ENT_LIST)
upd.result.push_back(']');
else if(tree_type == ENT_ASSOC)
upd.result.push_back('}');
else
upd.result.push_back(')');

upd.result.push_back('\r');
upd.result.push_back('\n');
}
else
{
upd.result.append(")");
if(tree_type == ENT_LIST)
upd.result.push_back(']');
else if(tree_type == ENT_ASSOC)
upd.result.push_back('}');
else
upd.result.push_back(')');
}
}
}
Expand Down
Loading

0 comments on commit ba637e9

Please sign in to comment.