-
Notifications
You must be signed in to change notification settings - Fork 41
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #468 from ScorexFoundation/v2.1
Branch for v2.1
- Loading branch information
Showing
265 changed files
with
19,984 additions
and
4,351 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -9,6 +9,7 @@ | |
*.fdb_latexmk | ||
|
||
*.log | ||
docs/spec/out/ | ||
test-out/ | ||
flamegraphs/ | ||
# sbt specific | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -14,7 +14,7 @@ cache: | |
language: scala | ||
|
||
jdk: | ||
- oraclejdk8 | ||
- oraclejdk9 | ||
|
||
script: | ||
- sbt -jvm-opts .travis.jvmopts test | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
## What should be checked during PR review | ||
|
||
### For each $TypeName.$methodName there should be | ||
|
||
1. test case in SigmaDslTests (checks SigmaDsl <-> ErgoScript equality) | ||
2. test case in CostingSpecification | ||
3. costing rule method in ${TypeName}Coster | ||
4. for each SMethod registration | ||
- .withInfo($description, $argsInfo) | ||
- .withIRInfo($irBuilder, $opDescriptor) | ||
|
||
### For each PredefinedFunc registration there should be | ||
- PredefFuncInfo($irBuilder, $opDescriptor) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,13 +1,15 @@ | ||
# Sigma: Scala DSL for smart contracts with zero knowledge proof of knowledge | ||
# SigmaDsl: Scala DSL for smart contracts with zero knowledge proof of knowledge | ||
|
||
## Intro | ||
SigmaDsl is a domain-specific language embedded into Scala and designed to be | ||
source code compatible with SigmaScript. This means you can write SigmaDsl | ||
code directly in Scala IDE (e.g. IntelliJ IDEA) and copy-paste code snippets | ||
between SigmaDsl and SigmaScript. | ||
Special Scala macros can also be used to automatically translate SigmaDsl to | ||
Sigma byte code. | ||
|
||
SigmaDsl is implemented as a library in the framework of | ||
[Special](https://github.com/scalan/special) | ||
SigmaDsl is implemented as Scala library using [Special](https://github.com/scalan/special) | ||
framework. | ||
|
||
## See also | ||
[Special](https://github.com/scalan/special) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,12 @@ | ||
|
||
## A log of changes leading to soft-fork | ||
|
||
This list should be updated every time something soft-forkable is added. | ||
|
||
### Changes since 2.0 | ||
|
||
- new type (SGlobal.typeCode = 106) | ||
- new method (SGlobal.groupGenerator.methodId = 1) | ||
- new method (SAvlTree.updateDigest.methodId = 15) | ||
- removed GroupElement.nonce (changed codes of getEncoded, exp, multiply, negate) | ||
- change in Coll.filter serialization format (removed tagged variable id, changed condition type) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
\section{Serialization format of ErgoTree nodes} | ||
\label{sec:appendix:ergotree_serialization} | ||
|
||
\mnote{These subsections are autogenerated from instrumented ValueSerializers} | ||
|
||
\input{generated/ergotree_serialization1.tex} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,36 @@ | ||
\section{Compressed encoding of integer values} | ||
|
||
\subsection{VLQ encoding} | ||
\label{sec:vlq-encoding} | ||
|
||
\begin{verbatim} | ||
public final void putULong(long value) { | ||
while (true) { | ||
if ((value & ~0x7FL) == 0) { | ||
buffer[position++] = (byte) value; | ||
return; | ||
} else { | ||
buffer[position++] = (byte) (((int) value & 0x7F) | 0x80); | ||
value >>>= 7; | ||
} | ||
} | ||
} | ||
\end{verbatim} | ||
|
||
\subsection{ZigZag encoding} | ||
\label{sec:zigzag-encoding} | ||
|
||
Encode a ZigZag-encoded 64-bit value. ZigZag encodes signed integers | ||
into values that can be efficiently encoded with varint. (Otherwise, | ||
negative values must be sign-extended to 64 bits to be varint encoded, | ||
thus always taking 10 bytes in the buffer. | ||
|
||
Parameter \lst{n} is a signed 64-bit integer. | ||
This Java method returns an unsigned 64-bit integer, stored in a signed int because Java has no explicit unsigned support. | ||
|
||
\begin{verbatim} | ||
public static long encodeZigZag64(final long n) { | ||
// Note: the right-shift must be arithmetic | ||
return (n << 1) ^ (n >> 63); | ||
} | ||
\end{verbatim} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,127 @@ | ||
\section{Motivations} | ||
\label{sec:appendix:motivation} | ||
|
||
\subsection{Type Serialization format rationale} | ||
\label{sec:appendix:motivation:type} | ||
|
||
Some operations of \ASDag have type parameters, for which concrete types | ||
should be specified (since \ASDag is monomorphic IR). When the operation | ||
(such as \hyperref[sec:serialization:operation:ExtractRegisterAs]{\lst{ExtractRegisterAs}}) is serialized those types should also be | ||
serialized as part of operation. The following encoding is designed to | ||
minimize a number of bytes required to represent type in the serialization | ||
format of \ASDag. | ||
|
||
In most cases type term serialises into a single byte. In the intermediate | ||
representation of ErgoTree each type is represented by a tree of nodes where | ||
leaves are primitive types and other nodes are type constructors. | ||
Simple (but sub-optimal) way to serialize a type would be to give each | ||
primitive type and each type constructor a unique type code. Then, to | ||
serialize a node, we need to emit its code and then perform recursive descent | ||
to serialize all children. | ||
However, to save storage space, we use special encoding schema to save bytes | ||
for the types that are used more often. | ||
|
||
We assume the most frequently used types are: | ||
\begin{itemize} | ||
\item primitive types (\lst{Int, Byte, Boolean, BigInt, GroupElement, | ||
Box, AvlTree}) | ||
\item Collections of primitive types (\lst{Coll[Byte]} etc) | ||
\item Options of primitive types (\lst{Option[Int]} etc.) | ||
\item Nested arrays of primitive types (\lst{Coll[Coll[Int]]} etc.) | ||
\item Functions of primitive types (\lst{Box => Boolean} etc.) | ||
\item First biased pair of types (\lst{(_, Int)} when we know the first | ||
component is a primitive type). | ||
\item Second biased pair of types (\lst{(Int, _)} when we know the second | ||
component is a primitive type) | ||
\item Symmetric pair of types (\lst{(Int, Int)} when we know both types are | ||
the same) | ||
\end{itemize} | ||
|
||
All the types above should be represented in an optimized way (preferable by a single byte). | ||
For other types, we do recursive descent down the type tree as it is defined in section~\ref{sec:ser:type} | ||
|
||
\subsection{Constant Segregation rationale} | ||
|
||
\subsubsection{Massive script validation} | ||
|
||
Consider a transaction \lst{tx} which have \lst{INPUTS} collection of boxes to | ||
spend. Every input box can have a script protecting it (\lst{propostionBytes} | ||
property). This script should be executed in a context of the current | ||
transaction. The simplest transaction have 1 input box. Thus if we want to | ||
have a sustained block validation of 1000 transactions per second we need to | ||
be able to validate 1000 scripts per second. | ||
|
||
For every script (of input \lst{box}) the following is done in order to | ||
validate it: | ||
\begin{enumerate} | ||
\item Context is created with \lst{SELF} = box | ||
\item The script is deserialized into ErgoTree | ||
\item ErgoTree is traversed to build costGraph and calcGraph, two graphs for | ||
cost estimation function and script calculation function. | ||
\item Cost estimation is computed by evaluating costGraph with current context data | ||
\item If cost and data size limits are not exceeded, calcGraph is | ||
evaluated using context data to obtain sigma proposition (see | ||
\hyperref[sec:type:SigmaProp]{\lst{SigmaProp}}) | ||
\item Verification procedure is executed | ||
\end{enumerate} | ||
|
||
\subsubsection{Potential for Script processing optimization} | ||
|
||
Before an \langname contract can be stored in a blockchain it should be first | ||
compiled from its source text into ErgoTree and then serialized into byte | ||
array. | ||
|
||
Because the language is purely functional and IR is graph-based, the | ||
compilation process has an effect of normalization/unification. This means | ||
that different original scripts may have identical ErgoTrees and as the | ||
result identical serialized bytes. | ||
|
||
Because of normalization, and also because of script reusability, the number | ||
of conceptually (or logically) different scripts is much less than the number | ||
of individual scripts in a blockchain. For example we may have 1000s of | ||
different scripts in a blockchain with millions of boxes. | ||
|
||
The average reusability ratio is 1000 in this case. And even those different | ||
scripts may have different usage frequency. Having big reusability ratio we | ||
can optimize script evaluation by performing steps 1 - 4 only once per unique | ||
script. | ||
|
||
The compiled calcGraph can be cached in \lst{Map[Array[Byte], Context => | ||
SigmaBoolean]}. Every script extracted from an input box can be used as a key | ||
in this map to obtain ready to execute graph. | ||
|
||
However, we have a problem with constants embedded in contracts. There is one | ||
obstacle to the optimization by caching. In many cases it is very natural to | ||
embed constants in the script body, most notable scenario is when public keys | ||
are embedded. As result two functionally identical scripts may serialize to | ||
different byte arrays because they have different embedded constants. | ||
|
||
\subsubsection{Constant-less ErgoTree} | ||
|
||
The solution to the problem with embedded constants is simple, we don't need | ||
to embed constants. Each constant in the body of \ASDag can be replaced | ||
with indexed placeholder (see \hyperref[sec:appendix:primops:ConstantPlaceholder]{\lst{ConstantPlaceholder}}). | ||
Each placeholder have an index field. The index of the placeholder is | ||
assigned by breadth-first topological order of the graph traversal. | ||
|
||
The transformation is part of compilation and is performed ahead of time. | ||
Each \ASDag have an array of all the constants extracted from its body. Each | ||
placeholder refers to the constant by the constant's index in the array. | ||
|
||
Thus the format of serialized script is shown in Figure~\ref{fig:ser:ergotree} which contains: | ||
\begin{enumerate} | ||
\item number of constants | ||
\item constants collection | ||
\item script expression with placeholders | ||
\end{enumerate} | ||
|
||
The constants collection contains serialized constant data (using | ||
ConstantSerializer) one after another. | ||
The script expression is a serialized ErgoTree with placeholders. | ||
|
||
Using this new script format we can use script expression part as a key in | ||
the cache. An observation is that after the constants are extracted, what | ||
remains is a template. Thus instead of applying steps 1-4 to | ||
\emph{constant-full} scripts we can apply them to \emph{constant-less} | ||
templates. Before applying steps 4 and 5 we need to bind placeholders with | ||
actual values taken from the cconstants collection. |
Oops, something went wrong.