From 7511b5f693d350fa0d675f0c527d0d633a0ba560 Mon Sep 17 00:00:00 2001
From: Alasdair Armstrong
Date: Wed, 18 Apr 2018 14:14:24 +0100
Subject: Start working on documentation

---
 doc/tutorial.tex | 558 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 558 insertions(+)
 create mode 100644 doc/tutorial.tex

(limited to 'doc/tutorial.tex')

diff --git a/doc/tutorial.tex b/doc/tutorial.tex
new file mode 100644
index 00000000..61ec73f3
--- /dev/null
+++ b/doc/tutorial.tex
@@ -0,0 +1,558 @@
+\section{Sail Language}
+
+\subsection{Functions}
+\label{sec:functions}
+
+\input{code_myreplicatebits}
+
+In Sail, functions are declared in two parts. First we write the type
+signature for the function using the \ll{val} keyword, then define the
+body of the function via the \ll{function} keyword. In this
+Subsection, we will write our own version of the \ll{replicate_bits}
+function from the Sail library. This function takes a number $n$ and a
+bitvector, and copies that bitvector $n$ times.
+
+\mrbmyreplicatebits
+
+\noindent The general syntax for a function type in Sail is as follows:
+\begin{center}
+  \ll{val} \textit{name} \ll{:} \ll{forall} \textit{type variables} \ll{,} \textit{constraint} \ll{. (} \textit{type$_1$} \ll{,} $\ldots$ \ll{,} \textit{type$_2$} \ll{)} \ll{->} \textit{return type}
+\end{center}
+An implementation for the \ll{replicate_bits} function would be
+follows \mrbfnmyreplicatebits
+
+The general syntax for a function declaration is:
+\begin{center}
+  \ll{function} \textit{name} \ll{(} \textit{argument$_1$} \ll{,} $\ldots$ \ll{,} \textit{argument$_n$} \ll{)} \ll{=} \textit{expression}
+\end{center}
+
+Code for this example can be found in
+the Sail repository in \verb|doc/examples/my_replicate_bits.sail|. To
+test it, we can invoke Sail interactively using the \verb|-i| option,
+passing the above file on the command line. Typing
+\verb|replicate_bits(3, 0xA)| will step through the execution of the
+above function, and eventually return the result \verb|0xAAA|. Typing
+\verb|:run| in the interactive interpreter will cause the expression
+to be evaluated fully, rather than stepping through the execution.
+
+Sail allows type variables to be used directly within expressions, so
+the above could be re-written as \mrbfnmyreplicatebitstwo This
+notation can be succinct, but should be used with caution. What
+happens is that Sail will try to re-write the type variables using
+expressions of the appropriate size that are available within the
+surrounding scope, in this case \ll{n} and \ll{length(xs)}. If no
+suitable expressions are found to trivially rewrite these type
+variables, then additional function parameters will be automatically
+added to pass around this information at runtime. This feature is
+however very useful for implementing functions with implicit
+parameters, e.g. we can implement a zero extension function that
+implicitly picks up its result length from the calling context as
+follows:
+\mrbextzz
+\mrbfnextzz
+
+Notice that we annotated the \ll{val} declaration as a
+\ll{cast}---this means that the type checker is allowed to
+automatically insert it where needed in order to make our code type
+check. This is another feature that must be used carefully, because
+too many implicit casts can quickly result in unreadable code. Sail
+does not make any distinction between expressions and statements, so
+since there is only a single line of code within the foreach block, we
+can drop it and simply write: \mrbfnmyreplicatebitsthree
+
+\subsection{Numeric Types}
+
+Sail has three basic numeric types, \ll{int}, \ll{nat}, and
+\ll{range}. The type \ll{int} is an arbitrary precision mathematical
+integer, and likewise \ll{nat} is an arbitrary precision natural
+number. The type \ll{range('n,'m)} is an inclusive range between the
+\ll{Int}-kinded type variables \ll{'n} and \ll{'m}. The type
+\ll{int('o)} is an integer exactly equal to the \ll{Int}-kinded type
+variable \ll{'n}, i.e. \ll{int('o)} $=$ \ll{range('o,'o)}. These types
+can be used interchangably provided the rules summarised in the below
+diagram are satisfied (via constraint solving).
+
+\begin{center}
+\begin{tikzpicture}
+  [type/.style={rectangle},auto,>=stealth',semithick]
+  \node[type] (int) at (0, 3) {\ll{int}};
+  \node[type] (nat) at (3, 0) {\ll{nat}};
+  \node[type] (range) at (-3, 0) {\ll{range('n,'m)}};
+  \node[type] (atom) at (0, -3) {\ll{int('o)}};
+
+  \draw[->] (nat) -- (int);
+  \draw[->] (range) -- (int);
+  \draw[->] (atom) to [bend left=10] node {} (int);
+  \draw[->] (range) to node {\ll{'n} $\ge$ \ll{0}} (nat);
+  \draw[->] (atom) to node {\ll{'o} $\ge$ \ll{0}} (nat);
+  \draw[->] (atom) to [bend right=35] node {\ll{'n} $\le$ \ll{'o} $\le$ \ll{'m}} (range);
+  \draw[->] (range) to [bend right=35] node [swap] {\ll{'n} $=$ \ll{'m} $=$ \ll{'o}} (atom);
+\end{tikzpicture}
+\end{center}
+
+Note that \ll{bit} isn't a numeric type (i.e. it's not
+\ll{range(0,1)}. This is intentional, as otherwise it would be
+possible to write expressions like \ll{(1 : bit) + 5} which would end
+up being equal to \ll{6 : range(5, 6)}. This kind of implicit casting
+from bits to other numeric types would be highly undesirable.
+
+\subsection{Vector Type}
+
+Sail has the builtin type vector, which is a polymorphic type for
+fixed-length vectors. For example, we could define a vector \ll{v} of
+three integers as follows:
+\begin{lstlisting}
+let v : vector(3, dec, int) = [1, 2, 3]
+\end{lstlisting}
+The first argument of the vector type is a numeric expression
+representing the length of the vector, and the last is the type of the
+vector's elements. But what is the second argument? Sail allows two
+different types of vector orderings---increasing (\ll{inc}) and
+decreasing (\ll{dec}). These two orderings are shown for the bitvector
+0b10110000 below.
+
+\begin{center}
+\begin{tikzpicture}[scale=0.7]
+  \draw (0,0) rectangle (1,1) node[pos=.5] {1};
+  \draw (1,0) rectangle (2,1) node[pos=.5] {0};
+  \draw (2,0) rectangle (3,1) node[pos=.5] {1};
+  \draw (3,0) rectangle (4,1) node[pos=.5] {1};
+  \draw (4,0) rectangle (5,1) node[pos=.5] {0};
+  \draw (5,0) rectangle (6,1) node[pos=.5] {0};
+  \draw (6,0) rectangle (7,1) node[pos=.5] {0};
+  \draw (7,0) rectangle (8,1) node[pos=.5] {0};
+
+  \node at (.5,1.5) {0};
+  \node at (7.5,1.5) {7};
+  \draw[->] (1.5,1.5) -- (6.5,1.5);
+  \node at (.5,-0.5) {MSB};
+  \node at (7.5,-0.5) {LSB};
+  \node at (4,2) {\ll{inc}};
+\end{tikzpicture}
+\qquad
+\begin{tikzpicture}[scale=0.7]
+  \draw (0,0) rectangle (1,1) node[pos=.5] {1};
+  \draw (1,0) rectangle (2,1) node[pos=.5] {0};
+  \draw (2,0) rectangle (3,1) node[pos=.5] {1};
+  \draw (3,0) rectangle (4,1) node[pos=.5] {1};
+  \draw (4,0) rectangle (5,1) node[pos=.5] {0};
+  \draw (5,0) rectangle (6,1) node[pos=.5] {0};
+  \draw (6,0) rectangle (7,1) node[pos=.5] {0};
+  \draw (7,0) rectangle (8,1) node[pos=.5] {0};
+
+  \node at (.5,1.5) {7};
+  \node at (7.5,1.5) {0};
+  \draw[->] (1.5,1.5) -- (6.5,1.5);
+  \node at (.5,-0.5) {MSB};
+  \node at (7.5,-0.5) {LSB};
+  \node at (4,2) {\ll{dec}};
+\end{tikzpicture}
+\end{center}
+
+For increasing (bit)vectors, the 0 index is the most significant bit
+and the indexing increases towards the least significant bit. Whereas
+for decreasing (bit)vectors the least significant bit is 0 indexed,
+and the indexing decreases from the most significant to the least
+significant bit. For this reason, increasing indexing is sometimes
+called `most significant bit is zero' or MSB0, while decreasing
+indexing is sometimes called `least significant bit is zero' or
+LSB0. While this vector ordering makes most sense for bitvectors (it
+is usually called bit-ordering), in Sail it applies to all
+vectors. A default ordering can be set using
+\begin{lstlisting}
+default Order dec
+\end{lstlisting}
+and this should usually be done right at the beginning of a
+specification. Most architectures stick to either convention or the
+other, but Sail also allows functions which are polymorphic over
+vector order, like so:
+\begin{lstlisting}
+val foo : forall ('a : Order). vector(8, 'a, bit) -> vector(8, 'a, bit)
+\end{lstlisting}
+
+\paragraph{Bitvector Literals}
+
+Bitvector literals in Sail are written as either
+\lstinline[mathescape]{0x$\textit{hex string}$} or
+\lstinline[mathescape]{0b$\textit{binary string}$}, for example
+\ll{0x12FE} or \ll{0b1010100}. The length of a hex literal is always
+four times the number of digits, and the length of binary string is
+always the exact number of digits, so \ll{0x12FE} has length 16, while
+\ll{0b1010100} has length 7.
+
+\paragraph{Accessing and Updating Vectors}
+
+A vector can be indexed by using the
+\lstinline[mathescape]{$\textit{vector}$[$\textit{index}$]}
+notation. So, in the following code:
+\begin{lstlisting}
+  let v : vector(4, dec, int) = [1, 2, 3, 4]
+  let a = v[0]
+  let b = v[3]
+\end{lstlisting}
+\ll{a} will be \ll{4}, and \ll{b} will be \ll{1} (not that \ll{v} is
+\ll{dec}). By default, Sail will statically check for out of bounds
+errors, and will raise a type error if it cannot prove that all such
+vector accesses are valid.
+
+Vectors can be sliced using the
+%
+\lstinline[mathescape]{$\textit{vector}$[$\textit{index}_{msb}$ .. $\textit{index}_{lsb}$]}
+%
+notation. The indexes are always supplied with the index closest to
+the MSB being given first, so we would take the bottom 32-bits of a
+decreasing bitvector \ll{v} as \ll{v[31 .. 0]}, and the upper 32-bits
+of an increasing bitvector as \ll{v[0 .. 31]}, i.e. the indexing order
+for decreasing vectors decreases, and the indexing order for
+increasing vectors increases.
+
+A vector index can be updated using
+\lstinline[mathescape]{[$\textit{vector}$ with $\textit{index}$ = $\textit{expression}$]}
+notation.
+%
+Similarly, a subrange of a vector can be updated using
+%
+\lstinline[mathescape]{[$\textit{vector}$ with $\textit{index}_{msb}$ .. $\textit{index}_{lsb}$ = $\textit{expression}$]},
+%
+where the order of the indexes is the same as described above for
+increasing and decreasing vectors.
+
+These expressions are actually just syntactic sugar for several
+builtin functions, namely \ll{vector_access}, \ll{vector_subrange},
+\ll{vector_update}, and \ll{vector_update_subrange}.
+
+\subsection{List Type}
+
+In addition to vectors, Sail also has \ll{list} as a builtin type. For
+example:
+\begin{lstlisting}
+let l : list(int) = [|1, 2, 3|]
+\end{lstlisting}
+The cons operator is \ll{::}, so we could equally write:
+\lstinputlisting{examples/list.sail}
+Pattern matching can be used to destructure lists, see Section~\ref{sec:pat}
+
+\subsection{Other Types}
+
+
+\subsection{Pattern Matching}
+\label{sec:pat}
+
+Like most functional languages, Sail supports pattern matching via the
+\ll{match} keyword. For example:
+\begin{lstlisting}
+let n : int = f();
+match n {
+  1 => print("1"),
+  2 => print("2"),
+  3 => print("3"),
+  _ => print("wildcard")
+}
+\end{lstlisting}
+The \ll{match} keyword takes an expression and then branches using a
+pattern based on its value. Each case in the match expression takes
+the form \lstinline[mathescape]{$\textit{pattern}$ => $\textit{expression}$},
+separated by commas. The cases are checked sequentially from top to
+bottom, and when the first pattern matches its expression will be
+evaluated.
+
+\ll{match} in Sail is not currently check for exhaustiveness---after
+all we could have arbitrary constraints on a numeric variable being
+matched upon, which would restrict the possible cases in ways that we
+could not determine with just a simple syntactic check. However, there
+is a simple exhaustiveness checker which runs and gives warnings (not
+errors) if it cannot tell that the pattern match is exhaustive, but
+this check can give false positives. It can be turned off with the
+\verb+-no_warn+ flag.
+
+When we match on numeric literals, the type of the variable we are
+matching on will be adjusted. In the above example in the
+\ll{print("1")} case, $n$ will have the type \ll{int('e)}, where
+\ll{'e} is some fresh type variable, and there will be a constraint
+that \ll{'e} is equal to one.
+
+We can also have guards on patterns, for example we could modify the
+above code to have an additional guarded case like so:
+\begin{lstlisting}
+let n : int = f();
+match n {
+  1 => print("1"),
+  2 => print("2"),
+  3 => print("3"),
+  m if m <= 10 => print("n is less than or equal to 10"),
+  _ => print("wildcard")
+}
+\end{lstlisting}
+The variable pattern m will match against anything, and the guard can
+refer to variables bound by the pattern.
+
+\paragraph{Matching on enums}
+
+Match can be used to match on possible values of an enum, like so:
+\begin{lstlisting}
+enum E = A | B | C
+
+match x {
+  A => print("A"),
+  B => print("B"),
+  C => print("C")
+}
+\end{lstlisting}
+Note that because Sail places no restrictions on the lexical structure
+of enumeration elements to differentiate them from ordinary
+identifiers, pattern matches on variables and enum elements can be
+somewhat ambiguous. This is the primary reason why we have the basic,
+but incomplete, pattern exhaustiveness check mentioned above---it can
+warn you if removing an enum constructor breaks a pattern match.
+
+\paragraph{Matching on unions}
+
+Match can also be used to destructure union constructors, for example
+using the option type from Section~\ref{sec:union}:
+\begin{lstlisting}
+match option {
+  Some(x) => foo(x),
+  None()  => print("matched None()")
+}
+\end{lstlisting}
+Note that like how calling a function with a unit argument can be done
+as \ll{f()} rather than \ll{f(())}, matching on a constructor \ll{C}
+with a unit type can be acheived by using \ll{C()} rather than
+\ll{C(())}.
+
+\paragraph{Matching on bit vectors}
+
+Sail allows numerous ways to match on bitvectors, for example:
+\begin{lstlisting}
+match v {
+  0xFF => print("hex match"),
+  0x0000_0001 => print("binary match"),
+  0xF @ v : bits(4) => print("vector concatentation pattern"),
+  0xF @ [bitone, _, b1, b0] => print("vector pattern"),
+  _ : bits(4) @ v : bits(4) => print("annotated wildcard pattern")
+}
+\end{lstlisting}
+We can match on bitvector literals in either hex or binary forms. We
+also have vector concatantion patterns, of the form
+\lstinline[mathescape]{$\mathit{pattern}$ @ $\ldots$ @ $\mathit{pattern}$}.
+We must be able to infer the length of all the subpatterns in a vector
+concatentation pattern, hence why in the example above all the
+wildcard and variable patterns beneath vector concatenation patterns
+have type annotations. In the context of a pattern the \ll{:} operator
+binds tighter than the \ll{@} operator (as it does elsewhere).
+
+We also have vector patterns, which for bitvectors match on individual
+bits. In the above example, \ll{b0} and \ll{b1} will have type
+\ll{bit}. The pattern \ll{bitone} is a bit literal, with \ll{bitzero}
+being the other bit literal pattern.
+
+Note that because vectors in sail are type-polymorphic, we can also
+use both vector concatentation patterns and vector patterns to match
+against non-bit vectors.
+
+\paragraph{Matching on lists}
+
+Sail allows lists to be destructured using patterns. There are two
+types of patterns for lists, cons patterns and list literal patterns.
+
+\begin{lstlisting}
+match ys {
+  x :: xs => print("cons pattern"),
+  [||]    => print("empty list")
+}
+\end{lstlisting}
+
+\begin{lstlisting}
+match ys {
+  [|1, 2, 3|] => print("list pattern"),
+  _           => print("wildcard")
+}
+\end{lstlisting}
+
+\paragraph{As patterns}
+
+\subsection{Mutable and Immutable Variables}
+
+\subsection{Type declarations}
+
+\subsubsection{Enumerations}
+
+Enumerations can be defined in either a Haskell-like syntax
+(useful for smaller enums) or a more traditional C-like syntax, which
+is often more readable for enumerations with more members. There are
+no lexical constraints on the identifiers that can be part of an
+enumeration. There are also no restrictions on the name of a
+enumeration type, other than it must be a valid identifier. For
+example, the following shows two ways to define the enumeration
+\ll{Foo} with three members, \ll{Bar}, \ll{Baz}, and \ll{quux}:
+
+\lstinputlisting{examples/enum1.sail}
+\lstinputlisting{examples/enum2.sail}
+
+For every enumeration type $E$ sail generates a
+\lstinline[mathescape]{num_of_$E$} function and a
+\lstinline[mathescape]{$E$_of_num} function, which for \ll{Foo} above
+will have the following definitions\footnote{It will ensure that the
+  generated function name \ll{arg} does not clash with any enumeration
+  constructor.}:
+\begin{lstlisting}
+val Foo_of_num : forall 'e, 0 <= 'e <= 2. int('e) -> Foo
+function Foo_of_num(arg) = match arg {
+  0 => Bar,
+  1 => Baz,
+  _ => quux
+}
+
+val num_of_Foo : Foo -> {'e, 0 <= 'e <= 2. int('e)}
+function num_of_Foo(arg) = match arg {
+  Bar  => 0,
+  Baz  => 1,
+  quux => 2
+}
+\end{lstlisting}
+Note that these functions are not automatically made into implicit
+casts.
+
+\subsubsection{Structs}
+
+Structs are defined using the struct keyword like so:
+\lstinputlisting{examples/struct.sail}
+
+If we have a struct \ll{foo : Foo}, its fields can be accessed by
+\ll{foo.bar}, and set as \ll{foo.bar = 0xF}. It can also be updated in
+a purely functional fashion using the construct \ll{\{foo with bar =
+  0xF\}}. There is no lexical restriction on the name of a struct or
+the names of its fields.
+
+\subsubsection{Unions}
+\label{sec:union}
+
+As an example, the \ll{maybe} type \'{a} la Haskell could be defined
+in Sail as follows:
+\begin{lstlisting}
+union maybe ('a : Type) = {
+  Just : 'a,
+  None : unit
+}
+\end{lstlisting}
+Constructors, such as \ll{Just} are called like functions, as in
+\ll{Just(3) : maybe(int)}. The \ll{None} constructor is also called in
+this way, as \ll{None()}. Notice that unlike in other languages, every
+constructor must be associated with a type---there are no nullary
+constructors. As with structs there are no lexical restrictions on the
+names of either the constructors nor the type itself, other than they
+must be valid identifiers.
+
+\subsubsection{Bitfields}
+
+The following example creates a bitfield type called \ll{cr} and a
+register \ll{CR} of that type.
+
+\lstinputlisting{examples/bitfield.sail}
+
+A bitfield definition creates a wrapper around a bit vector type, and
+generates getters and setters for the fields. For the setters, it is
+assumed that they are being used to set registers with the bitfield
+type\footnote{This functionality was originally called \emph{register
+    types} for this reason, but this was confusing because types of
+  registers are not always register types.}. If the bitvector is
+decreasing then indexes for the fields must also be in decreasing
+order, and vice-versa for an increasing vector. For the above example,
+the bitfield wrapper type will be the following:
+
+\begin{lstlisting}
+union cr = { Mk_cr(vector(8, dec, bit)) }
+\end{lstlisting}
+
+The complete vector can be accessed as \ll{CR.bits()}, and a register
+of type \ll{cr} can be set like \ll{CR->bits() = 0xFF}. Getting and
+setting individual fields can be done similarly, as \ll{CR.CR0()} and
+\ll{CR->CR0() = 0xF}. Internally, the bitfield definition will
+generate a \lstinline[mathescape]{_get_$F$} and
+\lstinline[mathescape]{_set_$F$} function for each field
+\lstinline[mathescape]{$F$}, and then overload them as
+\lstinline[mathescape]{_mod_$F$} for the accessor syntax. The setter
+takes the bitfield as a reference to a register, hence why we use the
+\ll{->} notation. For pure updates of values of type \ll{cr} a
+function \lstinline[mathescape]{update_$F$} is also defined. For more
+details on getters and setters, see Section~\ref{sec:getset}. A
+singleton bit in a bitfield definition, such as \ll{LT : 7} will be
+defined as a bitvector of length one, and not as a value of type
+\ll{bit}, which mirrors the behavior of ARM's ASL language.
+
+\subsection{Operators}
+
+Valid operators in Sail are sequences of the following non
+alpha-numeric characters: \verb#!%&*+-./:<>=@^|#. Additionally, any
+such sequence may be suffixed by an underscore followed by any valid
+identifier, so \verb#<=_u# or even \verb#<=_unsigned# are valid
+operator names. Operators may be left, right, or non-associative, and
+there are 10 different precedence levels, ranging from 0 to 9, with 9
+binding the tightest. To declare the precedence of an operator, we use a fixity declaration like:
+\begin{lstlisting}
+infix <=_u 4
+\end{lstlisting}
+For left or right associative operators, we'd use the keywords
+\ll{infixl} or \ll{infixr} respectively. An operator can be used
+anywhere a normal identifier could be used via the \ll{operator}
+keyword. As such, the \verb#<=_u# operator can be defined as:
+\begin{lstlisting}
+val operator <=_u : forall 'n. (bits('n), bits('n)) -> bool
+function operator <=_u(x, y) = unsigned(x) <= unsigned(y)
+\end{lstlisting}
+
+\paragraph{Builtin precedences}
+The precedence of several common operators are built into Sail. These
+include all the operators that are used in type-level numeric
+expressions, as well as several common operations such as equality,
+division, and modulus. The precedences for these operators are
+summarised in Table~\ref{tbl:operators}.
+
+\begin{table}[hbt]
+  \center
+  \begin{tabular}{| c || l | l | l |}
+    \hline
+    Precedence & Left associative & Non-associative & Right associative\\
+    \hline
+    9 & & &\\
+    \hline
+    8 & & & \ll{^}\\
+    \hline
+    7 & \ll{*}, \ll{/}, \ll{\%} & &\\
+    \hline
+    6 & \ll{+}, \ll{-} & &\\
+    \hline
+    5 & & &\\
+    \hline
+    4 & & \ll{<}, \ll{<=}, \ll{>}, \ll{>=}, \ll{!=}, \ll{=}, \ll{==} &\\
+    \hline
+    3 & & & \ll{&}\\
+    \hline
+    2 & & & \ll{|}\\
+    \hline
+    1 & & &\\
+    \hline
+    0 & & &\\
+    \hline
+  \end{tabular}
+  \caption{Default Sail operator precedences}
+  \label{tbl:operators}
+\end{table}
+
+\paragraph{Type operators}
+Sail allows operators to be used at the type level. For example, we
+could define a synonym for the builtin \ll{range} type as:
+\lstinputlisting{examples/type_operator.sail} Note that we can't use
+\ll{..} as an operator name, because that is reserved syntax for
+vector slicing. Operators used in types always share precedence with
+identically named operators at the expression level.
+
+\subsection{Ad-hoc Overloading}
+
+\subsection{Getters and Setters}
+\label{sec:getset}
+
+\subsection{Sizeof and Constraint}
+
+\subsection{Preludes and Default Environment}
+\label{sec:prelude}
-- 
cgit v1.2.3