From b6018c78b25da14d4f44cf10de692f968cba1e98 Mon Sep 17 00:00:00 2001 From: filliatr Date: Tue, 12 Dec 2000 22:36:15 +0000 Subject: Initial revision git-svn-id: svn+ssh://scm.gforge.inria.fr/svn/coq/trunk@8143 85f007b7-540e-0410-9357-904b9bb8a0f7 --- doc/Programs.tex | 931 +++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 931 insertions(+) create mode 100644 doc/Programs.tex (limited to 'doc/Programs.tex') diff --git a/doc/Programs.tex b/doc/Programs.tex new file mode 100644 index 0000000000..e92dd59311 --- /dev/null +++ b/doc/Programs.tex @@ -0,0 +1,931 @@ +\achapter{Proof of imperative programs} +\aauthor{Jean-Christophe Filliâtre} +\label{Addoc-programs} +\index{Imperative programs!proof of} +\comindex{Correctness} + +%%%%%%%%%%%%%%%% +% Introduction % +%%%%%%%%%%%%%%%% + +This chapter describes a new tactic +to prove the correctness and termination of imperative +programs annotated in a Floyd-Hoare logic style. +This tactic is provided in the \Coq\ module \texttt{Programs}, +which does not belong to the initial state of \Coq. So you must import +it when necessary, with the following command: +$$ +\mbox{\texttt{Require Programs.}} +$$ +If you want to use this tactic with the native-code version of \Coq, +you will have to run the version of \Coq\ with all the tactics, through +the command +$$ +\mbox{\texttt{coqtop -full}} +$$ +\emph{Be aware that this tactic is still very experimental}. + + +%%%%%%%%%%%%%%%%%%%%% +% comment ça marche % +%%%%%%%%%%%%%%%%%%%%% + +\asection{How it works} + +Before going on into details and syntax, let us give a quick overview +of how that tactic works. +Its behavior is the following: you give a +program annotated with logical assertions and the tactic will generate +a bundle of subgoals, called \emph{proof obligations}. Then, if you +prove all those proof obligations, you will establish the correctness and the +termination of the program. +The implementation currently supports traditional imperative programs +with references and arrays on arbitrary purely functional datatypes, +local variables, functions with call-by-value and call-by-variable +arguments, and recursive functions. + +Although it behaves as an implementation of Floyd-Hoare logic, it is not. +The idea of the underlying mechanism is to translate the imperative +program into a partial proof of a proposition of the kind +$$ +\forall \vec{x}. P(\vec{x}) + \Rightarrow \exists(\vec{y},v). Q(\vec{x},\vec{y},v) +$$ +where $P$ and $Q$ stand for the pre- and post-conditions of the +program, $\vec{x}$ and $\vec{y}$ the variables used and modified by +the program and $v$ its result. +Then this partial proof is given to the tactic \texttt{Refine} +(see~\ref{Refine}, page~\pageref{Refine}), which effect is to generate as many +subgoals as holes in the partial proof term. + +\medskip + +The syntax to invoke the tactic is the following: +$$ +\mbox{\tt Correctness} ~~ ident ~~ annotated\_program. +$$ +Notice that this is not exactly a \emph{tactic}, since it does not +apply to a goal. To be more rigorous, it is the combination of a +vernacular command (which creates the goal from the annotated program) +and a tactic (which partially solves it, leaving some proof +obligations to the user). + +Whereas \texttt{Correctness} is not a tactic, the following syntax is +available: +$$ +\mbox{\tt Correctness} ~~ ident ~~ annotated\_program ~ ; ~ tactic. +$$ +In that case, the given tactic is applied on any proof +obligation generated by the first command. + + +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +% Syntaxe de programmes annotes % +%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% + +\asection{Syntax of annotated programs} + +\asubsection{Programs} + +The syntax of programs is given in figure~\ref{fig:ProgramsSyntax}. +Basically, the programming language is a purely functional kernel +with an addition of references and arrays on purely functional values. +If you do not consider the logical assertions, the syntax coincide +with \ocaml\ syntax, except for elements of arrays which are written +$t[i]$. In particular, the dereference of a mutable variable $x$ is +written $!x$ and assignment is written \verb!:=! +(for instance, the increment of the variable $x$ will +be written \verb@x := !x + 1@). +Actually, that syntax does not really matters, since it would +be extracted later to real concrete syntax, in different programming +languages. + +\begin{figure}[htbp] + \begin{center} + \leavevmode +$$ +\begin{array}{lrl} + prog & ::= & \verb!{! ~ predicate ~ \verb!}! * + ~ statement ~ [ \verb!{! ~ predicate ~ \verb!}! ] \\ + + & & \\[0.1em] + + statement + & ::= & expression \\ + & | & identifier ~ \verb!:=! ~ prog \\ + & | & identifier ~ \verb![! ~ expression ~ \verb!]! + ~ \verb!:=! ~ prog \\ + & | & \verb!let! ~ identifier ~ \verb!=! ~ \verb!ref! ~ + prog ~ \verb!in! ~ prog \\ + & | & \verb!if! ~ prog ~ \verb!then! ~ prog + ~ [ \verb!else! ~ prog ] \\ + & | & \verb!while! ~ prog ~ \verb!do! + ~ loop\_annot ~ block ~ \verb!done! \\ + & | & \verb!begin! ~ block ~ \verb!end! \\ + & | & \verb!let! ~ identifier ~ \verb!=! ~ prog ~ + \verb!in! ~ prog \\ + & | & \verb!fun! ~ binders ~ \verb!->! ~ prog \\ + & | & \verb!let! ~ \verb!rec! ~ identifier ~ binder ~ \verb!:! + ~ value\_type \\ + & & ~~ \verb!{! ~ \verb!variant! ~ wf\_arg ~ \verb!}! + ~ \verb!=! ~ prog ~ [ \verb!in! ~ prog ] \\ + & | & \verb!(! ~ prog ~~ prog ~ \verb!)! \\ + + & & \\[1em] + + expression + & ::= & identifier \\ + & | & \verb@!@ ~ identifier \\ + & | & identifier ~ \verb![! ~ expression ~ \verb!]! \\ + & | & integer \\ + & | & \verb!(! ~ expression+ \verb!)! \\ + + & & \\[1em] + + block & ::= & block\_statement ~ [ ~ \verb!;! ~ block ~ ] \\ + + & & \\[0.1em] + + block\_statement + & ::= & prog \\ + & | & \verb!label! ~ identifier \\ + & | & \verb!assert! ~ \verb!{! ~ predicate ~ \verb!}! \\ + + & & \\[1em] + + binders & ::= & \verb!(! ~ identifier,\dots,identifier ~ \verb!:! + ~ value\_type ~ \verb!)! + \\ + + & & \\[1em] + + loop\_annot + & ::= & \verb!{! ~ \verb!invariant! ~ predicate ~ + \verb!variant! ~ wf\_arg ~ \verb!}! \\ + & & \\[0.1em] + + wf\_arg & ::= & cic\_term ~ [ \verb!for! ~ cic\_term ] \\ + + & & \\[0.1em] + + predicate & ::= & cci\_term ~ [ \verb!as! ~ identifier ] \\ + + \end{array} +$$ + \caption{Syntax of annotated programs} + \label{fig:ProgramsSyntax} + \end{center} +\end{figure} + + +\paragraph{Syntactic sugar.} + +\begin{itemize} + \item \textbf{Boolean expressions:} + + Boolean expressions appearing in programs (and in particular in + \kw{if} and \kw{while} tests) are arbitrary programs of type \texttt{bool}. + In order to make programs more readable, some syntactic sugar is + provided for the usual logical connectives and the usual order + relations over type \texttt{Z}, with the following syntax: + $$ + \begin{array}{lrl} + prog + & ::= & prog ~ \verb!and! ~ prog \\ + & | & prog ~ \verb!or! ~ prog \\ + & | & \verb!not! ~ prog \\ + & | & expression ~ order\_relation ~ expression \\[1em] + order\_relation + & ::= & \verb!>! ~|~ \verb!>=! ~|~ \verb!! \\ + \end{array} + $$ + where the order relations have the strongest precedences, + \verb!not! has a stronger precedence than \verb!and!, + and \verb!and! a stronger precedence than \verb!or!. + + Order relations in other types, like \texttt{lt}, \texttt{le}, \dots in + type \texttt{nat}, should be explicited as described in the + paragraph about \emph{Boolean expressions}, page~\pageref{prog:booleans}. + + \item \textbf{Arithmetical expressions:} + + Some syntactic sugar is provided for the usual arithmetic operator + over type \texttt{Z}, with the following syntax: + $$ + \begin{array}{lrl} + prog + & ::= & prog ~ \verb!*! ~ prog \\ + & | & prog ~ \verb!+! ~ prog \\ + & | & prog ~ \verb!-! ~ prog \\ + & | & \verb!-! ~ prog + \end{array} + $$ + where the unary operator \texttt{-} has the strongest precedence, + and \texttt{*} a stronger precedence than \texttt{+} and \texttt{-}. + + Operations in other arithmetical types (such as type \texttt{nat}) + must be explicitly written as applications, like + \texttt{(plus~a~b)}, \texttt{(pred~a)}, etc. + + \item \texttt{if $b$ then $p$} is a shortcut for + \texttt{if $b$ then $p$ else tt}, where \texttt{tt} is the + constant of type \texttt{unit}; + + \item Values in type \texttt{Z} + may be directly written as integers : 0,1,12546,\dots + Negative integers are not recognized and must be written + as \texttt{(Zinv $x$)}; + + \item Multiple application may be written $(f~a_1~\dots~a_n)$, + which must be understood as left-associa\-tive + i.e. as $(\dots((f~a_1)~a_2)\dots~a_n)$. +\end{itemize} + + +\paragraph{Restrictions.} + +You can notice some restrictions with respect to real ML programs: +\begin{enumerate} + \item Binders in functions (recursive or not) are explicitly typed, + and the type of the result of a recursive function is also given. + This is due to the lack of type inference. + + \item Full expressions are not allowed on left-hand side of assignment, but + only variables. Therefore, you can not write +\begin{verbatim} + (if b then x else y) := 0 +\end{verbatim} + But, in most cases, you can rewrite + them into acceptable programs. For instance, the previous program + may be rewritten into the following one: +\begin{verbatim} + if b then x := 0 else y := 0 +\end{verbatim} +\end{enumerate} + + + +%%%%%%%%%%%%%%% +% Type system % +%%%%%%%%%%%%%%% + +\asubsection{Typing} + +The types of annotated programs are split into two kinds: the types of +\emph{values} and the types of \emph{computations}. Those two types +families are recursively mutually defined with the following concrete syntax: +$$ +\begin{array}{lrl} + value\_type + & ::= & cic\_term \\ + & | & {cic\_term} ~ \verb!ref! \\ + & | & \verb!array! ~ cic\_term ~ \verb!of! ~ cic\_term \\ + & | & \verb!fun! ~ \verb!(! ~ x \verb!:! value\_type ~ \verb!)!\!+ + ~ computation\_type \\ + & & \\ + computation\_type + & ::= & \verb!returns! ~ identifier \verb!:! value\_type \\ + & & [\verb!reads! ~ identifier,\ldots,identifier] + ~ [\verb!writes! ~ identifier,\ldots,identifier] \\ + & & [\verb!pre! ~ predicate] ~ [\verb!post! ~ predicate] \\ + & & \verb!end! \\ + & & \\ + predicate + & ::= & cic\_term \\ + & & \\ +\end{array} +$$ + +The typing is mostly the one of ML, without polymorphism. +The user should notice that: +\begin{itemize} + \item Arrays are indexed over the type \texttt{Z} of binary integers + (defined in the module \texttt{ZArith}); + + \item Expressions must have purely functional types, and can not be + references or arrays (but, of course, you can pass mutables to + functions as call-by-variable arguments); + + \item There is no partial application. +\end{itemize} + + +%%%%%%%%%%%%%%%%%% +% Specifications % +%%%%%%%%%%%%%%%%%% + +\asubsection{Specification} + +The specification part of programs is made of different kind of +annotations, which are terms of sort \Prop\ in the Calculus of Inductive +Constructions. + +Those annotations can refer to the values of the variables +directly by their names. \emph{There is no dereference operator ``!'' in +annotations}. Annotations are read with the \Coq\ parser, so you can +use all the \Coq\ syntax to write annotations. For instance, if $x$ +and $y$ are references over integers (in type \texttt{Z}), you can write the +following annotation +$$ +\mbox{\texttt{\{ `0 < x <= x+y` \}}} +$$ +In a post-condition, if necessary, you can refer to the value of the variable +$x$ \emph{before} the evaluation with the notation $x@$. +Actually, it is possible to refer to the value of a variable at any +moment of the evaluation with the notation $x@l$, +provided that $l$ is a \emph{label} previously inserted in your program +(see below the paragraph about labels). + +You have the possibility to give some names to the annotations, with +the syntax +$$ +\texttt{\{ \emph{annotation} as \emph{identifier} \}} +$$ +and then the annotation will be given this name in the proof +obligations. +Otherwise, fresh names are given automatically, of the kind +\texttt{Post3}, \texttt{Pre12}, \texttt{Test4}, etc. +You are encouraged to give explicit names, in order not to have to +modify your proof script when your proof obligations change (for +instance, if you modify a part of the program). + + +\asubsubsection{Pre- and post-conditions} + +Each program, and each of its sub-programs, may be annotated by a +pre-condition and/or a post-condition. +The pre-condition is an annotation about the values of variables +\emph{before} the evaluation, and the post-condition is an annotation +about the values of variables \emph{before} and \emph{after} the +evaluation. Example: +$$ +\mbox{\texttt{\{ `0 < x` \} x := (Zplus !x !x) \{ `x@ < x` \}}} +$$ +Moreover, you can assert some properties of the result of the evaluation +in the post-condition, by referring to it through the name \emph{result}. +Example: +$$ +\mbox{\texttt{(Zs (Zplus !x !x)) \{ (Zodd result) \}}} +$$ + + +\asubsubsection{Loops invariants and variants} + +Loop invariants and variants are introduced just after the \kw{do} +keyword, with the following syntax: +$$ +\begin{array}{l} + \kw{while} ~ B ~ \kw{do} \\ + ~~~ \{ ~ \kw{invariant} ~ I ~~~ \kw{variant} ~ \phi ~ \kw{for} ~ R ~ + \} \\ + ~~~ block \\ + \kw{done} +\end{array} +$$ + +The invariant $I$ is an annotation about the values of variables +when the loop is entered, since $B$ has no side effects ($B$ is a +purely functional expression). Of course, $I$ may refer to values of +variables at any moment before the entering of the loop. + +The variant $\phi$ must be given in order to establish the termination of +the loop. The relation $R$ must be a term of type $A\rightarrow +A\rightarrow\Prop$, where $\phi$ is of type $A$. +When $R$ is not specified, then $\phi$ is assumed to be of type +\texttt{Z} and the usual order relation on natural number is used. + + +\asubsubsection{Recursive functions} + +The termination of a recursive function is justified in the same way as +loops, using a variant. This variant is introduced with the following syntax +$$ +\kw{let} ~ \kw{rec} ~ f ~ (x_1:V_1)\dots(x_n:V_n) : V + ~ \{ ~ \kw{variant} ~ \phi ~ \kw{for} ~ R ~ \} = prog +$$ +and is interpreted as for loops. Of course, the variant may refer to +the bound variables $x_i$. +The specification of a recursive function is the one of its body, $prog$. +Example: +$$ +\kw{let} ~ \kw{rec} ~ fact ~ (x:Z) : Z ~ \{ ~ \kw{variant} ~ x \} = + \{ ~ x\ge0 ~ \} ~ \dots ~ \{ ~ result=x! ~ \} +$$ + + +\asubsubsection{Assertions inside blocks} + +Assertions may be inserted inside blocks, with the following syntax +$$ +\kw{begin} ~ block\_statements \dots; ~ \kw{assert} ~ \{ ~ P ~ \}; + ~ block\_statements \dots ~ \kw{end} +$$ +The annotation $P$ may refer to the values of variables at any labels +known at this moment of evaluation. + + +\asubsubsection{Inserting labels in your program} + +In order to refer to the values of variables at any moment of +evaluation of the program, you may put some \emph{labels} inside your +programs. Actually, it is only necessary to insert them inside blocks, +since this is the only place where side effects can appear. The syntax +to insert a label is the following: +$$ +\kw{begin} ~ block\_statements \dots; ~ \kw{label} ~ L; + ~ block\_statements \dots ~ \kw{end} +$$ +Then it is possible to refer to the value of the variable $x$ at step +$L$ with the notation $x@L$. + +There is a special label $0$ which is automatically inserted at the +beginning of the program. Therefore, $x@0$ will always refer to the +initial value of the variable $x$. + +\medskip + +Notice that this mechanism allows the user to get rid of the so-called +\emph{auxiliary variables}, which are usually widely used in +traditional frameworks to refer to previous values of variables. + + +%%%%%%%%%%%% +% booléens % +%%%%%%%%%%%% + +\asubsubsection{Boolean expressions}\label{prog:booleans} + +As explained above, boolean expressions appearing in \kw{if} and +\kw{while} tests are arbitrary programs of type \texttt{bool}. +Actually, there is a little restriction: a test can not do some side +effects. +Usually, a test if annotated in such a way: +$$ + B ~ \{ \myifthenelse{result}{T}{F} \} +$$ +(The \textsf{if then else} construction in the annotation is the one +of \Coq\ !) +Here $T$ and $F$ are the two propositions you want to get in the two +branches of the test. +If you do not annotate a test, then $T$ and $F$ automatically become +$B=\kw{true}$ and $B=\kw{false}$, which is the usual annotation in +Floyd-Hoare logic. + +But you should take advantages of the fact that $T$ and $F$ may be +arbitrary propositions, or you can even annotate $B$ with any other +kind of proposition (usually depending on $result$). + +As explained in the paragraph about the syntax of boolean expression, +some syntactic sugar is provided for usual order relations over type +\texttt{Z}. When you write $\kw{if} ~ x \kw{if} $(odd ~ !n)$ \kw{then} $y$ := $!y \times !m$ ;\\ + \> $m$ := $!m \times !m$ ; \\ + \> $n$ := $!n / 2$ \\ + \kw{done} \\ + \end{tabbing} + \end{minipage} +\end{center} + + +\paragraph{Specification part.} + +Here we choose to use the binary integers of \texttt{ZArith}. +The exponentiation $X^n$ is defined, for $n \ge 0$, in the module +\texttt{Zpower}: +\begin{coq_example*} +Require ZArith. +Require Zpower. +\end{coq_example*} + +In particular, the module \texttt{ZArith} loads a module \texttt{Zmisc} +which contains the definitions of the predicate \texttt{Zeven} and +\texttt{Zodd}, and the function \texttt{Zdiv2}. +This module \texttt{ProgBool} also contains a test function +\texttt{Zeven\_odd\_bool} of type +$\forall n. \exists b. \myifthenelse{b}{(Zeven~n)}{(Zodd~n)}$ +derived from the proof \texttt{Zeven\_odd\_dec}, +as explained in section~\ref{prog:booleans}: + +\begin{coq_eval} +Require Ring. +\end{coq_eval} + + +\paragraph{Correctness part.} + +Then we come to the correctness proof. We first import the \Coq\ +module \texttt{Programs}: +\begin{coq_example*} +Require Programs. +\end{coq_example*} +\begin{coq_eval} +Definition Zsquare := [n:Z](Zmult n n). +Definition Zdouble := [n:Z]`2*n`. +\end{coq_eval} + +Then we introduce all the variables needed by the program: +\begin{coq_example*} +Parameter x : Z. +Global Variable n,m,y : Z ref. +\end{coq_example*} + +At last, we can give the annotated program: +\begin{coq_example} +Correctness i_exp + { `n >= 0` } + begin + m := x; y := 1; + while !n > 0 do + { invariant (Zpower x n@0)=(Zmult y (Zpower m n)) /\ `n >= 0` + variant n } + (if not (Zeven_odd_bool !n) then y := (Zmult !y !m)) + { (Zpower x n@0) = (Zmult y (Zpower m (Zdouble (Zdiv2 n)))) }; + m := (Zsquare !m); + n := (Zdiv2 !n) + done + end + { y=(Zpower x n@0) } +. +\end{coq_example} + +The proof obligations require some lemmas involving \texttt{Zpower} and +\texttt{Zdiv2}. You can find the whole proof in the \Coq\ standard +library (see below). +Let us make some quick remarks about this program and the way it was +written: +\begin{itemize} + \item The name \verb!n@0! is used to refer to the initial value of + the variable \verb!n!, as well inside the loop invariant as in + the post-condition; + + \item Purely functional expressions are allowed anywhere in the + program and they can use any purely informative \Coq\ constants; + That is why we can use \texttt{Zmult}, \texttt{Zsquare} and + \texttt{Zdiv2} in the programs even if they are not other functions + previously introduced as programs. +\end{itemize} + + +\asubsection{A recursive program} + +To give an example of a recursive program, let us rewrite the previous +program into a recursive one. We obtain the following program: +\begin{center} + \begin{minipage}{8cm} + \begin{tabbing} + AA\=AA\=AA\=\kill + \kw{let} \kw{rec} $exp$ $x$ $n$ = \\ + \> \kw{if} $n = 0$ \kw{then} \\ + \> \> 1 \\ + \> \kw{else} \\ + \> \> \kw{let} $y$ = $(exp ~ x ~ (n/2))$ \kw{in} \\ + \> \> \kw{if} $(even ~ n)$ \kw{then} $y\times y$ + \kw{else} $x\times (y\times y)$ \\ + \end{tabbing} + \end{minipage} +\end{center} + +This recursive program, once it is annotated, is given to the +tactic \texttt{Correctness}: +\begin{coq_eval} +Reset n. +\end{coq_eval} +\begin{coq_example} +Correctness r_exp + let rec exp (x:Z) (n:Z) : Z { variant n } = + { `n >= 0` } + (if n = 0 then + 1 + else + let y = (exp x (Zdiv2 n)) in + (if (Zeven_odd_bool n) then + (Zmult y y) + else + (Zmult x (Zmult y y))) { result=(Zpower x n) } + ) + { result=(Zpower x n) } +. +\end{coq_example} + +You can notice that the specification is simpler in the recursive case: +we only have to give the pre- and post-conditions --- which are the same +as for the imperative version --- but there is no annotation +corresponding to the loop invariant. +The other two annotations in the recursive program are added for the +recursive call and for the test inside the \textsf{let in} construct +(it can not be done automatically in general, +so the user has to add it by himself). + + +\asubsection{Other examples} + +You will find some other examples with the distribution of the system +\Coq, in the sub-directory \verb!tactics/programs/EXAMPLES! of the +\Coq\ standard library. Those examples are mostly programs to compute +the factorial and the exponentiation in various ways (on types \texttt{nat} +or \texttt{Z}, in imperative way or recursively, with global +variables or as functions, \dots). + +There are also some bigger correctness developments in the +\Coq\ contributions, which are available on the web page +\verb!coq.inria.fr/contribs!. +for the moment, you can find: +\begin{itemize} + \item A proof of \emph{insertion sort} by Nicolas Magaud, ENS Lyon; + \item A proof of \emph{quicksort} and \emph{find} by the author. +\end{itemize} + + +%%%%%%%%%%%%%%% +% BUG REPORTS % +%%%%%%%%%%%%%%% + +\asection{Bugs} + +\begin{itemize} + \item There is no discharge mechanism for programs; so you + \emph{cannot} do a program's proof inside a section (actually, + you can do it, but your program will not exist anymore after having + closed the section). +\end{itemize} + +Surely there are still many bugs in this implementation. +Please send bug reports to \textsf{Jean-Christophe.Filliatre$@$lri.fr}. +Don't forget to send the version of \Coq\ used (given by +\texttt{coqtop -v}) and a script producing the bug. + + +%%% Local Variables: +%%% mode: latex +%%% TeX-master: "Reference-Manual" +%%% End: -- cgit v1.2.3