maj + examples plus p�dagogiques

git-svn-id: svn+ssh://scm.gforge.inria.fr/svn/coq/trunk@8274 85f007b7-540e-0410-9357-904b9bb8a0f7
author: letouzey 2002-04-11 14:06:31 +0000
committer: letouzey 2002-04-11 14:06:31 +0000
commit: 490792d7f3545be84f99ded5df03ddbeb7106cc3 (patch)
tree: 14f3888e15a764cd46192ca76f6dbafbaa11662e
parent: 9e62dc0d0ce67af1375e1572b9d5f7bebfbce792 (diff)
1 files changed, 318 insertions, 119 deletions
diff --git a/doc/Extraction.tex b/doc/Extraction.tex
index e91dcdaaf1..a98a9c4b9b 100755
--- a/doc/Extraction.tex
+++ b/doc/Extraction.tex
@@ -4,20 +4,15 @@
 \index{Extraction}
 
 \begin{flushleft}
-  \em The status of extraction is experimental. \\
-  Haskell extraction is implemented, but not yet tested.
+  \em The status of extraction is experimental.
 \end{flushleft}
-
-It is possible to use \Coq\ to build certified and relatively
-efficient programs, extracting them from the proofs of their
-specifications. The extracted objects can be 
-obtained at the \Coq\ toplevel with the command {\tt Extraction}
-(see \ref{ExtractionTerm}).
-
-We present here a \Coq\ module, {\tt Extraction}, which translates the
-extracted terms to ML dialects, namely Objective Caml and Haskell. 
-In the following, ``ML'' will be used to refer to any of the target 
-dialects.
+We present here the \Coq\ extraction commands, used
+to build certified and relatively
+efficient ML programs, extracting them from the proofs of their
+specifications. 
+The ML dialects available as output language are currently 
+\ocaml\ and {\sf Haskell}. In the following, ``ML'' will be 
+used to refer to any of the two.
 
 \paragraph{Differences with old versions.}
 The current extraction mechanism is new for version 7.0 of {\Coq}.
@@ -28,28 +23,14 @@ The current mechanism also differs from
 the one in previous versions of \Coq: there is no more
 an explicit toplevel for the language (formerly called {\sf Fml}). 
 
-\medskip
-In the first part of this document we describe the commands of the
-{\tt Extraction} module, and in the second part we give some examples.
-
 \asection{Generating ML code}
 \comindex{Extraction}
 \comindex{Recursive Extraction}
 \comindex{Extraction Module}
-
-There are many different extraction commands, that can be used for
-rapid preview (section \ref{extraction:com-top}), for generating 
-real Ocaml code (section \ref{extraction:com-ocaml}) or for generating
-real Haskell code (section \ref{extraction:com-haskell}).
-
-
-\asubsection{Preview within \Coq\ toplevel}\label{extraction:com-top}
+\comindex{Recursive Extraction Module}
 
 The next two commands are meant to be used for rapid preview of
-extraction. They both display extracted term(s) inside \Coq\, using 
-an Ocaml syntax. Globals are printed as in the \Coq\ toplevel 
-(thus without any renaming). As a consequence, note that the output 
-cannot be copy-pasted directly into an Ocaml toplevel. 
+extraction. They both display extracted term(s) inside \Coq.
 
 \begin{description}
 \item {\tt Extraction \term.} ~\par
@@ -62,9 +43,8 @@ cannot be copy-pasted directly into an Ocaml toplevel.
 
 %% TODO error messages
 
-\asubsection{Generating real Ocaml files}\label{extraction:com-ocaml}
 
-All the following commands produce real Ocaml files. User can choose to produce
+All the following commands produce real ML files. User can choose to produce
 one monolithic file or one file per \Coq\ module. 
 
 \begin{description}
@@ -72,7 +52,7 @@ one monolithic file or one file per \Coq\ module.
       \qualid$_1$ \dots\ \qualid$_n$. ~\par
   Recursive extraction of all the globals \qualid$_1$ \dots\
   \qualid$_n$ and all their dependencies in one monolithic file {\em file}.
-  Global and local identifiers are renamed according to the Ocaml
+  Global and local identifiers are renamed according to the choosen ML
   language to fullfill its syntactic conventions, keeping original
   names as much as possible.
 
@@ -90,25 +70,29 @@ one monolithic file or one file per \Coq\ module.
 The list of globals \qualid$_i$ does not need to be
 exhaustive: it is automatically completed into a complete and minimal
 environment. Extraction will fail if it encounters an informative
-axiom not realized (see section \ref{extraction:axioms}).
-
+axiom not realized (see section \ref{extraction:axioms}). 
+A warning will be issued if it encounters an logical axiom, to remind 
+user that inconsistant logical axioms may lead to incorrect or 
+non-terminating extracted terms. 
 
 
-\asubsection{Generating real Haskell files}\label{extraction:com-haskell}
+\asection{Extraction options}
 
-The commands generating Haskell code are similar to those generating
-Ocaml. A prefix ``Haskell'' is just added, and syntactic conventions
-are Haskell's ones. 
+\asubsection{Setting the ML target language}
+\comindex{Extraction Language}
 
+The ability to fix target ML dialect is the first and more important
+of the extraction options. Default is Ocaml. Besides Haskell, another
+language called Toplevel is provided. It is a pseudo-Ocaml,
+with no renaming on global names: so names are printed as in \Coq.
+This third language is available only at the \Coq\ Toplevel.
 \begin{description}
-\item {\tt Haskell Extraction "{\em file}"}  
-      \qualid$_1$ \dots\ \qualid$_n$. ~\par
-\item  {\tt Haskell Extraction Module} \ident. ~\par
-\item {\tt Haskell Recursive Extraction Module} \ident. ~\par
+\item {\tt Extraction Language Ocaml}.
+\item {\tt Extraction Language Haskell}.
+\item {\tt Extraction Language Toplevel}.
 \end{description}
 
-\asection{Extraction options and optimizations}\label{extraction:com-options}
-
+\asubsection{Inlining and optimizations}
 
 Since Objective Caml is a strict language, the extracted
 code has to be optimized in order to be efficient (for instance, when
@@ -129,7 +113,8 @@ produce more readable code.
 All these optimizations are controled by the following \Coq\ options: 
 
 \begin{description}
-
+\comindex{Set Extraction Optimize}
+\comindex{Unset Extraction Optimize}
 \item {\tt Set Extraction Optimize.}
 \item {\tt Unset Extraction Optimize.} ~\par
 
@@ -138,14 +123,18 @@ Default is Set. This control all optimizations made on the ML terms
 Cases, etc). Put this option to Unset if you want a ML term as close as 
 possible to the Coq term.
 
+\comindex{Set Extraction AutoInline}
+\comindex{Unset Extraction AutoInline}
 \item {\tt Set Extraction AutoInline.}
-\item {\tt Unset Extraction AutoInline}. ~\par
+\item {\tt Unset Extraction AutoInline.} ~\par
 
 Default is Set, so by default, the extraction mechanism feels free to 
 inline the bodies of some defined constants, according to some heuristics 
 like size of bodies, useness of some arguments, etc. Those heuristics are 
 not always perfect, you may want to disable this feature, do it by Unset. 
 
+\comindex{Extraction Inline}
+\comindex{Extraction NoInline}
 \item {\tt Extraction Inline} \qualid$_1$ \dots\ \qualid$_n$. ~\par 
 \item {\tt Extraction NoInline} \qualid$_1$ \dots\ \qualid$_n$. ~\par
 
@@ -155,11 +144,13 @@ you can forbid the automatic inlining of some specific constants by
 the {\tt Extraction NoInline} command.
 Those two commands enable a precise control of what is inlined and what is not. 
 
+\comindex{Print Extraction Inline}
 \item {\tt Print Extraction Inline}. ~\par
 
 Prints the current state of the table recording the custom inlinings 
 declared by the two previous commands. 
 
+\comindex{Reset Extraction Inline}
 \item {\tt Reset Extraction Inline}. ~\par
 
 Puts the table recording the custom inlinings back to empty. 
@@ -181,7 +172,6 @@ But this same constant may or may not be inlined in the following
 terms, depending on the automatic/custom inlining mechanism.  
 
 
-
 For the constants non-explicitely required but needed for dependancy
 reasons, there are two cases: 
 \begin{itemize}
@@ -193,9 +183,7 @@ this constant is not declared in the generated file.
 \end{itemize}
 
 
-
-\asection{Realizing axioms}\label{extraction:axioms}
-\comindex{Link}
+\asubsection{Realizing axioms}\label{extraction:axioms}
 
 It is possible to assume some axioms while developing a proof. Since
 these axioms can be any kind of proposition or object type, they may
@@ -251,78 +239,289 @@ Extract Inductive sumbool => bool [ true false ].
 \end{coq_example}
 
 
-% \asubsection{Differences between \Coq\ and ML type systems}
-
-% \subsubsection{ML types that are not \FW\ types}
-
-% Some ML recursive types have no counterpart in the type system of
-% \Coq, like types using the record construction, or non positive types
-% like
-% \begin{verbatim}
-% # type T = C of T->T;;
-% \end{verbatim}
-% In that case, you cannot import those types as inductive types, and
-% the only way to do is to import them as abstract types (with {\tt ML
-% Import}) together with the corresponding building and de-structuring
-% functions (still with {\tt ML Import Constant}).
-
-
-% \subsubsection{Programs that are not ML-typable}
-
-% On the contrary, some extracted programs in \FW\ are not typable in
-% ML. There are in fact two cases which can be problematic:
-% \begin{itemize}
-%   \item If some part of the program is {\em very} polymorphic, there
-%     may be no ML type for it. In that case the extraction to ML works
-%     all right but the generated code may be refused by the ML
-%     type-checker. A very well known example is the {\em distr-pair}
-%     function:
-% $$\mbox{\tt
-% Definition dp := [A,B:Set][x:A][y:B][f:(C:Set)C->C](f A x,f B y).
-% }$$
-% In Caml Light, for instance, the extracted term is 
-% \verb!let dp x y f = pair((f x),(f y))!  and has type
-% $$\mbox{\tt
-% dp : 'a -> 'a -> ('a -> 'b) -> ('b,'b) prod
-% }$$
-% which is not its original type, but a restriction.
-
-%   \item Some definitions of \FW\ may have no counterpart in ML. This
-%     happens when there is a quantification over types inside the type
-%     of a constructor; for example:
-% $$\mbox{\tt
-% Inductive anything : Set := dummy : (A:Set)A->anything.
-% }$$
-% which corresponds to the definition of ML dynamics.
-% \end{itemize}
-
-% The first case is not too problematic: it is still possible to run the
-% programs by switching off the type-checker during compilation. Unless
-% you misused the semantical attachment facilities you should never get
-% any message like ``segmentation fault'' for which the extracted code
-% would be to blame. To switch off the Caml type-checker, use the
-% function {\tt obj\_\_magic} which gives the type {\tt 'a} to any
-% object; but this implies changing a little the extracted code by hand.
-
-% The second case is fatal. If some inductive type cannot be translated
-% to ML, one has to change the proof (or possibly to ``cheat'' by
-% some low-level manipulations we would not describe here). 
-
-% We have to say, though, that in most ``realistic'' programs, these
-% problems do not occur. For example all the programs of the library are
-% accepted by Caml type-checker except {\tt Higman.v}\footnote{Should
-%   you obtain a not ML-typable program out of a self developed example,
-%   we would be interested in seeing it; so please mail us the example at
-%   {\em coq@pauillac.inria.fr}}.
+\asection{Differences between \Coq\ and ML type systems}
+
+
+Due to differences between \Coq\ and ML type systems, 
+some extracted programs are not typable in ML. 
+For example, Here are two kinds of problem that can occur: 
+
+\begin{itemize}
+  \item If some part of the program is {\em very} polymorphic, there
+    may be no ML type for it. In that case the extraction to ML works
+    all right but the generated code may be refused by the ML
+    type-checker. A very well known example is the {\em distr-pair}
+    function:
+$$\mbox{\tt
+Definition dp := [A,B:Set][x:A][y:B][f:(C:Set)C->C](f A x,f B y).
+}$$
+In Ocaml, for instance, the extracted term is 
+\verb!let dp x y f = pair((f () x),(f () y))!  and has type
+$$\mbox{\tt
+dp : 'a -> 'a -> (unit -> 'a -> 'b) -> ('b,'b) prod
+}$$
+which is not its original type, but a restriction.
+
+  \item Some definitions of \Coq\ may have no counterpart in ML. This
+    happens when there is a quantification over types inside the type
+    of a constructor; for example:
+$$\mbox{\tt
+Inductive anything : Set := dummy : (A:Set)A->anything.
+}$$
+which corresponds to the definition of ML dynamics.
+\end{itemize}
+
+It is still possible to run the
+programs by switching off the type-checker during compilation. Unless
+you misused the semantical attachment facilities you should never get
+any message like ``segmentation fault'' for which the extracted code
+would be to blame. For example, to bypass the Ocaml type-checker, 
+we can use the
+function {\tt Obj.magic} which gives the type {\tt 'a} to any
+object. Work is underway to generate those {\tt Obj.magic} automatically.
+
+We have to say, though, that in most ``realistic'' programs, these
+problems do not occur. For example all the programs of the library are
+accepted by Caml type-checker (see examples below).
+
 
 
 \asection{Some examples}
 
- A more pedagogical introduction to extraction should appear here in
- the future. In the meanwhile you can have a look at the \Coq\
- contributions. Several of them use extraction to produce certified 
- programs. In particular the following ones have an automatic extraction
- test (just run \verb|make| in those directories): 
+We present here two examples of extractions, taken from the 
+\Coq\ Standard Library. We choose \ocaml\ as target language, 
+but all can be done in the other dialects with slight modifications.
+We then indicate where to find other examples and tests of Extraction.
+
+\asubsection{A detailed example: Euclidean division}
+
+The file {\tt Euclid} contains the proof of Euclidean division
+(theorem {\tt eucl\_dev}). The natural numbers defined in the example
+files are unary integers defined by two constructors $O$ and $S$:
+\begin{coq_example*}
+Inductive nat : Set := O : nat | S : nat -> nat.
+\end{coq_example*}
+
+This module contains a theorem {\tt eucl\_dev}, and its extracted term
+is of type 
+ $$\mbox{\tt (b:nat)(gt b O)->(a:nat)(diveucl a b)}$$ 
+where {\tt diveucl} is a type for the pair of the quotient and the modulo.
+We can now extract this program to \ocaml:
+
+\begin{coq_eval}
+Reset Initial.
+\end{coq_eval}
+\begin{coq_example}
+Require Euclid.
+Extraction Inline Wf_nat.gt_wf_rec Wf_nat.lt_wf_rec. 
+Recursive Extraction eucl_dev.
+\end{coq_example}
+
+The inlining of {\tt gt\_wf\_rec} and {\tt lt\_wf\_rec} is not
+mandatory. It only enhances readability of extracted code. 
+You can then copy-paste the output to a file {\tt euclid.ml} or let 
+\Coq\ do it for you with the following command: 
+
+\begin{coq_example}
+Extraction "euclid" eucl_dev.
+\end{coq_example}
+
+Let us play the resulting program:
+
+\begin{verbatim}
+# #use "euclid.ml";;
+type sumbool = Left | Right
+type nat = O | S of nat
+type diveucl = Divex of nat * nat
+val minus : nat -> nat -> nat = <fun>
+val le_lt_dec : nat -> nat -> sumbool = <fun>
+val le_gt_dec : nat -> nat -> sumbool = <fun>
+val eucl_dev : nat -> nat -> diveucl = <fun>
+# eucl_dev (S (S O)) (S (S (S (S (S O)))));;
+- : diveucl = Divex (S (S O), S O)
+\end{verbatim}
+It is easier to test on \ocaml\ integers:
+\begin{verbatim}
+# let rec i2n = function 0 -> O | n -> S (i2n (n-1));;
+val i2n : int -> nat = <fun>
+# let rec n2i = function O -> 0 | S p -> 1+(n2i p);;
+val n2i : nat -> int = <fun>
+# let div a b = 
+     let Divex (q,r) = eucl_dev (i2n b) (i2n a) in (n2i q, n2i r);;
+div : int -> int -> int * int = <fun>
+# div 173 15;;
+- : int * int = 11, 8
+\end{verbatim}
+
+\asubsection{Another detailed example: Heapsort}
+
+The file {\tt Heap.v}
+contains the proof of an efficient list sorting algorithm described by
+Bjerner. Is is an adaptation of the well-known {\em heapsort}
+algorithm to functional languages. The main function is {\tt
+treesort}, whose type is shown below: 
+
+
+\begin{coq_eval}
+Reset Initial.
+Require Relation_Definitions. 
+Require PolyList.
+Require Sorting. 
+Require Permutation.
+\end{coq_eval}
+\begin{coq_example}
+Require Heap.
+Check treesort.
+\end{coq_example}
+
+Let's now extract this function: 
+
+\begin{coq_example}
+Extraction NoInline list_to_heap.
+Extraction "heapsort" treesort.
+\end{coq_example}
+
+One more time, the {\tt Extraction NoInline} directive is cosmetic.
+Without it, everything goes right, but {\tt list\_to\_heap} is inlined
+inside {\tt treesort}, producing a less readable term.
+Here is the produced file {\tt heapsort.ml}: 
+
+\begin{verbatim}
+type sumbool =
+  | Left
+  | Right
+
+type nat =
+  | O
+  | S of nat
+
+type 'a tree =
+  | Tree_Leaf
+  | Tree_Node of 'a * 'a tree * 'a tree
+
+type 'a list =
+  | Nil
+  | Cons of 'a * 'a list
+
+let rec merge leA_dec eqA_dec x x0 =
+  match x with
+    | Nil -> x0
+    | Cons (a, l) ->
+        let rec f = function
+          | Nil -> Cons (a, l)
+          | Cons (a0, l1) ->
+              (match leA_dec a a0 with
+                 | Left -> Cons (a,
+                     (merge leA_dec eqA_dec l (Cons (a0, l1))))
+                 | Right -> Cons (a0, (f l1)))
+        in f x0
+
+let rec heap_to_list leA_dec eqA_dec = function
+  | Tree_Leaf -> Nil
+  | Tree_Node (a, t, t0) -> Cons (a,
+      (merge leA_dec eqA_dec (heap_to_list leA_dec eqA_dec t)
+        (heap_to_list leA_dec eqA_dec t0)))
+
+let rec insert leA_dec eqA_dec x x0 =
+  match x with
+    | Tree_Leaf -> Tree_Node (x0, Tree_Leaf, Tree_Leaf)
+    | Tree_Node (a, t, t0) ->
+        let h3 = fun x1 -> insert leA_dec eqA_dec t x1 in
+        (match leA_dec a x0 with
+           | Left -> Tree_Node (a, t0, (h3 x0))
+           | Right -> Tree_Node (x0, t0, (h3 a)))
+
+let rec list_to_heap leA_dec eqA_dec = function
+  | Nil -> Tree_Leaf
+  | Cons (a, l) ->
+      insert leA_dec eqA_dec (list_to_heap leA_dec eqA_dec l) a
+
+let treesort leA_dec eqA_dec l =
+  heap_to_list leA_dec eqA_dec (list_to_heap leA_dec eqA_dec l)
+\end{verbatim}
+
+Let's test it: 
+
+\begin{verbatim}
+# #use "heapsort.ml";;
+type sumbool = Left | Right
+type nat = O | S of nat
+type 'a tree = Tree_Leaf | Tree_Node of 'a * 'a tree * 'a tree
+type 'a list = Nil | Cons of 'a * 'a list
+val merge : ('a -> 'a -> sumbool) -> 'b -> 'a list -> 'a list -> 'a list =
+  <fun>
+val heap_to_list : ('a -> 'a -> sumbool) -> 'b -> 'a tree -> 'a list = <fun>
+val insert : ('a -> 'a -> sumbool) -> 'b -> 'a tree -> 'a -> 'a tree = <fun>
+val list_to_heap : ('a -> 'a -> sumbool) -> 'b -> 'a list -> 'a tree = <fun>
+val treesort : ('a -> 'a -> sumbool) -> 'b -> 'a list -> 'a list = <fun>
+\end{verbatim}
+
+One can remark that the argument of {\tt treesort} corresponding to 
+{\tt eqAdec} is never used in the informative part of the terms, 
+only in the logical parts. So the extracted {\tt treesort} never use
+it, hence this {\tt 'b} argument. We will use {\tt ()} for this
+argument. Only remains the {\tt leAdec}
+argument (of type {\tt 'a -> 'a -> sumbool}) to really provide.
+
+\begin{verbatim}
+# let leAdec x y = if x <= y then Left else Right;;
+val leAdec : 'a -> 'a -> sumbool = <fun>
+# let rec listn = function 0 -> Nil
+                         | n -> Cons(Random.int 10000,listn (n-1));;
+val listn : int -> int list = <fun>
+# treesort leAdec () (listn 10);;
+- : int list = Cons (136, Cons (760, Cons (1512, Cons (2776, Cons (3064, 
+Cons (4536, Cons (5768, Cons (7560, Cons (8856, Cons (8952, Nil))))))))))
+\end{verbatim}
+
+Some tests on longer lists (10000 elements) show that the program is
+quite efficient for Caml code.
+
+
+\asubsection{The Standard Library} 
+
+As a test, we propose an automatic extraction of the 
+Standard Library of \Coq. In particular, we will find back the
+two previous examples, {\tt Euclid} and {\tt Heapsort}. 
+Go to directory\\
+{\tt contrib/extraction/test} of the sources of \Coq, and run commands:\\
+
+\mbox{\tt make tree; make}\\
+
+This will extract all Standard Library files and compile them. 
+It is done via many {\tt Extraction Module}, with some customization
+(see subdirectory {\tt custom}).
+
+The result of this extraction of the Standard Library can be browsed
+at the address:\\
+
+\verb!http://www.lri.fr/~letouzey/extraction!\\
+
+Reals theory is normally not extracted, since it is an axiomatic 
+development. We propose nonetheless a dummy realization of those
+axioms, to test, run: \\
+
+\mbox{\tt make reals}\\
+
+This test works also with Haskell. In the same directory, run: \\
+
+\mbox{\tt make tree; make -f Makefile.haskell}\\
+
+The haskell compiler currently used is {\tt hbc}. 
+Any other should also work, just
+adapt the {\tt Makefile.haskell}. In particular {\tt ghc} is known
+to work.
+
+\asubsection{Extraction's horror museum}
+
+ Some pathological examples of extraction are grouped in the file\\
+ {\tt contrib/extraction/test\_extraction.v} of the sources of \Coq.
+
+\asubsection{Users' Contributions}
+
+ Several of the \Coq\ Users' Contributions use extraction to produce 
+ certified programs. In particular the following ones have an automatic 
+ extraction test (just run {\tt make} in those directories): 
 
  \begin{itemize}
  \item Bordeaux/Additions
@@ -345,6 +544,6 @@ Extract Inductive sumbool => bool [ true false ].
  in ML due to an heavy use of impredicativity. So we realize one
  inductive type using an \verb|Obj.magic| that artificially gives it
  the good type. After compilation this example runs nonetheless,
- thanks to the (desired but not proved) correction of the extraction. 
+ thanks to the correction of the extraction (proof underway). 
 
 % $Id$
author	letouzey	2002-04-11 14:06:31 +0000
committer	letouzey	2002-04-11 14:06:31 +0000
commit	490792d7f3545be84f99ded5df03ddbeb7106cc3 (patch)
tree	14f3888e15a764cd46192ca76f6dbafbaa11662e
parent	9e62dc0d0ce67af1375e1572b9d5f7bebfbce792 (diff)