\achapter{Execution of extracted programs in Objective Caml and Haskell} \label{Extraction} \aauthor{Jean-Christophe Filliātre and Pierre Letouzey} \index{Extraction} \begin{flushleft} \em The status of extraction is experimental. \\ Haskell extraction is implemented, but not yet tested. \end{flushleft} It is possible to use \Coq\ to build certified and relatively efficient programs, extracting them from the proofs of their specifications. The extracted objects can be obtained at the \Coq\ toplevel with the command {\tt Extraction} (see \ref{ExtractionTerm}). We present here a \Coq\ module, {\tt Extraction}, which translates the extracted terms to ML dialects, namely Objective Caml and Haskell. In the following, we will not refer to a particular dialect when possible and ``ML'' will be used to refer to any of the target dialects. \Rem The current extraction mechanism is new from version 7.0 of {\Coq}. In particular, the \FW\ toplevel used as intermediate step between \Coq\ and ML has been abandoned. It is also not possible any more to import ML objects in this \FW\ toplevel. The current mechanism also differs from the one in previous versions of \Coq: there is no more an explicit toplevel for the language (formerly called {\sf Fml}). \medskip In the first part of this document we describe the commands of the {\tt Extraction} module, and we give some examples in the second part. \asection{Generating ML code} \comindex{Extraction} \comindex{Recursive Extraction} \comindex{Extraction Module} There are many different extraction commands, that can be used for rapid preview (section \ref{extraction:com-top}), for generating real Ocaml code (section \ref{extraction:com-ocaml}) or for generating real Haskell code (section \ref{extraction:com-haskell}). \asubsection{Preview within \Coq\ toplevel}\label{extraction:com-top} The next two commands are meant to be used for rapid preview of extraction. They both display extracted term(s) inside \Coq\, using an Ocaml syntax. Globals are printed as in the \Coq\ toplevel (thus without any renaming). As a consequence, note that the output cannot be copy-pasted directly into an Ocaml toplevel. \begin{description} \item {\tt Extraction \term.} ~\par Extracts one term in the \Coq\ toplevel. \item {\tt Recursive Extraction \qualid$_1$ \dots\ \qualid$_n$.} ~\par Recursive extraction of all the globals \qualid$_1$ \dots\ \qualid$_n$ and all their dependencies in the \Coq\ toplevel. \end{description} \asubsection{Generating real Ocaml files}\label{extraction:com-ocaml} All following commands produce real Ocaml files. User can choose to produce one monolithic file or one file per \Coq\ module. \begin{description} \item {\tt Extraction "{\em file}"} \qualid$_1$ \dots\ \qualid$_n$. ~\par Recursive extraction of all the globals \qualid$_1$ \dots\ \qualid$_n$ and all their dependencies in one monolithic file {\em file}. Global and local identifiers are renamed accordingly to the Ocaml language to fullfill its syntactic conventions, keeping original names as much as possible. \item {\tt Extraction Module} \ident. ~\par Extraction of the \Coq\ module \ident\ to an ML module {\tt\ident.ml}. In case of name clash, identifiers are here renamed using prefixes \verb!coq_! or \verb!Coq_! to ensure a session-independent renaming. \item {\tt Recursive Extraction Module} \ident. ~\par Extraction of the \Coq\ module \ident\ and all other modules \ident\ depends on. \end{description} The list of globals \qualid$_i$ does not need to be exhaustive: it is automatically completed into a complete and minimal environment. Extraction will fail if it encounters an informative axiom not realized (see section \ref{extraction:axioms}). \asubsection{Generating real Haskell files}\label{extraction:com-haskell} The commands generating Haskell code are similar to those generating Ocaml. A prefix ``Haskell'' is just added, and syntactic conventions are Haskell's ones. \begin{description} \item {\tt Haskell Extraction "{\em file}"} \qualid$_1$ \dots\ \qualid$_n$. ~\par \item {\tt Haskell Extraction Module} \ident. ~\par \item {\tt Haskell Recursive Extraction Module} \ident. ~\par \end{description} \asection{Extraction options and optimizations}\label{extraction:com-options} Since Objective Caml is a strict language, the extracted code has to be optimized in order to be efficient (for instance, when using induction principles we do not want to compute all the recursive calls but only the needed ones). So the extraction mechanism provides an automatic optimization routine that will be called each time the user want to generate Ocaml programs. Essentially, it performs constants inlining and reductions. Therefore some constants may not appear in resulting monolithic Ocaml program (A warning is printed for each such constant). In the case of modular extraction, even if some inling is done, the inlined constant are nevertheless printed, to ensure session-independent programs. Concerning Haskell, such optimizations are less useful because of lazyness. We still make some optimizations, for example in order to produce more readable code. All these optimizations are controled by the following \Coq\ options: \begin{description} \item {\tt Set Extraction Optimize.} \item {\tt Unset Extraction Optimize.} ~\par Default is Set. This control all optimizations made on the ML terms (mostly reduction of dummy beta/iota redexes, but also simplications on Cases, etc). Put this option to Unset if you what a ML term as close as possible to the Coq term. \item {\tt Set Extraction AutoInline.} \item {\tt Unset Extraction AutoInline}. ~\par Default is Set, so by default, the extraction mechanism feels free to inline the bodies of some defined constants, according to some heuristics like size of bodies, useness of some arguments, etc. Those heuristics are not always perfect, you may want to disable this feature, do it by Unset. \item {\tt Extraction Inline} \qualid$_1$ \dots\ \qualid$_n$. ~\par \item {\tt Extraction NoInline} \qualid$_1$ \dots\ \qualid$_n$. ~\par In addition to the automatic inline feature, you can now tell precisely to inline some more constants by the {\tt Extraction Inline} command. Conversely, you can forbid the automatic inlining of some specific constants by the {\tt Extraction NoInline} command. Those two commands enable a precise control of what is inlined and what is not. \item {\tt Print Extraction Inline}. ~\par Prints the current state of the table recording the custom inlinings declared by the two previous commands. \item {\tt Reset Extraction Inline}. ~\par Puts the table recording the custom inlinings back to empty. \end{description} \paragraph{Inlining and printing of a constant declaration.} A user can explicitely asks a constant to be extracted by two means: \begin{itemize} \item by mentioning it on the extraction command line \item by extracting the whole \Coq\ module of this constant. \end{itemize} In both cases, the declaration of this constant will be present in the produced file. But this same constant may or may not be inlined in the following terms, depending on the automatic/custom inling mechanism. For the constants non-explicitely required but needed for dependancy reasons, there are two cases: \begin{itemize} \item If an inlining decision is taken, wether automatically or not, all occurences of this constant are replaced by its extracted body, and this constant is not declared in the generated file. \item If no inlining decision is taken, the constant is normally declared in the produced file. \end{itemize} \asection{Realizing axioms}\label{extraction:axioms} \comindex{Link} It is possible to assume some axioms while developing a proof. Since these axioms can be any kind of proposition or object type, they may perfectly well have some computational content. But a program must be a closed term, and of course the system cannot guess the program which realizes an axiom. Therefore, it is possible to tell the system what ML term corresponds to a given axiom. Of course, it is the responsability of the user to ensure that the ML terms given to realize the axioms do have the expected types. \comindex{Extract Constant} \comindex{Extract Inductive} The system actually provides a more general mechanism to specify ML terms even for defined constants, inductive types and constructors. For instance, the user may want to use the ML native boolean type instead of \Coq\ one. The syntax is the following: \begin{description} \item{\tt Extract Constant \qualid\ => \str.} ~\par Give an ML extraction for the given constant. The \str\ may be an identifier or a quoted string. \item{\tt Extract Inlined Constant \qualid\ => \str.} ~\par Same as the previous one, except that the given ML terms will be inlined everywhere instead of being declared via a let. \item{\tt Extract Inductive \qualid\ => \str\ [ \str\ \dots \str\ ].} ~\par Give an ML extraction for the given inductive type. You must specify extractions for the type itself (first \str) and all its constructors (between square brackets). The ML extraction must be an ML recursive datatype. \end{description} \begin{Remarks} %\item % The given ML terms are always inlined; thus, it may be % useful to introduce definitions in a separate ML module and then to % use the corresponding qualified names. \item The extraction of a module depending on axioms from another module will not fail. It is the responsability of the ``extractor'' of that other module to realize the given axioms. \item Note that now, the {\tt Extract Inlined Constant} command is sugar for an {\tt Extract Constant} followed by a {\tt Extraction Inline}. So be careful with {\tt Reset Extraction Inline}. \end{Remarks} \Example Typical examples are the following: \begin{coq_example} Extract Inductive unit => unit [ "()" ]. Extract Inductive bool => bool [ true false ]. Extract Inductive sumbool => bool [ true false ]. \end{coq_example} % \asubsection{Differences between \Coq\ and ML type systems} % \subsubsection{ML types that are not \FW\ types} % Some ML recursive types have no counterpart in the type system of % \Coq, like types using the record construction, or non positive types % like % \begin{verbatim} % # type T = C of T->T;; % \end{verbatim} % In that case, you cannot import those types as inductive types, and % the only way to do is to import them as abstract types (with {\tt ML % Import}) together with the corresponding building and de-structuring % functions (still with {\tt ML Import Constant}). % \subsubsection{Programs that are not ML-typable} % On the contrary, some extracted programs in \FW\ are not typable in % ML. There are in fact two cases which can be problematic: % \begin{itemize} % \item If some part of the program is {\em very} polymorphic, there % may be no ML type for it. In that case the extraction to ML works % all right but the generated code may be refused by the ML % type-checker. A very well known example is the {\em distr-pair} % function: % $$\mbox{\tt % Definition dp := [A,B:Set][x:A][y:B][f:(C:Set)C->C](f A x,f B y). % }$$ % In Caml Light, for instance, the extracted term is % \verb!let dp x y f = pair((f x),(f y))! and has type % $$\mbox{\tt % dp : 'a -> 'a -> ('a -> 'b) -> ('b,'b) prod % }$$ % which is not its original type, but a restriction. % \item Some definitions of \FW\ may have no counterpart in ML. This % happens when there is a quantification over types inside the type % of a constructor; for example: % $$\mbox{\tt % Inductive anything : Set := dummy : (A:Set)A->anything. % }$$ % which corresponds to the definition of ML dynamics. % \end{itemize} % The first case is not too problematic: it is still possible to run the % programs by switching off the type-checker during compilation. Unless % you misused the semantical attachment facilities you should never get % any message like ``segmentation fault'' for which the extracted code % would be to blame. To switch off the Caml type-checker, use the % function {\tt obj\_\_magic} which gives the type {\tt 'a} to any % object; but this implies changing a little the extracted code by hand. % The second case is fatal. If some inductive type cannot be translated % to ML, one has to change the proof (or possibly to ``cheat'' by % some low-level manipulations we would not describe here). % We have to say, though, that in most ``realistic'' programs, these % problems do not occur. For example all the programs of the library are % accepted by Caml type-checker except {\tt Higman.v}\footnote{Should % you obtain a not ML-typable program out of a self developed example, % we would be interested in seeing it; so please mail us the example at % {\em coq@pauillac.inria.fr}}. \asection{Some examples} A more pedagogical introduction to extraction should appear here in the future. In the meanwhile you can have a look at the \Coq\ contributions. Several of them use extraction to produce certified programs. In particular the following ones have an automatic extraction test (just run \verb|make| in those directories): \begin{itemize} \item Bordeaux/Additions \item Bordeaux/EXCEPTIONS \item Bordeaux/SearchTrees \item Dyade/BDDS \item Lyon/CIRCUITS \item Lyon/FIRING-SQUAD \item Marseille/CIRCUITS \item Nancy/FOUnify \item Rocq/ARITH/Chinese \item Rocq/COC \item Rocq/GRAPHS \item Rocq/HIGMAN \item Sophia-Antipolis/Stalmarck \item Suresnes/BDD \end{itemize} Rocq/HIGMAN is a bit particular. The extracted code is normally not typable in ML due to an heavy use of impredicativity. So we realize one inductive type using an \verb|Obj.magic| that artificially gives it the good type. After compilation this example runs nonetheless, thanks to the (desired but not proved) correction of the extraction. \section{Bugs} Surely there are still bugs in the {\tt Extraction} module. You can send your bug reports directly to the authors (\textsf{Pierre.Letouzey$@$lri.fr} and \textsf{Jean-Christophe.Filliatre$@$lri.fr}) or to the \Coq\ mailing list (at \textsf{coq-bugs$@$pauillac.inria.fr}). % $Id$