\achapter{Execution of extracted programs in Objective Caml and Haskell}
\label{Extraction}
\aauthor{Jean-Christophe Filliātre and Pierre Letouzey}
\index{Extraction}

\begin{flushleft}
  \em The status of extraction is experimental. \\
  Haskell extraction is implemented, but not yet tested.
\end{flushleft}

It is possible to use \Coq\ to build certified and relatively
efficient programs, extracting them from the proofs of their
specifications. The extracted objects can be 
obtained at the \Coq\ toplevel with the command {\tt Extraction}
(see \ref{ExtractionTerm}).

We present here a \Coq\ module, {\tt Extraction}, which translates the
extracted terms to ML dialects, namely Objective Caml and Haskell. 
In the following, we will not refer to a particular dialect
when possible and ``ML'' will be used to refer to any of the target
dialects.

\Rem
The current extraction mechanism is new from version 7.0 of {\Coq}.
In particular, the \FW\ toplevel used as intermediate step between 
\Coq\ and ML has been abandoned.  It is also not possible 
any more to import ML objects in this \FW\ toplevel.
The current mechanism also differs from
the one in previous versions of \Coq: there is no more
an explicit toplevel for the language (formerly called {\sf Fml}). 

\medskip
In the first part of this document we describe the commands of the
{\tt Extraction} module, and we give some examples in the second part.

\asection{Generating ML code}
\comindex{Extraction}
\comindex{Recursive Extraction}
\comindex{Extraction Module}

There are many different extraction commands, that can be used for
rapid preview (section \ref{extraction:com-top}), for generating 
real Ocaml code (section \ref{extraction:com-ocaml}) or for generating
real Haskell code (section \ref{extraction:com-haskell}).


\asubsection{Preview within \Coq\ toplevel}\label{extraction:com-top}

The next two commands are meant to be used for rapid preview of
extraction. They both display extracted term(s) inside \Coq\, using 
an Ocaml syntax. Globals are printed as in the \Coq\ toplevel 
(thus without any renaming). As a consequence, note that the output 
cannot be copy-pasted directly into an Ocaml toplevel. 

\begin{description}
\item {\tt Extraction \term.} ~\par
  Extracts one term in the \Coq\ toplevel.

\item {\tt Recursive Extraction  \qualid$_1$ \dots\ \qualid$_n$.} ~\par
  Recursive extraction of all the globals \qualid$_1$ \dots\
  \qualid$_n$ and all their dependencies in the \Coq\ toplevel.
\end{description}

\asubsection{Generating real Ocaml files}\label{extraction:com-ocaml}

All following commands produce real Ocaml files. User can choose to produce
one monolithic file or one file per \Coq\ module. 

\begin{description}
\item {\tt Extraction "{\em file}"}  
      \qualid$_1$ \dots\ \qualid$_n$. ~\par
  Recursive extraction of all the globals \qualid$_1$ \dots\
  \qualid$_n$ and all their dependencies in one monolithic file {\em file}.
  Global and local identifiers are renamed accordingly to the Ocaml
  language to fullfill its syntactic conventions, keeping original
  names as much as possible.

\item  {\tt Extraction Module} \ident. ~\par
  Extraction of the \Coq\ module \ident\ to an ML module {\tt\ident.ml}.
  In case of name clash, identifiers are here renamed using prefixes
  \verb!coq_! 
  or \verb!Coq_! to ensure a session-independent renaming.

\item {\tt Recursive Extraction Module} \ident. ~\par
  Extraction of the \Coq\ module \ident\ and all other modules \ident\
  depends on. 
\end{description}

The list of globals \qualid$_i$ does not need to be
exhaustive: it is automatically completed into a complete and minimal
environment. Extraction will fail if it encounters an informative
axiom not realized (see section \ref{extraction:axioms}).


\asubsection{Generating real Haskell files}\label{extraction:com-haskell}

The commands generating Haskell code are similar to those generating
Ocaml. A prefix ``Haskell'' is just added, and syntactic conventions
are Haskell's ones. 

\begin{description}
\item {\tt Haskell Extraction "{\em file}"}  
      \qualid$_1$ \dots\ \qualid$_n$. ~\par
\item  {\tt Haskell Extraction Module} \ident. ~\par
\item {\tt Haskell Recursive Extraction Module} \ident. ~\par
\end{description}

\asection{Extraction options and optimizations}\label{extraction:com-options}


Since Objective Caml is a strict language, the extracted
code has to be optimized in order to be efficient (for instance, when
using induction principles we do not want to compute all the recursive
calls but only the needed ones). So the extraction mechanism provides
an automatic optimization routine that will be
called each time the user want to generate Ocaml programs. Essentially,
it performs constants inlining and reductions.  Therefore some
constants may not appear in resulting monolithic Ocaml program (A warning is
printed for each such constant). In the case of modular extraction, 
even if some inling is done, the inlined constant are nevertheless
printed, to ensure session-independent programs. 

Concerning Haskell, such optimizations are less useful because of
lazyness. We still make some optimizations, for example in order to
produce more readable code. 

All these optimizations are controled by the following \Coq\ options: 

\begin{description}

\item {\tt Set Extraction Optimize.}
\item {\tt Unset Extraction Optimize.} ~\par

Default is Set. This control all optimizations made on the ML terms 
(mostly reduction of dummy beta/iota redexes, but also simplications on 
Cases, etc). Put this option to Unset if you what a ML term as close as 
possible to the Coq term.

\item {\tt Set Extraction AutoInline.}
\item {\tt Unset Extraction AutoInline}. ~\par

Default is Set, so by default, the extraction mechanism feels free to 
inline the bodies of some defined constants, according to some heuristics 
like size of bodies, useness of some arguments, etc. Those heuristics are 
not always perfect, you may want to disable this feature, do it by Unset. 

\item {\tt Extraction Inline} \qualid$_1$ \dots\ \qualid$_n$. ~\par 
\item {\tt Extraction NoInline} \qualid$_1$ \dots\ \qualid$_n$. ~\par

In addition to the automatic inline feature, you can now tell precisely to 
inline some more constants by the {\tt Extraction Inline} command. Conversely, 
you can forbid the automatic inlining of some specific constants by
the {\tt Extraction NoInline} command.
Those two commands enable a precise control of what is inlined and what is not. 

\item {\tt Print Extraction Inline}. ~\par

Prints the current state of the table recording the custom inlinings 
declared by the two previous commands. 

\item {\tt Reset Extraction Inline}. ~\par

Puts the table recording the custom inlinings back to empty. 

\end{description}


\paragraph{Inlining and printing of a constant declaration.}


A user can explicitely asks a constant to be extracted by two means: 
\begin{itemize}
\item by mentioning it on the extraction command line
\item by extracting the whole \Coq\ module of this constant.
\end{itemize}
In both cases, the declaration of this constant will be present in the
produced file. 
But this same constant may or may not be inlined in the following
terms, depending on the automatic/custom inling mechanism.  


For the constants non-explicitely required but needed for dependancy
reasons, there are two cases: 
\begin{itemize}
\item If an inlining decision is taken, wether automatically or not, 
all occurences of this constant are replaced by its extracted body, and
this constant is not declared in the generated file.
\item If no inlining decision is taken, the constant is normally
  declared in the produced file. 
\end{itemize}


\asection{Realizing axioms}\label{extraction:axioms}
\comindex{Link}

It is possible to assume some axioms while developing a proof. Since
these axioms can be any kind of proposition or object type, they may
perfectly well have some computational content. But a program must be
a closed term, and of course the system cannot guess the program which
realizes an axiom.  Therefore, it is possible to tell the system
what ML term corresponds to a given axiom. Of course, it is the
responsability of the user to ensure that the ML terms given to
realize the axioms do have the expected types.

\comindex{Extract Constant}
\comindex{Extract Inductive}
The system actually provides a more general mechanism to specify ML
terms even for defined constants, inductive types and constructors.
For instance, the user may want to use the ML native boolean type
instead of \Coq\ one.
The syntax is the following:
\begin{description}
\item{\tt Extract Constant \qualid\ => \str.} ~\par
  Give an ML extraction for the given constant.
  The \str\ may be an identifier or a quoted string.
\item{\tt Extract Inlined Constant \qualid\ => \str.} ~\par
  Same as the previous one, except that the given ML terms will
  be inlined everywhere instead of being declared via a let.
\item{\tt Extract Inductive \qualid\ => \str\ [ \str\ \dots \str\ ].} ~\par
  Give an ML extraction for the given inductive type. You must specify
  extractions for the type itself (first \str) and all its
  constructors (between square brackets). The ML extraction must be an
  ML recursive datatype.
\end{description}

\begin{Remarks}
%\item 
%  The given ML terms are always inlined; thus, it may be
%  useful to introduce definitions in a separate ML module and then to
%  use the corresponding qualified names.
\item
  The extraction of a module depending on axioms from another module
  will not fail. It is the responsability of the ``extractor'' of that
  other module to realize the given axioms.
\item 
  Note that now, the {\tt Extract Inlined Constant} command is sugar
  for an {\tt Extract Constant} followed by a {\tt Extraction Inline}. 
  So be careful with {\tt Reset Extraction Inline}. 
\end{Remarks}

\Example
Typical examples are the following:
\begin{coq_example}
Extract Inductive unit => unit [ "()" ].
Extract Inductive bool => bool [ true false ].
Extract Inductive sumbool => bool [ true false ].
\end{coq_example}


% \asubsection{Differences between \Coq\ and ML type systems}

% \subsubsection{ML types that are not \FW\ types}

% Some ML recursive types have no counterpart in the type system of
% \Coq, like types using the record construction, or non positive types
% like
% \begin{verbatim}
% # type T = C of T->T;;
% \end{verbatim}
% In that case, you cannot import those types as inductive types, and
% the only way to do is to import them as abstract types (with {\tt ML
% Import}) together with the corresponding building and de-structuring
% functions (still with {\tt ML Import Constant}).


% \subsubsection{Programs that are not ML-typable}

% On the contrary, some extracted programs in \FW\ are not typable in
% ML. There are in fact two cases which can be problematic:
% \begin{itemize}
%   \item If some part of the program is {\em very} polymorphic, there
%     may be no ML type for it. In that case the extraction to ML works
%     all right but the generated code may be refused by the ML
%     type-checker. A very well known example is the {\em distr-pair}
%     function:
% $$\mbox{\tt
% Definition dp := [A,B:Set][x:A][y:B][f:(C:Set)C->C](f A x,f B y).
% }$$
% In Caml Light, for instance, the extracted term is 
% \verb!let dp x y f = pair((f x),(f y))!  and has type
% $$\mbox{\tt
% dp : 'a -> 'a -> ('a -> 'b) -> ('b,'b) prod
% }$$
% which is not its original type, but a restriction.

%   \item Some definitions of \FW\ may have no counterpart in ML. This
%     happens when there is a quantification over types inside the type
%     of a constructor; for example:
% $$\mbox{\tt
% Inductive anything : Set := dummy : (A:Set)A->anything.
% }$$
% which corresponds to the definition of ML dynamics.
% \end{itemize}

% The first case is not too problematic: it is still possible to run the
% programs by switching off the type-checker during compilation. Unless
% you misused the semantical attachment facilities you should never get
% any message like ``segmentation fault'' for which the extracted code
% would be to blame. To switch off the Caml type-checker, use the
% function {\tt obj\_\_magic} which gives the type {\tt 'a} to any
% object; but this implies changing a little the extracted code by hand.

% The second case is fatal. If some inductive type cannot be translated
% to ML, one has to change the proof (or possibly to ``cheat'' by
% some low-level manipulations we would not describe here). 

% We have to say, though, that in most ``realistic'' programs, these
% problems do not occur. For example all the programs of the library are
% accepted by Caml type-checker except {\tt Higman.v}\footnote{Should
%   you obtain a not ML-typable program out of a self developed example,
%   we would be interested in seeing it; so please mail us the example at
%   {\em coq@pauillac.inria.fr}}.


\asection{Some examples}

 A more pedagogical introduction to extraction should appear here in
 the future. In the meanwhile you can have a look at the \Coq\
 contributions. Several of them use extraction to produce certified 
 programs. In particular the following ones have an automatic extraction
 test (just run \verb|make| in those directories): 

 \begin{itemize}
 \item Bordeaux/Additions
 \item Bordeaux/EXCEPTIONS
 \item Bordeaux/SearchTrees
 \item Dyade/BDDS
 \item Lyon/CIRCUITS
 \item Lyon/FIRING-SQUAD
 \item Marseille/CIRCUITS
 \item Nancy/FOUnify
 \item Rocq/ARITH/Chinese
 \item Rocq/COC
 \item Rocq/GRAPHS
 \item Rocq/HIGMAN
 \item Sophia-Antipolis/Stalmarck
 \item Suresnes/BDD
 \end{itemize}

 Rocq/HIGMAN is a bit particular. The extracted code is normally not typable
 in ML due to an heavy use of impredicativity. So we realize one
 inductive type using an \verb|Obj.magic| that artificially gives it
 the good type. After compilation this example runs nonetheless,
 thanks to the (desired but not proved) correction of the extraction. 

\section{Bugs}

Surely there are still bugs in the {\tt Extraction} module.  You can
send your bug reports directly to the authors
(\textsf{Pierre.Letouzey$@$lri.fr} and
\textsf{Jean-Christophe.Filliatre$@$lri.fr}) or to the \Coq\ mailing
list (at \textsf{coq-bugs$@$pauillac.inria.fr}).

% $Id$