Merge PR #7041: Sphinx doc chapter 23

author: Théo Zimmermann 2018-04-04 09:46:30 +0200
committer: Théo Zimmermann 2018-04-04 09:46:30 +0200
commit: 7e51ffdaf4340a67c254be7800eb8c68c5d78f2c (patch)
tree: ef53343b5a9494c42ccb5d956716f50619852c6a /doc/sphinx
parent: bea7f8f6a8f4cf902c6f296352d422bc81b7b8a9 (diff)
parent: 1fa0a403cd167796fd082be87828cec610a209ca (diff)
2 files changed, 587 insertions, 0 deletions
diff --git a/doc/sphinx/addendum/extraction.rst b/doc/sphinx/addendum/extraction.rst
new file mode 100644
index 0000000000..d7f97edab1
--- /dev/null
+++ b/doc/sphinx/addendum/extraction.rst
@@ -0,0 +1,586 @@
+.. _extraction:
+
+.. include:: ../replaces.rst
+
+Extraction of programs in OCaml and Haskell
+============================================
+
+:Authors: Jean-Christophe Filliâtre and Pierre Letouzey
+
+We present here the |Coq| extraction commands, used to build certified
+and relatively efficient functional programs, extracting them from
+either |Coq| functions or |Coq| proofs of specifications. The
+functional languages available as output are currently OCaml, Haskell
+and Scheme. In the following, "ML" will be used (abusively) to refer
+to any of the three.
+
+Before using any of the commands or options described in this chapter,
+the extraction framework should first be loaded explicitly
+via ``Require Extraction``, or via the more robust
+``From Coq Require Extraction``.
+Note that in earlier versions of Coq, these commands and options were
+directly available without any preliminary ``Require``.
+
+.. coqtop:: in
+
+   Require Extraction.
+
+Generating ML Code
+-------------------
+
+.. note::
+
+  In the following, a qualified identifier `qualid`
+  can be used to refer to any kind of |Coq| global "object" : constant,
+  inductive type, inductive constructor or module name.
+
+The next two commands are meant to be used for rapid preview of
+extraction. They both display extracted term(s) inside |Coq|.
+
+.. cmd:: Extraction @qualid.
+
+   Extraction of the mentioned object in the |Coq| toplevel.
+
+.. cmd:: Recursive Extraction @qualid ... @qualid.
+
+   Recursive extraction of all the mentioned objects and
+   all their dependencies in the |Coq| toplevel.
+
+All the following commands produce real ML files. User can choose to
+produce one monolithic file or one file per |Coq| library.
+
+.. cmd:: Extraction "@file" @qualid ... @qualid.
+
+   Recursive extraction of all the mentioned objects and all
+   their dependencies in one monolithic `file`.
+   Global and local identifiers are renamed according to the chosen ML
+   language to fulfill its syntactic conventions, keeping original
+   names as much as possible.
+  
+.. cmd:: Extraction Library @ident.
+
+   Extraction of the whole |Coq| library ``ident.v`` to an ML module
+   ``ident.ml``. In case of name clash, identifiers are here renamed
+   using prefixes ``coq_``  or ``Coq_`` to ensure a session-independent
+   renaming.
+
+.. cmd:: Recursive Extraction Library @ident.
+
+   Extraction of the |Coq| library ``ident.v`` and all other modules 
+   ``ident.v`` depends on.
+
+.. cmd:: Separate Extraction @qualid ... @qualid.
+
+   Recursive extraction of all the mentioned objects and all
+   their dependencies, just as ``Extraction "file"``,
+   but instead of producing one monolithic file, this command splits
+   the produced code in separate ML files, one per corresponding Coq
+   ``.v`` file. This command is hence quite similar to
+   ``Recursive Extraction Library``, except that only the needed
+   parts of Coq libraries are extracted instead of the whole.
+   The naming convention in case of name clash is the same one as
+   ``Extraction Library``: identifiers are here renamed using prefixes
+   ``coq_``  or ``Coq_``.
+
+The following command is meant to help automatic testing of
+the extraction, see for instance the ``test-suite`` directory
+in the |Coq| sources.
+
+.. cmd:: Extraction TestCompile @qualid ... @qualid.
+
+   All the mentioned objects and all their dependencies are extracted
+   to a temporary OCaml file, just as in ``Extraction "file"``. Then
+   this temporary file and its signature are compiled with the same
+   OCaml compiler used to built |Coq|. This command succeeds only
+   if the extraction and the OCaml compilation succeed. It fails
+   if the current target language of the extraction is not OCaml.
+
+Extraction Options
+-------------------
+
+Setting the target language
+~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The ability to fix target language is the first and more important
+of the extraction options. Default is ``Ocaml``.
+
+.. cmd:: Extraction Language Ocaml.
+.. cmd:: Extraction Language Haskell.
+.. cmd:: Extraction Language Scheme.
+
+Inlining and optimizations
+~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Since OCaml is a strict language, the extracted code has to
+be optimized in order to be efficient (for instance, when using
+induction principles we do not want to compute all the recursive calls
+but only the needed ones). So the extraction mechanism provides an
+automatic optimization routine that will be called each time the user
+want to generate OCaml programs. The optimizations can be split in two
+groups: the type-preserving ones (essentially constant inlining and
+reductions) and the non type-preserving ones (some function
+abstractions of dummy types are removed when it is deemed safe in order
+to have more elegant types). Therefore some constants may not appear in the
+resulting monolithic OCaml program. In the case of modular extraction,
+even if some inlining is done, the inlined constant are nevertheless
+printed, to ensure session-independent programs.
+
+Concerning Haskell, type-preserving optimizations are less useful
+because of laziness. We still make some optimizations, for example in
+order to produce more readable code.
+
+The type-preserving optimizations are controlled by the following |Coq| options:
+
+.. opt:: Extraction Optimize.
+
+   Default is on. This controls all type-preserving optimizations made on
+   the ML terms (mostly reduction of dummy beta/iota redexes, but also
+   simplifications on Cases, etc). Turn this option off if you want a
+   ML term as close as possible to the Coq term.
+
+.. opt:: Extraction Conservative Types.
+
+   Default is off. This controls the non type-preserving optimizations
+   made on ML terms (which try to avoid function abstraction of dummy
+   types). Turn this option on to make sure that ``e:t``
+   implies that ``e':t'`` where ``e'`` and ``t'`` are the extracted
+   code of ``e`` and ``t`` respectively.
+
+.. opt:: Extraction KeepSingleton.
+
+   Default is off. Normally, when the extraction of an inductive type
+   produces a singleton type (i.e. a type with only one constructor, and
+   only one argument to this constructor), the inductive structure is
+   removed and this type is seen as an alias to the inner type.
+   The typical example is ``sig``. This option allows disabling this
+   optimization when one wishes to preserve the inductive structure of types.
+
+.. opt:: Extraction AutoInline.
+
+   Default is on. The extraction mechanism inlines the bodies of
+   some defined constants, according to some heuristics
+   like size of bodies, uselessness of some arguments, etc.
+   Those heuristics are not always perfect; if you want to disable
+   this feature, turn this option off.
+
+.. cmd:: Extraction Inline @qualid ... @qualid.
+
+   In addition to the automatic inline feature, the constants
+   mentionned by this command will always be inlined during extraction.
+
+.. cmd:: Extraction NoInline @qualid ... @qualid.
+
+   Conversely, the constants mentionned by this command will
+   never be inlined during extraction.
+
+.. cmd:: Print Extraction Inline. 
+
+   Prints the current state of the table recording the custom inlinings 
+   declared by the two previous commands. 
+
+.. cmd:: Reset Extraction Inline.
+
+   Empties the table recording the custom inlinings (see the
+   previous commands).
+
+**Inlining and printing of a constant declaration:**
+
+A user can explicitly ask for a constant to be extracted by two means:
+
+  * by mentioning it on the extraction command line
+
+  * by extracting the whole |Coq| module of this constant.
+
+In both cases, the declaration of this constant will be present in the
+produced file. But this same constant may or may not be inlined in
+the following terms, depending on the automatic/custom inlining mechanism.  
+
+For the constants non-explicitly required but needed for dependency
+reasons, there are two cases: 
+
+  * If an inlining decision is taken, whether automatically or not,
+    all occurrences of this constant are replaced by its extracted body,
+    and this constant is not declared in the generated file.
+
+  * If no inlining decision is taken, the constant is normally
+    declared in the produced file. 
+
+Extra elimination of useless arguments
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The following command provides some extra manual control on the
+code elimination performed during extraction, in a way which
+is independent but complementary to the main elimination
+principles of extraction (logical parts and types).
+
+.. cmd:: Extraction Implicit @qualid [ @ident ... @ident ].
+
+   This experimental command allows declaring some arguments of
+   `qualid` as implicit, i.e. useless in extracted code and hence to
+   be removed by extraction. Here `qualid` can be any function or
+   inductive constructor, and the given `ident` are the names of
+   the concerned arguments. In fact, an argument can also be referred
+   by a number indicating its position, starting from 1.
+
+When an actual extraction takes place, an error is normally raised if the
+``Extraction Implicit`` declarations cannot be honored, that is
+if any of the implicited variables still occurs in the final code.
+This behavior can be relaxed via the following option:
+
+.. opt:: Extraction SafeImplicits.
+
+   Default is on. When this option is off, a warning is emitted
+   instead of an error if some implicited variables still occur in the
+   final code of an extraction. This way, the extracted code may be
+   obtained nonetheless and reviewed manually to locate the source of the issue
+   (in the code, some comments mark the location of these remaining
+   implicited variables).
+   Note that this extracted code might not compile or run properly,
+   depending of the use of these remaining implicited variables.
+
+Realizing axioms
+~~~~~~~~~~~~~~~~
+
+Extraction will fail if it encounters an informative axiom not realized. 
+A warning will be issued if it encounters a logical axiom, to remind the
+user that inconsistent logical axioms may lead to incorrect or
+non-terminating extracted terms. 
+
+It is possible to assume some axioms while developing a proof. Since
+these axioms can be any kind of proposition or object or type, they may
+perfectly well have some computational content. But a program must be
+a closed term, and of course the system cannot guess the program which
+realizes an axiom.  Therefore, it is possible to tell the system
+what ML term corresponds to a given axiom. 
+
+.. cmd:: Extract Constant @qualid => @string.
+
+   Give an ML extraction for the given constant.
+   The `string` may be an identifier or a quoted string.
+
+.. cmd:: Extract Inlined Constant @qualid => @string.
+
+   Same as the previous one, except that the given ML terms will
+   be inlined everywhere instead of being declared via a ``let``.
+
+   .. note::
+
+   This command is sugar for an ``Extract Constant`` followed
+   by a ``Extraction Inline``. Hence a ``Reset Extraction Inline``
+   will have an effect on the realized and inlined axiom.
+
+.. caution:: It is the responsibility of the user to ensure that the ML
+   terms given to realize the axioms do have the expected types. In
+   fact, the strings containing realizing code are just copied to the
+   extracted files. The extraction recognizes whether the realized axiom
+   should become a ML type constant or a ML object declaration. For example:
+
+.. coqtop:: in
+
+   Axiom X:Set.
+   Axiom x:X.
+   Extract Constant X => "int".
+   Extract Constant x => "0".
+
+Notice that in the case of type scheme axiom (i.e. whose type is an
+arity, that is a sequence of product finished by a sort), then some type
+variables have to be given (as quoted strings). The syntax is then:
+
+.. cmdv:: Extract Constant @qualid @string ... @string => @string.
+
+The number of type variables is checked by the system. For example:
+
+.. coqtop:: in
+
+   Axiom Y : Set -> Set -> Set.
+   Extract Constant Y "'a" "'b" => " 'a * 'b ".
+
+Realizing an axiom via ``Extract Constant`` is only useful in the
+case of an informative axiom (of sort ``Type`` or ``Set``). A logical axiom
+have no computational content and hence will not appears in extracted
+terms. But a warning is nonetheless issued if extraction encounters a
+logical axiom. This warning reminds user that inconsistent logical
+axioms may lead to incorrect or non-terminating extracted terms.
+
+If an informative axiom has not been realized before an extraction, a
+warning is also issued and the definition of the axiom is filled with
+an exception labeled ``AXIOM TO BE REALIZED``. The user must then
+search these exceptions inside the extracted file and replace them by
+real code.
+
+Realizing inductive types
+~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The system also provides a mechanism to specify ML terms for inductive
+types and constructors. For instance, the user may want to use the ML
+native boolean type instead of |Coq| one. The syntax is the following:
+
+.. cmd:: Extract Inductive @qualid => @string [ @string ... @string ].
+
+   Give an ML extraction for the given inductive type. You must specify
+   extractions for the type itself (first `string`) and all its
+   constructors (all the `string` between square brackets). In this form,
+   the ML extraction must be an ML inductive datatype, and the native
+   pattern-matching of the language will be used.
+
+.. cmdv:: Extract Inductive @qualid => @string [ @string ... @string ] @string.
+
+   Same as before, with a final extra `string` that indicates how to
+   perform pattern-matching over this inductive type. In this form,
+   the ML extraction could be an arbitrary type.
+   For an inductive type with `k` constructors, the function used to
+   emulate the pattern-matching should expect `(k+1)` arguments, first the `k`
+   branches in functional form, and then the inductive element to
+   destruct. For instance, the match branch ``| S n => foo`` gives the
+   functional form ``(fun n -> foo)``. Note that a constructor with no
+   argument is considered to have one unit argument, in order to block
+   early evaluation of the branch: ``| O => bar`` leads to the functional
+   form ``(fun () -> bar)``. For instance, when extracting ``nat``
+   into OCaml ``int``, the code to provide has type:
+   ``(unit->'a)->(int->'a)->int->'a``.
+
+.. caution:: As for ``Extract Constant``, this command should be used with care:
+
+  * The ML code provided by the user is currently **not** checked at all by
+    extraction, even for syntax errors.
+
+  * Extracting an inductive type to a pre-existing ML inductive type
+    is quite sound. But extracting to a general type (by providing an
+    ad-hoc pattern-matching) will often **not** be fully rigorously
+    correct. For instance, when extracting ``nat`` to OCaml ``int``,
+    it is theoretically possible to build ``nat`` values that are
+    larger than OCaml ``max_int``. It is the user's responsibility to
+    be sure that no overflow or other bad events occur in practice.
+
+  * Translating an inductive type to an arbitrary ML type does **not**
+    magically improve the asymptotic complexity of functions, even if the
+    ML type is an efficient representation. For instance, when extracting
+    ``nat`` to OCaml ``int``, the function ``Nat.mul`` stays quadratic.
+    It might be interesting to associate this translation with
+    some specific ``Extract Constant`` when primitive counterparts exist.
+
+Typical examples are the following:
+
+.. coqtop:: in
+    
+   Extract Inductive unit => "unit" [ "()" ].
+   Extract Inductive bool => "bool" [ "true" "false" ].
+   Extract Inductive sumbool => "bool" [ "true" "false" ].
+
+.. note::
+
+   When extracting to Ocaml, if an inductive constructor or type has arity 2 and
+   the corresponding string is enclosed by parentheses, and the string meets
+   Ocaml's lexical criteria for an infix symbol, then the rest of the string is
+   used as infix constructor or type.
+
+.. coqtop:: in
+   
+   Extract Inductive list => "list" [ "[]" "(::)" ].
+   Extract Inductive prod => "(*)"  [ "(,)" ].
+
+As an example of translation to a non-inductive datatype, let's turn
+``nat`` into OCaml ``int`` (see caveat above):
+
+.. coqtop:: in
+
+   Extract Inductive nat => int [ "0" "succ" ] "(fun fO fS n -> if n=0 then fO () else fS (n-1))".
+
+Avoiding conflicts with existing filenames
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+When using ``Extraction Library``, the names of the extracted files
+directly depends from the names of the |Coq| files. It may happen that
+these filenames are in conflict with already existing files, 
+either in the standard library of the target language or in other
+code that is meant to be linked with the extracted code. 
+For instance the module ``List`` exists both in |Coq| and in OCaml.
+It is possible to instruct the extraction not to use particular filenames.
+
+.. cmd:: Extraction Blacklist @ident ... @ident.
+
+   Instruct the extraction to avoid using these names as filenames
+   for extracted code.
+
+.. cmd:: Print Extraction Blacklist.
+
+   Show the current list of filenames the extraction should avoid.
+
+.. cmd:: Reset Extraction Blacklist.
+
+   Allow the extraction to use any filename.
+
+For OCaml, a typical use of these commands is
+``Extraction Blacklist String List``.
+
+Differences between |Coq| and ML type systems
+----------------------------------------------
+
+Due to differences between |Coq| and ML type systems, 
+some extracted programs are not directly typable in ML. 
+We now solve this problem (at least in OCaml) by adding 
+when needed some unsafe casting ``Obj.magic``, which give
+a generic type ``'a`` to any term.
+
+First, if some part of the program is *very* polymorphic, there
+may be no ML type for it. In that case the extraction to ML works
+alright but the generated code may be refused by the ML
+type-checker. A very well known example is the ``distr-pair``
+function:
+
+.. coqtop:: in
+
+   Definition dp {A B:Type}(x:A)(y:B)(f:forall C:Type, C->C) := (f A x, f B y).
+
+In Ocaml, for instance, the direct extracted term would be::
+
+   let dp x y f = Pair((f () x),(f () y))
+
+and would have type::
+
+   dp : 'a -> 'a -> (unit -> 'a -> 'b) -> ('b,'b) prod
+
+which is not its original type, but a restriction.
+
+We now produce the following correct version::
+
+   let dp x y f = Pair ((Obj.magic f () x), (Obj.magic f () y))
+
+Secondly, some |Coq| definitions may have no counterpart in ML. This
+happens when there is a quantification over types inside the type
+of a constructor; for example:
+
+.. coqtop:: in
+
+   Inductive anything : Type := dummy : forall A:Set, A -> anything.
+
+which corresponds to the definition of an ML dynamic type.
+In OCaml, we must cast any argument of the constructor dummy
+(no GADT are produced yet by the extraction).
+
+Even with those unsafe castings, you should never get error like
+``segmentation fault``. In fact even if your program may seem
+ill-typed to the Ocaml type-checker, it can't go wrong : it comes 
+from a Coq well-typed terms, so for example inductive types will always 
+have the correct number of arguments, etc. Of course, when launching
+manually some extracted function, you should apply it to arguments
+of the right shape (from the |Coq| point-of-view).
+
+More details about the correctness of the extracted programs can be 
+found in :cite:`Let02`.
+
+We have to say, though, that in most "realistic" programs, these problems do not
+occur. For example all the programs of Coq library are accepted by the OCaml
+type-checker without any ``Obj.magic`` (see examples below).
+
+Some examples
+-------------
+
+We present here two examples of extractions, taken from the 
+|Coq| Standard Library. We choose OCaml as target language, 
+but all can be done in the other dialects with slight modifications.
+We then indicate where to find other examples and tests of extraction.
+
+A detailed example: Euclidean division
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The file ``Euclid`` contains the proof of Euclidean division.
+The natural numbers used there are unary integers of type ``nat``,
+defined by two constructors ``O`` and ``S``.
+This module contains a theorem ``eucl_dev``, whose type is::
+
+   forall b:nat, b > 0 -> forall a:nat, diveucl a b
+
+where ``diveucl`` is a type for the pair of the quotient and the
+modulo, plus some logical assertions that disappear during extraction.
+We can now extract this program to OCaml:
+
+.. coqtop:: none
+
+   Reset Initial.
+
+.. coqtop:: all
+
+   Require Extraction.
+   Require Import Euclid Wf_nat.
+   Extraction Inline gt_wf_rec lt_wf_rec induction_ltof2.
+   Recursive Extraction eucl_dev.
+
+The inlining of ``gt_wf_rec`` and others is not
+mandatory. It only enhances readability of extracted code.
+You can then copy-paste the output to a file ``euclid.ml`` or let 
+|Coq| do it for you with the following command::
+
+   Extraction "euclid" eucl_dev.
+
+Let us play the resulting program (in an OCaml toplevel)::
+
+   #use "euclid.ml";;
+   type nat = O | S of nat
+   type sumbool = Left | Right
+   val sub : nat -> nat -> nat = <fun>
+   val le_lt_dec : nat -> nat -> sumbool = <fun>
+   val le_gt_dec : nat -> nat -> sumbool = <fun>
+   type diveucl = Divex of nat * nat
+   val eucl_dev : nat -> nat -> diveucl = <fun>
+
+   # eucl_dev (S (S O)) (S (S (S (S (S O)))));;
+   - : diveucl = Divex (S (S O), S O)
+
+It is easier to test on OCaml integers::
+
+   # let rec nat_of_int = function 0 -> O | n -> S (nat_of_int (n-1));;
+   val nat_of_int : int -> nat = <fun>
+
+   # let rec int_of_nat = function O -> 0 | S p -> 1+(int_of_nat p);;
+   val int_of_nat : nat -> int = <fun>
+
+   # let div a b = 
+     let Divex (q,r) = eucl_dev (nat_of_int b) (nat_of_int a)
+     in (int_of_nat q, int_of_nat r);;
+   val div : int -> int -> int * int = <fun>
+
+   # div 173 15;;
+   - : int * int = (11, 8)
+
+Note that these ``nat_of_int`` and ``int_of_nat`` are now
+available via a mere ``Require Import ExtrOcamlIntConv`` and then
+adding these functions to the list of functions to extract. This file
+``ExtrOcamlIntConv.v`` and some others in ``plugins/extraction/``
+are meant to help building concrete program via extraction.
+
+Extraction's horror museum
+~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Some pathological examples of extraction are grouped in the file
+``test-suite/success/extraction.v`` of the sources of |Coq|.
+
+Users' Contributions
+~~~~~~~~~~~~~~~~~~~~
+
+Several of the |Coq| Users' Contributions use extraction to produce
+certified programs. In particular the following ones have an automatic
+extraction test:
+
+ * ``additions`` : https://github.com/coq-contribs/additions
+ * ``bdds`` : https://github.com/coq-contribs/bdds
+ * ``canon-bdds`` : https://github.com/coq-contribs/canon-bdds
+ * ``chinese`` : https://github.com/coq-contribs/chinese
+ * ``continuations`` : https://github.com/coq-contribs/continuations
+ * ``coq-in-coq`` : https://github.com/coq-contribs/coq-in-coq
+ * ``exceptions`` : https://github.com/coq-contribs/exceptions
+ * ``firing-squad`` : https://github.com/coq-contribs/firing-squad
+ * ``founify`` : https://github.com/coq-contribs/founify
+ * ``graphs`` : https://github.com/coq-contribs/graphs
+ * ``higman-cf`` : https://github.com/coq-contribs/higman-cf
+ * ``higman-nw`` : https://github.com/coq-contribs/higman-nw
+ * ``hardware`` : https://github.com/coq-contribs/hardware
+ * ``multiplier`` : https://github.com/coq-contribs/multiplier
+ * ``search-trees`` : https://github.com/coq-contribs/search-trees
+ * ``stalmarck`` : https://github.com/coq-contribs/stalmarck
+
+Note that ``continuations`` and ``multiplier`` are a bit particular. They are
+examples of developments where ``Obj.magic`` are needed. This is
+probably due to an heavy use of impredicativity. After compilation, those
+two examples run nonetheless, thanks to the correction of the
+extraction :cite:`Let02`.
diff --git a/doc/sphinx/index.rst b/doc/sphinx/index.rst
index c875cb7d5c..6f769a4d33 100644
--- a/doc/sphinx/index.rst
+++ b/doc/sphinx/index.rst
@@ -49,6 +49,7 @@ Table of contents
    addendum/type-classes
    addendum/omega
    addendum/micromega
+   addendum/extraction
 
 .. toctree::
    :caption: Reference
author	Théo Zimmermann	2018-04-04 09:46:30 +0200
committer	Théo Zimmermann	2018-04-04 09:46:30 +0200
commit	7e51ffdaf4340a67c254be7800eb8c68c5d78f2c (patch)
tree	ef53343b5a9494c42ccb5d956716f50619852c6a /doc/sphinx
parent	bea7f8f6a8f4cf902c6f296352d422bc81b7b8a9 (diff)
parent	1fa0a403cd167796fd082be87828cec610a209ca (diff)