aboutsummaryrefslogtreecommitdiff
path: root/parsing/cLexer.mli
AgeCommit message (Collapse)Author
2020-10-30Renaming Numeral into NumberPierre Roux
2020-03-25[parsing] Move comments lexer extensions to base lexer interfaceEmilio Jesus Gallego Arias
This makes sense as a step towards a more functional handling of the state.
2020-03-25[parsing] Move `coq_parsable` custom logic to Grammar.Emilio Jesus Gallego Arias
Latest refactorings allow us to make the signature Coq parser a standard `Grammar.S` one; the only bit needed is to provide the extra capabilities to the `Lexer` signature w.r.t. to comments state. The handling of Lexer state is still a bit ad-hoc, in particular it is global whereas it should be fully attached to the parsable. This may work ok in batch mode but the current behavior may be buggy in the interactive context. This PR doesn't solve that but it is a step towards a more functional solution.
2020-03-22Centralizing all kinds of numeral string management in numTok.ml.Hugo Herbelin
Four types of numerals are introduced: - positive natural numbers (may include "_", e.g. to separate thousands, and leading 0) - integer numbers (may start with a minus sign) - positive numbers with mantisse and signed exponent - signed numbers with mantisse and signed exponent In passing, we clarify that the lexer parses only positive numerals, but the numeral interpreters may accept signed numerals. Several improvements and fixes come from Pierre Roux. See https://github.com/coq/coq/pull/11703 for details. Thanks to him.
2020-03-18Update headers in the whole code base.Théo Zimmermann
Add headers to a few files which were missing them.
2019-06-17Update ml-style headers to new year.Théo Zimmermann
2019-04-02Add parsing of decimal constants (e.g., 1.02e+01)Pierre Roux
Rather than integers '[0-9]+', numeral constant can now be parsed according to the regexp '[0-9]+ ([.][0-9]+)? ([eE][+-]?[0-9]+)?'. This can be used in one of the two following ways: - using the function `Notation.register_rawnumeral_interpreter` in an OCaml plugin - using `Numeral Notation` with the type `decimal` added to `Decimal.v` See examples of each use case in the next two commits.
2019-03-31Multiple payload types in tokensPierre Roux
Instead of just string (and empty strings for tokens without payload)
2019-03-31documentationEnrico Tassi
2019-03-31[parsing] add keyword-protected generic quotationEnrico Tassi
One can now register a quotation using a grammar rule with QUOTATION("name:"). "name:" becomes a keyword and the token is generated for name: followed by a an identifier or a parenthesized text. Eg constr:x string:[....] ltac:(....) ltac:{....} The delimiter is made of 1 or more occurrences of the same parenthesis, eg ((.....)) or [[[[....]]]]. The idea being that if the text happens to contain the closing delimiter, one can make the delimiter longer and avoid confusion (no escaping). Eg string:[[ .. ']' .. ]] Nesting the delimiter is allowed, eg ((..((...))..)) is OK. The text inside the quotation is returned as a string (including the parentheses), so that a third party parser can take care of it. Keywords don't need to end in ':'.
2019-03-31[parsing] Split Tok.t into Tok.t and Tok.patternEnrico Tassi
Tokens were having a double role: - the output of the lexer - the items of grammar entries, especially terminals Now tokens are the output of the lexer, and this paves the way for using a richer data type, eg including Loc.t Patterns, as in Plexing.pattern, only represent patterns (for tokens) and now have a bit more structure (eg the wildcard is represented as None, not as "", while a regular pattern for "x" as Some "x")
2019-03-29[parser] initialization based on Loc.t rather than Loc.sourceEnrico Tassi
In this way one can also set the current offsets in a file, useful if you are parsing a Coq fragment within a file instead of a full file starting from the first line.
2018-12-20Make diffs work for more input stringsJim Fehrle
Diff code uses the lexer to recognize tokens in the inputs, which can be Pp.t's or strings. To add the highlights in the Pp.t, the diff code matches characters in the input to characters in the tokens. Current code fails for inputs containing quote marks or "(*" because the quote marks and comments don't appear in the tokens. This commit adds a "diff mode" to the lexer to return those characters, making the diff routine more robust.
2018-11-21[gramlib] [build] Switch make-based system to packed gramlibEmilio Jesus Gallego Arias
This makes the make-based build system stop linking to Camlp5's gramlib and instead links to our own gramlib. We use the style done in the packing of `Stdlib` in OCaml 4.07. As to introduce a minimal amount of noise in history we use an autogenerated `gramlib__pack` directory. Co-authored-by: Gaëtan Gilbert <gaetan.gilbert@skyskimmer.net>
2018-10-06[api] Remove (most) 8.9 deprecated objects.Emilio Jesus Gallego Arias
A few of them will be of help for future cleanups. We have spared the stuff in `Names` due to bad organization of this module following the split from `Term`, which really difficult things removing the constructors.
2018-03-05Merge PR #6855: Update headers following #6543.Maxime Dénès
2018-02-28[toplevel] Move beautify to its own pass.Emilio Jesus Gallego Arias
We first load the file, then we print it as a post-processing step. This is both more flexible and clearer. We also refactor the comments handling to minimize the logic that is living in the Lexer. Indeed, the main API is now living in the printer, and complex interactions with the state are not possible anymore, including the removal of messing with low-level summary and state setting!
2018-02-27Update headers following #6543.Théo Zimmermann
2018-02-22Rename release_lexer_state to the more descriptive get_lexer_state.Jim Fehrle
2017-09-14Using an algebraic type for distinguishing toplevel input from location in file.Hugo Herbelin
2017-08-01[flags] Remove XML output flag.Emilio Jesus Gallego Arias
This is a second try at removing the hooks for the legacy xml export system which can't currently be tested. It is also not included in the API, so it should either be included in it or this PR be applied.
2017-07-04Bump year in headers.Pierre-Marie Pédrot
2017-04-07[camlpX] Remove camlp4 compat layer.Emilio Jesus Gallego Arias
We remove the camlp4 compatibility layer, and try to clean up most structures. `parsing/compat` is gone. We added some documentation to the lexer/parser interfaces that are often obscured by module includes.
2016-10-21Adding a primitive to recover the set of keywords from the lexer.Pierre-Marie Pédrot
This is useful for debugging purposes.
2016-10-17More on making the lexer more functional (continuing b8ae2de5 andHugo Herbelin
8a8caba3). - Adding cLexer.current_file to the lexer state, i.e. making it a component of the type "coq_parsable" of lexer state (it was forgotten in b8ae2de5 and 8a8caba3). - Inlining save_translator/restore_translator which have now lost most of their substance.
2016-10-17Removing export of location_table outside of cLexer.Hugo Herbelin
It was not used any more by coqdoc since b8194b22 (Dec 2010). The table is now only part of the lexer function closure (and only in the camlp5 case).
2016-10-09Moving Pp.comments to CLexer so that Pp is purer (no more side-effectHugo Herbelin
done by the Ppcmd_comment token) and so that lexing/parsing side-effects are collected at the same place, i.e. in CLexer.
2016-10-09Attaching all extra imperative components of the lexer/parser state toHugo Herbelin
the state of parsable streams, so that different lexing/parsing processes can be started independently without conflicting. Note however that these different lexing/parsing processes cannot be run concurrently as they still work on the same piece of global memory (i.e. calls to entry_parse should remain atomic). To go further, one would typically need to be able to functionally pass the lexing state to each call to the lexer. Note that currently the beautifier is also running in the context of a lexer/parser state (for the mapping of location to comments). In particular, this fixes #5102 (parsing/lexing of bullets depending on the lexing state which was global).
2016-08-30Fix #4941 - ~/.coqrc file confusing locationsMaxime Dénès
2016-06-20Add file name, line number and beginning of line position to locations.Maxime Dénès
Coq locations already had support for this, but were containing dummy information. We now don't need anymore to reconstruct this information by browsing the file when printing an error message or enriching exceptions on the fly. It also became easier to interface with Coq since locations emitted by the lexer now always contain full information. On the API side, Loc.represent disappeared and Loc.t is now exposed as record. It is less error-prone than manipulating a tuple of 5 integers. Also, Loc.create takes 5 arguments instead of 3 and a pair.
2016-05-09Rename Lexer -> CLexer.Pierre-Marie Pédrot