aboutsummaryrefslogtreecommitdiff
path: root/parsing/cLexer.ml
AgeCommit message (Collapse)Author
2021-04-23Relying on the abstract notion of streams with location for parsing.Hugo Herbelin
We also get rid of ploc.ml, now useless, relying a priori on more robust code in lStream.ml for location reporting (see e.g. parse_parsable in grammar.ml).
2020-10-30Renaming Numeral into NumberPierre Roux
2020-09-27Avoid lookup up too many characters.Guillaume Melquiond
This would cause issues in noninteractive mode. For example, when using Drop, the first character of the OCaml code would be read by Coq's REPL instead of OCaml's REPL. The peek_string function is quite inefficient, since the Stream module does not provide any good function to lookup arbitrary characters (or to push back characters).
2020-09-27Make "xxx:{{" always start a quotation, whether registered or not.Guillaume Melquiond
This commit also prevents quotations using "(" and "[" from gobbling sentences. As a consequence, dynamically-registered quotations can no longer modify where Coq sentences stop.
2020-09-11[parsing] Rename token NUMERAL to NUMBERPierre Roux
2020-03-25[parsing] Move `coq_parsable` custom logic to Grammar.Emilio Jesus Gallego Arias
Latest refactorings allow us to make the signature Coq parser a standard `Grammar.S` one; the only bit needed is to provide the extra capabilities to the `Lexer` signature w.r.t. to comments state. The handling of Lexer state is still a bit ad-hoc, in particular it is global whereas it should be fully attached to the parsable. This may work ok in batch mode but the current behavior may be buggy in the interactive context. This PR doesn't solve that but it is a step towards a more functional solution.
2020-03-22Centralizing all kinds of numeral string management in numTok.ml.Hugo Herbelin
Four types of numerals are introduced: - positive natural numbers (may include "_", e.g. to separate thousands, and leading 0) - integer numbers (may start with a minus sign) - positive numbers with mantisse and signed exponent - signed numbers with mantisse and signed exponent In passing, we clarify that the lexer parses only positive numerals, but the numeral interpreters may accept signed numerals. Several improvements and fixes come from Pierre Roux. See https://github.com/coq/coq/pull/11703 for details. Thanks to him.
2020-03-18Update headers in the whole code base.Théo Zimmermann
Add headers to a few files which were missing them.
2019-11-21[coq] Untabify the whole ML codebase.Emilio Jesus Gallego Arias
We also remove trailing whitespace. Script used: ```bash for i in `find . -name '*.ml' -or -name '*.mli' -or -name '*.mlg'`; do expand -i "$i" | sponge "$i"; sed -e's/[[:space:]]*$//' -i.bak "$i"; done ```
2019-08-29Fix a few wrong uses of `msg_notice`Maxime Dénès
2019-08-02[lexer]: improve error message on loct_func misuseEnrico Tassi
2019-06-24Merge PR #10375: [lexer] correctly update line number when lexing QUOTATION ↵Pierre-Marie Pédrot
(fix #10350) Ack-by: gares Reviewed-by: ppedrot
2019-06-18[lexer] correctly update line number when lexing QUOTATION (fix #10350)Enrico Tassi
2019-06-17Update ml-style headers to new year.Théo Zimmermann
2019-04-02Add parsing of decimal constants (e.g., 1.02e+01)Pierre Roux
Rather than integers '[0-9]+', numeral constant can now be parsed according to the regexp '[0-9]+ ([.][0-9]+)? ([eE][+-]?[0-9]+)?'. This can be used in one of the two following ways: - using the function `Notation.register_rawnumeral_interpreter` in an OCaml plugin - using `Numeral Notation` with the type `decimal` added to `Decimal.v` See examples of each use case in the next two commits.
2019-04-02Rename the INT token to NUMERALPierre Roux
In anticipation of future uses of this token for non integer numerals.
2019-03-31Multiple payload types in tokensPierre Roux
Instead of just string (and empty strings for tokens without payload)
2019-03-31[parsing] add keyword-protected generic quotationEnrico Tassi
One can now register a quotation using a grammar rule with QUOTATION("name:"). "name:" becomes a keyword and the token is generated for name: followed by a an identifier or a parenthesized text. Eg constr:x string:[....] ltac:(....) ltac:{....} The delimiter is made of 1 or more occurrences of the same parenthesis, eg ((.....)) or [[[[....]]]]. The idea being that if the text happens to contain the closing delimiter, one can make the delimiter longer and avoid confusion (no escaping). Eg string:[[ .. ']' .. ]] Nesting the delimiter is allowed, eg ((..((...))..)) is OK. The text inside the quotation is returned as a string (including the parentheses), so that a third party parser can take care of it. Keywords don't need to end in ':'.
2019-03-31[parsing] Split Tok.t into Tok.t and Tok.patternEnrico Tassi
Tokens were having a double role: - the output of the lexer - the items of grammar entries, especially terminals Now tokens are the output of the lexer, and this paves the way for using a richer data type, eg including Loc.t Patterns, as in Plexing.pattern, only represent patterns (for tokens) and now have a bit more structure (eg the wildcard is represented as None, not as "", while a regular pattern for "x" as Some "x")
2019-03-29[parser] initialization based on Loc.t rather than Loc.sourceEnrico Tassi
In this way one can also set the current offsets in a file, useful if you are parsing a Coq fragment within a file instead of a full file starting from the first line.
2018-12-20Make diffs work for more input stringsJim Fehrle
Diff code uses the lexer to recognize tokens in the inputs, which can be Pp.t's or strings. To add the highlights in the Pp.t, the diff code matches characters in the input to characters in the tokens. Current code fails for inputs containing quote marks or "(*" because the quote marks and comments don't appear in the tokens. This commit adds a "diff mode" to the lexer to return those characters, making the diff routine more robust.
2018-12-05Removing dead fields from Plexing.lexer.Pierre-Marie Pédrot
2018-11-30[gramlib] Remove `Ploc.t` in favor of `Loc.t`Emilio Jesus Gallego Arias
The types are identical and we have no more reason for the split. Note the following TODOS: - discrepancy of `Ploc.after` with `CLexer.after` - discrepancy of `Ploc.comments` with `CLexer.comments` - `Ploc.dummy` vs `Loc.t option`
2018-11-21[gramlib] [build] Switch make-based system to packed gramlibEmilio Jesus Gallego Arias
This makes the make-based build system stop linking to Camlp5's gramlib and instead links to our own gramlib. We use the style done in the packing of `Stdlib` in OCaml 4.07. As to introduce a minimal amount of noise in history we use an autogenerated `gramlib__pack` directory. Co-authored-by: Gaëtan Gilbert <gaetan.gilbert@skyskimmer.net>
2018-11-04Remove the deprecated Token module and port the corresponding code.Pierre-Marie Pédrot
2018-10-18Removing the Camlp5 macros from CLexer.Pierre-Marie Pédrot
We partially hand-translated so as to result in the minimal diff possible.