Skip to content

Conversation

@emillon
Copy link
Collaborator

@emillon emillon commented Apr 11, 2025

Fixes #2679

Formatting mld files can crash when the input file cannot be lexed as ocaml. This can happen with LaTeX is found in {m ...} or C code in {@c[ ... ]}, for example.

In that case we would crash when building the list of tokens in the file; however it's not used in the mld case.

Fixes ocaml-ppx#2679

Formatting `mld` files can crash when the input file cannot be lexed as
ocaml. This can happen with LaTeX is found in `{m ...}` or C code in
`{@c[ ... ]}`, for example.

In that case we would crash when building the list of tokens in the
file; however it's not used in the `mld` case.
@emillon emillon force-pushed the mld-lexing-errors branch from 794e1c4 to b401c6e Compare April 11, 2025 09:07
@emillon emillon requested a review from Copilot April 11, 2025 09:08
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot reviewed 1 out of 6 changed files in this pull request and generated 1 comment.

Files not reviewed (5)
  • lib/Parse_with_comments.ml: Language not supported
  • test/passing/refs.default/doc.mld.ref: Language not supported
  • test/passing/refs.janestreet/doc.mld.ref: Language not supported
  • test/passing/refs.ocamlformat/doc.mld.ref: Language not supported
  • test/passing/tests/doc.mld: Language not supported

@emillon emillon requested a review from EmileTrotignon April 11, 2025 09:08
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@EmileTrotignon
Copy link
Collaborator

I don't really like that we catch the exception even when we are not parsing documentation, but I checked and we cannot match on fragment at this stage. Maybe an is_documentation bool argument could fix it ?

Still I think its okay to merge as-is, the exception will be raised elsewhere when truly parsing an ocaml file.

@emillon
Copy link
Collaborator Author

emillon commented Apr 11, 2025

yeah I agree it's not ideal. we can't really make the tokens field lazier because it needs the file open, so making it optional would be nice but I'm not sure how to wire that to this call site.

Copy link
Collaborator

@Julow Julow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's merge!

@Julow Julow merged commit e879af1 into ocaml-ppx:main May 7, 2025
8 of 12 checks passed
Julow added a commit to Julow/opam-repository that referenced this pull request Oct 24, 2025
….28.1)

CHANGES:

### Highlight

- \* Support for OCaml 5.4
  (ocaml-ppx/ocamlformat#2717, ocaml-ppx/ocamlformat#2720, ocaml-ppx/ocamlformat#2732, ocaml-ppx/ocamlformat#2733, ocaml-ppx/ocamlformat#2735, @Julow, @Octachron, @cod1r, @EmileTrotignon)
  OCamlformat now supports OCaml 5.4 syntax.
  Module packing of the form `((module M) : (module S))` are no longer
  rewritten to `(module M : S)` because these are now two different syntaxes.

- \* Reduce indentation after `|> map (fun` (ocaml-ppx/ocamlformat#2694, @EmileTrotignon)
  Notably, the indentation no longer depends on the length of the infix
  operator, for example:
  ```ocaml
  (* before *)
  v
  |>>>>>> map (fun x ->
              x )
  (* after *)
  v
  |>>>>>> map (fun x ->
      x )
  ```
  `@@ match` can now also be on one line.

### Added

- Added option `module-indent` option (ocaml-ppx/ocamlformat#2711, @HPRIOR) to control the indentation
  of items within modules. This affects modules and signatures. For example,
  module-indent=4:
  ```ocaml
  module type M = sig
      type t

      val f : (string * int) list -> int
  end
  ```

- `exp-grouping=preserve` is now the default in `default` and `ocamlformat`
  profiles. This means that its now possible to use `begin ... end` without
  tweaking ocamlformat. (ocaml-ppx/ocamlformat#2716, @EmileTrotignon)

### Deprecated

- Starting in this release, ocamlformat can use cmdliner >= 2.0.0. When that is
  the case, the tool no longer accepts unambiguous option names prefixes. For
  example, `--max-iter` is not accepted anymore, you have to pass the full
  option `--max-iters`. This does not apply to the keys in the `.ocamlformat`
  configuration files, which have always required the full name.
  See dbuenzli/cmdliner#200.
  (ocaml-ppx/ocamlformat#2680, @emillon)

### Changed

- \* The formatting of infix extensions is now consistent with regular
  formatting by construction. This reduces indentation in `f @@ match%e`
  expressions to the level of indentation in `f @@ match`. Other unknown
  inconsistencies might also be fixed. (ocaml-ppx/ocamlformat#2676, @EmileTrotignon)

- \* The spacing of infix attributes is now consistent across keywords. Every
  keyword but `begin` `function`, and `fun` had attributes stuck to the keyword:
  `match[@A]`, but `fun [@A]`. Now its also `fun[@A]`. (ocaml-ppx/ocamlformat#2676, @EmileTrotignon)

- \* The formatting of`let a = b in fun ...` is now consistent with other
  contexts like `a ; fun ...`. A check for the syntax `let a = fun ... in ...`
  was made more precise. (ocaml-ppx/ocamlformat#2705, @EmileTrotignon)

- \* `|> begin`, `~arg:begin`, `begin if`, `lazy begin`, `begin match`,
  `begin fun` and `map li begin fun`  can now be printed on the same line, with
  one less indentation level for the body of the inner expression.
  (ocaml-ppx/ocamlformat#2664, ocaml-ppx/ocamlformat#2666, ocaml-ppx/ocamlformat#2671, ocaml-ppx/ocamlformat#2672, ocaml-ppx/ocamlformat#2681, ocaml-ppx/ocamlformat#2685, ocaml-ppx/ocamlformat#2693, @EmileTrotignon)
  For example :
  ```ocaml
  (* before *)
  begin
    fun x ->
      some code
  end
  (* after *)
  begin fun x ->
    some code
  end
  ```

- \* `break-struct=natural` now also applies to `sig ... end`. (ocaml-ppx/ocamlformat#2682, @EmileTrotignon)

### Fixed

- Fixed `wrap-comments=true` not working with the janestreet profile (ocaml-ppx/ocamlformat#2645, @Julow)
  Asterisk-prefixed comments are also now formatted the same way as with the
  default profile.

- Fixed `nested-match=align` not working with `match%ext` (ocaml-ppx/ocamlformat#2648, @EmileTrotignon)

- Fixed the AST generated for bindings of the form `let pattern : type = function ...`
  (ocaml-ppx/ocamlformat#2651, @v-gb)

- Print valid syntax for the corner case (1).a (ocaml-ppx/ocamlformat#2653, @v-gb)

- `Ast_mapper.default_mapper` now iterates on the location of `in` in `let+ .. in ..`
  (ocaml-ppx/ocamlformat#2658, @v-gb)

- Fix missing parentheses in `let+ (Cstr _) : _ = _` (ocaml-ppx/ocamlformat#2661, @Julow)
  This caused a crash as the generated code wasn't valid syntax.

- Fix bad indentation of `let%ext { ...` (ocaml-ppx/ocamlformat#2663, @EmileTrotignon)
  with `dock-collection-brackets` enabled.

- ocamlformat is now more robust when used as a library to print modified ASTs
  (ocaml-ppx/ocamlformat#2659, @v-gb)

- Fix crash due to edge case with asterisk-prefixed comments (ocaml-ppx/ocamlformat#2674, @Julow)

- Fix crash when formatting `mld` files that cannot be lexed as ocaml (e.g.
  containing LaTeX or C code) (ocaml-ppx/ocamlformat#2684, @emillon)

- \* Fix double parens around module constraint in functor application :
  `module M = F ((A : T))` becomes `module M = F (A : T)`. (ocaml-ppx/ocamlformat#2678, @EmileTrotignon)

- Fix misplaced `;;` due to interaction with floating doc comments.
  (ocaml-ppx/ocamlformat#2691, @EmileTrotignon)

- The formatting of attributes of expression is now aware of the attributes
  infix or postix positions: `((fun [@A] x -> y) [@b])` is formatted without
  moving attributes. (ocaml-ppx/ocamlformat#2676, @EmileTrotignon)

- `begin%e ... end` and `begin [@A] ... end` nodes are always preserved.
  (ocaml-ppx/ocamlformat#2676, @EmileTrotignon)

- `begin end` syntax for `()` is now preserved. (ocaml-ppx/ocamlformat#2676, @EmileTrotignon)

- Fix a crash on `type 'a t = A : 'a. {a: 'a} -> 'a t`. (ocaml-ppx/ocamlformat#2710, @EmileTrotignon)

- Fix a crash where `type%e nonrec t = t` was formatted as `type nonrec%e t = t`,
  which is invalid syntax. (ocaml-ppx/ocamlformat#2712, @EmileTrotignon)

- Fix commandline parsing being quadratic in the number of arguments
  (ocaml-ppx/ocamlformat#2724, @let-def)

- \* Fix `;;` being added after a documentation comment (ocaml-ppx/ocamlformat#2683, @EmileTrotignon)
  This results in more `;;` being inserted, for example:
  ```ocaml
  (* before *)
  print_endline "foo"
  let a = 3

  (* after *)
  print_endline "foo" ;;
  let a = 3
  ```

- Fix dropped comment in `if then (* comment *) begin .. end` (ocaml-ppx/ocamlformat#2734, @Julow)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bug: crash when backslash is in mld file

3 participants