Skip to content
This repository was archived by the owner on Aug 3, 2024. It is now read-only.
This repository was archived by the owner on Aug 3, 2024. It is now read-only.

Haddock crash when table contains "unicode" symbols #1578

Open
@guibou

Description

@guibou

The following code is crashing haddock:

{- |

+-----+
| ✅  |
+-----+

-}
module Toto where
$ haddock --version
Haddock version 2.27.0, (c) Simon Marlow 2006
Ported to use the GHC API by David Waern 2006-2008
$ which haddock
/nix/store/80k5c2yalbmmgny0np0y7ayd864xqpj3-ghc-9.4.4/bin/haddock
$ haddock Toto.hs  

<no location info>: error:
    Data.Text.Internal.Fusion.Common.index: Index too large
CallStack (from HasCallStack):
  error, called at libraries/text/src/Data/Text/Internal/Fusion/Common.hs:1180:24 in text-2.0.1:Data.Text.Internal.Fusion.Common
  streamError, called at libraries/text/src/Data/Text/Internal/Fusion/Common.hs:1080:33 in text-2.0.1:Data.Text.Internal.Fusion.Common
  indexI, called at libraries/text/src/Data/Text/Internal/Fusion.hs:249:9 in text-2.0.1:Data.Text.Internal.Fusion
  index, called at libraries/text/src/Data/Text.hs:1839:13 in text-2.0.1:Data.Text
  index, called at utils/haddock/haddock-library/src/Documentation/Haddock/Parser.hs:464:17 in main:Documentation.Haddock.Parser
haddock: Cannot typecheck modules

This is highly sensible to whitespaces, for example:

+-----+
| ✅   |
+-----+

works.

I suspect that the problem is because the line length are checked based on byte number or number of characters, which does not match because of the encoding.

This is known, see #718 (comment), where @phadej says:

There /will/ be a problem with UTF-8 as for tables we need to count characters. I won't do anything for that at this point.

I'm mostly opening the ticket for reference.

This being said, it may be possible to be more robust and generate an invalid table or a more graceful crash.

Note: I'm using haddock with ghc 9.4 which uses text 2, but I've also observed the problem with ghc 9.2 and text 1.2.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions