Closed
Description
Clarification and motivation
In the past, we've had many bugs related to path equality. A few examples (there are probably more):
- [BUG] Using
./
in paths #106 - [BUG] Trailing slashes in filepaths #195
- [BUG] Anchor checks are broken #120
- 77cba6c
We've solved them mainly by sprinkling the codebase with normalise
, normaliseWithNoTrailing
, expandIndirections
, dropTrailingPathSeparator
, canonizeLocalRef
etc.
But this approach is extremely error-prone.
A more principled approach might be to canonicalize filepaths using canonicalizePath
.
We should canonicalize:
- All filepaths returned by
git ls-files
- All local references
Here's a suggestion (haven't actually tried it):
-newtype RepoInfo = RepoInfo (Map FilePath (Maybe FileInfo))
+newtype RepoInfo = RepoInfo (Map CanonicalPath (Maybe FileInfo))
data FileInfo = FileInfo
{ _fiReferences :: [Reference]
, _fiAnchors :: [Anchor]
}
data Reference = Reference
{ rName :: Text
, rLink :: Text
, rAnchor :: Maybe Text
, rPos :: Position
+ , rInfo :: ReferenceInfo
}
+ -- We probably won't need the type `LocationType` after this.
+data ReferenceInfo
+ = RICurrentFile
+ | RIOtherFile CanonicalPath
+ | RIExternal
+ | RIOtherProtocol
+newtype CanonicalPath = UnsafeCanonicalPath { unCanonicalPath :: FilePath }
+ deriving newtype (Eq, Show, Ord)
+
+canonicalizePath :: FilePath -> IO CanonicalPath
+canonicalizePath = UnsafeCanonicalPath . Directory.canonicalizePath
Then we should be able to simply use the Eq
and Ord
instances as usual for comparing CanonicalPath
s, checking if a CanonicalPath
exists in a Map
/Set
, etc.
Let's review the uses of normalise
and similar functions, and try to get rid of as many as possible.
Acceptance criteria
- We're canonicalizing paths at the boundaries (i.e. after reading filepaths from
git ls-files
and from scanning markdown files) - We're avoiding manually "massaging" paths using
normalise/dropTrailingPathSeparator/etc
throughout the codebase
Metadata
Metadata
Assignees
Labels
No labels