Skip to content

use OsPath in NormalizedFilePath #446

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 27 commits into from
Aug 31, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 16 additions & 1 deletion .github/workflows/haskell.yml
Original file line number Diff line number Diff line change
Expand Up @@ -9,8 +9,17 @@ jobs:
strategy:
fail-fast: false
matrix:
ghc: ['9.2.1', '9.0.1', '8.10.7', '8.8.4', '8.6.5']
ghc: ['9.2.4', '9.0.2', '8.10.7', '8.8.4', '8.6.5']
os: [ubuntu-latest, macOS-latest, windows-latest]
ospath: [true, false]
exclude:
# newer 'entropy' doesn't work with old 'unix', and it doesn't have a correct version bound.
- ospath: true
ghc: 8.6.5
# "cabal build" always timeout
- ospath: true
ghc: 8.8.4
os: windows-latest

steps:
- uses: actions/checkout@v2
Expand Down Expand Up @@ -41,6 +50,12 @@ jobs:

- name: Cabal update
run: cabal update
- name: Cabal configure
shell: bash
run: |
if [ ${{ matrix.ospath }} = "true" ]; then
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should include this as a flag in the package definition? force-ospath or something?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice idea!

cabal configure --constraint="filepath ^>= 1.4.100.0"
fi
- name: Build using cabal
run: cabal build all
- name: Test
Expand Down
2 changes: 2 additions & 0 deletions cabal.project
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,8 @@ packages:
package lsp
flags: +demo

index-state: 2022-08-25T22:25:05Z

tests: True
benchmarks: True
test-show-details: direct
Expand Down
2 changes: 1 addition & 1 deletion lsp-test/lsp-test.cabal
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
cabal-version: 2.4
name: lsp-test
version: 0.14.0.3
version: 0.14.1.0
synopsis: Functional test framework for LSP servers.
description:
A test framework for writing tests against
Expand Down
8 changes: 7 additions & 1 deletion lsp-types/lsp-types.cabal
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
cabal-version: 2.2
name: lsp-types
version: 1.5.0.0
version: 1.5.1.0
synopsis: Haskell library for the Microsoft Language Server Protocol, data types

description: An implementation of the types to allow language implementors to
Expand Down Expand Up @@ -68,6 +68,7 @@ library
, Language.LSP.Types.WorkspaceEdit
, Language.LSP.Types.WorkspaceFolders
, Language.LSP.Types.WorkspaceSymbol
, Language.LSP.Types.Uri.OsPath
-- other-extensions:
ghc-options: -Wall
build-depends: base >= 4.11 && < 5
Expand All @@ -89,6 +90,9 @@ library
, text
, template-haskell
, unordered-containers
, exceptions
, safe
, bytestring
hs-source-dirs: src
default-language: Haskell2010
default-extensions: StrictData
Expand All @@ -106,6 +110,7 @@ test-suite lsp-types-test
TypesSpec
URIFilePathSpec
WorkspaceEditSpec
LocationSpec
build-depends: base
, QuickCheck
-- for instance Arbitrary Value
Expand All @@ -117,6 +122,7 @@ test-suite lsp-types-test
, network-uri
, quickcheck-instances
, text
, tuple
build-tool-depends: hspec-discover:hspec-discover
ghc-options: -threaded -rtsopts -with-rtsopts=-N -Wall
default-language: Haskell2010
Expand Down
4 changes: 3 additions & 1 deletion lsp-types/src/Language/LSP/Types.hs
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,7 @@ module Language.LSP.Types
, module Language.LSP.Types.TextDocument
, module Language.LSP.Types.TypeDefinition
, module Language.LSP.Types.Uri
, module Language.LSP.Types.Uri.OsPath
, module Language.LSP.Types.WatchedFiles
, module Language.LSP.Types.Window
, module Language.LSP.Types.WorkspaceEdit
Expand Down Expand Up @@ -69,8 +70,8 @@ import Language.LSP.Types.Initialize
import Language.LSP.Types.Location
import Language.LSP.Types.LspId
import Language.LSP.Types.MarkupContent
import Language.LSP.Types.Method
import Language.LSP.Types.Message
import Language.LSP.Types.Method
import Language.LSP.Types.Parsing
import Language.LSP.Types.Progress
import Language.LSP.Types.References
Expand All @@ -83,6 +84,7 @@ import Language.LSP.Types.StaticRegistrationOptions
import Language.LSP.Types.TextDocument
import Language.LSP.Types.TypeDefinition
import Language.LSP.Types.Uri
import Language.LSP.Types.Uri.OsPath
import Language.LSP.Types.WatchedFiles
import Language.LSP.Types.Window
import Language.LSP.Types.WorkspaceEdit
Expand Down
112 changes: 73 additions & 39 deletions lsp-types/src/Language/LSP/Types/Uri.hs
Original file line number Diff line number Diff line change
@@ -1,7 +1,11 @@
{-# LANGUAGE LambdaCase #-}
{-# LANGUAGE DeriveGeneric #-}
{-# LANGUAGE GeneralizedNewtypeDeriving #-}
{-# LANGUAGE DeriveGeneric #-}
{-# LANGUAGE RecordWildCards #-}
{-# LANGUAGE InstanceSigs #-}
{-# LANGUAGE OverloadedStrings #-}
{-# LANGUAGE RankNTypes #-}
{-# LANGUAGE RecordWildCards #-}
{-# LANGUAGE TypeSynonymInstances #-}

module Language.LSP.Types.Uri
( Uri(..)
, uriToFilePath
Expand All @@ -10,32 +14,39 @@ module Language.LSP.Types.Uri
, toNormalizedUri
, fromNormalizedUri
, NormalizedFilePath
, normalizedFilePath
, toNormalizedFilePath
, fromNormalizedFilePath
, normalizedFilePathToUri
, uriToNormalizedFilePath
, emptyNormalizedFilePath
-- Private functions
, platformAwareUriToFilePath
, platformAwareFilePathToUri
)
where

import Control.DeepSeq
import qualified Data.Aeson as A
import Data.Binary (Binary, Get, put, get)
import qualified Data.Aeson as A
import Data.Binary (Binary, Get, get, put)
import Data.ByteString.Short (ShortByteString)
import qualified Data.ByteString.Short as BS
import Data.Hashable
import Data.List (stripPrefix)
import Data.String (IsString, fromString)
import Data.Text (Text)
import qualified Data.Text as T
import Data.List (stripPrefix)
import Data.String (IsString (fromString))
import Data.Text (Text)
import qualified Data.Text as T
import qualified Data.Text.Encoding as T
import Data.Text.Encoding.Error (UnicodeException)
import GHC.Generics
import Network.URI hiding (authority)
import qualified System.FilePath as FP
import qualified System.FilePath.Posix as FPP
import qualified System.FilePath.Windows as FPW
import GHC.Stack (HasCallStack)
import Network.URI hiding (authority)
import Safe (tailMay)
import qualified System.FilePath as FP
import qualified System.FilePath.Posix as FPP
import qualified System.FilePath.Windows as FPW
import qualified System.Info


newtype Uri = Uri { getUri :: Text }
deriving (Eq,Ord,Read,Show,Generic,A.FromJSON,A.ToJSON,Hashable,A.ToJSONKey,A.FromJSONKey)

Expand Down Expand Up @@ -67,7 +78,7 @@ isUnescapedInUriPath systemOS c
normalizeUriEscaping :: String -> String
normalizeUriEscaping uri =
case stripPrefix (fileScheme ++ "//") uri of
Just p -> fileScheme ++ "//" ++ (escapeURIPath $ unEscapeString p)
Just p -> fileScheme ++ "//" ++ escapeURIPath (unEscapeString p)
Nothing -> escapeURIString isUnescapedInURI $ unEscapeString uri
where escapeURIPath = escapeURIString (isUnescapedInUriPath System.Info.os)

Expand Down Expand Up @@ -107,17 +118,19 @@ platformAdjustFromUriPath :: SystemOS
-> String -- ^ path
-> FilePath
platformAdjustFromUriPath systemOS authority srcPath =
(maybe id (++) authority) $
if systemOS /= windowsOS || null srcPath then srcPath
else let
firstSegment:rest = (FPP.splitDirectories . tail) srcPath -- Drop leading '/' for absolute Windows paths
drive = if FPW.isDrive firstSegment
then FPW.addTrailingPathSeparator firstSegment
else firstSegment
in FPW.joinDrive drive $ FPW.joinPath rest
maybe id (++) authority $
if systemOS /= windowsOS
then srcPath
else case FPP.splitDirectories <$> tailMay srcPath of
Just (firstSegment:rest) -> -- Drop leading '/' for absolute Windows paths
let drive = if FPW.isDrive firstSegment
then FPW.addTrailingPathSeparator firstSegment
else firstSegment
in FPW.joinDrive drive $ FPW.joinPath rest
_ -> srcPath

filePathToUri :: FilePath -> Uri
filePathToUri = (platformAwareFilePathToUri System.Info.os) . FP.normalise
filePathToUri = platformAwareFilePathToUri System.Info.os . FP.normalise

{-# WARNING platformAwareFilePathToUri "This function is considered private. Use normalizedUriToFilePath instead." #-}
platformAwareFilePathToUri :: SystemOS -> FilePath -> Uri
Expand Down Expand Up @@ -151,27 +164,32 @@ platformAdjustToUriPath systemOS srcPath
FPP.addTrailingPathSeparator (init drv)
| otherwise = drv

-- | Newtype wrapper around FilePath that always has normalized slashes.
-- The NormalizedUri and hash of the FilePath are cached to avoided
-- | A file path that is already normalized. It is stored as an UTF-8 encoded 'ShortByteString'
--
-- The 'NormalizedUri' is cached to avoided
-- repeated normalisation when we need to compute them (which is a lot).
--
-- This is one of the most performance critical parts of ghcide, do not
-- modify it without profiling.
data NormalizedFilePath = NormalizedFilePath NormalizedUri !FilePath
data NormalizedFilePath = NormalizedFilePath !NormalizedUri {-# UNPACK #-} !ShortByteString
deriving (Generic, Eq, Ord)

instance NFData NormalizedFilePath

instance Binary NormalizedFilePath where
put (NormalizedFilePath _ fp) = put fp
get = do
v <- Data.Binary.get :: Get FilePath
let nuri = internalNormalizedFilePathToUri v
return (normalizedFilePath nuri v)
v <- Data.Binary.get :: Get ShortByteString
case decodeFilePath v of
Left e -> fail (show e)
Right v' ->
return (NormalizedFilePath (internalNormalizedFilePathToUri v') v)

encodeFilePath :: String -> ShortByteString
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

haddock?

encodeFilePath = BS.toShort . T.encodeUtf8 . T.pack

-- | A smart constructor that performs UTF-8 encoding and hash consing
normalizedFilePath :: NormalizedUri -> FilePath -> NormalizedFilePath
normalizedFilePath nuri nfp = NormalizedFilePath nuri nfp
decodeFilePath :: ShortByteString -> Either UnicodeException String
decodeFilePath = fmap T.unpack . T.decodeUtf8' . BS.fromShort
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can always speed this up with something like utf8-string or rolling our own utf-8 decoding

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


-- | Internal helper that takes a file path that is assumed to
-- already be normalized to a URI. It is up to the caller
Expand All @@ -191,20 +209,36 @@ instance Hashable NormalizedFilePath where
hashWithSalt salt (NormalizedFilePath uri _) = hashWithSalt salt uri

instance IsString NormalizedFilePath where
fromString :: String -> NormalizedFilePath
fromString = toNormalizedFilePath

toNormalizedFilePath :: FilePath -> NormalizedFilePath
toNormalizedFilePath fp = normalizedFilePath nuri nfp
toNormalizedFilePath fp = NormalizedFilePath nuri . encodeFilePath $ nfp
where
nfp = FP.normalise fp
nuri = internalNormalizedFilePathToUri nfp
nfp = FP.normalise fp
nuri = internalNormalizedFilePathToUri nfp

fromNormalizedFilePath :: NormalizedFilePath -> FilePath
fromNormalizedFilePath (NormalizedFilePath _ fp) = fp
-- | Extracts 'FilePath' from 'NormalizedFilePath'.
-- The function is total. The 'HasCallStack' constraint is added for debugging purpose only.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a little confusing as it immediately calls error! I get the argument: the invariant of NFP is that it contains a correctly encoded string so this should never fail.

However, I also bet we call this a lot. What would happen if we also cached the decoded filepath in NormalizedFilePath? Would it blow out memory usage?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, memory usage is the point of using ShortByteString or OsPath instead of FilePath. Despite the encoding/decoding, both CPU time and memory usage decreased in benchmarks!

I saw most use sites of toNormalizedFilePath just extracts the FilePath and pass it to some IO function, so the CPU is not the bottleneck.

fromNormalizedFilePath :: HasCallStack => NormalizedFilePath -> FilePath
fromNormalizedFilePath (NormalizedFilePath _ fp) =
case decodeFilePath fp of
Left e -> error $ show e
Right x -> x

normalizedFilePathToUri :: NormalizedFilePath -> NormalizedUri
normalizedFilePathToUri (NormalizedFilePath uri _) = uri

uriToNormalizedFilePath :: NormalizedUri -> Maybe NormalizedFilePath
uriToNormalizedFilePath nuri = fmap (normalizedFilePath nuri) mbFilePath
uriToNormalizedFilePath nuri = fmap (NormalizedFilePath nuri . encodeFilePath) mbFilePath
where mbFilePath = platformAwareUriToFilePath System.Info.os (fromNormalizedUri nuri)

emptyNormalizedUri :: NormalizedUri
emptyNormalizedUri =
let s = "file://"
in NormalizedUri (hash s) s

-- | 'NormalizedFilePath' that contains an empty file path
emptyNormalizedFilePath :: NormalizedFilePath
emptyNormalizedFilePath = NormalizedFilePath emptyNormalizedUri ""

47 changes: 47 additions & 0 deletions lsp-types/src/Language/LSP/Types/Uri/OsPath.hs
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
{-# LANGUAGE CPP #-}
{-# LANGUAGE ScopedTypeVariables #-}

#if MIN_VERSION_filepath(1,4,100)
#define OS_PATH 1
#endif

module Language.LSP.Types.Uri.OsPath
(
#ifdef OS_PATH
osPathToNormalizedFilePath
, normalizedFilePathToOsPath
#endif
) where

#ifdef OS_PATH

import Control.DeepSeq (NFData, force)
import Control.Exception hiding (try)
import Control.Monad.Catch
import Language.LSP.Types.Uri
import System.IO.Unsafe (unsafePerformIO)
import System.OsPath

{-|
Constructs 'NormalizedFilePath' from 'OsPath'. Throws 'IOException' if the conversion fails.
-}
osPathToNormalizedFilePath :: MonadThrow m => OsPath -> m NormalizedFilePath
osPathToNormalizedFilePath = fmap toNormalizedFilePath . unsafePerformIO' . decodeFS
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm getting a little confused, would you mind writing a comment somewhere explaining the deal with the various encodings? The idea is something like: we always decode OsPaths using the system encoding into a string, and then we re-encode them as UTF8 to store them in NFP?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have missed all the discussion in the PR, so I need a comment explaining why we don't do the obvious thing (which would be storing the OsPath inside NormalizedFilePath).

Comment on lines +28 to +29
Copy link
Collaborator

@pepeiborra pepeiborra Sep 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of decodeFS you could use the pure decodeWith and isolate the unsafePerformIO in a single global:

osPathToNormalizedFilePath :: MonadThrow m => OsPath -> Either EncodingException NormalizedFilePath
osPathToNormalizedFilePath = fmap toNormalizedFilePath . decodeWith systemEnc

{-# NOINLINE systemEnc #-}
systemEnc :: TextEncoding
systemEnc = unsafePerformIO getFileSystemEncoding

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Applied in #453. Thank you!


{-|
Extracts 'OsPath' from 'NormalizedFilePath'. Throws 'IOException' if the conversion fails.
-}
normalizedFilePathToOsPath :: MonadThrow m => NormalizedFilePath -> m OsPath
normalizedFilePathToOsPath = unsafePerformIO' . encodeFS . fromNormalizedFilePath

unsafePerformIO' :: (MonadThrow m, NFData a) => IO a -> m a
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doc?

unsafePerformIO' action =
case fp of
Left (e :: SomeException) -> throwM e
Right fp' -> pure fp'
where
fp = unsafePerformIO . try $ do
x <- action
evaluate . force $ x

#endif
Loading