Skip to content

Intern Qualified, use HashMaps for type checking environment #9

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 22 commits into
base: restaumatic
Choose a base branch
from

Conversation

kozak
Copy link

@kozak kozak commented May 27, 2025

Cleaned up.

stat A: min 16ms max 17ms mean 16ms median 16ms stddev 0ms n=18
stat B: min 13ms max 14ms mean 13ms median 13ms stddev 0ms n=18
mean diff -20.6%
ttest detected diff? true p-value 0.000000
stat ratios: min -23.50% max -16.32% mean -20.61% median -20.69% stddev 2.28% n=18
ttest detected ratios diff? true p-value 0.000000

@kozak kozak requested review from zyla and jborkowski May 27, 2025 11:15
@kozak kozak changed the title hash type with interning Intern QualifiedBy use HashMaps for type checking environment May 27, 2025
@kozak kozak changed the title Intern QualifiedBy use HashMaps for type checking environment Intern Qualified, use HashMaps for type checking environment May 27, 2025
{-# LANGUAGE ViewPatterns #-}
{-# LANGUAGE ScopedTypeVariables #-}

module Language.PureScript.Interner where
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's taken from https://github.com/reflex-frp/hash-cons/blob/main/src/Data/HashCons/Internal.hs , right?
Are there modifications? Maybe add a comment indicating the source?
Also, the license probably should be copied here or something

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We definitely should

@@ -93,10 +96,10 @@ data Ident
--
| UnusedIdent
-- |
-- A generated name used only for internal transformations
-- A generated name used only for hashConsal transformations
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A clbuttic mistake

Suggested change
-- A generated name used only for hashConsal transformations
-- A generated name used only for internal transformations

Comment on lines +214 to +226
instance Show ModuleName where
show (ModuleName i) = T.unpack $ unHashCons i

instance Ord ModuleName where
compare (ModuleName a) (ModuleName b) = compare (unHashCons a) (unHashCons b)

instance Serialise ModuleName where
encode (ModuleName i) = encode (unHashCons i)
decode = ModuleName . hashCons <$> decode

instance Hashable ModuleName where
hash (ModuleName i) = hash i
hashWithSalt s (ModuleName i) = hashWithSalt s i
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These instances can probably be handled via newtype deriving.

show (ModuleName i) = T.unpack $ unHashCons i

instance Ord ModuleName where
compare (ModuleName a) (ModuleName b) = compare (unHashCons a) (unHashCons b)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason we don't use the "optimized" hash-based Ord? Do we rely on deterministic ordering somewhere?

@@ -229,89 +263,130 @@ toMaybeModuleName (BySourcePos _) = Nothing
-- |
-- A qualified name, i.e. a name with an optional module name
--
data Qualified a = Qualified QualifiedBy a
deriving (Show, Eq, Ord, Functor, Foldable, Traversable, Generic)
data Qualified' a = Qualified' QualifiedBy a
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know what the convention is in Haskell, but in Rust-land such types ("guts" of a type where the public version is wrapped in something) are named by adding an Inner suffix, i.e. QualifiedInner

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool


instance NFData a => NFData (Qualified a)
instance Serialise a => Serialise (Qualified a)
mapQualified :: Hashable b => (a -> b) -> Qualified a -> Qualified b
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a little annoying (lots of changes caused by losing the Functor instance).

If only the Haskell ecosystem used constrained functors by default...

parseJSON v = byModule <|> bySourcePos <|> byMaybeModuleName'
where
byModule = do
(mn, a) <- parseJSON2 v
pure $ Qualified (ByModuleName mn) a
pure $ mkQualified_ (ByModuleName mn) a
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we just use the pattern synonym here?

Suggested change
pure $ mkQualified_ (ByModuleName mn) a
pure $ Qualified (ByModuleName mn) a

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(and in a lot of other places)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Weird because I thought I have already done that, maybe it was a different branch..

Comment on lines +53 to +81
newtype PSString = PSString { unPSString :: [Word16] }
deriving (Eq, NFData, Generic)
deriving newtype Hashable

instance NFData PSString
instance Serialise PSString
instance Ord PSString where
compare (PSString a) (PSString b) = compare a b

instance Show PSString where
show = show . codePoints

toUTF16CodeUnits :: PSString -> [Word16]
toUTF16CodeUnits (PSString ps) = ps

mkPSString :: [Word16] -> PSString
mkPSString = PSString


instance Semigroup PSString where
PSString a <> PSString b = PSString (a <> b)

instance Monoid PSString where
mempty = PSString []
mappend = (<>)

instance Codec.Serialise PSString where
encode (PSString s) = Codec.encode s
decode = mkPSString <$> Codec.decode


Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume these changes are preparation for also interning PSString?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are correct, this is a leftover from interning. I reverted it but didn't revert the changes.

decode = ProperName . hashCons <$> decode

instance Ord (ProperName a) where
compare (ProperName a) (ProperName b) = compare (unHashCons a) (unHashCons b)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same with ModuleName, why can't we use HashCons' Ord?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants