Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add lookuptables for writing efficient lookup tables; make symbolRank, symbolName O(1) #18044

Closed
wants to merge 1 commit into from

Conversation

timotheecour
Copy link
Member

@timotheecour timotheecour commented May 19, 2021

EDIT

it works but don't review yet, i'm improving the API

@timotheecour timotheecour changed the title add lookuptables for writing efficient lookup tables; make symbolRank, symbolName O(1) add lookuptables for writing efficient lookup tables; make symbolRank, symbolName O(1) May 19, 2021
@timotheecour timotheecour force-pushed the pr_lookuptable branch 2 times, most recently from f1c1594 to f421881 Compare May 19, 2021 02:34
@Varriount
Copy link
Contributor

  • Why does this need to be added directly to the standard library? Or alternately, why shouldn't this be an external package?
  • What differences does this have, compared to the other table types?

@timotheecour
Copy link
Member Author

timotheecour commented May 19, 2021

Why does this need to be added directly to the standard library? Or alternately, why shouldn't this be an external package?

because std/enumutils (itself used by std/jsonutils) depends on it, turning an O(N) API into O(1), and in future PRs other stdlib modules will depend on it. Note that the module is private for now (std/private/lookuptables).

What differences does this have, compared to the other table types?

  • it's 2 to 3 times faster than equivalent code using tables (see benchmarks in this PR)
  • it's a self-contained module with no dependency. So instead of depending on tables (which adds these recursive import dependencies: tables,hashes,math,bitops,fenv,algorithm, as shown by --processing:filenames), you only depend on lookuptables

@Clyybber
Copy link
Contributor

There's a much simpler solution to symbolName if we remove symbolRank introduced in #18029 again:
Simply generate an enum type similar to T without the string representation overrides and then convert to that enum and call $ on it.

@timotheecour
Copy link
Member Author

timotheecour commented May 19, 2021

There's a much simpler solution to symbolName

the problem is not with ordinal enums (which don't need symbolRank), the problem is holey enums, for which your approach (which isn't really simpler btw) just cannot work efficiently (it'd be O(N) instead of O(1) as in this PR). And symbolRank (as well as lookuptables) is useful regardless of symbolName.

@Clyybber
Copy link
Contributor

Clyybber commented May 20, 2021

Please reread my proposal. It would work with holey enums and would be O(1) because afaik $ for enums is O(1).

@timotheecour
Copy link
Member Author

timotheecour commented May 20, 2021

Please reread my proposal. It would work with holey enums and would be O(1) because afaik $ for enums is O(1).

please read implementation of reprEnum

# ugh we need a slow linear search:

@Clyybber
Copy link
Contributor

# ugh we need a slow linear search:

I see, but then reprEnum should be improved instead of only improving symbolName/symbolRank.

@Varriount
Copy link
Contributor

So why can't these improvements be applied to the tables module?

@arnetheduck
Copy link
Contributor

because X depends on it,

The solution here is to move X out of the library as well - not introduce Y which soon will be used to motivate including A, B and C as well.

@c-blake
Copy link
Contributor

c-blake commented Jun 26, 2021

If I literally just take his benchmark and impl and compile with -d:nimIntHash1 then I get that Table is faster than this proposal. 88% the time of this new table in genData1 and 66% the time in genData2 on an i7-6700k, nim-devel, gcc-11. Just one cpu/backend/person. A more impulsive person might say "Table is 1.5X faster!"

Even if this lookuptables work did help (and my experiment above suggests it may well hurt), how many enum values are there usually? Is converting to strings in a fast & furious mode even common outside some json microbenchmark? Why is this an optimization so important that it needs a new table? In general, having the whole stdlib captive to any one person's json microbenchmarks seems like a truly terrible plan.

@stale
Copy link

stale bot commented Jun 28, 2022

This pull request has been automatically marked as stale because it has not had recent activity. If you think it is still a valid PR, please rebase it on the latest devel; otherwise it will be closed. Thank you for your contributions.

@stale stale bot added the stale Staled PR/issues; remove the label after fixing them label Jun 28, 2022
@stale stale bot closed this Jul 31, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stale Staled PR/issues; remove the label after fixing them
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants