Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: add FoldLeft, FoldLeftX; and ForAllX, ForAnyX, FilteredX, NumberX, PerformX #3357

Draft
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

fingolfin
Copy link
Member

@fingolfin fingolfin commented Mar 19, 2019

This is unfinished work that I've been dragging along in some form or another for a couple years now, but I know some people are interested in similar things, so I thought I should perhaps finally finish it; but then I thought that I'd like to first learn if this has any chance of being merged...

This PR adds a function FoldLeft and a corresponding method FoldLeftOp. This generalizes List, Sum, Product.

Moreover, it adds kernel function FOLD_LEFT_X (doesn't need to be a kernel function, but that helps to get competitive performance out of the more general code), and a GAP level wrapper for that called FoldLeftX. This is then used to reimplement ListX, SumX, ProductX with very little code; and then some "obvious" additional "X"-functions are added: ForAllX, ForAnyX, FilteredX, NumberX, PerformX. Note that these new functions are not just added "for completeness"; I've missed (and then later) used every single one of them at some point.

Among the things left to be done:

  • general cleanup of the code
  • more tests
  • devise a method to perform "short circuiting" of the evaluation, so that e.g. ForAllX can immediately abort if it sees false
  • ...

I'd be interested to know what people think about the general thrust of this PR, to determine whether there's any point in polishing it up.

@fingolfin fingolfin added kind: enhancement Label for issues suggesting enhancements; and for pull requests implementing enhancements topic: kernel release notes: to be added PRs introducing changes that should be (but have not yet been) mentioned in the release notes gapdays2019-spring Issues and PRs that arose at https://www.gapdays.de/gapdays2019-spring and removed gapdays2019-spring Issues and PRs that arose at https://www.gapdays.de/gapdays2019-spring labels Mar 19, 2019
@ChrisJefferson
Copy link
Contributor

I think this looks very reasonable.

For the shortcutting, I'm tempted to suggest an optional single argument which is a object which is compared using IsIdenticalObj. My reasoning is:

  1. Nice and efficient.
  2. The values I expect most commonly to be tested are 0, true, false and fail, where this will work.
  3. If we make an otherwise unused bag, we can use that value when nothing was given by the user (I thought about using null, but that might mean we short-circut early when given a function which doesn't return, which might be surprising).

@fingolfin
Copy link
Member Author

@ChrisJefferson I like that idea! We could simply pass "" instead of a special "unused bag", as that would be a completely fresh object guaranteed to be non-identical to any other (of course that's a slight inefficiency, too, but I'd imagine not measurable in practice).

@wilfwilson
Copy link
Member

I like this project.

@codecov
Copy link

codecov bot commented Mar 25, 2019

Codecov Report

Merging #3357 (cc79085) into master (12c3bac) will increase coverage by 11.21%.
The diff coverage is 76.27%.

❗ Current head cc79085 differs from pull request most recent head aed891a. Consider uploading reports for the commit aed891a to get more accurate results

@@             Coverage Diff             @@
##           master    #3357       +/-   ##
===========================================
+ Coverage   82.18%   93.39%   +11.21%     
===========================================
  Files         678      716       +38     
  Lines      287772   812207   +524435     
===========================================
+ Hits       236500   758590   +522090     
- Misses      51272    53617     +2345     
Impacted Files Coverage Δ
src/stats.c 94.48% <ø> (-0.41%) ⬇️
src/listfunc.c 92.21% <58.90%> (-5.17%) ⬇️
lib/coll.gi 94.67% <87.36%> (-1.54%) ⬇️
lib/coll.gd 95.46% <100.00%> (+0.13%) ⬆️
lib/memory.gi 43.12% <0.00%> (-41.75%) ⬇️
src/io.c 58.09% <0.00%> (-26.11%) ⬇️
lib/methsel.g 51.16% <0.00%> (-24.70%) ⬇️
lib/colorprompt.g 46.15% <0.00%> (-22.60%) ⬇️
src/finfield.h 83.05% <0.00%> (-13.03%) ⬇️
src/permutat_intern.hh 87.50% <0.00%> (-12.50%) ⬇️
... and 243 more

lib/coll.gd Outdated
@@ -2400,6 +2400,38 @@ DeclareOperation( "AsSSortedListNonstored", [IsListOrCollection] );
DeclareGlobalFunction( "Elements" );


#############################################################################
##
## TODO: document FoldLeft
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: add documentation

lib/coll.gd Show resolved Hide resolved
local tnum, C, func, result, i, l;
l := Length( arg );
if l < 2 or l > 3 or not IsFunction(arg[2]) then
Error( "usage: FoldLeft( <C>, <func>[, <init>] )" );
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: should FoldLeft also get an abort mechanism, like FoldLeftX? How should it work? The one for FoldLeftX is a bit adhoc: it works great for the few applications we have, but it's not very general...

What do other languages do for this, if they do anything at all?

lib/coll.gd Outdated
DeclareGlobalFunction( "ForAnyX" );
DeclareGlobalFunction( "FilteredX" );
DeclareGlobalFunction( "NumberX" );
DeclareGlobalFunction( "PerformX" );
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: also add FirstX and LastX?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "problem" with FirstX is the same as for ForAnyX and ForAllX: we want to be able to short circuit and abort the search as soon as we know the answer. The mechanism for doing this in this PR is the ugliest part (IMHO) of this PR by far (there is an optional abortValue that one can specify, and for which we check by identity comparison... once encountered, the search aborts. Works, is simply, but... ugly?

lib/coll.gi Outdated Show resolved Hide resolved
while (genIndex <= LEN_PLIST(gens)) {
Obj gen = ELM_PLIST(gens, genIndex);
if (IS_FUNC(gen))
gen = FastCallFuncList(gen, args);
Copy link
Member Author

@fingolfin fingolfin Mar 25, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: replace FastCallFuncList by just CallFuncList? It was kind of an attempt to bypass call overhead, but now I wonder if that wasn't a premature optimization. Doing some timing measurements probably would be a good idea: comparing performance with FastCallFuncList versus CallFuncList, but also that of ListX, SumX, ProductX before and after this PR.

@coveralls
Copy link

coveralls commented Dec 28, 2019

Coverage Status

Coverage decreased (-0.02%) to 84.715% when pulling 3bc8f92 on fingolfin:mh/fold into 791286a on gap-system:master.

UNB_LIST(args, valIndex);
return 0;
}
else if (CALL_1ARGS(IsListOrCollection, gen) == True) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually here is a chance to generalize ListX and friends to also accept iterators: we just need to accept objects in the filter IsIterator here, that should do it. Note that List and friends already (mostly?) support iterators.


#############################################################################
##
#O FoldLeftX( <gens>, <func>, <init>[, <abortValue>] )
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: replace abortValue by stopValue or stoppingValue?

#F FoldLeft( <coll>, <func> )
#F FoldLeft( <coll>, <func>, <init> )
##
InstallGlobalFunction( FoldLeft,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be redone using InstallEarlyMethod

Error( "gens[",i+1,"] must be a collection, a list, a boolean, ",
"or a function" );
result:= C[1];
for i in [ 2 .. Length( C ) ] do
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Contrasting this with the code path used when init is given (which uses for i in C do) I think we get different behavior if C has holes. We probably don't want that difference, do we?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
gapdays2019-spring Issues and PRs that arose at https://www.gapdays.de/gapdays2019-spring kind: enhancement Label for issues suggesting enhancements; and for pull requests implementing enhancements release notes: to be added PRs introducing changes that should be (but have not yet been) mentioned in the release notes topic: kernel
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants