Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New maxNrAssignment scheme for pruning #1156

Merged
merged 2 commits into from
Apr 14, 2022
Merged

Conversation

varunagrawal
Copy link
Collaborator

As discussed in our meeting yesterday, this PR updates the DecisionTreeFactor::prune method to use the new maxNrAssignments scheme.

Here, we use the record of assignments for each leaf to compute the number of duplicate values in the tree without enumerating all the (exponential) number of assignments, and threshold on the top maxNrAssignments values. To facilitate this, I added a new visitLeaf method to DecisionTree which follows the same idea as visit in that we visit all the leaves (and not assignments), but visitLeaf passes the full leaf object to the functional, allowing us to do leaf.nrAssignments() and leaf.constant() quite easily.

NOTE I have one question: should the functional receive the leaf object or the leaf pointer? It is a const either way, but I've made it as a leaf object const-ref so there shouldn't be a memory hit and it is convenient to use (since we don't declare a LeafPtr typedef).

@varunagrawal varunagrawal self-assigned this Mar 31, 2022
@ProfFan
Copy link
Collaborator

ProfFan commented Apr 6, 2022

@varunagrawal Shall we merge this?

@varunagrawal
Copy link
Collaborator Author

I'm waiting for approval on #1155

Base automatically changed from decisiontree-refactor to develop April 14, 2022 02:24
@varunagrawal varunagrawal merged commit caa14bc into develop Apr 14, 2022
@varunagrawal varunagrawal deleted the decisiontree-prune branch April 14, 2022 03:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants