Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Limit number of results per depth #70

Open
Gofven opened this issue May 30, 2024 · 6 comments
Open

Limit number of results per depth #70

Gofven opened this issue May 30, 2024 · 6 comments

Comments

@Gofven
Copy link

Gofven commented May 30, 2024

I'd like to use this library, but i just wonder if it's possible to set a limit of e.g. max 10 childrens per root node and two children per child node?

@matthiask
Copy link
Member

You can do something like this by introducing your own def clean() method on nodes which checks for this.

Of course this doesn't check anything on the database level, but maybe it's good enough?

def clean(self):
    if not self.parent_id and Node.objects.filter(Q(parent=None) & ~Q(id=self.id)).count() > 10:
        raise ValidationError(...)
    if self.parent_id and self.parent.children.filter(~Q(id=self.id)).count() > 2:
        raise ValidationError(...)

You could maybe also do something using a database constraint, but you'd have to research this yourself for now.

@Gofven
Copy link
Author

Gofven commented May 30, 2024

Thank you for responding so quickly c:
Although that's not quite what i was wondering, but i can see my question was a bit badly formulated, sorry for that!

What i want is to make a depth first QuerySet that can be paginated, ordered by score, and allows only 10 childrens, and 2 grandchildrens before jumping to next parent node

Although i'm not sure how to explain this, but i'll try! Below is an example of what i want the queryset to return (in ascending order)

  • the "..." is to mark points where to skip
  • "/" describes when a tree node branches to a child node and also when it ends
  • "#" is a comment for guidance c:

├── 1/ # Root node begins here
│ ├── 2
│ ├── 3/
│ │ ├── 4
│ │ ├── 5/
│ │ │ └── ... # grand grand children not allowed
│ │ └── ... # more than two grand children
│ ├── 6
│ ├── 7/
│ │ └── 8
│ ├── 9
│ ├── 10
│ ├── 11
│ ├── 12/
│ │ └── 13/
│ │ └── ... # grand grand children not allowed
│ ├── 14
│ ├── 15
│ └── ... # more than ten children
└── 16/
├── 17
├── 18/
│ ├── 19
│ ├── 20
└── 21/
└── 22

@matthiask
Copy link
Member

Ah, yes.

  • Well, for the total descendants count, you can always query the root node of the current tree (using node.ancestors()[0] if node.parent_id else node) and then check node.descendants().count() for the count of all descendants.
  • The recipe for a maximum amount of children is already in the first response.
  • The tree depth can be limited by checking for node.ancestors().count() >= 2 or something, or maybe without tree queries by using node.parent and node.parent.parent and node.parent.parent.parent (of course with guards etc.)

Pagination is always much harder than it looks at first; easier ways exist maybe per use case, e.g. you could only paginate the root nodes and just accept what descendants you get or whatever.

I'm not sure I understand your requirements completely, but maybe the ideas here will help you with the implementation! It certainly sounds doable with django-tree-queries.

@Gofven
Copy link
Author

Gofven commented May 30, 2024

You're getting closer that's for sure c:

However, i forgot to mention a few things again, i'll try more!

The descendants that i show with the "..." is not a limitation, the model will allow these nodes to exist, which is why a clean function wouldn't work for this!

Also, the root node as described can also be a child, or a grand child and so-on, the QuerySet can start from anywhere on this Tree and give the same structure back, even if it starts on a Node which is at 10, 20, or even a hundred depth

Thanks again for your blazing fast responses!!

@matthiask
Copy link
Member

The problem is then that tree queries will always build a tree of at least all descendants of the current node; you can apply a limit like (untested!) .extra(where=["tree_depth <= 2"]) or something, but I'm not 100% sure if the new .tree_filter() (

def tree_filter(self, *args, **kwargs):
"""
Adds a filter to the TreeQuery rank_table_query
Takes the same arguements as a Django QuerySet .filter()
"""
self.query.__class__ = TreeQuery
self.query._setup_query()
self.query.rank_table_query = self.query.rank_table_query.filter(
*args, **kwargs
)
return self
def tree_exclude(self, *args, **kwargs):
"""
Adds a filter to the TreeQuery rank_table_query
Takes the same arguements as a Django QuerySet .exclude()
"""
self.query.__class__ = TreeQuery
self.query._setup_query()
self.query.rank_table_query = self.query.rank_table_query.exclude(
*args, **kwargs
)
return self
) functions are flexible enough to allow excluding ancestors etc. so that the database can at least avoid building the whole tree in memory before applying filters on it. Since CTEs are still an optimization fence in PostgreSQL that may be a hard problem.

I don't really want to point anyone towards some of the less well maintained alternatives like django-mptt, but it sounds as if something like https://www.postgresql.org/docs/current/ltree.html or maybe some other tree/graph database could be a better fit. I cannot vouch for any Django / ltree integration packages since I have never used one of them. The trees I encounter most of the time are at most a few hundred nodes large so it's no question that the recursive CTE approach is the right one. For (much) larger trees I don't know, but Disqus has certainly also had much success using recursive CTEs for their threaded comments.

Btw, if you know how to do what you want in raw SQL, don't fear dropping down to it if the ORM isn't expressive enough or if there are no libraries around. If you don't know how to do it in raw SQL, maybe first try researching that just to get a feel for the problem and maybe discover ways to slightly alter the requirements to make the implementation more straightforward.

Thanks again for your blazing fast responses!!

No problem, it's certainly an interesting problem!

@rhomboss
Copy link
Contributor

rhomboss commented Jun 4, 2024

The new .tree_filter() methods are not flexible enough on their own to exclude ancestors. In order to do that you would need to modify tree-queries' custom compiler whose default behavior is to start a tree from root nodes that have no parents:

__tree (
{tree_fields_names}
"tree_depth",
"tree_path",
"tree_ordering",
"tree_pk"
) AS (
SELECT
{tree_fields_initial}
0,
array[T.{pk}],
array[T.rank_order],
T."{pk}"
FROM __rank_table T
WHERE T."{parent}" IS NULL

To exclude ancestors you will need to specify in the SQL which nodes you want to be treated as root nodes. The simplest way to do this within the framework of tree-queries is to change the above WHERE statement to filter on a set of primary keys rather than if a node has a null parent. You could then apply something like .extra(where=["tree_depth <= 2"]) to remove any children based on depth.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants