Skip to content

Add nestedRowsLimit parameter to DisplayConfiguration #221

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jan 23, 2023

Conversation

pacher
Copy link
Contributor

@pacher pacher commented Jan 14, 2023

By default DataFrame<T>.toHTML() renders only first 20 rows of the dataset. We can override it by supplying a DisplayConfiguration object with desired rowsLimit.

However if dataset contains FrameColumn those nested datasets are rendered with hard-coded 5 rows limit and there is no way of changing it.

This PR adds nestedRowsLimit parameter to DisplayConfiguration used to limit rendered rows number of nested datasets in FrameColumn.
Also it allows to use Int.MAX_VALUE as a sentinel value to turn the limit off in order to render complete dataset.

@@ -121,7 +121,7 @@ internal fun AnyFrame.toHtmlData(
val queue = LinkedList<Pair<AnyFrame, Int>>()

fun AnyFrame.columnToJs(col: AnyCol, rowsLimit: Int): ColumnDataForJs {
val values = rows().take(rowsLimit)
val values = if (rowsLimit == Int.MAX_VALUE) rows() else rows().take(rowsLimit)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

take() already just returns the list if n >= size. So this change is not needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rows() returns Iterable which is not a Collection. In this case inside take() there is a line

val list = ArrayList<T>(n)

which throws OutOfMemoryError: Requested array size exceeds VM limit if n=Int.MAX_VALUE
So this change is necessary to allow using Int.MAX_VALUE as a sentinel value and turn the limit off if we want to render all the data without guessing the upper limit first.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can use df.count() to set the limit to the number of rows for an outer dataset to render it fully, but I can't do this for nested dataframes. So I would like to be able to use Int.MAX_VALUE or any other value to turn the limit off and just render everything there is.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see, interesting find! I'd maybe add the check for if rowsLimit < 0, so people can then put in -1 if they want no limit. I think that's more common.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had the same thought initially so I know where you are coming from. But I am not sure that it is very common in kotlin nowadays. I can't think of any example. On the contrary, request(Int.MAX_VALUE) to disable backpressure in reactive APIs, buffer(Int.MAX_VALUE) to get unlimited buffer in kotlin coroutines etc.

Whenever I see a size-limiting API without explicit "unlimited" option, my first impulse is to use Int.MAX_VALUE which should be for sure big enough to satisfy any kind of if/while (x < limit) expression in the implementation.

Just 2 days ago you yourself expected take(Int.MAX_VALUE) to return the full collection. I did the same!! It's natural and intuitive. It works most of the time without reading the docs. And take(-1) quite expectedly throws IAE.

Special treatment of Int.MAX_VALUE might be good even if just for the sake of avoiding OOME for the users of the library.

We might want to use negative limit values for something else in the future. For example we can treat -n as size - n or last n rows instead of first n rows similar to array slicing in numpy. Or have constants for special cases like optional or conditional limits, row collapsing or whatever else people might come up with.

Currently toHtml() with negative limit will throw IAE. Which is reasonable default behavior, assuming the limit is the result of erroneous user computation.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, you convinced me.
I can only think of one edge case, namely if the list you're taking from is larger than Int.MAX_VALUE, but I think you've reached OOM errors long before that. Unfortunately, there's no "Infinity" option for integers, so this is the best option.

Maybe we'll convert the rows iterable into a Collection with size to avoid this error in the future.

I will also pass on this case (Iterable.take()) to the Kotlin team and see what they think about it.

Copy link
Collaborator

@Jolanrensen Jolanrensen Jan 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, we could also just change it to rows().toList().take(rowsLimit) since it creates a list anyways. Might be the cleanest solution

Copy link
Collaborator

@Jolanrensen Jolanrensen Jan 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or... make the limit nullable? That would definitely indicate "no limit".

Edit: Okay this is actually the recommendation from the Kotlin team. Let's make it nullable :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. I agree that nullable is even more intuitive and hopefully idiomatic :)

@Jolanrensen Jolanrensen added enhancement New feature or request good first issue Good issues to pick-up for newcomers labels Jan 16, 2023
@Jolanrensen Jolanrensen added this to the 0.10.0 milestone Jan 16, 2023
@Jolanrensen
Copy link
Collaborator

Good idea! :D

@pacher pacher requested a review from Jolanrensen January 16, 2023 13:46
@Jolanrensen Jolanrensen merged commit 4ff8406 into Kotlin:master Jan 23, 2023
@pacher pacher deleted the nested-rows-limit branch January 23, 2023 14:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good issues to pick-up for newcomers
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants