-
-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Speed extremely slow with populating relation dropdown (several minutes) #5920
Comments
I can share the browser network traffic, but it goes on for ages - 10s of
thousands of requests. I'll try to get this for you. It's definitely not
caching.
I can provide you with a cloned repository for you to try this on, as well.
…On Fri, 12 Nov 2021 at 14:52, Erez Rokah ***@***.***> wrote:
Hi @delwin <https://github.com/delwin>, related to #4635
<#4635>
There seems to be no caching of values between drop-downs despite pulling
from the same source:
Initial load can be slow as it needs to load all entries from the related
connection. Once entries are loaded they should be cached, so that's
definitely something we should look into.
I also experience GitLab to be slower than GitHub (I don't have official
benchmarks though).
I'll try to create a test repo to simulate this case.
Can you share the browser network traffic (by opening the browser
developer tools and going to the network tab)? The browser should indicate
if requests are cached or not. For example this is how Chrome shows cached
requests:
[image: image]
<https://user-images.githubusercontent.com/26760571/141477653-954681f8-5b23-42a3-9799-21e7f5e3dbf9.png>
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#5920 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAC6GKEU5C5M2D4FJ5FENVDULULYZANCNFSM5GQRRPNQ>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
|
The final 43k requests took almost 6 minutes to "finish" (many with a 429 response from gitlab). Once done I am not getting further requests when using a new dropdown to select a post except for the following requests (after clearing the network traffic from the console, to ensure I get the right requests): |
Thanks @delwin, that's helpful - it seems you're hitting GitLab's rate limits. I'll set up a reproduction and see if we can be more efficient. I'll try to post some answers next week. |
Thanks Erez
…On Fri, 12 Nov 2021 at 17:01, Erez Rokah ***@***.***> wrote:
Thanks @delwin <https://github.com/delwin>, that's helpful - it seems
you're hitting GitLab's rate limits.
Looks like these changed not so long ago:
https://docs.gitlab.com/ee/user/gitlab_com/index.html#gitlabcom-specific-rate-limits
I'll set up a reproduction and see if we can be more efficient. I'll try
to post some answers next week.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#5920 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAC6GKH5DCF3YKJO5PHPUQTULU22LANCNFSM5GQRRPNQ>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
|
Ok, so I finally was able to dig into this. What's you're seeing (with the 429 response) is GitLab rate limiting the initial load. The CMS identifies this situation and uses a backoff mechanism to handle it (basically slows everything down). This is how it looks like in local storage: The requests you're seeing here are used by the CMS to calculate the difference between the local version to the remote version. There isn't a simple solution for this, as the relation widget needs to load all files to create the relation. Please let me know if that helps. |
Hi Erez,
I have tried understanding the couple options you sent, particularly the
one of using a file collection with a list widget. I've looked through and
tried to figure out what you might mean by trying a number of things,
without being able to get anywhere. But maybe you can explain what you
mean, more in depth?
We have a folder collection because editors of the site create new items in
it all the time. I thought maybe you were referencing a new type of file
collection, or making a file collection dynamically from a folder
collection from which we can then select, but I'm not finding a way to make
this work.
So can you expand on that possibility?
We switched to GitLab because on GitHub we had one of two problems
(depending on the version of CMS we pegged things to - a pre-GrpahQL
version or the latest verison). One issue was the same 429 rate limit, but
in a much more severe form, not allowing editors to proceed in any way.
That was with GraphQL not enabled. The other issue was with the Media
library, which didn't work on GitHub, but worked on Gitlab when allowing
the latest (GraphQL-allowed) CMS versions to run. With those two crippling
errors, we had to switch to GitLab, which, except for this one page, allows
us to function very well.
Delwin
…On Thu, 18 Nov 2021 at 21:06, Erez Rokah ***@***.***> wrote:
The final 43k requests took almost 6 minutes to "finish" (many with a 429
response from gitlab). Once done I am not getting further requests when
using a new dropdown to select a post except for the following requests
(after clearing the network traffic from the console, to ensure I get the
right requests):
Ok, so I finally was able to dig into this.
I created a test repo here
<https://gitlab.com/erezrokah/netlify-cms-reproductions/-/tree/netlify_cms/issue_5920>.
It has a collection referencing another collection that has 500 entries.
What's you're seeing (with the 429 response) is GitLab rate limiting the
initial load. The CMS identifies this situation and uses a backoff
<https://github.com/netlify/netlify-cms/blob/05e7629cf413b8fc3cfa9ee2b15b6f5ef09e549c/packages/netlify-cms-lib-util/src/API.ts#L80>
mechanism to handle it (basically slows everything down).
After the initial load, not only the CMS caches the requests using the
browser IndexedDB (not the built-in cache mechanism my initial image shows
<#5920 (comment)>),
it also saves a local representation
<https://github.com/netlify/netlify-cms/blob/05e7629cf413b8fc3cfa9ee2b15b6f5ef09e549c/packages/netlify-cms-lib-util/src/implementation.ts#L511>
of the git tree for that collection.
That means, that in further attempts to retrieve the collection the CMS
only retrieves the difference (the changes to the folder) instead of
retrieving all the files. The reason for this, is that GitLab paginates
listing of files, so without this optimization we'll need to retrieve all
pages (in comparison GitHub lets you list 100K files in a single request).
This is how it looks like in local storage:
[image: image]
<https://user-images.githubusercontent.com/26760571/142483748-e3490f17-620b-4b26-b86e-1d5f4e93cc9b.png>
The requests you're seeing here
<#5920 (comment)>
are used by the CMS to calculate the difference between the local version
to the remote version.
There isn't a simple solution for this, as the relation widget needs to
load all files to create the relation.
One possible option is to use a file collection with a list widget and
reference items in the list (see more here
<https://www.netlifycms.org/docs/widgets/#relation>), instead of a folder
collection with multiple entries.
Another option is to reach out to GitLab to see if the rate limits are
configurable, or if see if they can provide a way to retrieve multiple
files in a single request. This might be possible via the GraphQL API
<https://docs.gitlab.com/ee/api/graphql/>, but will require a lot of
effort.
Please et me know if that helps.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#5920 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAC6GKBPW35T7S7DGMNEUZTUMVMBTANCNFSM5GQRRPNQ>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
|
I have done some more testing, re-creating the repository again on Github so that the GraphQL API can be used, to check to see how this helps. The response time to the editor is just as abysmal, to the point of being unusable. The number of requests made to Github is definitely vastly reduced (to around 800 from around 40000 without GraphQL on Gitlab), but this doesn't seem to help with the responsiveness. Other notes:
When going to the Collections list page first I notice that a caching message is indicated the first time going to the collections list. It seems that perhaps the entries are only cached when listing them in the Collections screen(??) When doing that I get an initial "This may take several minutes" message, and when scrolling down through each "page" of entries, a "Loading Entries..." message that takes a minute before being replaced with another page of the entries, finally taking several minutes before I'm able to get to the bottom of the list of 150 items. However, even when these entries are cached for the Collections list screen, that cache does not seem to be used in the drop downs (the relationship widget), as using them even after the caching in the list still takes 5 minutes to populate. So this is no better on Github with GraphQL than it is on Gitlab (where I guess GraphQL is not yet supported by Netlify CMS). Delwin |
Hi @delwin and sorry for the very late response, I was out of office for a week+. For the file collection suggestion - the config will look like: collections:
- name: posts
label: Posts
label_singular: 'Post'
folder: content/posts
create: true
fields:
- label: Title
name: title
widget: string
- label: 'Featured Stories⁺'
singular_label: 'Featured Story'
hint: '⁺will be displayed in the order provided above.'
name: 'featured_stories'
widget: list
ui: fields
required: true
allow_add: true
fields:
- label: 'Story'
name: 'story'
widget: 'relation'
collection: 'stories'
file: 'stories'
search_fields: ['stories.*.title']
value_field: 'stories.*.title'
display_fields: ['stories.*.title']
- name: stories
label: Stories
files:
- file: content/stories/stories.json
name: stories
label: Stories
fields:
- label: Stories
name: stories
widget: list
fields:
- label: Title
name: title Notice the wildcard For differences between GitHub and GitLab when reaching rate limits - GitHub has a 1 hour window, so if you reach the limit you'd have to wait up to 1 hour for that window to reset. As for GitHub with GraphQL - let me look into the issues, but please consider the CMS doesn't currently take full advantage of the GraphQL API. I'll port my test repo to GitHub and see if it works as expected with/without GraphQL and also check the media library issue, |
Thanks @erezrokah So that's what I thought you might mean with the file collection, but that means that every time a story is added, deleted or has its title changed (we don't use title for the relationship field because of this), we'd have to manually update the stories.json file, which isn't feasible for a manual process without errors and seems like an unwieldy process. It is something I have considered doing, having a server process running regularly to generate such a stories.json file and commit its changes to the repository, but it really doesn't seem like an ideal solution. Or am I missing something in your suggestion? |
Hey @erezrokah, As I indicated we already don't use the title for the relationship field, and we do have an id widget that is used for that purpose. The issue I raise is that in order to make this sort of "solution" work, the stories.json file needs to be manually updated every time a story is added, deleted, or a title changed (so that the title being searched appears correctly). This cannot happen within the CMS workflow, and would need to be done outside of the CMS, so it seems very unwieldy. |
Hi, we have the same issue with Gitlab. We have an object consisting of an object and a list of objects which have the
(Edited for clarification) |
Yes, I get the impression more and more that Netlify CMS was a fun experiment that had some hope, but really only works well for small sites with very little structure, unfortunately. The reliance on git was a cool idea, but is obviously not very scalable. |
@erezrokah we are hosting our own Gitlab instance. There are no Gitlab API limits set for authenticated users. Any other Gitlab settings I might tweak? Thanks a lot! |
Hi @delwin and @lorenzode 👋 @lorenzode for the performance issues with GitLab, in order to figure out the relation the CMS needs to read all the files. This is especially slow on initial load (before caching kicks it). The only way I can think to improve this is to use the GraphQL API. We're currently using GitLab REST API, which means we need to list all files (and paginate on them 100 at a time), and then read each file in a separate request. Lets say you have 1000 files in a collection, we'll need 10 requests just to list them (those need to happen serially) and then another 1000 to read the content of the files (those are done in parallel, but still...). @delwin I'm not sure I understand the need to sync the |
I'm seeing this happen in an environment with only 30 or so files, but many relation fields, the number of requests doesn't seem to stop. Keeps crashing chrome. |
@erezrokah I will explain the need for a synchronization by starting with a more complete structure of the site, although I believe you have seen the complete structure before with a different bug:
So the issue is that the Stories are and must be in a folder collection, since they produce web pages. We cannot transform the Stories to a file collection because they are not simply used in a list on one or two pages. This means that if we want to use a file collection to make the Relationship Widget more performant, we would need to have it created on-the-fly whenever the Folder collection is updated (particularly when a new Story is added, one is deleted or a title is updated). The File collection cannot functionally replace the Folder collection |
Thanks @delwin, that makes sense. Using a file collection will require changing the way pages are generated during the build. The function to optimize would be this one: It's a bit complex at the moment since it does what I described in #5920 (comment) (caching each collection locally and only getting a diff). If we can list files and get their content using a GraphQL API query, that should boost performance quite a bit, and it could be a drop-in replacement for that function. I wonder if at some point we'll hit these limits, but it's worth a try. |
Hey @erezrokah, I have now made a couple file collections that mirror the folder collections so that we can use and test that. BTW, we use hexo as a site generator, and I have created a small hexo generator plugin to generate the file collection files we would need (one for each type of article - only one of which I've mentioned in this thread). I have thus created file collections for these relationships to use, mirroring the folder collections, but only containing the fields necessary for the relationship widget (Title, ID). Note: This is not yet a complete solution, as I have to manually commit the changes to these file collecitons to git for them to be picked up. The Stories file collection contains the ~150 stories in it. Using this configuration where the relationship field picks up values from a file collection instead of the folder collection is quite surprisingly no faster when the editor is first loaded for a page (the relationship widgets aren't yet cached). One the select widget is clicked in and typing is started to search for an article, it still takes several minutes to load the widget with the choices. One this several minutes has passed, then searches only take several seconds (2-3 at best, 5 at worst). This still seems exceeding slow for a process where only 150 titles are searched for possible matches, so I'm not sure what the widget is doing. This time is consistent across all such widgets on the page (remember the relationship widget is in a list widget, and there are at least two of these lists using the same relationships on this page). Once the editor page is exited, however, going back to the editor for the page starts everything all over - needing several minutes to load data for the first time a relationship widget is activated, then serveral seconds after that. So there seems to be no caching of a relationship widget that uses identical collections and search conditions between page edits, and the cache is rebuilt every time a page is edited and the time to construct it is exceeding long considering it now only needs to load one file to get the information it needs (150 records, and about 26kB in size). BTW, I have also made sure that there is not preview for the page configured in case the preview was adding any time to it. So the "preview" is the default preview, listing values (in this case IDs). Here is the config of the file collection and the corresponding generated collection: Definition of the file collection:
Use of the file collection in a page:
story_index.json:
|
Hi @delwin, thank you for the additional information. I'm currently looking into GitLab's GraphQL API to see if it provides better performance, see #6034 (comment). I'll update once/when I make some progress. |
Thanks @erezrokah ! |
Hello 👋 I currently have a draft PR that uses the GitLab GraphQL to retrieve 50 files at a time (to avoid hitting query complexity limit). There are still improvements to be made (like saving to a local cache instead of in-memory cache), but it should help speed up initial load quite a bit. Also, it will require some more testing to see if query complexity is impacted by the content size of files (my test repo has small files at the moment). It would be great to get some early feedback on this, so if anyone would like to try it out at this stage, you can follow on contributing guide. See https://github.com/netlify/netlify-cms/blob/782e87c48a14937fcc7167ae5a7960a692e8054c/CONTRIBUTING.md#debugging You'll need to have a similar config to this:
|
I'm going to try take a look at this in the next weeks. Thanks @erezrokah |
Initial support for the GitLab GraphQL API is released in Going to keep the issue open to get some feedback. |
Thanks so much! We will test this in the new year and give feedback here.Am 28.12.2021 12:51 schrieb Erez Rokah ***@***.***>:
Reopened #5920.
—Reply to this email directly, view it on GitHub, or unsubscribe.Triage notifications on the go with GitHub Mobile for iOS or Android.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
Hi @erezrokah . Happy New Year! So we did test this, even so much as having the team of editors try it on the live administration. Unfortunately enabling graphql caused
|
We tried 2.10.183 on an instance where we have this structure: List > Object > List > Objects with relation field. We host our own Gitlab. Each relation points to 50 to 100 files. I definitely see an improvement in speed when loading the form although there is still some loading time when revealing the last object widget and until the relations are displayed in the relation widget. It's not snappy but it makes this form usable for us again. Let me know if you want me to supply additional debugging information. Many thanks! |
Hi everyone, sorry for the delay. This issue got pushed back due to other priorities.
This is probably related to us hitting the query complexity limit. We'll need to add some error handling and fall back to retrieving less entries (the failure is probably reflected in the query response in the browser network tab). I'll try to improve it by the end of this week. |
@erezrokah might this be a related problem? #6410 |
Still having this issue as well |
We still find this a crippling problem for editors, taking minutes (tens of minutes) for relationship fields (relation widgets, displayed as a type of custom drop-down) to populate in trying to select a post in a page. We have tried things like paring down the git repository, ensuring that editors have good amounts of memory for caching, etc. In the most recent trials I have done as a developer who sees the issues on my machine, too, I have found that the issue is no longer requests being made from the browser on the initial use of a relationship field. In fact, I have always told the editors to give the CMS a few minutes (10-15 or more) to populate the cache after the first time they connect, so that all data can be cached properly. This seems to happen. Once the cache is populated, however, the drop-down relation widget still take 10-15 minutes to populate (or more, depending on computer and browser) the first time one is used (remember, we use relation widgets as part of a list widget, so adding another relation to the list after one has already been added successfully works much faster). The second time a relation is added to the list and a portion of title typed, the drop-down populates within 5-10 seconds. However, using the same template, or coming back to the same page within the same editing session, the initial population time returns to 10-15 minutes or more. During all of this drop-down (relation widget) population time, there is almost no network traffic. None, in fact, other than a occasional "user" ping once every 5 minutes. So this population time is all some sort of JavaScript processing, completely unrelated to Gitlab network traffic and responses. |
There is a possible solution for this problem. We created an own relation widget that only uses the slug as search, display and value field. It is basically a select widget that loads the options by listing all files from a folder collection. Unfortunately, there is no function provided in the props to list all files from a directory in the git repository. I called the GitLab API directly. We have 1400+ files in the referenced collection and it works within a few seconds. Warning: This code only works for GitLab SaaS but it can be adpted to other backends. simple_relation/SimpleRelationControl.tsx
simple_relation/schema.ts
cms.ts
usage in config.yml
|
Describe the bug
We have a template that uses four different list fields - each one allowing the administrator to specify any number of items from a collection to be displayed on a page. The list is defined with a relation field widget, similar to the code below. When there were fewer than 30 items in the respective collections and fewer than 5 or so items selected for each of the four lists this was slow, but acceptable (taking a second or two to populate the relation field dropdown). With 140 and 60 items in each of the referenced collections (stories and news), and 77 selected from one collection and 8 from the smaller collection the waiting time for drop-down population seems to be on the order of several minutes, and this population takes place every time a new item is added to the list, or an item's relation drop-down is clicked to edit it. There seems to be no caching of values between drop-downs despite pulling from the same source:
To Reproduce
Expected behavior
I would expect that the performance for selecting in this sort of situation be good enough at least through a thousand or more items in the collection and at least a few hundred or more items selected from that collection
Applicable Versions:
CMS configuration
See relevant snipet, above
Additional context
If required, we can provide a copy of the repository with all config and collection files for analysis to a Netlify CMS engineer (not for direct public release)
The text was updated successfully, but these errors were encountered: