Skip to content

Improve attribute checking #11554

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

FreekVandeursen
Copy link

On a store with a large number of attribute sets, a lot of repeated checking is done for the same attributes.

Description

Instead of repeating the checks every time, keep track of the attributes that have already been checked and should be skipped (invisible). This we way we don't have to do the same check for the next attribute set.

Fixed Issues (if relevant)

Manual testing scenarios

  1. Create a store with a large number of attribute sets, and add some invisible attributes in those sets
  2. Upload a csv file with products to import, and click the Check Data button

With this change the checking phase will be much faster than without the change, without any functional change.

Contribution checklist

  • Pull request has a meaningful description of its purpose
  • All commits are accompanied by meaningful commit messages
  • All new or changed code is covered with unit/integration tests (if applicable)
  • All automated tests passed successfully (all builds on Travis CI are green)

On a store with a large number of attribute sets, a lot of repeated checking is done for the same attributes.
So instead we can keep track of the attributes we already checked, and skip them the next time.
@magento-cicd2
Copy link
Contributor

magento-cicd2 commented Oct 19, 2017

CLA assistant check
All committers have signed the CLA.

@ishakhsuvarov
Copy link
Contributor

Hi @FreekVandeursen
Do you have any performance comparison numbers to see the benefit of this change?

@FreekVandeursen
Copy link
Author

Hi @ishakhsuvarov,

I'll need to do some work to reproduce the situation in which I noticed the performance problems here. So I cannot give you exact numbers immediately. But to give you a rough estimate: on a product catalog with around 500 attribute sets (and no products yet), checking a csv file would take around 15 min. on my development machine, before it would even reach the row validation. With this change, it was under a minute.

But if you're interested, I can reproduce the same situation to give some more exact numbers.

@ishakhsuvarov
Copy link
Contributor

@FreekVandeursen
Thank you for a quick answer. It is important to us to evaluate the fix from the performance standpoint as well.
Precise scenario you used for testing would be helpful.

@FreekVandeursen
Copy link
Author

I'll try to reproduce the previous scenario tomorrow and give you an update on the results @ishakhsuvarov

@ishakhsuvarov ishakhsuvarov self-assigned this Oct 20, 2017
@FreekVandeursen
Copy link
Author

FreekVandeursen commented Oct 22, 2017

Hey @ishakhsuvarov , I've set up a scenario on my laptop to reproduce this issue.
I've created 230 attributes, and 580 attribute sets containing various combinations of these attributes, resulting in almost 58000 entries in the eav_entity_attribute.
On my development machine, this results in the Check Data step in the product import taking between 18 and 19 minutes with core code. With the change in this pull request, the that goes down to only 30 seconds.

The csv I used to test this only contained a single row, with 2 columns. So the big delay is in the initialization phase of the checking, not in the actual row validation itself.

@FreekVandeursen
Copy link
Author

Good morning @ishakhsuvarov, Do you need any additional information from me regarding this issue, or is this enough to test/review/merge the pull request?

@ishakhsuvarov
Copy link
Contributor

@FreekVandeursen Thank you for the update. We are going to continue with evaluation and let you know if we have any additional questions.

@FreekVandeursen
Copy link
Author

@ishakhsuvarov @okorshenko Any news about this issue? Is there anything preventing the testing and merging of this?

@okorshenko okorshenko modified the milestones: January 2018, February 2018 Feb 7, 2018
@AVoskoboinikov
Copy link
Contributor

Hi @FreekVandeursen
Thanks for your improvement, It looks like very promising but I am concerned with one question.
Lets assume we have 3 attribute sets with next attribute ids:
AttrSet#1 => [1, 2]
AttrSet#2 => [2, 3]
AttrSet#3 => [1, 3]

Looks like after your changes AttrSet#3 will not be added to $this->_attributes (via method _addAttributeParams()) because $unknownAttributeIds will be empty. I think this could be a functional issue.

Have you checked this case? Let me know if you need more details from me.

@okorshenko okorshenko modified the milestones: February 2018, March 2018 Mar 1, 2018
@FreekVandeursen
Copy link
Author

hi @AVoskoboinikov ,
Thanks for the review. Your comment seem to be valid, that it does break some functionality.
I think it can be easily fixed though by moving the _addAttributeParams() call from the attachAttributesById function to the _initAttributes function. I'll test that and push a fix if it works.

@FreekVandeursen
Copy link
Author

hey @AVoskoboinikov ,
On closer inspection, it seems there is no problem at all. The foreach in line 297 will call the _addAttributeParams() anyway. I have tested with the scenario that you mentioned, and the $this->_attributes array is filled normally for all attribute sets.

@AVoskoboinikov
Copy link
Contributor

Hi @FreekVandeursen
Thanks for a great job!
@magento-engcom-team, can you help with this PR?

@okorshenko okorshenko modified the milestones: March 2018, April 2018 Apr 16, 2018
…to2ce into performance/import-attribute-validation
@ishakhsuvarov
Copy link
Contributor

@FreekVandeursen Unfortunately there are still some integration tests failing on this Pull Request. Could you please check those?

@FreekVandeursen
Copy link
Author

Hi @ishakhsuvarov . Is there still any serious interest in this change? See the history above. Every time I fix or answer something, the thread then sits waiting for a couple of months. If that is how it will continue to go I would prefer to just close it and forget about it. But if there's still serious interest I will of course see why the tests fail although they passed before.

@ishakhsuvarov
Copy link
Contributor

Hi @FreekVandeursen
There is always an interest in collaboration. Unfortunately this PR involved some more in-depth performance evaluation than usual, therefore also different flows for processing on our side. I assume it is now resolved and we can continue in a faster pace.
Please let us know if you wish to continue and we will try and speed up things, hopefully.

Thank you

@FreekVandeursen
Copy link
Author

No problem @ishakhsuvarov , I will take a look at it to see what needs to be fixed. It might take me a couple of days before I have time for that though.

@okorshenko okorshenko modified the milestones: April 2018, May 2018 May 18, 2018
@okorshenko okorshenko removed this from the May 2018 milestone May 31, 2018
@magento-engcom-team magento-engcom-team added this to the Release: 2.2.6 milestone Jun 5, 2018
@VladimirZaets VladimirZaets self-assigned this Jun 20, 2018
@ishakhsuvarov ishakhsuvarov removed their assignment Jul 12, 2018
@magento-engcom-team magento-engcom-team merged commit a961726 into magento:2.2-develop Jul 13, 2018
@magento-engcom-team
Copy link
Contributor

Hi @FreekVandeursen. Thank you for your contribution.
We will aim to release these changes as part of 2.2.6.
Please check the release notes for final confirmation.

Please, consider to port this solution to 2.3 release line.
You may use Porting tool to port commits automatically.

@ishakhsuvarov
Copy link
Contributor

Hi @FreekVandeursen
Thanks for your patience.

@FreekVandeursen FreekVandeursen deleted the performance/import-attribute-validation branch July 13, 2018 13:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants