Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

datahub:index:rebuild configName indexes unrelated objects #57

Open
NiklasBr opened this issue May 30, 2024 · 7 comments
Open

datahub:index:rebuild configName indexes unrelated objects #57

NiklasBr opened this issue May 30, 2024 · 7 comments

Comments

@NiklasBr
Copy link

Pimcore version(s) affected

11.2.3

Description

When running the command bin/console datahub:index:rebuild mycomfigname I expect it to re-index DataObjets only in the given config.

Instead it also indexes unrelated DataObjects.

How to reproduce

Example config name:

pimcore_data_hub:
    configurations:
        ci_hub_1:
            general:
                active: 'on'
                type: ciHub
                name: ci_hub_1
                description: ''
                modificationDate: 1716993974
                path: null
                createDate: 1715938788
            schema:
                assets:
                    enabled: 'on'
                    allowOriginalImage: 'on'
                    thumbnails:
                        - card
                        - detail_print
                dataObjectClasses:
                    -
                        id: Product
                        name: Product
                        columnConfig:
                            id:
                                name: id
                                position: 1
                                hidden: false
                                fieldConfig:
                                    key: id
                                    label: ID
                                    type: system
                            articleNumber:
                                name: articleNumber
                                position: 2
                                hidden: false
                                fieldConfig:
                                    key: articleNumber
                                    label: 'Article number (articleNumber)'
                                    type: input
                            GTIN:
                                name: GTIN
                                position: 3
                                hidden: false
                                fieldConfig:
                                    key: GTIN
                                    label: 'GTIN (13 tall) (GTIN)'
                                    type: input
                            productNumber:
                                name: productNumber
                                position: 4
                                hidden: false
                                fieldConfig:
                                    key: productNumber
                                    label: 'Product number (productNumber)'
                                    type: input
                            productDescription:
                                name: productDescription
                                position: 5
                                hidden: false
                                fieldConfig:
                                    key: productDescription
                                    label: 'Product name (productDescription)'
                                    type: input
                            description:
                                name: description
                                position: 6
                                hidden: false
                                fieldConfig:
                                    key: description
                                    label: 'Product description'
                                    type: wysiwyg
                            urlSlug:
                                name: urlSlug
                                position: 7
                                hidden: false
                                fieldConfig:
                                    key: urlSlug
                                    label: 'URL Slug (urlSlug)'
                                    type: urlSlug
                            available:
                                name: available
                                position: 8
                                hidden: false
                                fieldConfig:
                                    key: available
                                    label: 'Available in market'
                                    type: checkbox
                        language: en
            labelSettings:
                -
                    id: system.id
                -
                    id: system.key
                -
                    id: system.fullPath
                -
                    id: system.parentId
                -
                    id: system.type
                -
                    id: system.subtype
                -
                    id: system.hasChildren
                -
                    id: system.creationDate
                -
                    id: system.modificationDate
                -
                    id: system.locked
                -
                    id: system.versionCount
                -
                    id: system.currentVersion
                -
                    id: system.checksum
                -
                    id: system.mimeType
                -
                    id: system.fileSize
                -
                    id: binaryData.original
                -
                    id: binaryData.card
                -
                    id: binaryData.detail_print
                -
                    id: binaryData.galleryThumbnail
                -
                    id: dimensionData.width
                -
                    id: dimensionData.height
            workspaces: {  }
            permissions:
                user: {  }
                role: {  }

Then run bin/console datahub:index:rebuild ci_hub_1

Possible Solution

No response

Additional Context

No response

Copy link

Thank you for reporting this problem!

This is an open source project, and we rely on the community to help us diagnose and fix issues, as it is not possible to investigate and fix every issue reported to us via GitHub.

If possible, please create a pull request that fixes the problem you describe, along with appropriate tests. All pull requests will be promptly reviewed by the BrandOriented team.

Thank you very much!

labudzinski added a commit that referenced this issue Jun 5, 2024
@labudzinski
Copy link

The method used to create the ES index has been changed to be more explicit.
Fixed deficiencies in code.

labudzinski added a commit that referenced this issue Jun 5, 2024
@NiklasBr
Copy link
Author

Seems fixed now.

@NiklasBr
Copy link
Author

Sorry but it is still an issue.

Example:

Installation has about 30 000 assets and 14 000 products (variants and objects). Yet it still indexes over 90 000 elements (which appears to match all dataobjects and all assets in total).

@NiklasBr NiklasBr reopened this Jun 11, 2024
@NiklasBr
Copy link
Author

NiklasBr commented Jun 12, 2024

@labudzinski I think I see the issue here:

if ($rebuildIndexElementMessage->configReader->isObjectIndexingEnabled()) {
$this->rebuildType($rebuildIndexElementMessage, self::TYPE_OBJECT, $hash, $todo);
}

$maxId = $this->getDb()->executeQuery("SELECT MAX(id) FROM {$type}s")->fetchNumeric()[0];

These lines re-indexes all objects in the objects table and does not take into consideration that sometimes a specific set of classes only should be indexed.

@labudzinski
Copy link

labudzinski commented Jun 13, 2024

In the IndexPersistenceService class in the update method:

if(in_array($element, $configReader->getObjectClassNames()) ||

in_array((new \ReflectionClass($this))->getShortName(), $configReader->getObjectClassNames())) {

We check whether the class is declared in the configuration.

@NiklasBr
Copy link
Author

The lines I show above cause a massive unnecessary congestion in the messenger queue by queuing elements which should not be indexed. We can improve performance and reduce strain on the logging system a lot by not doing that in the first place.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants