Performance Improvement Proposal for queries related with the Task Table #20505

peresureda · 2024-05-28T14:35:10Z

Is your feature request related to a problem? Please describe.
When you have the Trivy scanner activated and want to perform one of the following operations:

List the artifacts of a project.
Execute a ScanAll.

The following query is executed in the database:

SELECT * FROM task WHERE extra_attrs::jsonb @> cast( $1 as jsonb )

This query becomes very heavy for the database CPU, and in some cases, Harbor timeouts when the number of artifacts is large.

This is because the extra_attrs field is of type JSON and cannot have any kind of index on it.

Solution Proposal

Since the JSON and JSONB data types accept the same input and produce the same output, my recommendation is to change the data type to JSONB to allow indexing:

ALTER TABLE task ALTER COLUMN extra_attrs SET DATA TYPE jsonb
CREATE INDEX indx_task_jsonb ON task USING GIN(extra_attrs jsonb_path_ops);

Improvements
With this improvement, we have achieved the following results:

Reduced artifact listing time from 45 seconds to 2 seconds.
Reduced scan time by 80%.
Reduced CPU consumption during a ScanAll from 60% to 25%.

The text was updated successfully, but these errors were encountered:

Vad1mo · 2024-05-28T20:24:18Z

Thank you for the improvement, this makes sense.

one question does ALTER TABLE task ALTER COLUMN extra_attrs SET DATA TYPE jsonb convert existing data?

peresureda · 2024-05-29T14:30:05Z

Hi @Vad1mo ,

Important points:

JSONB available from: PostgreSQL 9.4 (2014) added support for JSONB data type.
PostgreSQL offers two types for storing JSON data: json and jsonb. The json and jsonb data types accept almost identical sets of values as input. The json data type stores an exact copy of the input text, which processing functions must reparse on each execution; while jsonb data is stored in a decomposed binary format that makes it slightly slower to input due to added conversion overhead, but significantly faster to process, since no reparsing is needed. jsonb also supports indexing, which can be a significant advantage.

At summary: You can migrate a JSON column to JSONB simply by altering the table. However, it's not possible to revert back since JSONB does not store the JSON in raw format. If you want to revert back, you should first make a copy of the table.
create table task_aux as select * from task;

flbla · 2024-05-31T07:59:24Z

Hi,
we probably have the same issue.
since we updated Harbor from 2.10.0 to 2.10.2, we have a lot of slowness in harbor when we access artefacts page with the UI same with the artifacts API
we saw huge CPU usage on our postgres database since the update
It looks like its due to this change : https://github.com/goharbor/harbor/pull/20169/files#diff-529262240b8fd8c4239ff9411c3c13ae66343e674ff7ce1fa4978f692079089eR118

explain command on one of the select * from task :

difference in CPU usage between today (yellow line) and previous week (gray dotted line) :

peresureda · 2024-05-31T08:30:10Z

Thank you for the update. Now I think everything is clear. Anyway, my recommendation is to change the data type ( to jsonb ) instead of performing the conversion to make the query. Performance tests have reduced CPU consumption by almost 50% with the change of the data type to how it was implemented before.

flbla · 2024-05-31T10:14:48Z

hi @peresureda, we tried it on our staging env, indeed with the conversion to jsonb and the add of the index it's much more faster, thank you

@wy65701436 @Vad1mo do you think this change could be included in a 2.10 version ? thank you

AlenversFr · 2024-05-31T10:51:31Z

hi @peresureda, we tried it on our staging env, indeed with the conversion to jsonb and the add of the index it's much more faster, thank you

@wy65701436 @Vad1mo do you think this change could be included in a 2.10 version ? thank you

Query analysis :

Before applying the fix :

                                           Table "public.task"
     Column      |            Type             | Collation | Nullable |             Default              
-----------------+-----------------------------+-----------+----------+----------------------------------
 id              | integer                     |           | not null | nextval('task_id_seq'::regclass)
 execution_id    | integer                     |           | not null | 
 job_id          | character varying(64)       |           |          | 
 status          | character varying(16)       |           | not null | 
 status_code     | integer                     |           | not null | 
 status_revision | integer                     |           |          | 
 status_message  | text                        |           |          | 
 run_count       | integer                     |           |          | 
 extra_attrs     | jsonb                       |           |          | 
 creation_time   | timestamp without time zone |           |          | now()
 start_time      | timestamp without time zone |           |          | 
 update_time     | timestamp without time zone |           |          | 
 end_time        | timestamp without time zone |           |          | 
 vendor_type     | character varying(64)       |           | not null | 
Indexes:
    "task_pkey" PRIMARY KEY, btree (id)
    "idx_task_extra_attrs_report_uuids" gin ((extra_attrs -> 'report_uuids'::text))```


when you run a query like just below the result of the EXPLAIN command indicates a seq_scan. It meens there's no use of the index  "idx_task_extra_attrs_report_uuids"  defined with task table.

```explain SELECT * FROM task WHERE extra_attrs::jsonb @> cast('{"report_uuids": ["6526b828-f6ef-4864-a27e-9b0dc38f1502"]}' as jsonb );```

If you just apply :
```ALTER TABLE task ALTER COLUMN extra_attrs SET DATA TYPE jsonb```
You might think the index will be used but it's not the case it's still a seq_scan.

After applying the new index :
```CREATE INDEX indx_task_jsonb ON task USING GIN(extra_attrs jsonb_path_ops);```

Here is the result of the explain command : 
```explain SELECT * FROM task WHERE extra_attrs::jsonb @> cast('{"report_uuids": ["6526b828-f6ef-4864-a27e-9b0dc38f1502"]}' as jsonb );
                                                QUERY PLAN                                                
----------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on task  (cost=8.01..10.02 rows=1 width=253)
   Recheck Cond: (extra_attrs @> '{"report_uuids": ["6526b828-f6ef-4864-a27e-9b0dc38f1502"]}'::jsonb)
   ->  Bitmap Index Scan on indx_task_jsonb  (cost=0.00..8.01 rows=1 width=0)
         Index Cond: (extra_attrs @> '{"report_uuids": ["6526b828-f6ef-4864-a27e-9b0dc38f1502"]}'::jsonb)
(4 rows)```

In our staging environment here is the performance impact of the modification : 
Number of rows in the task table = 13k rows
Average of the painful query before = 175 ms/query
Average of the painful query after (ALTER +INDEX) = 0.25 ms/query

When the task table is nicelly loaded (more than 300k rows like our production), the query avg is around 1s.


Modifying the type of the column means we have to check all the internal services  of Harbor that might use this table.
It's not easy to evaluate the impact.

Here is a query that helps to validate the correct interactions between task and execution tables : 
```WITH task_error as (                                          
    select b.execution_id,count(b.id) as cnt from task as b where b.status='Error' group by b.vendor_type, b.execution_id
), task_success as (
    select b.execution_id,count(b.id) as cnt from task as b where b.status='Success' group by b.vendor_type, b.execution_id
), task_stopped as (
    select b.execution_id,count(b.id) as cnt from task as b where b.status='Stopped' group by b.vendor_type, b.execution_id
)
 select a.vendor_type, to_char(a.start_time,'YYYY-MM-DD'), a.status,
 count(distinct a.id) as nb_execution,
 SUM(task_error.cnt) as task_error_cnt,
 SUM(task_success.cnt) as task_success_cnt,
 SUM(task_stopped.cnt) as task_stopped_cnt
 from execution as a
 left Join task_error on task_error.execution_id=a.id
 left Join task_success on task_success.execution_id=a.id
 left Join task_stopped on task_stopped.execution_id=a.id
 group by a.vendor_type, to_char(a.start_time,'YYYY-MM-DD'), a.status
 order by to_char(a.start_time,'YYYY-MM-DD') desc limit 20;```
 
 In order to validate the performance fix  :
 This query should indicates no regression for every executions and tasks dependencies.
  
AS for now in our staging environment the fix seems to have no negative impact but it needs to be confirmed by the maintainers.

```   vendor_type     |  to_char   | status  | nb_execution | task_error_cnt | task_success_cnt | task_stopped_cnt 
--------------------+------------+---------+--------------+----------------+------------------+------------------
 EXECUTION_SWEEP    | 2024-05-31 | Success |            3 |                |                3 |                 
 GARBAGE_COLLECTION | 2024-05-31 | Success |            1 |                |                1 |                 
 IMAGE_SCAN         | 2024-05-31 | Running |            7 |                |                  |                 
 REPLICATION        | 2024-05-31 | Success |           10 |                |               16 |                 
 SCAN_ALL           | 2024-05-31 | Running |            1 |                |             2082 |                 
 SCHEDULER          | 2024-05-31 | Running |            1 |                |                  |  ```
 
 An easy way to test the performance issue is to create a replication that is scanning all the harbor projects.
 Example of configuration with a single check on a label :
 
![image](https://github.com/goharbor/harbor/assets/57966036/ee5bec28-e14a-4202-bdd4-bd3fd0cb4a33)


The fix seems legit and operationnal, we just need confirmation from the maintainers and wait for a new release.

…trs_report_uuids fixes goharbor#20505 Signed-off-by: stonezdj <stone.zhang@broadcom.com>

…_task_extra_at… (#20546) Adjust the query by UUID sql so that it can use the idx_task_extra_attrs_report_uuids fixes #20505 Signed-off-by: stonezdj <stone.zhang@broadcom.com>

Vad1mo · 2024-06-04T11:23:57Z

@stonezdj @wy65701436, thank you for the swift fix of the issues.

It is not clear to me why, in addition to changing the query, we can not change the Data Type for generally a better performance?

ALTER TABLE task ALTER COLUMN extra_attrs SET DATA TYPE jsonb
CREATE INDEX indx_task_jsonb ON task USING GIN(extra_attrs jsonb_path_ops);

peresureda · 2024-06-04T11:52:08Z

We are reverting to the previous situation where we are casting the object on each execution in order to use an index. Tests show that CPU consumption decreased exponentially by avoiding this cast and converting the data type. Am I missing any place where you need this field to be in JSON and cannot be migrated to JSONB?

@stonezdj , @wy65701436 ¿?

…trs_report_uuids fixes goharbor#20505 Signed-off-by: stonezdj <stone.zhang@broadcom.com>

Vad1mo · 2024-06-04T13:44:13Z

I would also suggest to transition to jsob data type, it is clearly recommended option for dealing with json in pg, specifically for performance reasons. There are also other opertunities to switch to jsonb.

stonezdj · 2024-06-05T02:20:03Z

The PR merged in 2.11 just fixed the index not used issue, for converting json to jsonb type and more index, we need to do more testing to verify it.

…trs_report_uuids fixes goharbor#20505 Signed-off-by: stonezdj <stone.zhang@broadcom.com>

wy65701436 · 2024-06-05T08:56:58Z

Since we didn't encounter any performance issues prior to v2.10.1, the index on report_uuids has proven beneficial. The recent commit did not leverage this existing index, leading to performance problems.

The fix is to rewrite the SQL to ensure the DB engine utilizes the report_uuids index. This approach avoids updating the DB schema, which would require additional upgrade work and testing

This fix will be in v2.9.5, v2.10.3 and v2.11.0.

Vad1mo · 2024-06-05T14:43:18Z

@stonezdj @wy65701436, thank you for more info, as a quick fix this is a good way forward.

However, I am quite confident, that we can optimize this further.

@flbla @thcdrt @peresureda, you have a rather large data sets, can you provide some data points.

Switch to DATA TYPE jsonb (we have already datapoints here)
Applied fix Adjust the query by UUID sql so that it can use the idx_task_extra_at… #20545
Applied fix Adjust the query by UUID sql so that it can use the idx_task_extra_at… #20545 + change of DATA TYPE to jsonb.

This will also open up opportunities for further improvements, as have other json data types present, that we query.

…_task_extra_at… (#20552) Adjust the query by UUID sql so that it can use the idx_task_extra_attrs_report_uuids fixes #20505 Signed-off-by: stonezdj <stone.zhang@broadcom.com>

#20545) Adjust the query by UUID sql so that it can use the idx_task_extra_attrs_report_uuids fixes #20505 Signed-off-by: stonezdj <stone.zhang@broadcom.com>

peresureda · 2024-06-13T11:29:25Z

Hi @Vad1mo

This morning I have done extra tests with the different scenarios commented and I can provide more data:

Applied fix [https://github.com/Adjust the query by UUID sql so that it can use the idx_task_extra_at… #20545]: In this case, we return to a previous scenario where in our current system, listing the artifacts of a project with around 50 artifacts including information about the CVEs takes us between 7 and 10 seconds.

Execution Plan:

explain SELECT * FROM task WHERE extra_attrs::jsonb -> 'report_uuids' @> '{"artifact_id":10216,"artifact_tag":"1.585.0","report_uuids":["94aed6ca-8ac4-42c8-9325-80e70bf8040f"],"robot_id":14160}'

"Bitmap Heap Scan on task  (**cost=68.01..75.71** rows=2 width=220)"
"  Recheck Cond: (((extra_attrs)::jsonb -> 'report_uuids'::text) @> '{""robot_id"": 14160, ""artifact_id"": 10216, ""artifact_tag"": ""1.585.0"", ""report_uuids"": [""94aed6ca-8ac4-42c8-9325-80e70bf8040f""]}'::jsonb)"
"  ->  Bitmap Index Scan on idx_task_extra_attrs_report_uuids  (cost=0.00..68.01 rows=2 width=0)"
"        Index Cond: (((extra_attrs)::jsonb -> 'report_uuids'::text) @> '{""robot_id"": 14160, ""artifact_id"": 10216, ""artifact_tag"": ""1.585.0"", ""report_uuids"": [""94aed6ca-8ac4-42c8-9325-80e70bf8040f""]}'::jsonb)"

Applied fix https://github.com/goharbor/harbor/pull/20545 + change of DATA TYPE to jsonb: This case doesn't make sense because even if we change the data type, if the query is the provided to the fix, the performance is the same. This is because the way the query is set up doesn't use the JSONB index.

explain SELECT * FROM task WHERE extra_attrs::jsonb -> 'report_uuids' @> '{"artifact_id":10216,"artifact_tag":"1.585.0","report_uuids":["94aed6ca-8ac4-42c8-9325-80e70bf8040f"],"robot_id":14160}'

"Bitmap Heap Scan on task  (**cost=72.01..79.73** rows=2 width=250)"
"  Recheck Cond: ((extra_attrs -> 'report_uuids'::text) @> '{""robot_id"": 14160, ""artifact_id"": 10216, ""artifact_tag"": ""1.585.0"", ""report_uuids"": [""94aed6ca-8ac4-42c8-9325-80e70bf8040f""]}'::jsonb)"
"  ->  Bitmap Index Scan on idx_task_extra_attrs_report_uuids  (cost=0.00..72.01 rows=2 width=0)"
"        Index Cond: ((extra_attrs -> 'report_uuids'::text) @> '{""robot_id"": 14160, ""artifact_id"": 10216, ""artifact_tag"": ""1.585.0"", ""report_uuids"": [""94aed6ca-8ac4-42c8-9325-80e70bf8040f""]}'::jsonb)"

Proposal: Looking at all the results, the real performance improvement is with the previous query and the proposed improvement, as we utilize the JSONB enhancements provided by PostgreSQL. In this case, the previous ListAll is reduced to 2-3 seconds.

explain SELECT * FROM task WHERE extra_attrs::jsonb @> cast( '{"artifact_id":10216,"artifact_tag":"1.585.0","report_uuids":["94aed6ca-8ac4-42c8-9325-80e70bf8040f"],"robot_id":14160}' as jsonb )

"Bitmap Heap Scan on task  (cost=40.01..47.72 rows=2 width=250)"
"  Recheck Cond: (extra_attrs @> '{""robot_id"": 14160, ""artifact_id"": 10216, ""artifact_tag"": ""1.585.0"", ""report_uuids"": [""94aed6ca-8ac4-42c8-9325-80e70bf8040f""]}'::jsonb)"
"  ->  Bitmap Index Scan on indx_task_jsonb  (cost=0.00..40.01 rows=2 width=0)"
"        Index Cond: (extra_attrs @> '{""robot_id"": 14160, ""artifact_id"": 10216, ""artifact_tag"": ""1.585.0"", ""report_uuids"": [""94aed6ca-8ac4-42c8-9325-80e70bf8040f""]}'::jsonb)"

flbla · 2024-06-14T10:31:16Z

Hi @Vad1mo,

I built a version 2.10 with the rollback on the SELECT FROM task query.
I deployed it on my staging environment: ~7K artifacts (for comparaison, our prod env has arround 125K artifacts)
I reset my pg_stat_statements table before I started my tests

The size of the task table when I started my tests was 11313 lines.

Query	Tot_Calls	Tot_Time	Avg_Time
SELECT * FROM task WHERE extra_attrs::jsonb -> $2 @> $1	1056	87428.30858300008	82.79195888541675

After I launched some "GC", "Scan All" and "Replications" actions :
Size of table task : 10408 lines

Query	Tot_Calls	Tot_Time	Avg_Time
SELECT * FROM task WHERE extra_attrs::jsonb -> $2 @> $1	10012	1864485.1329499988	186.22504324310816

Then test with same builded image + change of DATA TYPE to jsonb + add indexes + reset my pg_stat_statements table :

ALTER TABLE task ALTER COLUMN extra_attrs SET DATA TYPE jsonb;
CREATE INDEX indx_task_jsonb ON task USING GIN(extra_attrs jsonb_path_ops);
CREATE INDEX idx_task_extra_attrs_report_uuids ON public.task USING gin (((extra_attrs -> 'report_uuids'::text)));

Query	Tot_Calls	Tot_Time	Avg_Time
SELECT * FROM task WHERE extra_attrs::jsonb -> $2 @> $1	406	185.97064199999988	0.45805576847290613

After I launched some "GC", "Scan All" and "Replications" actions :
Size of table task : 10414 lines

Query	Tot_Calls	Tot_Time	Avg_Time
SELECT * FROM task WHERE extra_attrs::jsonb -> $2 @> $1	8526	2642.1774179999943	0.30989648346234977

As you can see, the average time :
BEFORE the switch to jsonb + creation of indexes increase with the number of actions done on harbor
AFTER the switch to jsonb + creation of indexes the average time is less than 0.5ms and do not change with the number of actions done on harbor

flbla · 2024-06-14T10:56:25Z

I then re-switch to the 2.10.2 version (the version with slowness) and kept the jsonb and indexes , reset pg state statement :
table task size : 10415 lines

Query	Tot_Calls	Tot_Time	Avg_Time
SELECT * FROM task WHERE extra_attrs::jsonb @> cast( $1 as j	61	15.206537	0.2492874918032787

after some actions on harbor (scan all, gc, replications, ...) :
table task size : 12282 lines

Query	Tot_Calls	Tot_Time	Avg_Time
SELECT * FROM task WHERE extra_attrs::jsonb @> cast( $1 as j	3520	1545.4175580000006	0.4390390789772729

explain SELECT * FROM task WHERE extra_attrs::jsonb @> cast('{"report_uuids": ["90246945-5f4d-4b04-b57a-c89a0283b487"]}' as jsonb );
                                                QUERY PLAN
----------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on task  (cost=8.01..10.02 rows=1 width=254)
   Recheck Cond: (extra_attrs @> '{"report_uuids": ["90246945-5f4d-4b04-b57a-c89a0283b487"]}'::jsonb)
   ->  Bitmap Index Scan on indx_task_jsonb  (cost=0.00..8.01 rows=1 width=0)
         Index Cond: (extra_attrs @> '{"report_uuids": ["90246945-5f4d-4b04-b57a-c89a0283b487"]}'::jsonb)
(4 rows)

flbla · 2024-06-14T11:02:20Z

So from my point of view, it could be good to do the following:

ALTER TABLE task ALTER COLUMN extra_attrs SET DATA TYPE jsonb;
CREATE INDEX indx_task_jsonb ON task USING GIN(extra_attrs jsonb_path_ops);
CREATE INDEX idx_task_extra_attrs_report_uuids ON public.task USING gin (((extra_attrs -> 'report_uuids'::text)));

wy65701436 · 2024-06-25T09:24:27Z

We have reverted to using the index defined in the execution table. Please upgrade to the latest version (v2.11) to validate the performance changes.

If the performance still does not meet expectations, we can continue considering the use of the JSONB type. However, based on our testing results, we have observed a significant performance improvement after reverting.

goharbor#20545) Adjust the query by UUID sql so that it can use the idx_task_extra_attrs_report_uuids fixes goharbor#20505 Signed-off-by: stonezdj <stone.zhang@broadcom.com>

peresureda · 2024-07-11T07:59:59Z

Hi @wy65701436 ,

After reverting the manual changes (changing the type from JSONB to JSON and removing the index) and applying the patch, we have observed that in terms of the performance of the ScanAll process, it is very similar to what we had with the improvement of JSONB + index. Where there is a small change is in resource usage, as we mentioned, the execution cost of having to transform to JSONB at runtime is higher than already having the data in that format.

CPU USage during ScanAll before Upgrade with jsonb + index:

CPU USage during ScanAll with 2.11 without manual improvements:

From here, the decision is yours.
Regards!

wy65701436 · 2024-08-06T07:40:27Z

@peresureda thanks for your data, it's helpful for us to do the further investigation.

The performance problem was resolved after upgrade to v2.11, but you still see high CPU usage, am I understand correct?
If so, I will do more investigation.

cc @stonezdj

stonezdj · 2024-08-14T09:30:09Z

Created 1 million records in the task. and run following test.

Operation/query	Before(json)	After(jsonb+gin index)
ScanAll	25s	26s
SELECT * FROM task WHERE extra_attrs::jsonb -> 'report_uuids'	0.7ms	0.6ms

(Before) Aug 14 05:20:20 ubuntu core[2185963]: [ORM]2024/08/14 09:20:20  -[Queries/default] - [  OK /    db.Query /     0.7ms] - [SELECT * FROM task WHERE extra_attrs::jsonb -> 'report_uuids' @> $1] - `"0e9f119d-30ba-46b6-9b79-4f7df6b06cea"
(After)Aug 14 05:22:30 ubuntu core[2185963]: [ORM]2024/08/14 09:22:30  -[Queries/default] - [  OK /    db.Query /     0.6ms] - [SELECT * FROM task WHERE extra_attrs::jsonb -> 'report_uuids' @> $1] - `"0e9f119d-30ba-46b6-9b79-4f7df6b06cea"`

The CPU/Memory compare
Before

After:

Test ScanAll with 10000 artifact

The performance improvement doesn't seem very significant.

flbla · 2024-08-14T09:48:33Z

@stonezdj
did you created both indexes ?

CREATE INDEX indx_task_jsonb ON task USING GIN(extra_attrs jsonb_path_ops);
CREATE INDEX idx_task_extra_attrs_report_uuids ON public.task USING gin (((extra_attrs -> 'report_uuids'::text)));

stonezdj · 2024-08-15T06:35:47Z

Yes,
After change it should be:

registry=# \d task;
                                           Table "public.task"
     Column      |            Type             | Collation | Nullable |             Default              
-----------------+-----------------------------+-----------+----------+----------------------------------
 id              | integer                     |           | not null | nextval('task_id_seq'::regclass)
 execution_id    | integer                     |           | not null | 
 job_id          | character varying(64)       |           |          | 
 status          | character varying(16)       |           | not null | 
 status_code     | integer                     |           | not null | 
 status_revision | integer                     |           |          | 
 status_message  | text                        |           |          | 
 run_count       | integer                     |           |          | 
 extra_attrs     | jsonb                       |           |          | 
 creation_time   | timestamp without time zone |           |          | CURRENT_TIMESTAMP
 start_time      | timestamp without time zone |           |          | 
 update_time     | timestamp without time zone |           |          | 
 end_time        | timestamp without time zone |           |          | 
 vendor_type     | character varying(64)       |           | not null | 
Indexes:
    "task_pkey" PRIMARY KEY, btree (id)
    "idx_task_extra_attrs_report_uuids" gin ((extra_attrs -> 'report_uuids'::text))
    "idx_task_job_id" btree (job_id)
    "indx_task_jsonb" gin (extra_attrs jsonb_path_ops)
    "task_execution_id_idx" btree (execution_id)
Foreign-key constraints:
    "task_execution_id_fkey" FOREIGN KEY (execution_id) REFERENCES execution(id)

flbla · 2024-08-16T12:29:53Z

I don't remember about the CPU and memory but in my previous tests I had really better time with Jsonb + gin indexes : #20505 (comment)
did you try to do a explain select * FROM task WHERE extra_attrs::jsonb -> 'report_uuids' @> 0e9f119d-30ba-46b6-9b79-4f7df6b06cea to be sure it uses correct indexes ?

stonezdj · 2024-08-29T03:11:05Z

SQL:select * FROM task WHERE extra_attrs::jsonb -> 'report_uuids' @> 0e9f119d-30ba-46b6-9b79-4f7df6b06cea
before and after hitting index both, and their performance has no difference.
"idx_task_extra_attrs_report_uuids" gin ((extra_attrs::jsonb -> 'report_uuids'::text))

The performance of this SQL doesn't impact the ScanAll API, only impact the list.

If any user wants to apply this change, it is OK to apply it in database directly, no code change required:
#20505 (comment)

github-actions · 2024-10-28T09:04:29Z

This issue is being marked stale due to a period of inactivity. If this issue is still relevant, please comment or remove the stale label. Otherwise, this issue will close in 30 days.

Vad1mo added the area/database label May 28, 2024

stonezdj self-assigned this May 29, 2024

wy65701436 added candidate/2.12.0 area/performance labels May 29, 2024

Vad1mo added the kind/bug label May 31, 2024

Vad1mo added the priority/high label May 31, 2024

stonezdj added a commit to stonezdj/harbor that referenced this issue Jun 4, 2024

Adjust the query by UUID sql so that it can use the idx_task_extra_at…

39ea6ba

…trs_report_uuids fixes goharbor#20505 Signed-off-by: stonezdj <stone.zhang@broadcom.com>

stonezdj added a commit to stonezdj/harbor that referenced this issue Jun 4, 2024

Adjust the query by UUID sql so that it can use the idx_task_extra_at…

af7cb6a

…trs_report_uuids fixes goharbor#20505 Signed-off-by: stonezdj <stone.zhang@broadcom.com>

stonezdj mentioned this issue Jun 4, 2024

Adjust the query by UUID sql so that it can use the idx_task_extra_at… #20545

Merged

5 tasks

stonezdj added a commit to stonezdj/harbor that referenced this issue Jun 4, 2024

Adjust the query by UUID sql so that it can use the idx_task_extra_at…

1de0c3a

…trs_report_uuids fixes goharbor#20505 Signed-off-by: stonezdj <stone.zhang@broadcom.com>

stonezdj mentioned this issue Jun 4, 2024

[cherry-pick] Adjust the query by UUID sql so that it can use the idx_task_extra_at… #20546

Merged

5 tasks

wy65701436 added target/2.11.0 and removed candidate/2.12.0 labels Jun 4, 2024

stonezdj added a commit to stonezdj/harbor that referenced this issue Jun 4, 2024

Adjust the query by UUID sql so that it can use the idx_task_extra_at…

c7692ef

…trs_report_uuids fixes goharbor#20505 Signed-off-by: stonezdj <stone.zhang@broadcom.com>

stonezdj added a commit to stonezdj/harbor that referenced this issue Jun 5, 2024

Adjust the query by UUID sql so that it can use the idx_task_extra_at…

db36438

…trs_report_uuids fixes goharbor#20505 Signed-off-by: stonezdj <stone.zhang@broadcom.com>

stonezdj added a commit to stonezdj/harbor that referenced this issue Jun 5, 2024

Adjust the query by UUID sql so that it can use the idx_task_extra_at…

134c68e

…trs_report_uuids fixes goharbor#20505 Signed-off-by: stonezdj <stone.zhang@broadcom.com>

stonezdj mentioned this issue Jun 5, 2024

[cherry-pick] Adjust the query by UUID sql so that it can use the idx_task_extra_at… #20551

Merged

5 tasks

stonezdj closed this as completed in #20545 Jun 11, 2024

Vad1mo reopened this Jun 14, 2024

stonezdj added candidate/2.12.0 and removed target/2.11.0 labels Jun 17, 2024

wy65701436 added backlog and removed candidate/2.12.0 priority/high labels Jun 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance Improvement Proposal for queries related with the Task Table #20505

Performance Improvement Proposal for queries related with the Task Table #20505

peresureda commented May 28, 2024 •

edited

Loading

Vad1mo commented May 28, 2024

peresureda commented May 29, 2024

flbla commented May 31, 2024

peresureda commented May 31, 2024

flbla commented May 31, 2024

AlenversFr commented May 31, 2024 •

edited

Loading

Vad1mo commented Jun 4, 2024

peresureda commented Jun 4, 2024

Vad1mo commented Jun 4, 2024

stonezdj commented Jun 5, 2024

wy65701436 commented Jun 5, 2024 •

edited

Loading

Vad1mo commented Jun 5, 2024 •

edited

Loading

peresureda commented Jun 13, 2024

flbla commented Jun 14, 2024

flbla commented Jun 14, 2024

flbla commented Jun 14, 2024

wy65701436 commented Jun 25, 2024

peresureda commented Jul 11, 2024

wy65701436 commented Aug 6, 2024

stonezdj commented Aug 14, 2024 •

edited

Loading

flbla commented Aug 14, 2024

stonezdj commented Aug 15, 2024

flbla commented Aug 16, 2024

stonezdj commented Aug 29, 2024 •

edited

Loading

github-actions bot commented Oct 28, 2024

Performance Improvement Proposal for queries related with the Task Table #20505

Performance Improvement Proposal for queries related with the Task Table #20505

Comments

peresureda commented May 28, 2024 • edited Loading

Vad1mo commented May 28, 2024

peresureda commented May 29, 2024

flbla commented May 31, 2024

peresureda commented May 31, 2024

flbla commented May 31, 2024

AlenversFr commented May 31, 2024 • edited Loading

Vad1mo commented Jun 4, 2024

peresureda commented Jun 4, 2024

Vad1mo commented Jun 4, 2024

stonezdj commented Jun 5, 2024

wy65701436 commented Jun 5, 2024 • edited Loading

Vad1mo commented Jun 5, 2024 • edited Loading

peresureda commented Jun 13, 2024

flbla commented Jun 14, 2024

flbla commented Jun 14, 2024

flbla commented Jun 14, 2024

wy65701436 commented Jun 25, 2024

peresureda commented Jul 11, 2024

wy65701436 commented Aug 6, 2024

stonezdj commented Aug 14, 2024 • edited Loading

flbla commented Aug 14, 2024

stonezdj commented Aug 15, 2024

flbla commented Aug 16, 2024

stonezdj commented Aug 29, 2024 • edited Loading

github-actions bot commented Oct 28, 2024

peresureda commented May 28, 2024 •

edited

Loading

AlenversFr commented May 31, 2024 •

edited

Loading

wy65701436 commented Jun 5, 2024 •

edited

Loading

Vad1mo commented Jun 5, 2024 •

edited

Loading

stonezdj commented Aug 14, 2024 •

edited

Loading

stonezdj commented Aug 29, 2024 •

edited

Loading