Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat/postgrestore #126

Open
wants to merge 11 commits into
base: develop
Choose a base branch
from
Open

Feat/postgrestore #126

wants to merge 11 commits into from

Conversation

HatemMn
Copy link
Contributor

@HatemMn HatemMn commented Mar 4, 2025

ℹ️ PR ORDER ℹ️

Develop <- #128 <- #129 <- #125

Performance : would using the hstore module be better ?

In theory : 🚫 | Test : (not done)

The answer was no, for the followng reasons :

  • B-tree indexes offer is faster lookups vs hstore GIN src
  • Row-level locking enables higher concurrency than document-level

Performance : would using another index improve perf ?

In theory : 🆗 | Test : 🚫

No. In theory, HASH index would increase performance however after testing we notice that the tests become around 10% slower (probably due to indexing overhead ?) so it is better to leave default values as they are

Performance : would using prepared statement perf ?

In theory : 🆗 | Test : 🚀 🆗

using prepared statements improved performance by around a huge x5 ! making it comparable with sqlite3 interface

Prepared statements are a good idea since they benefit from caching and remove request processing overhead, which is very useful with our requests that are particularely complex and have a lot of CTE

Performance : fillfactor ?

In theory : 🚫 | Test : ?

Let's think about what the fillfactor serves : avoiding fragmentation problems when we do frequent update. GWrite is subject to do frequent updates on certain scenarios.
However, the WORD_LENGTH is fixed, meaning no update will ever cause fragmentation issues because the "new data" will always have fixed size ! so in theory, the fillfactor should stay 100% like the default. For reasons that are yet not known, a small perf increase was seen with FILLFACTOR set to 90%

Performance : storage mode and compression

In theory : 🆗 | Test : Nothing changed

EXTERNAL storage mode stores data out-of-line in a secondary TOAST table but does not compress it
This is pretty cummon tu use with BYTEA values that do not benefit from compression (cpu sees that the compressed data is bigger than the original data and cancels the compression, which is a waste of cpu time - source and benches )

However, changing those parameters didn't make the tests run faster.

PS : the consisty garantees are not violated with this storage method

Alt approach

This is (for reference only) an alternative approach to do G_write

I has, however, poor performance. So similar approaches are to be avoided ( 857.55s, instead of 120 for 1000 workers )

        let (ag, wg) = (guard.0, guard.1.as_ref().map(|w| &w[..]));

        let params = match &wg {
            Some(guard) => vec![&*ag as &(dyn ToSql + Sync), guard],
            None => vec![&*ag, &None::<&[u8]> as &(dyn ToSql + Sync)],
        }
        .iter()
        .copied()
        .chain(bindings.iter().flat_map(|(a, w)| vec![&**a as _, &*w as _]))
        .collect::<Vec<&(dyn ToSql + Sync)>>();

        let query = &format!(
            "
    WITH
    guard_check AS (
        SELECT w FROM findex_db WHERE a = $1::bytea
    ),
    insert_condition AS (
        INSERT INTO findex_db (a, w)
        SELECT * FROM (
            VALUES
            {}
        ) AS new_values(a, w)
WHERE (
    $2::bytea IS NULL AND NOT EXISTS (SELECT 1 FROM guard_check)
) OR (
    $2::bytea IS NOT NULL AND EXISTS (
       SELECT 1 FROM guard_check WHERE w = $2::bytea
    )
)
        ON CONFLICT (a) DO UPDATE SET w = EXCLUDED.w
    )
    SELECT COALESCE((SELECT w FROM guard_check)) AS original_guard_value;
",
            (3..=params.len())
                .step_by(2) // params.len() is always even
                .map(|i| format!("(${}::bytea, ${}::bytea)", i, i + 1))
                .collect::<Vec<String>>()
                .join(",")
        );
        self.client
            .query_opt(query, &params)
            .await?
            .map_or(Ok(None), |row| {
                row.try_get::<_, Option<&[u8]>>(0)?
                    .map(|b| b.try_into())
                    .map_or(Ok(None), |r| Ok(Some(r?)))
            })

@HatemMn HatemMn changed the base branch from main to develop March 4, 2025 17:23
@HatemMn HatemMn force-pushed the feat/postgrestore branch from 218dd42 to 7635d5b Compare March 4, 2025 17:53
@HatemMn HatemMn force-pushed the feat/postgrestore branch 2 times, most recently from 715bc36 to 256b82d Compare March 17, 2025 11:33
@HatemMn HatemMn force-pushed the feat/postgrestore branch from e0f3f88 to 5c673fe Compare March 19, 2025 15:16
This was referenced Mar 19, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant