Skip to content

Performance optimization skill:Associating Small Fact Table with Big Dimension Table

esProcSPL edited this page Aug 23, 2024 · 1 revision

I Problem introduction & Solution

When we handle the primary-sub table association query, we sometimes find that the filtered fact data is small enough to be wholly loaded in memory or slightly exceeds the memory whereas the dimension table to be associated contains a large volume of data, far larger than the available memory space. In this case, if the dimension table is ordered by its key, we can use binary search to locate the small amount of dimension records corresponding to the fact records all at once rather than traverse all records of the dimension table like what the HASH algorithm does, which improves the operation performance efficiently.

This is exactly the algorithm that SPL provides to us, and in the following part we’ll test the approach and compare it with the HASH JOIN algorithm in Oracle.

However, the algorithm is supposed to retrieve all association key values from the fact table, sort them, then collect as many association key values as possible at a time to search in the dimension table. In the processing, it is the searching that takes up most of the execution time, whereas the association itself is not time-consuming. Therefore, the approach is not sensitive to multi-thread in improving performance, then we’ll test it in both single thread and multi-thread.

II Test environment & computing scenario

The test computer has two Intel2670 CPUs, 2.6G frequency, 16 core in total, 64G memory and an SSD hard disk, where a 16-core virtual machine with 8G memory is set for testing.

On the virtual machine, we create a dimension table named account which consists of 3 fields (accountid, name and state, respectively) with a total of 5 billion records, and three fact tables with the same structure (trade1, trade2 and trade3) which contain 0.3 million, 3 million and 15 million records respectively with 4 fields – tradedate, outid (account from which money is transferred), receiveid (account to which money is transferred) and amount (transfer amount). The accountid in the account table is the foreign key for the outid field and received field in the fact tables, and both are in one-to-many relationships.

Our test goal is to query the total transfer amount between each two states according to the three fact tables with different data volume, respectively. Then we’ll test the multi-thread performance using the fact table trade2.

III Tests

1. Oracle

SQL query statement:
select  /*+ parallel(n) */
    a1.state,
    a2.state,
    sum(amount) as amount
from
    account a1,
    account a2,
    trade1
where
    outid = a1.accountid
    and receiveid = a2.accountid
group by
    a1.state,
    a2.state
order by
    a1.state,
    a2.state;

/*+ parallel(n) */ is used to test the performance of multi-thread processing, and n is the number of parallel tasks.

2. SPL

SPL script:

A
1 =now()
2 =file(path+"account.ctx").open()
3 =file(path+"trade1.ctx").open().cursor@m(outid,receiveid,amount;;1)
4 =A3.joinx@q(outid,A2:accountid,state:out_state;receiveid,A2:accountid,state:receive_state;5000000)
5 =A4.groups(out_state,receive_state;sum(amount):amount)
6 =interval@s(A1,now())

The @q option is added to the joinx function to improve the association performance when the fact data is small, and the dimension data is big. Here the function’s last parameter defines the number of records to be retrieved from the fact table cursor at each association. When there is enough available memory space, the greater this value, the better the performance.

3. Test results

Below are the results of tests over different volumes of fact data (Unit: Sec):

Record number of a filtered fact table 15 million 3 million 0.3 million
Oracle 17277 2747 421
SPL 517 106 72

Here are the results of multi-thread processing tests using trade2 (with 3 million records) (Unit: Sec):

Number of threads 1 2 4 8 16
Oracle 2747 1589 1318 1375 1631
SPL 106 109 117 115 109

IV Summary

From the test results, in single-thread processing, the SPL algorithm is many times faster than Oracle because it avoids the full traversal of the dimension table.

But in multi-thread processing, the parallel-insensitive SPL algorithm has little impact on performance improvement. On the contrary, Oracle does enhance the performance by using parallel processing, but it is still much slower due to the time-consuming full traversal.

Clone this wiki locally