Skip to content

Conversation

@SakiTakamachi
Copy link
Member

@SakiTakamachi SakiTakamachi commented Apr 29, 2024

  • Adjusting the order of structure members
  • Refactor bc_str2num

Very rough speed comparison

code (Increased loop count to 1 million):

<?php

$num1 = '1.2345678901234567890';
$num2 = '-2.12345678901234567890';

$start = microtime(true);
for ($i = 0; $i < 10000000; $i++) {
    bcadd($num1, $num2, 20);
}
echo microtime(true) - $start . PHP_EOL;

$num1 = '12345678901234567890.12345678901234567890';
$num2 = '-212345678901234567890.12345678901234567890';

$start = microtime(true);
for ($i = 0; $i < 10000000; $i++) {
    bcadd($num1, $num2, 20);
}
echo microtime(true) - $start . PHP_EOL;

$num1 = '12345678901234567890.00000000000000000000000000000000000000000000000000';
$num2 = '-212345678901234567890.00000000000000000000000000000000000000000000000000';

$start = microtime(true);
for ($i = 0; $i < 10000000; $i++) {
    bcadd($num1, $num2, 20);
}
echo microtime(true) - $start . PHP_EOL;

Executed 3 times each

before:

// 1
2.5999221801758
3.0079250335693
3.2259128093719

// 2
2.5655269622803
3.0276448726654
3.2609598636627

// 3
2.5622088909149
3.0198249816895
3.3219320774078

after adjusting the order of structure members:

// 1
2.4207520484924
3.0243470668793
3.2665190696716

// 2
2.4135448932648
2.987694978714
3.2447309494019

// 3
2.414500951767
2.9784641265869
3.2391669750214

after refactor bc_str2num:

// 1
2.0390820503235
2.5814299583435
2.589555978775

// 2
2.0322518348694
2.5541210174561
2.5936360359192

// 3
2.0435800552368
2.60458111763
2.5752019882202

@SakiTakamachi SakiTakamachi force-pushed the refactor_bcmath branch 9 times, most recently from 27c292e to e87f1f5 Compare April 30, 2024 01:57
@SakiTakamachi SakiTakamachi marked this pull request as ready for review April 30, 2024 04:54
@SakiTakamachi SakiTakamachi requested a review from Girgias as a code owner April 30, 2024 04:54
@SakiTakamachi
Copy link
Member Author

I plan to conduct a high-precision benchmark.

@SakiTakamachi SakiTakamachi force-pushed the refactor_bcmath branch 4 times, most recently from 89bcbbd to ff77807 Compare April 30, 2024 11:56
Copy link
Member

@Girgias Girgias left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commits 1 and 2 are good to land, got questions/remakrs about commit 3

/* Skip Sign */
ptr++;
}
ptr += (*ptr == '-' || *ptr == '+');
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is true, but adding a comment can't help (i.e. if we have a sign we need to increment the str pointer)

char *ptr, *nptr;
size_t trailing_zeros = 0;
size_t str_scale = 0;
char *ptr, *nptr, *integer_ptr, *integer_end, *fractional_ptr = NULL, *fractional_end = NULL, *decimal_point;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I usually prefer to have each declaration on its own line

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In release builds this had no effect so I reverted it


/* Convert strings to bc numbers. Base 10 only.*/

bool bc_str2num(bc_num *num, char *str, size_t scale)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sense to pass the str_len here? As this should save the need to compute strlen() of it in the function body (also you can easily determine if we have reached the end of the str or not.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've already tried that and it was slower. Since the length can be obtained from zend_string, there should be no computational cost, so it's strange that it's slower....

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would also expect that ZSTR_LEN is faster. Are you using a release build without ASAN & UBSAN instrumentation?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahhh, it was a debug build.... I'll try again

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This worked quite well for the release build, but I ended up not needing the length of str, so I'll leave it unchanged.

However, I would use ZSTR_LEN if future refactoring requires it.

if (decimal_point) {
/* search */
fractional_ptr = decimal_point + 1;
fractional_end = fractional_ptr + strlen(fractional_ptr) - 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See remark at the start of the function declaration

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

@nielsdos
Copy link
Member

Cool work Saki! I like the performance gains.
One trick I really like doing is misusing registers to perform SIMD.
For example, this improves performance on top of your patch: https://gist.github.com/nielsdos/a485ace9471c56ea6515dbf8962ce250. I might commit this after this is merged, and maybe you know of some other places where this stuff can be used? 🙂

@SakiTakamachi
Copy link
Member Author

@nielsdos
Thank you, that suggestion worked great!
This is the result of my local test:

// 1
0.97930192947388
1.1109738349915
1.2666010856628

// 2
0.97456121444702
1.1181631088257
1.3485460281372

// 3
0.97628211975098
1.1602928638458
1.337718963623

// 4
0.96974301338196
1.1106779575348
1.2683279514313

BCMath is written in a "highly readable" manner, so I think there are quite a few places where your hack can be used. For example, in this PR change, I could use that hack for a loop that checks trailing zero while decrementing from fractional_end.

result:

// 1
0.9805908203125
1.1358230113983
1.103413105011

// 2
0.97740793228149
1.0924868583679
1.0780000686646

// 3
0.98030591011047
1.1083829402924
1.075413942337

// 4
0.98706197738647
1.1104090213776
1.0623908042908

After your changes are committed, I'll try to commit them :)

@SakiTakamachi
Copy link
Member Author

Since it became slow after making the release build, I will review the code again.

@SakiTakamachi SakiTakamachi force-pushed the refactor_bcmath branch 2 times, most recently from c243421 to cf252e7 Compare May 1, 2024 04:50
@SakiTakamachi
Copy link
Member Author

There's no need to cram too many changes into this PR, so I'll leave it here for now and open another one 😄

@SakiTakamachi SakiTakamachi requested review from Girgias and nielsdos May 1, 2024 09:02
Copy link
Member

@nielsdos nielsdos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good just needs some minor style adjustments / comment fixes

while (*ptr == '0') {
ptr++;
}
integer_ptr = ptr;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll only put this comment once as it applies to some other places too.
When refactoring code, you should merge the declaration and the assignment where possible.
I.e. make this char *integer_ptr = ptr;. Or even const char *integer_ptr = ptr;.
The reason to merge them is so it becomes clearer what the actual scope of the variable is, so it's not accidentally used when it should not be used (yet).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed 386cb7f

}
fractional_ptr = decimal_point + 1;

str_scale = fractional_end - fractional_ptr;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may need a comment.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added 386cb7f

SakiTakamachi and others added 2 commits May 1, 2024 21:00
Co-authored-by: Niels Dossche <7771979+nielsdos@users.noreply.github.com>
@SakiTakamachi
Copy link
Member Author

I've added some comments in addition to the places mentioned.

Copy link
Member

@nielsdos nielsdos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thx
I also have 2 patches for improving performance: the one I shared here already (that I might extend); and one that improves _bc_new_num_ex.
I'll open PRs after this is merged.

@SakiTakamachi
Copy link
Member Author

Just in case, I wait for CI to pass before merging. Thx :)

@SakiTakamachi SakiTakamachi deleted the refactor_bcmath branch May 1, 2024 12:58
nielsdos added a commit to nielsdos/php-src that referenced this pull request May 1, 2024
On my i7-4790 with benchmark from php#14076, on top of php#14101 I obtain the
following results:

before (with php#14101):
```
1.672737121582
2.3618471622467
2.3474779129028
```

after (with php#14101 + this):
```
1.5878579616547
2.0568618774414
2.0204811096191
```
nielsdos added a commit that referenced this pull request May 1, 2024
On my i7-4790 with benchmark from #14076, on top of #14101 I obtain the
following results:

before (with #14101):
```
1.672737121582
2.3618471622467
2.3474779129028
```

after (with #14101 + this):
```
1.5878579616547
2.0568618774414
2.0204811096191
```
nielsdos added a commit to nielsdos/php-src that referenced this pull request May 1, 2024
Since freeing can deal with NULL, we can avoid calling bc_init_num and
avoid resetting the number during parsing.

Using benchmark from php#14076.

Before:
```
1.544440984726
2.0288550853729
2.092139005661
```

After:
```
1.5324399471283
1.9081380367279
2.065819978714
```
nielsdos added a commit that referenced this pull request May 1, 2024
Since freeing can deal with NULL, we can avoid calling bc_init_num and
avoid resetting the number during parsing.

Using benchmark from #14076.

Before:
```
1.544440984726
2.0288550853729
2.092139005661
```

After:
```
1.5324399471283
1.9081380367279
2.065819978714
```
@SakiTakamachi SakiTakamachi mentioned this pull request May 2, 2024
nielsdos added a commit to nielsdos/php-src that referenced this pull request May 2, 2024
Using SIMD to accelerate the validation.

Using the benchmark from php#14076.

After:
```
1.3504369258881
1.6206321716309
1.6845638751984
```

Before:
```
1.4750170707703
1.9039781093597
1.9632289409637
```
nielsdos added a commit that referenced this pull request May 3, 2024
Using SIMD to accelerate the validation.

Using the benchmark from #14076.

After:
```
1.3504369258881
1.6206321716309
1.6845638751984
```

Before:
```
1.4750170707703
1.9039781093597
1.9632289409637
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants