Improve throughput when sending FCM Messages by using HTTP/2 #874

jeromegamez · 2024-03-26T21:36:50Z

codecov · 2024-03-26T21:38:40Z

Codecov Report

Attention: Patch coverage is 92.50000% with 3 lines in your changes are missing coverage. Please review.

Project coverage is 91.08%. Comparing base (f1d2809) to head (bfce08a).
Report is 2 commits behind head on 7.x.

Additional details and impacted files

@@             Coverage Diff              @@
##                7.x     #874      +/-   ##
============================================
+ Coverage     91.05%   91.08%   +0.02%     
+ Complexity     1596     1589       -7     
============================================
  Files           156      156              
  Lines          4418     4407      -11     
============================================
- Hits           4023     4014       -9     
+ Misses          395      393       -2

tonysilva16 · 2024-04-17T20:56:02Z

Hi jeromegamez,

I wonder if we need to check if curl lib supports http2 like here

https://github.com/microsoftgraph/msgraph-sdk-php/pull/818/files#diff-849151e7b2ea35fbfeca79419f39a80ebc43ed0c66cd50f309db361f7682c4a9R548

Thank you.

jeromegamez · 2024-04-17T21:42:32Z

HTTP/2 support has been available since cURL 7.47, released in 2016... although PHP 8.x requires just cURL 7.29, I think it's safe to assume that current OSes ship with supported cURL releases. I'd rather not add a check related to a downstream library 🤔

jeromegamez · 2024-04-17T23:23:34Z

Hm. I just tried sending 100 messages with the following script, and ran into errors myself once I tried it with more than ~75 messages:

<?php

use Kreait\Firebase\Factory;
use Kreait\Firebase\Messaging\CloudMessage;

$factory = (new Factory())
    ->withServiceAccount('/path/to/my/service_account.json')
;

$messaging = $factory->createMessaging();

$message = CloudMessage::fromArray([
    'notification' => [
        'title' => 'Hello World!',
        'body' => 'Hello World!',
    ]
]);

$targets = array_fill(0, 70, "<device token>");

$results = $messaging->sendMulticast($message, $targets);

dump($results->successes()->count());
dump($results->failures()->count());

When it works (with fewer messages), and I enable debug for the requests, I see something like

[...]
* Connection cache is full, closing the oldest one
* Connection #61 to host fcm.googleapis.com left intact
< HTTP/2 200
< content-type: application/json; charset=UTF-8
< vary: X-Origin
< vary: Referer
< vary: Origin,Accept-Encoding
< date: Wed, 17 Apr 2024 23:20:18 GMT
< server: scaffolding on HTTPServer2
< cache-control: private
< x-xss-protection: 0
< x-frame-options: SAMEORIGIN
< x-content-type-options: nosniff
< alt-svc: h3=":443"; ma=2592000,h3-29=":443"; ma=2592000
< accept-ranges: none
<
* Connection cache is full, closing the oldest one
* Connection #69 to host fcm.googleapis.com left intact
^ 68
^ 2

It seems that the connection isn't kept alive/reused after all, so I'll need to continue working on this and explore if the situation can be ameliorated with a Guzzle Pools and a (configurable) concurrency.

tonysilva16 · 2024-04-19T15:51:31Z

I tested this locally sending invalid tokens (token1, token2, token3...) to a valid project and i cannot replicate this issue.
I can test this again using other method. Should i use a valid service account with a valid tokens and just duplicate them to try replicate this issue?

<<<<<<<<
HTTP/2 400 Bad Request
vary: X-Origin, Referer, Origin,Accept-Encoding
content-type: application/json; charset=UTF-8
date: Thu, 18 Apr 2024 08:29:55 GMT
server: scaffolding on HTTPServer2
cache-control: private
x-xss-protection: 0
x-frame-options: SAMEORIGIN
x-content-type-options: nosniff
alt-svc: h3=":443"; ma=2592000,h3-29=":443"; ma=2592000
accept-ranges: none

{
  "error": {
    "code": 400,
    "message": "The registration token is not a valid FCM registration token",
    "status": "INVALID_ARGUMENT",
    "details": [
      {
        "@type": "type.googleapis.com/google.firebase.fcm.v1.FcmError",
        "errorCode": "INVALID_ARGUMENT"
      }
    ]
  }
}

jeromegamez · 2024-04-20T23:13:13Z

Okay, I could make it work and send 1500 messages in ~5 seconds, without running into failures.

However, I'm not sure if the send reports for successfully sent messages are correct (and I'm not sure they were ever correct).

On failed sends, an Exception is thrown that includes the request that caused the exception. Here, it's easy to re-extract the message from the request body, and create a Failure-SendReport for it.
On successful requests, the API returns a response with HTTP Status code 200, and the response body only includes the message ID.

The mapping between message and response is done by iterating the messages and using the next message to map it to the next successful response. I'm not sure the responses are processed in the same order as the requests were sent, but if this mapping is correct (or ever was), that I can't say for sure.

jeromegamez · 2024-04-20T23:38:40Z

Instead of using iterators, I could use an array of requests and then map messages to responses, but then, sending of thousands of messages at a time would increase the memory usage by a lot, I don't know if people will then come complain 🤔.

Another way would be to only report failures, but if people rely on the successful reports, that would be a BC.

Another possibility would be to introduce a new method with a new result object. I don't know how I feel about that, there are already so many methods to send messages.

The second and third approach would mean that successful responses could not be mapped to the respective messages anymore. Again, I can't say for sure this is the case in the current implementation either.

Hmmm 🤔

rickclephas · 2024-04-22T06:43:36Z

Thanks for working on this @jeromegamez!

On successful requests, the API returns a response with HTTP Status code 200, and the response body only includes the message ID.

Really? According to the docs the response should contain a Message object.

The mapping between message and response is done by iterating the messages and using the next message to map it to the next successful response. I'm not sure the responses are processed in the same order as the requests were sent, but if this mapping is correct (or ever was), that I can't say for sure.

I don't think we can assume that the responses are processed in the same order.
Probably best to use the index parameter to map/store the correct response.

How about using the index and retrieving the message on demand?:

'fulfilled' => function (ResponseInterface $response, $index) use ($messages, &$sendReports) {
    $message = $this->ensureMessage($messages[$index]);
    // ...
    $sendReports[$index] = SendReport::success($message->target(), $json);
},
'rejected' => function (RequestException $reason, $index) use (&$sendReports) {
    // ...
    $sendReports[$index] = SendReport::failure($message->target(), $reason, $message);
}

jeromegamez · 2024-04-22T07:14:36Z

The last time I checked, the docs lied 😅, but I'll have another look.

jeromegamez · 2024-04-22T22:16:46Z

Alright, I've split the two approaches into two commits ("Use generators...", and "Use arrays..."), and did some naive benchmarks:

### Generators, HTTP/2, using GuzzleHTTP's pool

10 messages, Memory usage: 0.719215 MiB, Duration: 276.203834 ms
100 messages, Memory usage: 0.476227 MiB, Duration: 404.786958 ms
1000 messages, Memory usage: 1.351799 MiB, Duration: 2573.935125 ms
10000 messages, Memory usage: 9.478622 MiB, Duration: 23759.530041 ms

### Arrays, HTTP/2, using GuzzleHTTP's pool

10 messages, Memory usage: 0.767624 MiB, Duration: 295.545958 ms
100 messages, Memory usage: 0.893921 MiB, Duration: 536.475875 ms
1000 messages, Memory usage: 6.248581 MiB, Duration: 7520.771083 ms
10000 messages, Memory usage: 58.461571 MiB, Duration: 40931.196458 ms

### Latest Release 7.9.1, HTTP 1.1, async requests

10 messages, Memory usage: 0.695015 MiB, Duration: 287.356417 ms
100 messages, Memory usage: 1.948715 MiB, Duration: 424.529083 ms
PHP Warning:  include(vendor/guzzlehttp/guzzle/src/Exception/ConnectException.php): 
Failed to open stream: Too many open files in vendor/composer/ClassLoader.php on line 576

No need to say that we need this PR one way or the other 😅. However, the jump in execution time and memory usage is high when we use arrays instead of generators.

Given that the current implementation doesn't work reliably at all (results may vary depending on the underlying system), I think that using arrays would be a reasonable choice for the first iteration. Using Generators is the way to go in the long run, though. I'll figure that one out after this PR.

If nobody objects, I'll proceed with the arrays in the next days.

Here's the script I used:

<?php

declare(strict_types=1);

require 'vendor/autoload.php';

use Kreait\Firebase\Factory;
use Kreait\Firebase\Messaging\CloudMessage;
use Kreait\Firebase\Messaging\MessageTarget;
use Kreait\Firebase\Messaging\Notification;

$token = "I'm sending all messages to a single token";
$sizes = [10, 100, 1000, 10000];

$factory = (new Factory())->withServiceAccount('/path/to/service-account.json');

$messaging = $factory->createMessaging();

foreach ($sizes as $size) {
    $targets = array_fill(0, $size, $token);

    $messages = [];
    foreach ($targets as $target) {
        $messages[] = CloudMessage::withTarget(MessageTarget::TOKEN, $target)
            ->withNotification(Notification::create('Title', $target));
    }

    $usageAtStart = memory_get_peak_usage(false) / 1024 / 1024;
    $timeAtStart = hrtime(true);

    $report = $messaging->sendAll($messages);

    $usageAtEnd = memory_get_peak_usage(false) / 1024 / 1024;
    $timeAtEnd = hrtime(true);

    $peakUsage = $usageAtEnd - $usageAtStart;
    $duration = ($timeAtEnd - $timeAtStart) / 1e+6; // nanoseconds to milliseconds

    printf("%d messages, Memory usage: %f MiB, Duration: %f ms\n", $size, $peakUsage, $duration);
}

jeromegamez · 2024-04-22T22:18:47Z

@rickclephas I checked, and the response body after a successful send looks like this:

{
  "name": "projects/my-project/messages/867384e8-9692-40f4-b26c-e6f89990293b"
}

The message ID, which I never knew what to use it for ¯\(ツ)/¯

rickclephas · 2024-04-23T11:42:42Z

Given that the current implementation doesn't work reliably at all (results may vary depending on the underlying system), I think that using arrays would be a reasonable choice for the first iteration. Using Generators is the way to go in the long run, though. I'll figure that one out after this PR.

If nobody objects, I'll proceed with the arrays in the next days.

Sounds great! Some thoughts:

if the responses aren't processed in the request order that would also mean that the order of results won't match with the provided message unless those are mapped as well (at the moment we rely on the order)
maybe only using an array for the messages and keeping everything else a generator will already go a big way in terms of memory

jeromegamez · 2024-04-23T14:53:50Z

Alright, I think we found a nice middleground, great minds think alike, so as a final change, the messages are now stored in an array, the requests given as a Generator, and the mapping between messages and responses is done via the index variable available through the response handler configuration.

10 messages, Memory usage: 0.750603 MiB, Duration: 320.770875 ms
100 messages, Memory usage: 0.617378 MiB, Duration: 366.145416 ms
1000 messages, Memory usage: 2.851311 MiB, Duration: 2722.624125 ms
10000 messages, Memory usage: 24.174843 MiB, Duration: 28863.839500 ms

The numbers are looking good, closer to the "pure" generators approach than the "pure" array approach.

I will let this sit for a few days, or until someone confirms that this works for them without errors or problems.

rickclephas · 2024-04-23T16:33:59Z

I will let this sit for a few days, or until someone confirms that this works for them without errors or problems.

I would be happy to try this in our project and report back. Just have one question:

$messages = [CloudMessage::fromArray(...), ...];
$reports = $messaging->sendAll($messages)->getItems();
// is it guaranteed that report 0 is referring to message 0?
assert($reports[0]->target() === $messages[0]->target());

At the moment we rely on this order to associate the reports to our device records.
If this isn't guaranteed we'll need to change that 😅 (though I believe this was guaranteed with promise based approach).

jeromegamez · 2024-04-23T17:33:45Z

I believe that the order will be preserved, because I understood the $index variable in the response handlers to be in the same order as the promises that are pumped in.

If this is the case, the generated requests are in the same order as the messages in the array, so the index in the response handlers should be the same as the index in the message array.

If it isn't, it means we have to go with the "everything in arrays" approach, right?

I can't test this because I only have one manually created registration token to work with.

I have mentioned it at some occasions, but as a reminder: I haven't used Firebase myself in ~10 years 😅

rickclephas · 2024-04-23T19:50:06Z

I believe that the order will be preserved, because I understood the $index variable in the response handlers to be in the same order as the promises that are pumped in.

Yeah that's how I understand it as well. The thing I am unsure about is the order in which the fulfilled/rejected callbacks are invoked. If those aren't invoked in the same order (e.g. 1, 5, 2, 3, 0, 4) then that would mean the returned reports aren't in the same order as the messages.

Meaning the message inside the report is correct but the message inside report 0 wouldn't be message 0.
Depending on how the reports are used that might not be a problem, but in our case we rely on this order.

I can't test this because I only have one manually created registration token to work with.

I could do some basic testing with multiple tokens on our development environment.
Also have some code to "stresstest" our send logic (but that also uses a single token).

I have mentioned it at some occasions, but as a reminder: I haven't used Firebase myself in ~10 years 😅

No problem, your work and support is much appreciated! Evenmore so if you aren't actively using Firebase yourself.

I do actively use Firebase, but I don't normally use PHP so forgive me if this doesn't make sense, but couldn't we use the index provided by the callbacks to also determine the order of the reports?:

$sendReports[$index] = SendReport::...;

I am not really sure if this has any obvious issues in how PHP handles index based arrays vs custom key based arrays.
But if possible it would also guarantee the order of the reports, right?

jeromegamez · 2024-04-23T22:10:49Z

The reports should now be ordered by the order of messages.

But: I noticed that there is one test randomly failing, because the order is not as expected:

https://github.com/kreait/firebase-php/blob/8e8497467f9bafa16e94bc7af9ec062d9d0e89d8/tests/Integration/MessagingTest.php#L220-L238

I'll have to dive into that - hopefully it's the test that has a thinking error 🤞🏻.

rickclephas · 2024-04-24T11:52:42Z

Alright I think I get why the test is randomly failing.

$array = [];
$array[1] = "One";
$array[0] = "Zero";
var_dump($array); // array(2) { [1]=> string(3) "One" [0]=> string(4) "Zero" }

While the indexes are correct the order in the array isn't.
I guess this sort of makes sense for PHP "arrays".

$report->failures() filters the items with array_filter and array_values.
Effectively reindexing the array and ignoring the explicitly specified index.

Using array_fill to create an array of null values works as expected:

- $sendReports = [];
+ $sendReports = array_fill(0, count($messages), null);

jeromegamez · 2024-04-24T13:37:56Z

Didn't you say you rarely do PHP? This is spot-on! I will implement this as soon as a find the time later!

rickclephas · 2024-04-24T15:23:05Z

Didn't you say you rarely do PHP? This is spot-on! I will implement this as soon as a find the time later!

😅 yeah recently been helping some coworkers, but otherwise not too much PHP experience.

I already did some testing with the "stresstest" script and so far everything else seems to be working fine.
Duration differences are similar to the once you posted.
Batch sizes were previously reduced to 250 due to some random curl errors. It's now back to 500 without any problems.

I'll do some normal testing as well, but I think we are good once the test is green 👍🏻.

jeromegamez · 2024-04-25T00:36:45Z

I added the array_fill() and will merge this now so that I'm not tempted to optimize™ things within the PR 😅. It's 2:35am where I am right now, so I'll wait with the release until later today.

Thank you for the fruitful collaboration 🫶🏻 ! And as always, if the project is useful to you, please consider becoming a sponsor!

jeromegamez · 2024-04-25T10:40:18Z

Released with https://github.com/kreait/firebase-php/releases/tag/7.10.0 🚀

marbelsy · 2024-11-09T08:42:06Z

We are currently running the 5.0.0 version and we currently have a depdendency conflict with lcobucci/jwt atm.
When trying to update are there any breaking changes when upgrading to 7?
Or: can we use sendAll() with the version 5 already without isses?
At peak times there are 5000 messages being sent at a time

Thanks in advance

jeromegamez · 2024-11-09T09:50:47Z

I'd recommend first updating the SDK to 6.x and inspecting the changelog and the upgrade.md before updating to 7.x.

jeromegamez mentioned this pull request Mar 26, 2024

Unknown limit of devices to send to via multicast #873

Closed

jeromegamez force-pushed the http2 branch from 5c4f06b to cb15104 Compare March 26, 2024 21:55

jeromegamez force-pushed the http2 branch from cb15104 to 02679a9 Compare April 17, 2024 22:52

jeromegamez changed the title ~~Try improving performance and reliability by using HTTP 2.0~~ Improve throughput when sending FCM Messages by using HTTP/2 Apr 17, 2024

jeromegamez force-pushed the http2 branch 2 times, most recently from d0377af to 77f78de Compare April 20, 2024 23:10

jeromegamez force-pushed the http2 branch from 77f78de to c89dd70 Compare April 22, 2024 22:02

jeromegamez force-pushed the http2 branch from c89dd70 to 911edc3 Compare April 23, 2024 14:48

jeromegamez force-pushed the 7.x branch from 4341fca to 62c67a2 Compare April 23, 2024 15:01

jeromegamez force-pushed the http2 branch from 911edc3 to 3e2c4b8 Compare April 23, 2024 15:01

jeromegamez force-pushed the 7.x branch from 62c67a2 to 63eb05e Compare April 23, 2024 15:02

jeromegamez force-pushed the http2 branch 3 times, most recently from 103758b to 79d6218 Compare April 23, 2024 15:27

jeromegamez force-pushed the http2 branch 2 times, most recently from d778e23 to 8e84974 Compare April 23, 2024 22:33

jeromegamez force-pushed the http2 branch from 5981f47 to ab173ae Compare April 24, 2024 23:55

jeromegamez force-pushed the 7.x branch from d4c64e9 to 73a74b2 Compare April 25, 2024 00:00

jeromegamez force-pushed the http2 branch from ab173ae to 649575f Compare April 25, 2024 00:01

jeromegamez added 4 commits April 25, 2024 02:24

Send messages asynchronously through a connection pool with HTTP/2

62a0e27

Order send reports by the order of messages

1ad7863

Fill array with null values to ensure order of indices

50220d2

Ensure that each message has a SendReport

bfce08a

jeromegamez force-pushed the http2 branch from 649575f to bfce08a Compare April 25, 2024 00:29

jeromegamez merged commit d2456d7 into 7.x Apr 25, 2024

jeromegamez deleted the http2 branch April 25, 2024 00:38

jeromegamez mentioned this pull request Jun 28, 2024

Fix Messaging for outdated environments #908

Merged

Uh oh!

Improve throughput when sending FCM Messages by using HTTP/2 #874

Improve throughput when sending FCM Messages by using HTTP/2 #874

Uh oh!

Conversation

jeromegamez commented Mar 26, 2024

Uh oh!

codecov bot commented Mar 26, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

tonysilva16 commented Apr 17, 2024

Uh oh!

jeromegamez commented Apr 17, 2024

Uh oh!

jeromegamez commented Apr 17, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tonysilva16 commented Apr 19, 2024

Uh oh!

jeromegamez commented Apr 20, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jeromegamez commented Apr 20, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rickclephas commented Apr 22, 2024

Uh oh!

jeromegamez commented Apr 22, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jeromegamez commented Apr 22, 2024

Uh oh!

jeromegamez commented Apr 22, 2024

Uh oh!

rickclephas commented Apr 23, 2024

Uh oh!

jeromegamez commented Apr 23, 2024

Uh oh!

rickclephas commented Apr 23, 2024

Uh oh!

jeromegamez commented Apr 23, 2024

Uh oh!

rickclephas commented Apr 23, 2024

Uh oh!

jeromegamez commented Apr 23, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rickclephas commented Apr 24, 2024

Uh oh!

jeromegamez commented Apr 24, 2024

Uh oh!

rickclephas commented Apr 24, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jeromegamez commented Apr 25, 2024

Uh oh!

jeromegamez commented Apr 25, 2024

Uh oh!

marbelsy commented Nov 9, 2024

Uh oh!

jeromegamez commented Nov 9, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

codecov bot commented Mar 26, 2024 •

edited

Loading

jeromegamez commented Apr 17, 2024 •

edited

Loading

jeromegamez commented Apr 20, 2024 •

edited

Loading

jeromegamez commented Apr 20, 2024 •

edited

Loading

jeromegamez commented Apr 22, 2024 •

edited

Loading

jeromegamez commented Apr 23, 2024 •

edited

Loading

rickclephas commented Apr 24, 2024 •

edited

Loading