-
-
Notifications
You must be signed in to change notification settings - Fork 207
WebAssembly edits #4401
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
WebAssembly edits #4401
Conversation
Enhanced clarity and corrected minor inaccuracies in the text. It is not finalized yet.
Updated date filters and added new metrics for WebAssembly requests.
Super-linter summary
Super-linter detected linting errors For more information, see the GitHub Actions workflow run Powered by Super-linter EDITORCONFIG |
Super-linter summary
All files and directories linted successfully For more information, see the GitHub Actions workflow run Powered by Super-linter |
Super-linter summary
All files and directories linted successfully For more information, see the GitHub Actions workflow run Powered by Super-linter |
Super-linter summary
All files and directories linted successfully For more information, see the GitHub Actions workflow run Powered by Super-linter |
Super-linter summary
All files and directories linted successfully For more information, see the GitHub Actions workflow run Powered by Super-linter |
src/content/en/2025/webassembly.md
Outdated
| We follow the same methodology from [the 2021 Web Almanac](../2021/webassembly#methodology), where WebAssembly was introduced for the first time. | ||
|
|
||
| ### Limitations | ||
| **Data Collection:** This chapter relies on this dataset provided by HTTP Archive Juli 2025 crawl data which is hosted on Google BigQuery. to identify WebAssembly modules by matching the `Content-Type` (`application/wasm`) and the `.wasm` file extension. Using this method, we identified 233,857 Wasm modules on desktop and 255,060 on mobile. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We avoid absolute numbers because they are basically meaningless.
233,857 sounds a lot, but is it? It depends how many sites we crawl. If this is from top million that;'s a lot! If it's from top billion, not so much. The reader has no context here. That's why we use percentages to give that context.
It's maybe ok to use absolute numbers if you also provide context. For example: "Using this method, we identified 233,857 Wasm modules on desktop and 255,060 on mobile across X% of websites in out crawl".
But ultimately I don't get what the point of this paragraph is. Is it to show the scale of our analysis? Again, I'm not sure that's that meaningful without understand what that scale means.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The description was about the method we have used to analyse the source of the language in Wasm with our tool offline. To improve the language usage analysis of Wasm, We had downloaded 233,857 wasm files, validate and populate final stats. These all process have various risks associated for example, ..When downloading from the 3rd party server with request params, the server may raise 404 or security exceptions or the 3rd party server may not have that perticular resource this time or It may not response as we expect and or we may have invalid WebAssembly with respect to Wasm Specs.
Lets resolve your concern with more details and rephrase above details with percentage.
| }} | ||
|
|
||
| These WebAssembly modules differ considerably in size, with the smallest being just a few kilobytes, and the largest one is 228.102 MB in desktop's client and 166.415 MB for mobile client. | ||
| When examining uncompressed sizes, we observe that while the median module remains lightweight at approximately 30 KB on both platforms, the largest binaries at the 90th percentile are significantly heavier on desktop (897 KB) than on mobile (756 KB). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is Wasm not typically binary? What compression does it use? Gzip? Brotli? I thought those only really worked on text resources (maybe explaining the relatively low different between these two charts?) and they shouldn't be further compressed for sending on the wire?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Barry, The Wasm is fundamentally a binary instruction format, not typically a text format. the format sent over the wire is the compact binary format (wasm).
It has built in compression.
It is observed that applying Gzip or Brotli to an already highly optimized binary file yields minimal additional size savings (often less than 10%, sometimes even negligible), Because of the minimal benefit of compression, Barry the Wasm files are typically served with an identity encoding, meaning no further general purpose compression is applied
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it worth commenting on this?
At the moment we show raw (which I think should be renamed "compress" as per other conversation to make it more obvious) and uncompressed, but never really explain that it's not really expected for them to be compressed. And that that explains why the charts aren't that different. By presenting both, it's almost like the opposite and we think it is something to be looked at, almost like we expect there to be interesting insights here?
So I went back to the 2021 chapter that we seem to be basing this on and it has this to say:
"...one of the benefits of Wasm bytecode is that it’s highly compressible, and size over the wire is what matters for download speed and billing reasons. Let’s check sizes of raw response bodies as sent by servers instead:"
That's a good narrative for why they included both.
But the 2025 chapter doesn't seem to have a similar narrative and just says "here's the raw bytes, here's the compressed bytes" and I'm not sure what the readers's supposed to take away from that.
Additionally it contradicts what you say about compression (and what I believed too).
Again, as a reader I'm confused as to what to take away from this section?
| - **Library : Microsoft (23.2%)** represents the massive footprint of the Microsoft ecosystem on the web, primarily driven by Blazor WebAssembly. Blazor allows developers to build interactive web UIs using C# and .NET instead of JavaScript. The high percentage reflects many enterprise and business applications that have been ported to the web using Microsoft's specialized Wasm runtime for the .NET framework. | ||
|
|
||
| - **Library : RXEngine (6.2%)** is a more specialized entry, often associated with high-performance execution engines used for specific industries like gaming or advanced data processing. While more niche than the top two, its 6.2% share indicates it is a popular choice for developers who need a pre-built, optimized engine to handle computationally intensive tasks (such as real-time analytics or complex UI interactions) without building the entire infrastructure from scratch. | ||
| We find that System (43%), Microsoft (23%), RXEngine (6%), and Dotnet (6%) are the most popular libraries or frameworks used in WebAssembly modules, indicating Microsoft's dominance within this ecosystem, driven specifically by the Dotnet and Blazor frameworks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is System btw? And who owns that? Seems weird not to mention the biggest one - or is that Microsoft too?
Links for all these would also be helpful as Googling "System Wasm" for example unsurprisingly isn't helpful.
Finally why do you mention "Dotnet" and "Blazor" but not Microsoft which is 23%?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I would also agree with you on your googling :) because The term "System" is extremely common for project names, particularly within IT, software development, engineering, and enterprise contexts. It is frequently used to denote a comprehensive, integrated solution rather than just a single tool.
For Microsoft technologies, It has the default namespace named System as well. It is likely that most of them are owned by Microsoft but We can not confirm that all the "System" wasm modules are owned by Microsoft with current analysis.
To step further and confirm whether System wasm belongs to Microsoft Technology or not, We need to cross check with other analysis techniques like what is the its source of the language ? is that co-relates with Microsoft Technologies or not ? or may be binary comparision with various versions of existing Microsoft's System wasm libraries.
We will have link as per your suggession.
Dotnet may also include mono stacks, stripped down obfuscation blazor stacks, native c++ stacks and or other 3rd party technologies as well. To understand the very specific usage to the technology i.e. dotnet or blazor, We had seperated them here. If We give under single Microsoft technology, We would lose the detailed analysis. However It is also good to have analysis stats for Microsoft technology usage as the bigger picture.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK so System, Microsoft, and DotNet are ALL Microsoft (with caveats about System maybe not being all Microsoft)? If so that makes sense why you say Microsoft is such a big placer but I definitely did not get that from the reading. Maybe that's obvious to Wasm experts but I think it could be made much clearer.
Co-authored-by: Barry Pollard <barrypollard@google.com>
Removed redundant sentence from introduction
Super-linter summary
All files and directories linted successfully For more information, see the GitHub Actions workflow run Powered by Super-linter |
Super-linter summary
All files and directories linted successfully For more information, see the GitHub Actions workflow run Powered by Super-linter |
Super-linter summary
All files and directories linted successfully For more information, see the GitHub Actions workflow run Powered by Super-linter |
Updated the data collection section to reflect the identification of WebAssembly modules on sites analyzed.
Super-linter summary
All files and directories linted successfully For more information, see the GitHub Actions workflow run Powered by Super-linter |
please, have the link almanac-wasm-stats Later on We would be raising pull request and will merge the stack under HTTPArchive and enable in the pipeline with the help of Patrick and Barry. |
Super-linter summary
All files and directories linted successfully For more information, see the GitHub Actions workflow run Powered by Super-linter |
I have edited the chapter. Could you folks please take a look at it and take further adjustments?
@nimeshgit : we still need the tool "almanac-wasm". Can you please provide it to us?