-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Try optimizing file processing. #60
Comments
Thanks for reaching out @tdresser . You are likely right that any slow downs would come from the parser step. I'm by no means attached to the current parser, xml2js. It was most likely picked because of the significant usage and the Typescript support. If there's something faster out there, I'm definitely not opposed to switching things up. |
file downloaded from from https://green-button.github.io/samples/ to test issue #40
So it looks like with a 26 MB test file, GitHub Actions takes 3.5s and my development machine takes 2.5s. Are your results similar? |
I think I'm using the term "Parsing" inaccurately here. What I'm trying to refer to is "Time from file upload until it's done processing and I can see the data on the dashboard." I guess there's a bunch of database writes in there? Oh, that likely means this issue is on the wrong repo. I'm happy to move this over if there is anything worth looking at. |
Ok, I think I understand now. I'm going to move this issue to the EMILE repository as the parser doesn't do the uploading, database writing, or visualizing. |
Thanks for moving. I've done a little bit of poking around, and I think I've got a fairly minimal change in mind that should improve things a bunch, assuming I'm not completely misunderstanding how this works! The vast majority of the time (based on a couple performance traces) is spent in IIUC, this needs to be run after any asset is updated. Does that make sense? I guess today you might be able to view data from a partially processed file, and this would eliminate that possibility, but on my machine, I think this would be a ~10x speedup. Am I understanding this correctly? |
Parsing a 10MB xml file is pretty slow currently.
I don't have a good intuition for what part is slow. Do you think there might be some low hanging fruit here? Any hunches?
The first step would be profiling.
The text was updated successfully, but these errors were encountered: