Skip to content

Using binary search instead of sequential search for rowData #2129

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
May 12, 2025

Conversation

shcabin
Copy link
Contributor

@shcabin shcabin commented May 11, 2025

PR Details

Using binary search instead of sequential search for rowData

Description

Related Issue

#1961

Motivation and Context

How Has This Been Tested

Performance testing was conducted on an AMD64 WSL using Go's benchmark tool. The benchmarks were run the test mentioned in #1961 8 times . The following results were observed:

goos: linux
goarch: amd64
pkg: github.com/xuri/excelize/v2
cpu: 12th Gen Intel(R) Core(TM) i5-12400F
               │ base.txt   │             new.txt               │
               │   sec/op   │   sec/op    vs base               │
GetFormulas-12   3.951 ± 4%   1.382 ± 3%  -65.02% (p=0.000 n=8)

               │  base.txt    │           new.txt             │
               │     B/op     │     B/op      vs base         │
GetFormulas-12   815.1Mi ± 0%   815.1Mi ± 0%  ~ (p=0.721 n=8)

               │  base.txt   │           new.txt            │
               │  allocs/op  │  allocs/op   vs base         │
GetFormulas-12   10.87M ± 0%   10.87M ± 0%  ~ (p=0.940 n=8)
(The version of the issue submitted is 27 seconds in the same environment)

code:


func BenchmarkGetFormulas(b *testing.B) {
	for i := 0; i < b.N; i++ {
		file, err := OpenFile("4600Formulas.xlsx")
		if err != nil {
			panic(err)
		}
		begin := time.Now()
		cc := 0
		for row := 1; row <= 4600; row++ { // 4600
			for col := 1; col <= 55; col++ { //55
				axis, _ := CoordinatesToCellName(col, row)
				str, err := file.GetCellFormula("Sheet1", axis)
				if err != nil {
					panic(err)
				}
				if str != "" {
					cc++
				}
			}
		}
		assert.Equal(b, 45990, cc)
		b.Logf("GetCellFormula time cost:%dms cc:%v", time.Since(begin).Milliseconds(), cc)
	}
}

Types of changes

  • Docs change / refactoring / dependency upgrade
  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist

  • My code follows the code style of this project.
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
  • I have read the CONTRIBUTING document.
  • I have added tests to cover my changes.
  • All new and existing tests passed.

@shcabin
Copy link
Contributor Author

shcabin commented May 11, 2025

this PR dependency that 'SheetData.Row[i].R ' is ordered.
the first commit does not have this dependency and can reduce the time by 55% compared to the previous version (3.95seconds vs 1.762seconds).

Copy link

codecov bot commented May 11, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 99.23%. Comparing base (efdfc52) to head (2650592).
Report is 4 commits behind head on master.

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #2129   +/-   ##
=======================================
  Coverage   99.23%   99.23%           
=======================================
  Files          32       32           
  Lines       30196    30236   +40     
=======================================
+ Hits        29965    30005   +40     
  Misses        153      153           
  Partials       78       78           
Flag Coverage Δ
unittests 99.23% <100.00%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@xuri xuri added the size/S Denotes a PR that changes 10-29 lines, ignoring generated files. label May 11, 2025
Copy link
Member

@xuri xuri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your pull request. I've left a comment.

@xuri
Copy link
Member

xuri commented May 11, 2025

this PR dependency that 'SheetData.Row[i].R ' is ordered. the first commit does not have this dependency and can reduce the time by 55% compared to the previous version (3.95seconds vs 1.762seconds).

This is awesome! Does these changes increase memory usage? If so, how much memories will be increased?

@shcabin
Copy link
Contributor Author

shcabin commented May 11, 2025

This is awesome! Does these changes increase memory usage? If so, how much memories will be increased?

no change in memory usage

@xuri xuri moved this to Performance in Excelize v2.9.1 May 12, 2025
Copy link
Member

@xuri xuri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for your contribution.

@xuri xuri merged commit 2c3d13f into qax-os:master May 12, 2025
17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
No open projects
Status: Performance
Development

Successfully merging this pull request may close these issues.

2 participants