Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf(util/gconv): Add gconv.Struct cache logic #3673

Open
wants to merge 27 commits into
base: master
Choose a base branch
from

Conversation

wln32
Copy link
Member

@wln32 wln32 commented Jul 1, 2024

  1. gconv.Scan, gconv.Struct之类的函数(只要是转换到struct类型的)增加了缓存逻辑,不需要重复解析
  2. 对于常见类型的字段的转换使用gconv.Int, gconv.String之类的,无需多次判断
  3. 缓存特性由UseConvCacheExperiment函数来控制,默认开启,如果开启后有什么bug,关闭即可,平稳过渡
  4. 对于上次 perf(util/gconv): improve performance for struct converting #3412 的改进来说,性能方面大幅度提升,运行速度差不多有4倍,内存方面大幅度减少
  5. 对于需要模糊匹配的情况,增加一个字段来记上次找到的key,大概率可以不再需要模糊匹配
  6. 针对paramsMap的长度选择合适的方法来做赋值
    1. 如果paramsMap的长度比结构体的字段数量少,那么以paramsMap来做循环
    2. 反之用结构体字段map来做循环
      这么所的原因是当paramsMap的长度和结构体字段数量差距过大时,会造成一些不必要浪费
      比如 以下代码
type structType2 struct {
	Name  string
	Score int
	Age   int
	ID    int

	Name1  string
	Score1 int
	Age1   int
	ID1    int
}
m := map[string]any{
	"Name":  "qiang",
	"Score": 600,
	"Age":   98,
	"ID":    199,
}

如果采用结构体字段map的长度来做循环,那么至少需要循环len(结构体字段map)次,即时使用一个额外的字段来统计赋值次数,在合适的时机退出,也不能保证赋值4次就可以退出,因为map的无序性,可能前三个字段一开始就匹配好了,最后一个ID字段可能在第七次才匹配到,中间几次没有匹配到,依然需要模糊匹配,所以这里以paramsMap的长度来做循环比较好
另外一种情况也是同样的


以下是本次的性能,会有一些浮动,

最新提交的性能是
Benchmark_doStruct_Fields8_Basic_MapToStruct
Benchmark_doStruct_Fields8_Basic_MapToStruct-8           3223788              1882 ns/op               9 B/op          0 allocs/op


commit [26c14b3] 之前的
Benchmark_doStruct_Fields8_Basic_MapToStruct
Benchmark_doStruct_Fields8_Basic_MapToStruct-8            976039              6240 ns/op            1310 B/op         10 allocs/op

---
之前pr#3412的
Benchmark_Struct_my_Fields8_Basic_MapToStruct       
Benchmark_Struct_my_Fields8_Basic_MapToStruct-8          482611             11708 ns/op            2074 B/op         44 allocs/op

Benchmark_Struct_gf_Fields8_Basic_MapToStruct                                                                                     
Benchmark_Struct_gf_Fields8_Basic_MapToStruct-8           214789             27090 ns/op            7668 B/op        138 allocs/op

@gqcn gqcn added the awesome It's awesome! We keep watching. label Jul 1, 2024
wln32 and others added 9 commits July 2, 2024 08:07
2.修改或新增一些注释
3.删除无用代码
2.对doStruct的params参数增加map[string]any的快速判断,减少内存申请次数
2.模糊匹配时增加一个字段来记录最后一次匹配到的key,加快匹配
2.增加bool和[]byte两个常见类型的转换函数
@wln32 wln32 changed the title util/gconv: Add gconv.Struct cache logic perf(util/gconv): Add gconv.Struct cache logic Jul 5, 2024
@gqcn
Copy link
Member

gqcn commented Jul 17, 2024

@wln32 很棒的性能优化思路 👍 我刚粗略看了下,整体实现逻辑上没什么问题的,我需要仔细看看学习一下细节,需要花一点时间哈。

@Issues-translate-bot
Copy link

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿


@wln32 Great performance optimization idea 👍 I just took a quick look at it, and there is nothing wrong with the overall implementation logic. I need to take a closer look at the details and spend some time.

elemType = pointerElemReflectValue.Type()
toBeConvertedFieldNameToInfoMap = map[string]toBeConvertedFieldInfo{} // key=elemFieldName
// Indicates that those values have been used and cannot be reused.
usedParamsKeyOrTagNameMap = poolGetUsedParamsKeyOrTagNameMap()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个usedParamsKeyOrTagNameMap恐怕不能使用pool来全局复用吧,每个struct转换的时候这个map都应当是不一样的,复用的话恐怕会有问题哦?!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个usedParamsKeyOrTagNameMap恐怕不能使用pool来全局复用吧,每个struct转换的时候这个map都应当是不一样的,复用的话恐怕会有问题哦?!

@gqcn 可以的,没问题,每次用完吧map清空就行了,我注释那里有写的

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wln32 看了下你的代码,这个usedParamsKeyOrTagNameMap是没有清空的就被defer直接Put进复用池了。如果采用了清空逻辑,这里的复用也没有意义了,不如新申请空的map

convCacheExperiment = true
)

func UseConvCacheExperiment(b bool) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

不建议把开关作为选项给使用者来控制,去掉该方法。

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

不建议把开关作为选项给使用者来控制,去掉该方法。

ok

// For example:
// if the type of the field is int, then directly cache the [gconv.Int] function
convFunc func(from any, to reflect.Value)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个convertFieldInfoBase结构的每一项属性描述请使用中文详细描述下作用。我来转为英文。

// In this case, name shall prevail
isField bool
removeSymbolsFieldName string
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个convertFieldInfo结构的每一项属性描述请使用中文详细描述下作用。我来转为英文。

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个convertFieldInfo结构的每一项属性描述请使用中文详细描述下作用。我来转为英文。

ok

fieldNamesMap map[string]*convertFieldInfo
}

func (structInfo *convertStructInfo) NoFields() bool {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HasNoFields


func (structInfo *convertStructInfo) GetFieldInfo(fieldName string) *convertFieldInfo {
v := structInfo.fieldAndTagsMap[fieldName]
return v
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return structInfo.fieldAndTagsMap[fieldName]

structField reflect.StructField
// Field index used to store duplicate names
// Generally used for nested structures
otherFieldIndex [][]int
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

为什么要定义多维数组?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

为什么要定义多维数组?

正常是[]int,可以兼顾到嵌套的字段,使用[][]int的原因是嵌套的结构体字段可能会和外层结构体或者其他嵌套结构体的字段重名
例如以下

type Name struct{
   LastName string
   FirstName string
}


type User struct{
  Name 
  LastName string
  FirstName string
}

这种情况只会存储两个字段LastName, FirstName,使用不同的索引来代表不同的字段

elemType = pointerElemReflectValue.Type()
toBeConvertedFieldNameToInfoMap = map[string]toBeConvertedFieldInfo{} // key=elemFieldName
// Indicates that those values have been used and cannot be reused.
usedParamsKeyOrTagNameMap = poolGetUsedParamsKeyOrTagNameMap()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wln32 看了下你的代码,这个usedParamsKeyOrTagNameMap是没有清空的就被defer直接Put进复用池了。如果采用了清空逻辑,这里的复用也没有意义了,不如新申请空的map

@gqcn
Copy link
Member

gqcn commented Aug 8, 2024

I'm still reviewing this. ❤️

gqcn and others added 2 commits August 14, 2024 21:16
…truct converting (#3)

* fix(cmd/gf): fix command `gf up` with `-u` option upgrading packages indirectly required would fail with higher version of go required (gogf#3687)

* up

* up

* refactor(util/gconv): add struct&field cache to improve performance for struct converting

---------

Co-authored-by: 海亮 <739476267@qq.com>
Copy link

sonarcloud bot commented Aug 18, 2024

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
awesome It's awesome! We keep watching.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants