Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Enable Nats-server as a optional MQ for Milvus #24445

Merged
merged 1 commit into from
Jun 7, 2023

Conversation

chyezh
Copy link
Contributor

@chyezh chyezh commented May 26, 2023

Enable Nats-server as a optional MQ for Milvus.

prototype and unit test by @yiwangdr #23606
more unit test, bug fix and pass e2e.
Related issue: #23611

@sre-ci-robot sre-ci-robot added the area/dependency Pull requests that update a dependency file label May 26, 2023
@sre-ci-robot sre-ci-robot added the size/XXL Denotes a PR that changes 1000+ lines. label May 26, 2023
@mergify mergify bot added the dco-passed DCO check passed. label May 26, 2023
@mergify
Copy link
Contributor

mergify bot commented May 26, 2023

@chyezh ut workflow job failed, comment rerun ut can trigger the job again.

@mergify
Copy link
Contributor

mergify bot commented May 26, 2023

@chyezh E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.

@chyezh chyezh force-pushed the feature_enable_natsmq branch from bc5370b to bf349a9 Compare May 26, 2023 10:04
@mergify
Copy link
Contributor

mergify bot commented May 26, 2023

@chyezh ut workflow job failed, comment rerun ut can trigger the job again.

@chyezh chyezh force-pushed the feature_enable_natsmq branch from bf349a9 to f543ff9 Compare May 26, 2023 10:20
@mergify
Copy link
Contributor

mergify bot commented May 26, 2023

@chyezh ut workflow job failed, comment rerun ut can trigger the job again.

@codecov
Copy link

codecov bot commented May 26, 2023

Codecov Report

Merging #24445 (e9f22af) into master (852e5ce) will increase coverage by 0.45%.
The diff coverage is 82.24%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #24445      +/-   ##
==========================================
+ Coverage   82.18%   82.63%   +0.45%     
==========================================
  Files         777      776       -1     
  Lines      103129   102013    -1116     
==========================================
- Hits        84756    84301     -455     
+ Misses      15358    14678     -680     
- Partials     3015     3034      +19     
Impacted Files Coverage Δ
pkg/mq/msgstream/common_mq_factory.go 57.14% <57.14%> (ø)
internal/util/dependency/factory.go 64.38% <58.13%> (+24.38%) ⬆️
internal/mq/msgstream/mq_factory.go 75.00% <75.00%> (+13.46%) ⬆️
pkg/mq/msgstream/mqwrapper/nmq/nmq_consumer.go 79.78% <79.78%> (ø)
pkg/mq/msgstream/mqwrapper/nmq/nmq_producer.go 80.95% <80.95%> (ø)
pkg/mq/msgstream/mqwrapper/nmq/nmq_client.go 81.25% <81.25%> (ø)
pkg/mq/msgstream/mqwrapper/nmq/nmq_server.go 85.36% <85.36%> (ø)
pkg/mq/msgstream/mqwrapper/nmq/nmq_message.go 86.95% <86.95%> (ø)
pkg/util/paramtable/service_param.go 98.45% <97.89%> (-0.10%) ⬇️
internal/mq/msgstream/mqwrapper/rmq/rmq_client.go 92.30% <100.00%> (+0.81%) ⬆️
... and 2 more

... and 217 files with indirect coverage changes

@chyezh chyezh force-pushed the feature_enable_natsmq branch from f543ff9 to 31eebdf Compare May 27, 2023 03:44
@mergify
Copy link
Contributor

mergify bot commented May 27, 2023

@chyezh ut workflow job failed, comment rerun ut can trigger the job again.

@chyezh chyezh force-pushed the feature_enable_natsmq branch from 31eebdf to 49577f8 Compare May 27, 2023 04:42
@mergify
Copy link
Contributor

mergify bot commented May 27, 2023

@chyezh E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.

@chyezh
Copy link
Contributor Author

chyezh commented May 27, 2023

/run-cpu-e2e

@mergify mergify bot added the ci-passed label May 27, 2023
@chyezh chyezh requested review from jaime0815 and yiwangdr May 29, 2023 01:58
// Wait for server to be ready for connections
if !Nmq.ReadyForConnections(initializeTimeout) {
finalErr = fmt.Errorf("nmq is not ready with timeout %s", initializeTimeout)
log.Warn("nmq is not ready", zap.Duration("timeout", initializeTimeout), zap.Reflect("options", opts))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might cause some exception behaviors if nmq server is not ready and skip it within a singleton.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

log.Fatal is likely better here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

new commit changed:
log.Fatal with error field is added into MustInitNatsMQ


f.msgStreamFactory = f.initMQRemoteService(params)
func (f *DefaultFactory) initMQ(standalone bool, params *paramtable.ComponentParam) error {
mqType := mustSelectMQType(standalone, params.MQCfg.Type.GetValue(), mqEnable{params.RocksmqEnable(), params.NatsmqEnable(), params.PulsarEnable(), params.KafkaEnable()})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just invoke validateMQType to check parameters, then create msgstream factory, it is more simple.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some rule is applied if mqType = default, validateMQType is not enough.

return errors.Newf("mq.type %s is invalid", mqType)
}
if !standalone && (mqType == mqTypeRocksmq || mqType == mqTypeNatsmq) {
return errors.Newf("mq %s is only valid in standalone mode")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mq type

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mq.type is the field of milvus.yaml.

func (p *MQConfig) Init(base *BaseTable) {
p.Type = ParamItem{
Key: "mq.type",
Version: "2.2.0",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be 2.3.0?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

new commit changed:
all field of this commit in yaml is changed to 2.3.0

}

// Init sets up a new NatsmqConfig instance using the provided BaseTable
func (r *NatsmqConfig) Init(base *BaseTable) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Set version to 2.3.0 for all new parameters

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OOC, where do we read this "version" field?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It just indicates this parameter bringing into Milvus with this version.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case, we probably don't need to have it in every single param. A global param is good enough.

For this pr, we can have a local/global variable instead of specifying "2.2.0"/"2.3.0" per param.

r.Server.MaxMemoryStore = ParamItem{
Key: "natsmq.server.maxMemoryStore",
Version: "2.2.0",
DefaultValue: "536870912",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

256 MB?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

new commit changed:
memory storage is removed, it's not used in natsmq.

r.Server.MaxFileStore = ParamItem{
Key: "natsmq.server.maxFileStore",
Version: "2.2.0",
DefaultValue: "68719476736",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it too large?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

new commit changed:
16G.

@chyezh
Copy link
Contributor Author

chyezh commented May 30, 2023

move nats implementation into pkg module

commit changed:

  1. move nats implementation into pkg module.
  2. recover some orignal code of rmq implementation.
  3. review is replied, most of the suggestion is applied.

@chyezh chyezh force-pushed the feature_enable_natsmq branch from 8431f07 to f26e1b8 Compare May 30, 2023 09:49
@mergify
Copy link
Contributor

mergify bot commented May 30, 2023

@chyezh E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.

@chyezh
Copy link
Contributor Author

chyezh commented May 30, 2023

/run-cpu-e2e

@mergify mergify bot added the ci-passed label May 30, 2023
@chyezh chyezh force-pushed the feature_enable_natsmq branch from f26e1b8 to c6b4e89 Compare May 30, 2023 14:54
@mergify mergify bot added ci-passed and removed ci-passed labels May 30, 2023
@yiwangdr
Copy link
Contributor

move nats implementation into pkg module

commit changed:

  1. move nats implementation into pkg module.
  2. recover some orignal code of rmq implementation.
  3. review is replied, most of the suggestion is applied.

Thanks, is it ready for review now?

@chyezh chyezh requested review from jaime0815 and yiwangdr May 31, 2023 01:54
@chyezh
Copy link
Contributor Author

chyezh commented May 31, 2023

/assign @czs007

Signed-off-by: yiwangdr <yiwangdr@gmail.com>

bug fixup, configurable natsmq, add unittest, pass e2e.

Signed-off-by: chyezh <ye.zhen@zilliz.com>

move natsmq to pkg project

Signed-off-by: chyezh <ye.zhen@zilliz.com>
@chyezh chyezh force-pushed the feature_enable_natsmq branch from c6b4e89 to e9f22af Compare June 6, 2023 12:34
@mergify mergify bot removed the ci-passed label Jun 6, 2023
@mergify
Copy link
Contributor

mergify bot commented Jun 6, 2023

@chyezh E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.

@chyezh
Copy link
Contributor Author

chyezh commented Jun 6, 2023

/run-cpu-e2e

@mergify
Copy link
Contributor

mergify bot commented Jun 6, 2023

@chyezh E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.

@chyezh
Copy link
Contributor Author

chyezh commented Jun 6, 2023

/run-cpu-e2e

@mergify mergify bot added the ci-passed label Jun 6, 2023
@czs007
Copy link
Collaborator

czs007 commented Jun 7, 2023

/approve
/lgtm

@sre-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: chyezh, czs007, yiwangdr

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@sre-ci-robot sre-ci-robot merged commit f97127a into milvus-io:master Jun 7, 2023
@chyezh chyezh deleted the feature_enable_natsmq branch June 7, 2023 02:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved area/dependency Pull requests that update a dependency file ci-passed dco-passed DCO check passed. lgtm size/XXL Denotes a PR that changes 1000+ lines.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants