Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: After restarting, it is unusually slow to be able to use normally #37654

Open
1 task done
fire717 opened this issue Nov 13, 2024 · 4 comments
Open
1 task done
Assignees
Labels
help wanted Extra attention is needed

Comments

@fire717
Copy link

fire717 commented Nov 13, 2024

Is there an existing issue for this?

  • I have searched the existing issues

Environment

- Milvus version: milvusdb/milvus:v2.3.4 (docker)
- Deployment mode(standalone or cluster):standalone 
- MQ type(rocksmq, pulsar or kafka):    
- SDK version(e.g. pymilvus v2.0.0rc2):
- OS(Ubuntu or CentOS): Ubuntu 
- CPU/Memory: 256G
- GPU: a6000
- Others:

Current Behavior

After restarting, the milvus cannot work, search or delect api goes wrong.
And by TOP command, it shows that milvus cpu cost from 100% to 400%.

Expected Behavior

work normally.

Steps To Reproduce

No response

Milvus Log

ERROR:sanic.error:Exception occurred while handling uri: 'http://10.89.134.52:8777/api/local_doc_qa/delete_files'
Traceback (most recent call last):
File "handle_request", line 97, in handle_request
File "/workspace/qanything_local/qanything_kernel/qanything_server/handler.py", line 273, in delete_docs
milvus_kb.delete_files(file_ids)
File "/workspace/qanything_local/qanything_kernel/connector/database/milvus/milvus_client.py", line 284, in delete_files
self.sess.delete(expr=f"file_id in {files_id}")
File "/usr/local/lib/python3.10/dist-packages/pymilvus/orm/collection.py", line 563, in delete
res = conn.delete(self._name, expr, partition_name, timeout=timeout, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/pymilvus/decorators.py", line 129, in handler
raise e from e
File "/usr/local/lib/python3.10/dist-packages/pymilvus/decorators.py", line 125, in handler
return func(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/pymilvus/decorators.py", line 164, in handler
return func(self, *args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/pymilvus/decorators.py", line 104, in handler
raise e from e
File "/usr/local/lib/python3.10/dist-packages/pymilvus/decorators.py", line 68, in handler
return func(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/pymilvus/client/grpc_handler.py", line 586, in delete
raise err from err
File "/usr/local/lib/python3.10/dist-packages/pymilvus/client/grpc_handler.py", line 580, in delete
check_status(response.status)
File "/usr/local/lib/python3.10/dist-packages/pymilvus/client/utils.py", line 54, in check_status
raise MilvusException(status.code, status.reason, status.error_code)
pymilvus.exceptions.MilvusException: <MilvusException: (code=65535, message=failed to search/query delegator 31 for channel by-dev-rootcoord-dml_0_450679063477500466v0: Timestamp lag too large)>

Anything else?

No response

@fire717 fire717 added kind/bug Issues or changes related a bug needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Nov 13, 2024
@fire717
Copy link
Author

fire717 commented Nov 13, 2024

docker logs -f keep printing logs like this:

.019945022 data:-0.07271096 data:-0.042803142 data:-0.05212506 data:-0.04420143 data:0.09912307 data:0.040822234 data:-0.057601687 data:0.011778632 data:0.006976873 data:0.0062680193 data:-0.0032553887 data:0.037947975 data:0.03889959 data:-0.10409476 data:0.0228387 data:-0.00045365456 data:0.02285812 data:-0.02808228 data:0.007758555 data:0.008671327 data:-0.028917369 data:0.02837359 data:0.008292623 data:0.03538445 data:-0.056902543 data:0.023615526 data:-0.0050736484 data:0.015284062 data:0.0926754 data:0.00069914386 data:-0.076245524 data:-0.00015513772 data:0.009647215 data:0.047774833 data:0.0019578456 data:0.07473071 data:-0.020547062 data:0.0038550016 data:-0.038744222 data:-0.029713616 data:-0.0024190864 data:0.011545585 data:-0.004231277 data:0.03567576 data:-0.0064865015 data:-0.0028936788 data:0.028956208 data:0.016585246 data:0.028295906 data:0.07756613 data:0.010283241 data:0.025965426 data:0.022392025 data:0.007282749 data:-0.002660631 data:-0.060126375 data:-0.030548703 data:0.024139885 data:-0.02050822 data:0.05966028 data:0.051076345 data:-0.05997101 data:0.039054953 data:0.06758391 data:0.02021691 data:-0.021323888 data:-0.018100059 data:-0.0060446816 data:0.06253453 data:0.043968383 data:-0.015410297 data:-0.024664242 data:-0.022605652 data:-0.06913756 data:0.0030490442 data:0.014963621 data:0.08537324 data:0.052163903 data:0.029791297 data:-0.011458191 data:-0.0416379 data:-0.009307353 data:-0.023965098 data:0.007889644 data:0.011293116 data:-0.020178068 data:0.039598733 data:-0.010419186 data:0.032141197 data:0.033248175 data:0.002944658 data:0.0038331535 data:-0.003643802 data:0.06179655 data:0.007845948 data:-0.03012145 data:0.036180697 data:-0.0020901489 data:0.0562034 data:-0.0026970445 data:0.047269896 data:-0.004884297 data:0.016614377 data:0.0204888 data:-0.073021695 data:-0.007821673 data:-0.003527278 data:0.06121393 data:-0.059077658 data:0.04194863 data:0.0062194676 data:-0.011933997 data:-0.039754096 data:-0.018808912 data:0.015080146 data:0.052202743 data:-0.042026315 data:-0.023304796 data:0.013021555 data:-0.013380837 data:0.004777483 data:0.009982221 data:0.012623431 data:-0.02604311 data:-0.019721683 data:0.01260401 data:-0.0056659784 data:-0.04447332 data:0.024974974 data:-0.037773192 data:-0.03773435 data:0.017731065 data:-0.03456878 data:0.0145849185 data:-0.055193525 data:-0.020430539 data:-0.033694852 data:-0.028528955 data:-0.02309117 data:0.04361881 data:0.016245386 data:0.035345607 data:-0.016740613 data:-0.031985834 data:0.0038962706 data:0.0456774 data:0.07100195 data:-0.023634948 data:0.027150087 data:-0.014555788 data:-0.08413032 data:-0.01490536 data:0.0144295525 data:-0.025091497 data:-0.015924944 data:0.069215246 data:0.0463377 data:-0.011671819 data:0.014371291 data:0.018828332 data:-0.0004755028 data:0.03454936 data:0.0060495366 data:-0.009923959 data:0.010749337 data:-0.031111903 data:-0.037889715 data:-0.015643345 data:-0.11621325 data:-0.040861078 data:0.041404855 data:0.014118822 data:-0.026217896 data:-0.00805472 data:-0.056941386 data:-0.035598077 data:-0.021770563 data:0.026878199 data:0.02693646 data:0.035539813 data:-0.06533111 data:-0.020566482 data:-0.0316751 data:-0.056048036 data:-0.023576686 data:0.037909135 data:-0.020702427 data:-0.048435133 data:0.01875065 data:-0.016225964 data:0.021246206 data:-0.05383408 data:0.009472429 data:-0.008982057 data:-0.0108755715 data:0.014623759 data:-0.03452994 data:0.01141935 data:0.0060058404 data:-0.0030223408 data:0.030257393 data:-0.0025611 data:-0.02928636 data:-0.005005676 data:0.054572064 data:0.010176428 data:-0.0013643017 data:-0.039346263 data:0.027499659 data:0.032354824 data:0.06459313 data:0.009938524 data:0.060320582 data:-0.003248106 data:-0.035656337 data:0.057912417 data:-0.009084015 data:-0.05830083 data:-0.048668183 data:0.05181433 data:0.05294073 data:0.014303318 data:0.032160617 data:0.028936788 data:0.0391909 data:-0.015536531 data:-0.015478269 data:0.03800624 data:0.048046723 data:0.03889959 data:0.04416259 data:-0.009054884 data:-0.05130939 data:-0.06338905 data:-0.026489785 data:0.037132308 data:-0.035656337 data:0.051892012 data:-0.02103258 data:-0.043152712 data:-0.059038818 data:-0.029713616 data:-0.0009825642 data:-0.05476627 data:0.011603846 data:-0.0251886 data:-0.015555952 data:-0.002706755 data:0.034316313 data:-0.06878799 data:-0.023052327 data:0.011176592 data:-0.061563503 data:0.03717115 data:-0.02107142 data:0.026761673 data:-0.040589187 data:-0.020721847 data:-0.01112804 data:0.05503816 data:-0.0638163 data:-0.020430539 data:-0.027324874 data:0.00497169 data:0.046648435 data:-0.03254903 data:0.004197291 data:0.005127055 data:0.044123746 data:-0.024528299 data:-0.007476955 data:-0.011011516 data:-0.0009212677 data:-0.025557593 data:-0.01434216 data:0.012186467 data:0.04451216 data:0.002078011 data:0.009137422 data:-0.00614664 data:0.021013157 data:0.013565334 data:0.018507892 data:0.0028839684 data:-0.0029203822 data:-0.00791392 data:-0.05884461 data:-0.0013448809 data:0.027829811 data:0.025635276 data:0.0021969625 data:-0.036840998 data:0.08917969 data:-0.024547718 data:-0.03194699 data:-0.011079488 data:-0.039656993 data:-0.014623759 data:0.054533225 data:0.007578914 data:0.019119643 data:-0.012370963 data:0.038705382 data:0.026586888 data:0.06793348 data:0.03313165 data:-0.008588788 data:-0.036782738 data:0.0132448925 data:0.05740748 data:-0.06944829 data:0.030160291 data:-0.0028329892 data:0.0059038815 data:-0.029072734 data:0.024644822 data:0.07111847 data:0.013128368 data:0.042919666 data:0.06999207 data:-0.037792612 data:0.03755956 data:0.01843992 data:-0.046143495 data:-0.013099237 data:-0.020430539 data:0.03633606 data:-0.048784707 data:0.09391833 data:0.05950491 data:-0.02930578 data:-0.010768758 data:0.020022703 data:-0.034646463 data:0.013060397 data:0.03388906 data:0.0014189222 data:0.03856944 data:0.0048041865 data:0.015769579 data:-0.034918353 data:-0.0031267267 data:0.030665228 data:-0.022120135 data:-0.013643016 data:0.0005243579 data:-0.025227442 data:0.044317953 data:0.00089152984 data:-0.054960478 data:0.0010487158 data:0.01758541 data:-0.021323888 data:-0.037443038 data:-0.018944858 data:0.008749009 data:-0.018653547 data:-0.037035204 data:-0.04886239 data:0.025013814 data:-0.041715585 data:0.010700786 data:0.025402227 data:0.011778632 data:-0.011030937 data:0.047269896 data:0.043813016 data:-0.003694781 data:-0.015895814 data:-0.023965098 data:0.017002791 data:-0.0035151402 data:-0.057019066 data:-0.030684648 data:-0.037132308 data:0.035287347 data:0.024664242 data:-0.002281928 data:0.066107936 data:-0.017769907 data:-0.050454885 data:0.08001313 data:0.011166882 data:-0.04975574 data:0.017789328 data:0.011186302 data:-0.033500645 data:0.021828827 data:-0.17245549 data:0.039637573 data:0.027752127 data:0.06404935 data:-0.033733

@fire717
Copy link
Author

fire717 commented Nov 14, 2024

After waiting for 7 hours, it works normally now, but the cpu cost is still high.

@yanliang567
Copy link
Contributor

I guess the milvus is building index that's why the cpu cost keeps high at this moment. If you have milvus metrics, you can check and confirm it. Before that we can see there are Timestamp lag too large errors, i think that was you had too many insert/delete requests that milvus cannot handle it in a short time.
As it was recovered now, i think we shall do nothing for now.

/assign @fire717
/unassign

@sre-ci-robot sre-ci-robot assigned fire717 and unassigned yanliang567 Nov 14, 2024
@yanliang567 yanliang567 added help wanted Extra attention is needed and removed kind/bug Issues or changes related a bug needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Nov 14, 2024
@fire717
Copy link
Author

fire717 commented Nov 14, 2024

My total data is around 10G, and when do restart, I am sure that I didn't send any insert or delete request, why it's cpu cost so high? And does it normal that it cost so much time for hours?

After restarting, the log didn't show error, but if I try to search or delete , it shows error below.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

2 participants