Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Question]: Bypassing/purging problematic tasks ? #1383

Closed
Randname666 opened this issue Jul 5, 2024 · 3 comments
Closed

[Question]: Bypassing/purging problematic tasks ? #1383

Randname666 opened this issue Jul 5, 2024 · 3 comments
Labels
question Further information is requested

Comments

@Randname666
Copy link

Describe your problem

A problematic task (how it's generated is unknown) is clogging up all other new tasks dispatched including non-PDF ones. The problematic task is nowhere to be found to be canceled in the WebUI. Currently the backend is giving out such errors constantly:

ragflow-server  | [WARNING] Load term.freq FAIL!
ragflow-server  | Traceback (most recent call last):
ragflow-server  |   File "/ragflow/rag/svr/task_executor.py", line 375, in <module>
ragflow-server  |     main()
ragflow-server  |   File "/ragflow/rag/svr/task_executor.py", line 294, in main
ragflow-server  |     rows = collect()
ragflow-server  |   File "/ragflow/rag/svr/task_executor.py", line 117, in collect
ragflow-server  |     assert tasks, "{} empty task!".format(msg["id"])
ragflow-server  | AssertionError: 2077fa703a6311efbc6f0242ac120006 empty task!
ragflow-mysql   | 2024-07-05T01:08:14.129120Z 28 [Note] Aborted connection 28 to db: 'rag_flow' user: 'root' host: '172.19.0.6' (Got an error reading communication packets)

docker compose down then docker compose up doesn't resolve the issue.

Is there a way to manually remove this problematic task? Additionally, is there a mechanism for task purging/canceling on error internally ?

@Randname666 Randname666 added the question Further information is requested label Jul 5, 2024
@guoyuhao2330
Copy link
Contributor

This problem is due to the fact that you have generated dirty data as a result of multiple reboots, however it does not affect the operation,you can ignore this problem.

@Randname666
Copy link
Author

This problem is due to the fact that you have generated dirty data as a result of multiple reboots, however it does not affect the operation,you can ignore this problem.

But unfortunately, that one problematic task is clogging up all other new tasks dispatched. It simply goes away by waiting?

I ended up purging all the volumes of docker used by RagFlow. That fixed the issue, but of course with that, all the documents are gone which is definitely not a thing to perform if there are already a lot of documents processed in it.

This was referenced Aug 23, 2024
KevinHuSh added a commit that referenced this issue Aug 23, 2024
### What problem does this PR solve?
#1383

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
@Sephieroth
Copy link

I have the same problem.

I solve the problem by deleting data in Redis Finally.

import redis r = redis.Redis(host="0.0.0.0",port=6379,password='infini_rag_flow') keys = r.keys('*') # keys are [b"rag_flow_svr_queue"] obj = r.delete('rag_flow_svr_queue')

After deleting the data, the parsing process works well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

4 participants