Skip to content

When too much data is imported, timeouts may easily occur when executing the embedding model. #1199

Closed
@impactCn

Description

@impactCn
java.util.concurrent.TimeoutException: Channel response timed out after 60000 milliseconds.
	at com.azure.core.http.netty.implementation.AzureSdkHandler.responseTimedOut(AzureSdkHandler.java:202) ~[azure-core-http-netty-1.15.1.jar:1.15.1]
	at com.azure.core.http.netty.implementation.AzureSdkHandler.lambda$startResponseTracking$2(AzureSdkHandler.java:187) ~[azure-core-http-netty-1.15.1.jar:1.15.1]
	at io.netty.util.concurrent.PromiseTask.runTask(PromiseTask.java:98) ~[netty-common-4.1.101.Final.jar:4.1.101.Final]
	at io.netty.util.concurrent.ScheduledFutureTask.run(ScheduledFutureTask.java:153) ~[netty-common-4.1.101.Final.jar:4.1.101.Final]
	at io.netty.util.concurrent.AbstractEventExecutor.runTask$$$capture(AbstractEventExecutor.java:173) ~[netty-common-4.1.101.Final.jar:4.1.101.Final]
	at io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java) ~[netty-common-4.1.101.Final.jar:4.1.101.Final]
	at io.netty.util.concurrent.AbstractEventExecutor.safeExecute$$$capture(AbstractEventExecutor.java:166) ~[netty-common-4.1.101.Final.jar:4.1.101.Final]
	at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java) ~[netty-common-4.1.101.Final.jar:4.1.101.Final]
	at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:470) ~[netty-common-4.1.101.Final.jar:4.1.101.Final]
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:566) ~[netty-transport-4.1.101.Final.jar:4.1.101.Final]
	at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997) ~[netty-common-4.1.101.Final.jar:4.1.101.Final]
	at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) ~[netty-common-4.1.101.Final.jar:4.1.101.Final]
	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) ~[netty-common-4.1.101.Final.jar:4.1.101.Final]
	at java.base/java.lang.Thread.run(Thread.java:833) ~[na:na]

2024-08-09T17:36:36.604+08:00 ERROR 4252 --- [nio-8083-exec-4] c.a.c.http.netty.NettyAsyncHttpClient    : java.util.concurrent.TimeoutException: Channel response timed out after 60000 milliseconds.

When I inserted 800 pieces of data, there was no data in the pg database and it was timed out directly.

Generally speaking, each model has its own timeout, but the timeout cannot be set endlessly. When I was using PgVectorStore.add, I saw that all the data embedding processing was done at once. When using this method, the user cannot grasp the data size.

Once a timeout occurs, all data cannot be inserted. So I think the processing logic here should be inserted in segments. Insert every 10 items to avoid problems like this.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions