Open
0 of 2 issues completedDescription
We just investigated this together with @JavierGOrdonnez.
We were not able to reproduce the error. What we observed is that if themap
endpoint in the api-server returns 200, then all jobs succeed. We also tried running this as two users at the same time and it worked fine. What does fail sometime is themap
endpoint in the api-server. Here are the two reasons we observed for that happening:
- The 60s timeout in the api-server when waiting to create a job often causes issues. This is for sure an issue for studies with large amounts of data (see https://github.com/ITISFoundation/osparc-simcore/blob/85a2e9d08db780e3a3a840f0f63cb31e4aed3bc0/services/api-server/src/simcore_service_api_server/services_http/webserver.py#L225).
- Sometimes the rpc endpoints timeout (see https://monitoring.osparc-staging.io/graylog/messages/graylog_250/b4041241-46c6-11f0-ab1d-0242c0a80103)
Originally posted by @bisgaard-itis in #1898