-
Notifications
You must be signed in to change notification settings - Fork 178
Open
Description
Model description
google/embeddinggemma-300m
embedding-server | INFO: Waiting for application startup.
embedding-server | INFO 2025-09-18 15:07:20,029 infinity_emb INFO: infinity_server.py:84
embedding-server | Creating 2 engines:
embedding-server | ['google/embeddinggemma-300m',
embedding-server | 'Qwen/Qwen3-Reranker-0.6B']
embedding-server | INFO 2025-09-18 15:07:20,031 infinity_emb INFO: telemetry.py:34
embedding-server | DO_NOT_TRACK=1 registered. Anonymized usage statistics
embedding-server | are disabled.
embedding-server | INFO 2025-09-18 15:07:20,034 infinity_emb INFO: select_model.py:66
embedding-server | model=`google/embeddinggemma-300m` selected, using
embedding-server | engine=`torch` and device=`cuda`
embedding-server | ERROR: Traceback (most recent call last):
embedding-server | File "/app/.venv/lib/python3.10/site-packages/transformers/models/auto/configuration_auto.py", line 1082, in from_pretrained
embedding-server | config_class = CONFIG_MAPPING[config_dict["model_type"]]
embedding-server | File "/app/.venv/lib/python3.10/site-packages/transformers/models/auto/configuration_auto.py", line 784, in __getitem__
embedding-server | raise KeyError(key)
embedding-server | KeyError: 'gemma3_text'
embedding-server |
embedding-server | During handling of the above exception, another exception occurred:
embedding-server |
embedding-server | Traceback (most recent call last):
embedding-server | File "/app/.venv/lib/python3.10/site-packages/starlette/routing.py", line 693, in lifespan
embedding-server | async with self.lifespan_context(app) as maybe_state:
embedding-server | File "/usr/lib/python3.10/contextlib.py", line 199, in __aenter__
embedding-server | return await anext(self.gen)
embedding-server | File "/app/infinity_emb/infinity_server.py", line 88, in lifespan
embedding-server | app.engine_array = AsyncEngineArray.from_args(engine_args_list) # type: ignore
embedding-server | File "/app/infinity_emb/engine.py", line 306, in from_args
embedding-server | return cls(engines=tuple(engines))
embedding-server | File "/app/infinity_emb/engine.py", line 71, in from_args
embedding-server | engine = cls(**engine_args.to_dict(), _show_deprecation_warning=False)
embedding-server | File "/app/infinity_emb/engine.py", line 56, in __init__
embedding-server | self._model_replicas, self._min_inference_t, self._max_inference_t = select_model(
embedding-server | File "/app/infinity_emb/inference/select_model.py", line 83, in select_model
embedding-server | loaded_engine = unloaded_engine.value(engine_args=engine_args_copy)
embedding-server | File "/app/infinity_emb/transformer/embedder/sentence_transformer.py", line 62, in __init__
embedding-server | attempt_bt = check_if_bettertransformer_possible(engine_args)
embedding-server | File "/app/infinity_emb/transformer/acceleration.py", line 40, in check_if_bettertransformer_possible
embedding-server | config = AutoConfig.from_pretrained(
embedding-server | File "/app/.venv/lib/python3.10/site-packages/transformers/models/auto/configuration_auto.py", line 1084, in from_pretrained
embedding-server | raise ValueError(
embedding-server | ValueError: The checkpoint you are trying to load has model type `gemma3_text` but Transformers does not recognize this architecture. This could be because of an issue with the checkpoint, or because your version of Transformers is out of date.
embedding-server |
embedding-server | You can update Transformers with the command `pip install --upgrade transformers`. If this does not work, and the checkpoint is very new, then there may not be a release version that supports this model yet. In this case, you can get the most up-to-date code by installing Transformers from source with the command `pip install git+https://github.com/huggingface/transformers.git`
embedding-server |
embedding-server | ERROR: Application startup failed. Exiting.
WARN[0010] optional dependency "embedding-server" failed to start: container embedding-server exited (3)
embedding-server exited with code 3
Qwen/Qwen3-Reranker-0.6B
embedding-server | INFO: Started server process [1]
embedding-server | INFO: Waiting for application startup.
embedding-server | INFO: Started server process [1]
embedding-server | INFO: Waiting for application startup.
embedding-server | INFO 2025-09-18 15:11:31,772 infinity_emb INFO: infinity_server.py:84
embedding-server | Creating 1 engines: ['Qwen/Qwen3-Reranker-0.6B']
embedding-server | INFO 2025-09-18 15:11:31,774 infinity_emb INFO: telemetry.py:34
embedding-server | DO_NOT_TRACK=1 registered. Anonymized usage statistics
embedding-server | are disabled.
embedding-server | INFO 2025-09-18 15:11:31,772 infinity_emb INFO: infinity_server.py:84
embedding-server | Creating 1 engines: ['Qwen/Qwen3-Reranker-0.6B']
embedding-server | INFO 2025-09-18 15:11:31,774 infinity_emb INFO: telemetry.py:34
embedding-server | DO_NOT_TRACK=1 registered. Anonymized usage statistics
embedding-server | are disabled.
embedding-server | INFO 2025-09-18 15:11:31,777 infinity_emb INFO: select_model.py:66
embedding-server | model=`Qwen/Qwen3-Reranker-0.6B` selected, using
embedding-server | engine=`torch` and device=`cuda`
embedding-server | INFO 2025-09-18 15:11:31,777 infinity_emb INFO: select_model.py:66
embedding-server | model=`Qwen/Qwen3-Reranker-0.6B` selected, using
embedding-server | engine=`torch` and device=`cuda`
embedding-server | ERROR: Traceback (most recent call last):
embedding-server | File "/app/.venv/lib/python3.10/site-packages/transformers/models/auto/configuration_auto.py", line 1082, in from_pretrained
embedding-server | config_class = CONFIG_MAPPING[config_dict["model_type"]]
embedding-server | File "/app/.venv/lib/python3.10/site-packages/transformers/models/auto/configuration_auto.py", line 784, in __getitem__
embedding-server | raise KeyError(key)
embedding-server | KeyError: 'qwen3'
embedding-server |
embedding-server | During handling of the above exception, another exception occurred:
embedding-server |
embedding-server | Traceback (most recent call last):
embedding-server | File "/app/.venv/lib/python3.10/site-packages/starlette/routing.py", line 693, in lifespan
embedding-server | async with self.lifespan_context(app) as maybe_state:
embedding-server | File "/usr/lib/python3.10/contextlib.py", line 199, in __aenter__
embedding-server | return await anext(self.gen)
embedding-server | File "/app/infinity_emb/infinity_server.py", line 88, in lifespan
embedding-server | app.engine_array = AsyncEngineArray.from_args(engine_args_list) # type: ignore
embedding-server | File "/app/infinity_emb/engine.py", line 306, in from_args
embedding-server | return cls(engines=tuple(engines))
embedding-server | File "/app/infinity_emb/engine.py", line 71, in from_args
embedding-server | engine = cls(**engine_args.to_dict(), _show_deprecation_warning=False)
embedding-server | File "/app/infinity_emb/engine.py", line 56, in __init__
embedding-server | self._model_replicas, self._min_inference_t, self._max_inference_t = select_model(
embedding-server | File "/app/infinity_emb/inference/select_model.py", line 83, in select_model
embedding-server | loaded_engine = unloaded_engine.value(engine_args=engine_args_copy)
embedding-server | File "/app/infinity_emb/transformer/embedder/sentence_transformer.py", line 62, in __init__
embedding-server | attempt_bt = check_if_bettertransformer_possible(engine_args)
embedding-server | File "/app/infinity_emb/transformer/acceleration.py", line 40, in check_if_bettertransformer_possible
embedding-server | config = AutoConfig.from_pretrained(
embedding-server | File "/app/.venv/lib/python3.10/site-packages/transformers/models/auto/configuration_auto.py", line 1084, in from_pretrained
embedding-server | raise ValueError(
embedding-server | ValueError: The checkpoint you are trying to load has model type `qwen3` but Transformers does not recognize this architecture. This could be because of an issue with the checkpoint, or because your version of Transformers is out of date.
embedding-server |
embedding-server | You can update Transformers with the command `pip install --upgrade transformers`. If this does not work, and the checkpoint is very new, then there may not be a release version that supports this model yet. In this case, you can get the most up-to-date code by installing Transformers from source with the command `pip install git+https://github.com/huggingface/transformers.git`
embedding-server |
embedding-server | ERROR: Application startup failed. Exiting.
embedding-server | ERROR: Traceback (most recent call last):
embedding-server | File "/app/.venv/lib/python3.10/site-packages/transformers/models/auto/configuration_auto.py", line 1082, in from_pretrained
embedding-server | config_class = CONFIG_MAPPING[config_dict["model_type"]]
embedding-server | File "/app/.venv/lib/python3.10/site-packages/transformers/models/auto/configuration_auto.py", line 784, in __getitem__
embedding-server | raise KeyError(key)
embedding-server | KeyError: 'qwen3'
embedding-server |
embedding-server | During handling of the above exception, another exception occurred:
embedding-server |
embedding-server | Traceback (most recent call last):
embedding-server | File "/app/.venv/lib/python3.10/site-packages/starlette/routing.py", line 693, in lifespan
embedding-server | async with self.lifespan_context(app) as maybe_state:
embedding-server | File "/usr/lib/python3.10/contextlib.py", line 199, in __aenter__
embedding-server | return await anext(self.gen)
embedding-server | File "/app/infinity_emb/infinity_server.py", line 88, in lifespan
embedding-server | app.engine_array = AsyncEngineArray.from_args(engine_args_list) # type: ignore
embedding-server | File "/app/infinity_emb/engine.py", line 306, in from_args
embedding-server | return cls(engines=tuple(engines))
embedding-server | File "/app/infinity_emb/engine.py", line 71, in from_args
embedding-server | engine = cls(**engine_args.to_dict(), _show_deprecation_warning=False)
embedding-server | File "/app/infinity_emb/engine.py", line 56, in __init__
embedding-server | self._model_replicas, self._min_inference_t, self._max_inference_t = select_model(
embedding-server | File "/app/infinity_emb/inference/select_model.py", line 83, in select_model
embedding-server | loaded_engine = unloaded_engine.value(engine_args=engine_args_copy)
embedding-server | File "/app/infinity_emb/transformer/embedder/sentence_transformer.py", line 62, in __init__
embedding-server | attempt_bt = check_if_bettertransformer_possible(engine_args)
embedding-server | File "/app/infinity_emb/transformer/acceleration.py", line 40, in check_if_bettertransformer_possible
embedding-server | config = AutoConfig.from_pretrained(
embedding-server | File "/app/.venv/lib/python3.10/site-packages/transformers/models/auto/configuration_auto.py", line 1084, in from_pretrained
embedding-server | raise ValueError(
embedding-server | ValueError: The checkpoint you are trying to load has model type `qwen3` but Transformers does not recognize this architecture. This could be because of an issue with the checkpoint, or because your version of Transformers is out of date.
embedding-server |
embedding-server | You can update Transformers with the command `pip install --upgrade transformers`. If this does not work, and the checkpoint is very new, then there may not be a release version that supports this model yet. In this case, you can get the most up-to-date code by installing Transformers from source with the command `pip install git+https://github.com/huggingface/transformers.git`
embedding-server |
embedding-server | ERROR: Application startup failed. Exiting.
WARN[0011] optional dependency "embedding-server" failed to start: container embedding-server exited (3)
embedding-server exited with code 3
Open source status & huggingface transformers.
- The model implementation is available on transformers
- The model weights are available on huggingface-hub
- I verified that the model is currently not running in the latest version
pip install infinity_emb[all] --upgrade - I made the authors of the model aware that I want to use it with infinity_emb & check if they are aware of the issue.
daeyoonlee and dotmobo
Metadata
Metadata
Assignees
Labels
No labels