Skip to content

Issue running on RTX 3090 + A100 #2

@ayunami2000

Description

@ayunami2000

Hi,

I used the demo code, and Python 3.13, and it always seems to freeze when i run the demo code. On my local machine with a 3090 and 32GB ram, it eventually SIGKILLs; in a Kubernetes cluster with an A100 and 16GB ram (I can increase this if needed for testing, let me know), it eventually terminates the entire container.

Here is what it said when I ran it locally:
Image

This happened right after my music I also have playing on the local machine stuttered/lagged a bit, indicating there are potentially RAM requirements not stated in your docs.

Here is my contents of test.py:

from steerling import SteerlingGenerator, GenerationConfig

generator = SteerlingGenerator.from_pretrained("guidelabs/steerling-8b")

text = generator.generate(
    "The key to understanding neural networks is",
    GenerationConfig(max_new_tokens=100, seed=42),
)
print(text)

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions