Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test replit-code-v1-3b model #1299

Open
abetlen opened this issue May 3, 2023 · 7 comments
Open

Test replit-code-v1-3b model #1299

abetlen opened this issue May 3, 2023 · 7 comments
Labels
help wanted Extra attention is needed model Model specific

Comments

@abetlen
Copy link
Collaborator

abetlen commented May 3, 2023

Replit recently trained a 3B parameter Llama-style code model with some very promising results. Weights have been released here

@execveat
Copy link

execveat commented May 3, 2023

The demo looks extremely underwhelming. Nowhere close to Codex/Copilot: https://huggingface.co/spaces/replit/replit-code-v1-3b-demo

@ggerganov
Copy link
Owner

Would be nice to get a breakdown of the differences with the LLaMA architecture to get a feeling of how big of a task this would be

@Heath123
Copy link

Heath123 commented May 3, 2023

The demo looks extremely underwhelming. Nowhere close to Codex/Copilot: huggingface.co/spaces/replit/replit-code-v1-3b-demo

Do you expect it to be at 3B parameters? It might not be as good as Copilot but it's still very good

@abetlen
Copy link
Collaborator Author

abetlen commented May 3, 2023

Would be nice to get a breakdown of the differences with the LLaMA architecture to get a feeling of how big of a task this would be

Ah looks like I misread their press release, I took Llama-style to mean the exact architecture not that it was trained past Chinchilla optimallity. I'll try to look into this.

The demo looks extremely underwhelming. Nowhere close to Codex/Copilot: https://huggingface.co/spaces/replit/replit-code-v1-3b-demo

I think the better comparison is against the salesforce codegen models as they're the best option for self-hosted code completion, would be cool to build something like turbopilot but this may require a seperate ggml implementation for this model.

@Green-Sky
Copy link
Collaborator

replit-code-v1-3b is powered by state-of-the-art LLM techniques, such as: Flash Attention for fast training and inference, AliBi positional embeddings to support variable context length at inference time, LionW optimizer, etc.

alibi got merged recently

@v3ss0n
Copy link

v3ss0n commented May 4, 2023

The demo looks extremely underwhelming. Nowhere close to Codex/Copilot: https://huggingface.co/spaces/replit/replit-code-v1-3b-demo

Can you show your results? Mine looks quite good and already usable, this with chat tuned lora will be quite amazing

""" SQLAlchemy model for relationship betweek tasks and projects """ # < thats the only input i gave
class Project(Base):
    __tablename__ = 'project'

    id = Column(Integer, primary_key=True)
    name = Column(String, nullable=False)
    description = Column(String, nullable=False)
    tasks = relationship('Task', back_populates='project')

    def __init__(self, name, description):
        self.name = name
        self.description = description

    def __repr__(self):
        return f"<Project(name='{self.name}', description='{self.description}')>"

class Task(Base):
    __tablename__ = 'task'

    id = Column(Integer, primary_key=True)
    name = Column(String, nullable=False)
    description = Column(String, nullable=False)
    project_id = Column(Integer, ForeignKey('project.id'))
    project = relationship('Project', back_populates='tasks')

    def __init__(self, name, description, project_id):
        self.name = name
        self.description = description
        self.project_id = project_id

    def __repr__(self):
        return f"<Task(name='{self.name}', description='{self.description}', project_id='{self.project_id}')>"

@ElYaiko
Copy link
Contributor

ElYaiko commented May 4, 2023

The demo looks extremely underwhelming. Nowhere close to Codex/Copilot: https://huggingface.co/spaces/replit/replit-code-v1-3b-demo

Can you show your results? Mine looks quite good and already usable, this with chat tuned lora will be quite amazing

""" SQLAlchemy model for relationship betweek tasks and projects """ # < thats the only input i gave
class Project(Base):
    __tablename__ = 'project'

    id = Column(Integer, primary_key=True)
    name = Column(String, nullable=False)
    description = Column(String, nullable=False)
    tasks = relationship('Task', back_populates='project')

    def __init__(self, name, description):
        self.name = name
        self.description = description

    def __repr__(self):
        return f"<Project(name='{self.name}', description='{self.description}')>"

class Task(Base):
    __tablename__ = 'task'

    id = Column(Integer, primary_key=True)
    name = Column(String, nullable=False)
    description = Column(String, nullable=False)
    project_id = Column(Integer, ForeignKey('project.id'))
    project = relationship('Project', back_populates='tasks')

    def __init__(self, name, description, project_id):
        self.name = name
        self.description = description
        self.project_id = project_id

    def __repr__(self):
        return f"<Task(name='{self.name}', description='{self.description}', project_id='{self.project_id}')>"

Yes, it gives ok results
Is there any chance of using this model with ggml?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed model Model specific
Projects
None yet
Development

No branches or pull requests

7 participants