Skip to content

Conversation

@carmocca
Copy link
Contributor

@carmocca carmocca commented May 4, 2023

Generation works.

I removed all files that are not updated for simplicity. We can port them from upstream on demand.

## License

Lit-LLaMA is released under the [Apache 2.0](https://github.com/Lightning-AI/lightning-llama/blob/main/LICENSE) license.
# FIXME
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably want to refresh this

> **Note**
> All scripts support argument [customization](customize_paths.md)
### FIXME: update this
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to try this on a A100

Copy link
Contributor

@lantiga lantiga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fantastic work @carmocca!


if hasattr(self, "bias"):
# causal self-attention; Self-attend: (B, nh, T, hs) x (B, nh, hs, T) -> (B, nh, T, T)
# NOTE: cannot use flash attention because it takes q.size(-1) as the norm factor which is different to the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this conditioned on bias being there?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See #2


class Tokenizer:
def __init__(self, vocabulary_path: Path, config_path: Path) -> None:
# https://github.com/Stability-AI/StableLM/blob/e60081/configs/stablelm-base-alpha-3b.yaml#L108
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should just vendor the yaml file in the repo directly

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you suggesting this as a showcase of the configs used?

Because this is a gpt-neox config, meaning we don't need to use it

Or do you want to add support for running the scripts by passing it?

@lantiga
Copy link
Contributor

lantiga commented May 5, 2023

There's a test failing in windows and the readme to complete. I can work on the readme.

Co-authored-by: Luca Antiga <luca@lightning.ai>
@lantiga lantiga merged commit f172f8d into main May 5, 2023
@lantiga lantiga deleted the carmocca/initial-commit branch May 5, 2023 12:34
gkroiz pushed a commit to gkroiz/lit-gpt that referenced this pull request Aug 31, 2023
* Make trainer configurable and add docker file

* Fix bugs

* Add dockerignore

* Fix config

* Fix bug

* Fix big

* Fix bug

* Fix bug

* Try dlprof

* fix bug

* Add pytorch logger

* FIx import

* Add pytorch profiler

* Fix bug

* Reorder docker file

* Fix bug

* Make pytorch profiler optional

* Try to fix profiler

* Pytorch profiler working. Shunt torch comms again

* tune profiler params

* Make pt profiler configurable and run for global batches

* Fix bug

* Fix batch offset

* Fix bug

* Debug print issues

* More print stuff

* Add nvtx ranges

* Adjust model sizes

* tune validation iters
@carmocca carmocca self-assigned this Nov 1, 2023
@sadrafh sadrafh mentioned this pull request Nov 14, 2024
Andrei-Aksionov added a commit that referenced this pull request Jan 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants