-
Notifications
You must be signed in to change notification settings - Fork 39
Added llnotes experiment
#42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
|
✅ 5/5 passed, 1 skipped, 14s total Running from acceptance #128 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
The
go-libs/llnotespackage has been updated with several new features, including the addition of three new files:chain.go,pull_request.go, andrelease_notes.go. Thechain.gofile introduces aHistorytype, which is a list of messages with methods to manage and manipulate the messages. Three concrete types,SystemMessage,UserMessage, andAssistantMessage, have been added to represent different roles in a chat conversation. Thepull_request.gofile introduces thePullRequest()method, which fetches and processes pull request diffs using GitHub's GraphQL API and downloads the diff data. It then iterates through the file diffs and processes them one by one, generatingUserMessagewith the diff content and appending it to theHistory. Therelease_notes.gofile introduces theReleaseNotes()method, which generates a release note blog post for the specified GitHub repository. Thellnotesdirectory has also been added to theusedirective in thego.workfile, and a newREADME.mdfile has been added to provide documentation for thellnotestool. Additionally, thego.modfile specifies a new dependency on thegithub.com/databrickslabs/sandboxpackage versionv0.1.0-alpha.1. Overall, these changes add new functionality to thellnotespackage for processing pull request diffs, generating release notes for GitHub repositories, and interfacing with the Databricks Serving Endpoints API.Details
This is a new Go source file,
chain.go, added to thego-libs/llnotespackage. The file defines several types, includingmessage,SystemMessage,UserMessage,AssistantMessage,History, along with methods for these types. Themessagetype is an interface with a single method,ChatMessage, that returns aserving.ChatMessage. TheSystemMessage,UserMessage, andAssistantMessagetypes implement themessageinterface and are used to represent different roles in a chat conversation. TheHistorytype holds a slice ofmessages and provides methods for working with the chat history, such asMessages, which converts the history to an array ofserving.ChatMessages, andWith, which appends a newmessageto the history while ensuring the total number of tokens in the history does not exceed 32768. ThemessageTokensmethod calculates the number of tokens in a given message, andtotalTokenscomputes the total number of tokens in the history. TheLastmethod returns the content of the most recent message in the history.This is a new Go source file,
pull_request.go, added to thego-libs/llnotespackage. The file introduces a new method,PullRequest(ctx context.Context, number int), that fetches a GitHub pull request's diff and feeds it to a language model for summarization. The method first retrieves the pull request using thelln.gh.GetPullRequestfunction and then fetches the diff via an HTTP GET request to the GitHub API. The file diff is then parsed and processed, with each file diff being passed to theTalkfunction for summarization by a language model. The language model's responses are normalized and accumulated in aHistoryslice. The method then initiates another conversation with the language model for reducing the accumulated summaries to a single paragraph, which is then returned as part of theHistoryslice. Additionally, the file contains constants and variables used for regex-based normalization of the language model's responses.This is a new Go source file,
release_notes.go, added to thego-libs/llnotespackage. The file introduces a new method,ReleaseNotes(ctx context.Context), that generates release notes for a GitHub repository. The method first retrieves the repository's versions and compares the latest tagged version to the default branch, collecting commit messages along the way. These commit messages are then passed to theTalkfunction for summarization by the language model. The language model's responses are expected to summarize the most important features in a fluent, multi-sentence paragraph, suitable for a release note blog post. TheblogPromptconstant is used to frame the conversation with the language model, emphasizing the need for a coherent, engaging, and informative summary. The generated release notes are then returned as part of theHistoryslice.This is a Go source file,
talk.go, added to thego-libs/llnotespackage. The file introduces several types and functions to facilitate communication with a language model through a Databricks endpoint. TheSettingsstruct holds the necessary configuration to instantiate a newllNotesinstance, including Databricks and GitHub configurations, the organization, repository names, commit references, and the identifier for the language model. Thehttpclientpackage is used to create anhttpclient.ApiClientfor communication, while thegithubanddatabricks-sdk-gopackages are utilized to interact with the GitHub and Databricks APIs, respectively. TheNew(cfg *Settings)function initializes a newllNotesinstance and sets the Databricks HTTP timeout to 300 seconds. It then returns a pointer to the newllNotesinstance. ThellNotesstruct contains adatabricks.WorkspaceClient,github.GitHubClient,httpclient.ApiClient, the language model identifier, the GitHub organization, and repository names. TheTalk(ctx context.Context, h History)method is defined on thellNotesstruct and sends a query to the specified language model using the Databricks API. The method returns a response containing the language model's generated content as part of theHistoryslice. If an error occurs, the method returns an error.The
go.workfile is used to configure the Go workspace, which allows managing dependencies and build settings for Go projects. This specific change modifies theusesection of thego.workfile, adding a new entry for the./llnotesdirectory. This change indicates that thellnotesdirectory is now part of the Go workspace, allowing other modules in the workspace to import and use packages within thellnotesdirectory. Overall, this change integrates thellnotespackage into the Go workspace, making it available for other modules within the workspace to use.This change creates a new file,
README.md, in thellnotesdirectory. TheREADME.mdfile provides documentation for thellnotespackage. The file starts with YAML front matter, which sets the title, language, author, date, and tags for the documentation. The main content begins with a header, "Generate GitHub release notes with LLMs hosted on Databricks Model Serving", which is also the title. The documentation provides a brief description of the functionality provided by thellnotespackage, which is to generate GitHub release notes using a large language model (LLM) hosted on Databricks Model Serving. TheREADME.mdfile follows the Markdown format, which allows for formatting and styling the text with headers, paragraphs, and other markdown elements. TheREADME.mdis an important file for documenting and introducing the purpose and functionality of thellnotespackage.This change creates a
go.modfile for thellnotespackage. Themoduleline specifies the module's name asgithub.com/databrickslabs/sandbox/llnotes. Thegoline specifies the required Go version as1.21.0. Therequiresection lists the required dependencies and their specific versions. The dependencies for this package are: *github.com/databricks/databricks-sdk-goversionv0.33.0*github.com/sourcegraph/go-diffversionv0.7.0*github.com/spf13/pflagversionv1.0.5Thego.modfile is automatically generated by thegocommand, and it specifies the package's dependencies and their versions. This information is used for dependency management and build isolation in Go modules. Thego.modfile is essential for the proper functioning and management of thellnotespackage and its dependencies.This change creates a new file,
main.go, in thellnotespackage. Themain.gofile contains themain()function, which is the entry point for the package. Themain()function initializes a context, sets the product name and version, and then initializes and runs a newlite.Init[llnotes.Settings]instance. Thelite.Init[llnotes.Settings]instance creates a new root command, adds two subcommands, "pull-request" and "release-notes", and then runs the root command. * The "pull-request" subcommand extracts a pull request number and calls thellnotes.PullRequest()function, printing the summary of the pull request. * The "release-notes" subcommand calls thellnotes.ReleaseNotes()function, printing the release notes. Thelite.New[llnotes.Settings](...)function initializes a newliteCLI instance with the specified configuration and subcommands. Thelite.Command[llnotes.Settings, req]{...}instances define the two subcommands, their flags, and their corresponding handlers. Themain.gofile provides thellnotespackage with a user-facing CLI. The CLI allows users to generate GitHub release notes and pull request summaries using a large language model (LLM) hosted on Databricks Model Serving.