Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhanced Configuration and Development Workflow with Poetry, Markdown Prompts, and YAML Config #432

Open
wants to merge 27 commits into
base: main
Choose a base branch
from

Conversation

use-the-fork
Copy link

@use-the-fork use-the-fork commented Dec 29, 2023

Summary:

This pull request introduces significant enhancements to the development workflow and configuration management of our application. Key changes include the integration of Poetry for dependency handling, the introduction of Markdown-based prompts, and a comprehensive overhaul of the configuration system from JSON to YAML.

Key Changes:

  1. Poetry Integration:

    • Implemented Poetry as the primary tool for dependency management.
    • Updated all actions to support Poetry, ensuring a seamless development experience.
  2. Markdown-based Prompts:

    • Added the capability to use Markdown for prompts.
    • Created markdown based prompts and added them to resources directory.
    • Introduced a toggle in the configuration to switch between traditional and Markdown prompts, providing flexibility to developers.
  3. Configuration System Overhaul:

    • Transitioned from a JSON-based configuration file to a more readable and maintainable YAML format.
    • Developed an initialization command that automatically creates a YAML configuration file at the git root.
    • Refined the configuration structure to distinguish between immutable and mutable properties based on the session.
    • Organized the configuration into distinct sections: AI, Runtime, and Parser, under a unified base configuration.
  4. User Session Management:

    • Introduced a new UserSession object, allowing developers greater flexibility in setting and retrieving values. This object supersedes the traditional context passed to commands.
    • Implemented a safety feature where new values must be explicitly set via the user_session.set function to prevent accidental overwriting of the context.
  5. CLI and Testing Adjustments:

    • Modified the CLI to utilize the Click package, enhancing usability and maintainability.
    • Updated and adjusted all tests to align with the new configuration object.
  6. Quality Assurance:

    • Ensured that all Pytest and Pyright are 100% passing

image
image

  1. Boot Logo: Because all apps need one!
    image

greg-assa and others added 25 commits December 18, 2023 08:57
The dev-requirements.txt and requirements.txt files have been deleted and replaced with a poetry.lock file. This switch to using Poetry for dependency management will simplify dependencies' declaration and lock them for consistent installs across different environments. This update includes moving all previously listed dependencies to the newly generated Poetry lock file.
The setup.py file has been removed as part of this migration, with all necessary information and dependencies now defined in the updated pyproject.toml file. This change enables more modern, standardized Python packaging and dependency management.
The project has been updated to use Poetry for managing dependencies in lieu of pip. This applies to the build, lint, test, benchmark, and release workflows. All pyright, pytest, black, isort, and other tests now run with poetry instead of python and pip commands.
Changed the package manager for installation from pip to poetry in the README file. This update reflects the switch to the more modern package manager, poetry.
The Github workflows for benchmarking, release, and linting/testing have been updated to include a step for installing poetry using pipx. This change ensures that poetry, a necessary dependency for this project, is properly installed during these workflow processes.
This change modifies the run command under the license checking script step of the GitHub Action workflow. Specifically, it alters the way we execute the license_check.py script, using 'poetry run python' instead of 'poetry run' alone.
The benchmark script has been updated to use 'poetry run' for executing pytest. This change ensures that pytest runs in the context of the poetry environment, appropriately respecting the project
This commit includes the addition of markdown formatted coding project prompts, as well as updating all file references in the Python files to reflect these changes. It also includes the installation of the "dataclasses-json" dependency. Lastly, the plain text prompt files were moved to a separate directory for better organization.
Updated default prompts for the coding project to a markdown file format and adjusted all related file references. New dependency, "dataclasses-json", was installed to handle potential future data classes with the JSON format. Moved previously existing plain text format prompts to a distinct directory for easier management.
The prompt type for each parser has been moved from hardcoded value within the parser to a centralized configuration through 'mentat.config.py'. This improves maintainability by centralizing configuration. Additionally, the use of 'rich' library has been expanded in 'session.py' and 'code_context.py' to improve readability and color-coding of messages.
…ature/markdown-prompts

# Conflicts:
#	dev-requirements.txt
#	mentat/code_context.py
#	mentat/code_feature.py
#	mentat/config.py
#	mentat/conversation.py
#	mentat/feature_filters/llm_feature_filter.py
#	mentat/include_files.py
#	mentat/llm_api_handler.py
#	mentat/session.py
#	mentat/terminal/client.py
#	mentat/utils.py
#	requirements.txt
#	scripts/run_and_upload_benchmarks.sh
The code has been refactored to use a global configuration object instead of a local one. This change standardizes how the config is accessed across multiple modules and simplifies the code by reducing redundant variable assignments. Along with this, color print formatting has been updated to use the 'rich' module's syntax.
This commit removes 'config' import from 'mentat/session_context.py' and adds it directly in other files where it's used, refactoring relevant lines accordingly. This change cleans up the codebase and makes config usage more immediate and intuitive. Additionally, minor code adjustments were made in several other files for consistency and readability.
The debug function in mentat/utils.py has been replaced with the inspect function from the rich library. The inspect function provides a more detailed view of objects for easy debugging and visualization. The former custom debug function which uses pprint for pretty printing and handles exceptions has been commented out for reference.
Replaced Typer with Click for command line interface of terminal client, added Click to poetry dependencies, and rearranged and removed redundant code in terminal client for better optimization. Also, fixed asynchronous task in the client exit listener. This commit simplifies the command line interface code while enhancing the application's functionality and efficiency.
The rich module has been replaced with termcolor for text coloring and formatting in various Python files. The change improves consistency across the codebase and streamlines the process of sending colored text with the stream.send() function. The changes include updating function and method calls, adjusting import statements, and modifying line color settings.
The path for the configuration file in the Mentat project was updated to use the Git root, rather than the application root. This change also resulted in the removal of unnecessary code that previously loaded and merged the configuration.
The code has been updated to fetch configuration from user_session instead of importing it directly from mentat.config. Several test cases have been altered to reflect this change. The Dumper and Dumper functions in utils have been enhanced for handling possible exceptions. An utility function is added in __init__.py to expose user_session as part of the package's public API.
This commit refactors the way configuration is handled in the application. This includes implementing settings for parsers as well as AI models, and modifying how the configuration is retrieved throughout the app. Mid-session configuration changes are now more consistent and manageable, enhancing overall usability and code readability.
Various changes have been made to improve the readability and maintainability of the code. These changes include reformatting lists and function arguments for better visibility, updating syntax to meet PEP8 standards, and making sure all diffs are correctly formatted.
The main update to 'session.py' involves changing the message content sent via self.stream and altering its color. In 'config.py', the significant addition is a new function 'load_model' that is tasked with checking if a model is known and setting the value of maximum context appropriately. Also, code formatting has been improved in a few places to maintain readability.
The change ensures that the mentat command in the lint and test workflow runs in the correct directory. The previous command was not specifying the directory which may have caused some unexpected behaviors.
@jakethekoenig
Copy link
Member

Thanks for your interest in the project. Some quick thoughts:

  1. What is the purpose of prompt_type at the end of the day markdown prompts are still plaintext prompts that are sent to gpt. There's an empirical question about whether writing them in markdown improves llm performance and there's ergonomics reasons to write them that way for developers. But either way there doesn't seem to be any reason for mentat to "know" whether a given prompt is markdown formatted or not.
  2. I'm not sure we're interested in using poetry. Can you talk more about the benefits?
  3. It seems not ideal to combine 1-7 into one PR. They don't seem logically connected and it's hard to review such a massive change.
  4. I'm not a big fan of the current json config file either. But I'm not sure yaml is the best choice either. Toml might be simpler.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants