Skip to content

Bug: aider crashes on files with non-ASCII characters (when building the repository map) #82

Closed
@cgrothaus

Description

@cgrothaus

Description

When aider attempts to call the run_ctags function on a git repository containing filenames with non-ASCII characters, it crashes with a FileNotFoundError. The error message indicates that the filename passed to os.path.getmtime in the run_ctags function contains escaped non-ASCII characters (\\303\\274), which is likely causing the issue.

Steps to Reproduce

  1. Create a git repository with a file with a non-ASCII character in the file or directory name (e.g., doc/fänny_dirname/README.md), or clone such a repository (demo repo: https://github.com/cgrothaus/sample-repo-demonstrate-aider-bug-special-filenames).
  2. Run aider on the repository.
  3. Run the /tokens command, which causes the repo map to be built.

aider crashes with this error output:

Traceback (most recent call last):
  File "/opt/homebrew/Caskroom/miniconda/base/envs/aider/bin/aider", line 33, in <module>
    sys.exit(load_entry_point('aider-chat', 'console_scripts', 'aider')())
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/christoph.grothaus/projects/aider/main.py", line 371, in main
    coder.run()
  File "/Users/christoph.grothaus/projects/aider/coders/base_coder.py", line 382, in run
    new_user_message = self.run_loop()
                       ^^^^^^^^^^^^^^^
  File "/Users/christoph.grothaus/projects/aider/coders/base_coder.py", line 446, in run_loop
    return self.commands.run(inp)
           ^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/christoph.grothaus/projects/aider/commands.py", line 60, in run
    return self.do_run(matching_commands[0][1:], rest_inp)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/christoph.grothaus/projects/aider/commands.py", line 45, in do_run
    return cmd_method(args)
           ^^^^^^^^^^^^^^^^
  File "/Users/christoph.grothaus/projects/aider/commands.py", line 113, in cmd_tokens
    repo_content = self.coder.repo_map.get_repo_map(self.coder.abs_fnames, other_files)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/christoph.grothaus/projects/aider/repomap.py", line 107, in get_repo_map
    res = self.choose_files_listing(chat_files, other_files)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/christoph.grothaus/projects/aider/repomap.py", line 138, in choose_files_listing
    files_listing = self.get_ranked_tags_map(chat_files, other_files)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/christoph.grothaus/projects/aider/repomap.py", line 381, in get_ranked_tags_map
    ranked_tags = self.get_ranked_tags(chat_fnames, other_fnames)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/christoph.grothaus/projects/aider/repomap.py", line 289, in get_ranked_tags
    data = self.run_ctags(fname)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/Users/christoph.grothaus/projects/aider/repomap.py", line 175, in run_ctags
    file_mtime = os.path.getmtime(filename)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<frozen genericpath>", line 55, in getmtime
FileNotFoundError: [Errno 2] No such file or directory: '/Users/christoph.grothaus/projects/sample-repo-demonstrate-aider-bug-special-filenames/"doc/f\\303\\244nny_dirname/README.md"'

Expected Behavior

aider should correctly handle filenames with non-ASCII characters and not crash when calling the run_ctags function.

Actual Behavior

aider crashes with a FileNotFoundError when calling the run_ctags function on a repository containing filenames with non-ASCII characters. The error message indicates that the filename passed to os.path.getmtime contains escaped non-ASCII characters.

Possible Solution

Ensure that the filename is correctly encoded and escaped at all points in the code where it's used. This might involve changing how the filename is read from the file system, how it's stored in the cache, and how it's passed to the ctags command.

Additional Context

This issue was discovered during a chat session with aider. The issue occurs regardless of the specific non-ASCII characters in the filenames.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions