Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Items in .gitignoreshould still be searchable via file finder #4745

Open
1 task done
mifopen opened this issue May 13, 2023 · 19 comments · May be fixed by #16852
Open
1 task done

Items in .gitignoreshould still be searchable via file finder #4745

mifopen opened this issue May 13, 2023 · 19 comments · May be fixed by #16852
Labels
bug [core label] file finder Feedback for file management, navigation, etc git Git integration feedback

Comments

@mifopen
Copy link

mifopen commented May 13, 2023

Check for existing issues

  • Completed

Describe the bug / provide steps to reproduce it

  1. Create file with name .env.local in the open directory
  2. Open file search pallete
  3. Search for env or local

Environment

Zed: v0.85.4 (stable)
OS: macOS 13.3.1
Memory: 16 GiB
Architecture: aarch64

If applicable, add mockups / screenshots to help explain present your vision of the feature

No response

If applicable, attach your ~/Library/Logs/Zed/Zed.log file to this issue.

If you only need the most recent lines, you can run the zed: open log command palette action to see the last 1000.

No response

@mifopen mifopen added admin read Pending admin review bug [core label] triage Maintainer needs to classify the issue labels May 13, 2023
@hovsater hovsater added file finder Feedback for file management, navigation, etc and removed triage Maintainer needs to classify the issue labels May 15, 2023
@JosephTLyons JosephTLyons removed the admin read Pending admin review label May 15, 2023
@JosephTLyons
Copy link
Collaborator

JosephTLyons commented May 15, 2023

Hey @mifopen, by chance, do you have the .env.local file .gitignored?

@mifopen
Copy link
Author

mifopen commented May 15, 2023

@JosephTLyons yes, 100%

@mifopen
Copy link
Author

mifopen commented May 15, 2023

And I see where it could come from. But I've been using JetBrains IDEs for a decade and believe that "having smth in .gitignore" doesn't equal "excluded from the global project search". It's a separate setting (you can mark directories/files as "excluded") in JB IDEs and it feels natural.

@JosephTLyons
Copy link
Collaborator

JosephTLyons commented May 15, 2023

I think at the moment, that behavior is baked into zed on purpose, to prevent things like node_modules (and similar setups in other languages) from absolutely polluting the file finder results. That being said, we should probably consider some sort of setting here to give more flexibility into this:

@mifopen
Copy link
Author

mifopen commented May 16, 2023

Totally agree

@JosephTLyons JosephTLyons changed the title .env.local file doesn't appear in file search Items in .gitignoreshould still be searchable via file finder Sep 25, 2023
@TwanLuttik
Copy link

I agree as well but maybe add a option in the setting to override it.

SomeoneToIgnore referenced this issue Dec 21, 2023
)

Deals with https://github.com/zed-industries/community/issues/2347
Part of https://github.com/zed-industries/community/issues/1538

Now file finder will match all gitignored worktree entries.
Zed does not traverse gitignored dirs by default, which means that not
all gitignored files will be matches, but all that were toggled in
project panel and all root non-directory gitignored entries will be now
used, hopefully causing less questions.

Release Notes:

- Improved file finder to match all gitignored files that were added
into worktrees (e.g. due to opening gitignored directories in project
panel)
@JosephTLyons JosephTLyons transferred this issue from zed-industries/community Jan 24, 2024
@abejfehr
Copy link

abejfehr commented Jan 24, 2024

Piggybacking to share this related behaviour: in VS Code, doing "Find in folder" and searching for a query works if a folder is ignored, but in Zed that doesn't seem to be the case.

Even if build/dist folders are grayed out in the editor, sometimes I want to make sure that something built correctly by searching for a string directly in an ignored folder.

Similar thing if I'm learning how a node_module works, sometimes I want to perform a scoped search in the node_modules folder even though it's ignored

@fvsch
Copy link

fvsch commented Jan 26, 2024

I have this issue as well. I understand that excluding all files matched in .gitignore is helpful to exclude directories with installed dependencies (node_modules, vendor, etc.), and temporary outputs like build artifacts and caches.

Still, there are gitignored files which are legitimately useful to be able to open quickly.

Some possible heuristics that could balance those two needs (at the cost of some complexity):

  • Include the file in results if the query is sufficiently precise. For instance, include the file only if the query matches enough of the file name and the start of the file name, so typing env will not return .env.local but typing .env or .env. might?
  • Include the file in results but deprioritize it (i.e. put gitignored results at the end, so that you have to have a sufficiently precise query to filter out other not-ignored matches).
  • Only exclude gitignored files which which are children or descendants of a gitignored directory. That would typically exclude dependencies and cache, but include config files.

@alex-astronomer
Copy link

alex-astronomer commented Mar 21, 2024

I have started work on this issue. The solution that I came up with is adding an "Ignore Included" toggle to the FileFinder picker modal.

image

Please leave comments if you have design feedback on this. I believe that a clean solution to this issue will involve different pickers each having their own search options. File Finder can have gitignore toggled for example, or project symbols could have case sensitivity toggled. That last example I just made up.

The reason that this is a clean solution is because each Picker has its own delegate for the different functions. We can modify the PickerDelegate trait in order to add optional search options for each type of picker in order to make this extensible and re-use components that are already written (SearchOptions).

Developers: I accept any and all feedback about design!
Users: Let me know if this would solve the problems that you're facing.

@SomeoneToIgnore
Copy link
Contributor

This is a relatively hard issue to tackle, if put in a generic form as "[any] items in .gitignore should still be searchable via file finder".
The hardest part would be to keep the file finder searching very fast with all gitignored files knowledge (#7504).

First of all, note_modules/, target/, foo_bar_output, .env or other files are all alike to Zed and it does not have any heuristics on their side.

Consider https://github.com/microsoft/vscode-eslint project as a web example, after installing the dependencies with npm i:

❯ find . -type f ! -path node_modules |wc -l
     870

❯ find . -type f |wc -l
    7882

❯ du -ha node_modules
........snip
 90M	node_modules

❯ du -ha .
........snip
104M	.

node_modules is a gitignored directory there, has by an order of magnitude more files than the real project and occupies ~90% of the project, size-wise.

Also note that this repo is a relatively small project that does not include Angular or React + somethingX + .. in its package.json.

State in Zed

Zed used to track all gitignored files at some point, but it was soon discovered that it becomes unresponsive relatively quickly due to the requirement to react on all related FS events and also [re]scan directories:

// Continue processing events until the worktree is dropped.

Now, Zed does scan only the non-ignored directories:

fn should_scan_directory(&self, entry: &Entry) -> bool {
and expand_entry -> refresh_entries_for_paths -> forcibly_load_paths chain of actions in the worktree — so, all currently expanded gitignored directories will be added into the same collection of worktree entries:
entries_by_path: SumTree<Entry>,
entries_by_id: SumTree<PathEntry>,

and will be used when searching or displaying in the project panel, project search, file finder and various entry-related iterations.

This way, Zed only "indices" and uses gitignored files if they are in the worktree/"project" root (as it gets opened by default) and all other directories that were open (e.g. due to autonavigating in the project tree to the entry corresponding to the editor opened).
This index is a core thing when it comes to addressing the files, so things might get slow if more entries will be added inside.
Seems that there are some scaling issues with the current model already: #8242 and adding more (10x at least) things on top is easy, if possible at all.

Zed does not use "gitignored files" concept too frequently: it shows a different icon in the project panel (ergo the whole "load gitignored directories via expand_entry call" story) and allows to do a project search on files + the gitignored ones.
The project search part is done via a separate, background thread walking the gitignored tree roots and matching the files + there's a limit on the number of entries:

const MAX_SEARCH_RESULT_FILES: usize = 5_000;
const MAX_SEARCH_RESULT_RANGES: usize = 10_000;

Design considerations

At first, we need to understand what to display and how. So far, it feels that there are certain people that expect Zed showing an arbitrary node_modules/foo/bar.ts file in file finder if queried, and some other set of people who will be happy with just their .env.* files from the project root opening in the file finder.

The latter are simple to fix with #9760 or similar, but keeping the same design, how could former part be done?
I currently think it's not really possible? As we cannot bloat the main cache with so many extra entries, but have to be able to answer fuzzy path queries over a 10x repository of files.

While traversing such file trees in realtime for fuzzy match queries does not sound possible, caching seems hard too due to invalidation?
Current, worktree entry SumTree cache, tracks every related FS event for that which would not work here, so either some other strategy has to be picked, or a better way of approaching the problem considered — it seems rather wasteful to cache 10x of the repo size just to enable file finder queries.

Neither of the current editors known to me seem to provide any similar functionality, but VSCode has a "whitelist" of directories to track — we could solve the issue with yet another config thing but that would not be very discoverable and might still slow things down overall on large enough node_modules if we reuse the same worktree entry SumTree cache.
On the bright side, with reasonable defaults (we can add all .env*-like files there explicitly) it will work for many people.

One idea that seems worth exploring is to add more interactivity into file finder and propose to input gitignored roots first to start looking them up: match regularly before receiving node_modules or whatever other gitignored root that was not scanned, then start to propose node_modules/* directories that match further file finder queries, e.g.
for node_modules/pr, the picker will show first N directories with names starting with pr.
Since the input operation is relatively tedious, some list completion would be needed + a background indexing task might index a subset and switch to regular, fuzzy matching mode.
While might be a blast if implemented properly, sounds rather complicated (and would require more design than anything else above) and the "whitelist" + new file finder input toggle sound more feasible to do.

@solventak
Copy link

Ohhh interesting.... Thank you so much for replying so quickly to this issue. I will take a look through the (incredibly extensive) comment that you left and I'll let you know if I have additional questions.

@CurbaiCode
Copy link

CurbaiCode commented Mar 30, 2024

#6927 seems like it could be related to this. Also, #5029.

@aarroisi
Copy link

I think excluding files from .gitignore generally is a good approach. 95% of my time searching files, I don't want to search in the files / folders included in the .gitignore.

But I also frequently need to open files like .env that's not checked into git. So my proposed solution is a config that "forces" files /folders to be included in search, even though it's git ignored. Probably something like this:

{
  ...
  "always_include_in_search":
  [
    "**/.env"
  ],
  ...
}

So it acts sort of like a reverse gitignore, and can also use the same patterns used in .gitignore files for looking matched files / folders.

@Hemant-Mann
Copy link

I think the solution by @aarroisi should work.

I also need to search some files which are ignored in .gitignore like configurations or some internal dependency, giving a configuration option to the user would give them more control over what they want to search, the default behaviour is okay but we must have an option to override it if needed!

@ottodevs
Copy link

ottodevs commented Jun 18, 2024

Another idea could be to reuse the already proven and mature .gitignore syntax so users are able to do things like:

"file_scan_exclusions": [
  "!.env*"
  // (...your other exclusions) 
]

Wherever the editor reads the .gitignore file, it would need to apply the file_scan_exclusions over it to come up with the final exclusions list. Even just concatenating both lists if that makes sense.

This brings great flexibility and avoids introducing new settings.

Examples:

  1. Empty setting for file_scan_exclusions to just use the .gitignore (although some sane defaults like the current ones seem nice to keep).

    {
      "file_scan_exclusions": []
    }
  2. Setting file_scan_exclusions to ["!.env*"] to negate current .gitignore settings.

    {
      "file_scan_exclusions": [
        "!.env*"
      ]
    }
  3. Add more files to the file_scan_exclusions to be excluded from the file scan apart from the ones in the .gitignore.

    {
      "file_scan_exclusions": [
        "*.tmp",
        "*.bak",
        "/logs/*",
        "!important.log"
      ]
    }

@thnt
Copy link

thnt commented Aug 19, 2024

I think we just need to allow to search file by relative path, ex: ./.env.local to open ignored file

@michaelaguiar
Copy link

Another idea could be to reuse the already proven and mature .gitignore syntax so users are able to do things like:

"file_scan_exclusions": [
  "!.env*"
  // (...your other exclusions) 
]

Wherever the editor reads the .gitignore file, it would need to apply the file_scan_exclusions over it to come up with the final exclusions list. Even just concatenating both lists if that makes sense.

This brings great flexibility and avoids introducing new settings.

Examples:

  1. Empty setting for file_scan_exclusions to just use the .gitignore (although some sane defaults like the current ones seem nice to keep).
    {
      "file_scan_exclusions": []
    }
  2. Setting file_scan_exclusions to ["!.env*"] to negate current .gitignore settings.
    {
      "file_scan_exclusions": [
        "!.env*"
      ]
    }
  3. Add more files to the file_scan_exclusions to be excluded from the file scan apart from the ones in the .gitignore.
    {
      "file_scan_exclusions": [
        "*.tmp",
        "*.bak",
        "/logs/*",
        "!important.log"
      ]
    }

Has something like this been implemented? I am hoping for a quick way to open log files in the search project files dialog, that are currently ignored by git. This would be exactly what I need.

@Tobbe
Copy link

Tobbe commented Aug 27, 2024

I think @abejfehr's comment is worth more attention.

If I explicitly list a .gitignored directory in my "include" filter I think it makes sense to override the exclusion of that directory and actually include files inside it when searching, even though it's part of my .gitignore

@maxsbelt
Copy link

E.g. in vscode my workaround is to use .ignore file to unignore the things that I want to see in quick search:

.gitignore

*.local

.ignore

# enables ability to jump to .env.local file through VSCode quick search
!.env.local

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment