Skip to content

Improve SuperPMI script robustness #65586

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

BruceForstall
Copy link
Contributor

@BruceForstall BruceForstall commented Feb 18, 2022

  1. Be more robust to temp directory removal failure

If we fail to remove a temporary directory, don't crash. Log the failure
and the set of directories and files still remaining, and continue.

We have seen this failure in at least one Linux x64 PMI coreclr_tests
SuperPMI collection:

[Errno 39] Directory not empty: '/datadisks/disk1/work/B18E0979/t/tmpov3b4_qa'
  1. Limit the length of file names created by make_safe_filename. We were creating
    file names out of full path names, where those paths contained long temporary directory
    paths, and thus we were exceeding the maximum allowed file name component.

@ghost ghost added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Feb 18, 2022
@ghost ghost assigned BruceForstall Feb 18, 2022
@ghost
Copy link

ghost commented Feb 18, 2022

Tagging subscribers to this area: @JulieLeeMSFT
See info in area-owners.md if you want to be subscribed.

Issue Details

If we fail to remove a temporary directory, don't crash. Log the failure
and the set of directories and files still remaining, and continue.

We have seen this failure in at least one Linux x64 PMI coreclr_tests
SuperPMI collection:

[Errno 39] Directory not empty: '/datadisks/disk1/work/B18E0979/t/tmpov3b4_qa'
Author: BruceForstall
Assignees: -
Labels:

area-CodeGen-coreclr

Milestone: -

@BruceForstall
Copy link
Contributor Author

@dotnet/jit-contrib Improve SPMI collection robustness

except Exception as ex:
logging.warning("Warning: failed to remove directory \"%s\": %s", self.mydir, ex)
# Print out all the remaining files and directories.
for dirpath, dirnames, filenames in os.walk(self.mydir):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any point in printing this? What if os.walk fails for that directory as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, I wanted to print something in case that provides clues to why the rmtree is failing in the first place; being completely silent to the failure seems wrong. It seemed like walk would be "safer", but I guess I could be paranoid and put try/except around that, too, if you think that would help.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that will be good because I have seen things fail in APIs around files/folders. We can't do much about it except to catch those exceptions and move forward.

1. Be more robust to temp directory removal failure

If we fail to remove a temporary directory, don't crash. Log the failure
and the set of directories and files still remaining, and continue.

We have seen this failure in at least one Linux x64 PMI coreclr_tests
SuperPMI collection:
```
[Errno 39] Directory not empty: '/datadisks/disk1/work/B18E0979/t/tmpov3b4_qa'
```

2. Limit the length of file names created by `make_safe_filename`. We were creating
file names out of full path names, where those paths contained long temporary directory
paths, and thus we were exceeding the maximum allowed file name component.
@BruceForstall BruceForstall force-pushed the AddDirectoryRemovalRobustness branch from 26bf29d to a6fc7d1 Compare February 18, 2022 21:13
@BruceForstall BruceForstall changed the title Be more robust to temp directory removal failure Improve SuperPMI script robustness Feb 18, 2022
@BruceForstall
Copy link
Contributor Author

@kunalspathak I added the extra try/except, and also added code to handle "too long" file names that was causing a failure on OSX.

# Typically, a max filename length is 256, but let's limit it far below that, because callers typically add additional
# strings to this.
max_allowed_file_name_length = 150
s = "".join(safe_char(c) for c in s)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
s = "".join(safe_char(c) for c in s)
s = "".join(safe_char(c) for c in s)[-max_allowed_file_name_length:]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to leave this as-is so people (like me) not too knowledgeable about Python precedence rules can understand it more easily...

Copy link
Member

@kunalspathak kunalspathak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@BruceForstall BruceForstall merged commit cf6dbde into dotnet:main Feb 18, 2022
@BruceForstall BruceForstall deleted the AddDirectoryRemovalRobustness branch February 18, 2022 21:33
@ghost ghost locked as resolved and limited conversation to collaborators Mar 21, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants