Skip to content

Java: Improve the QHelp for java/path-injection. #15409

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Jan 25, 2024

Conversation

erik-krogh
Copy link
Contributor

@erik-krogh erik-krogh commented Jan 23, 2024

The previous suggestion heavily suggested that paths should be checked to be within a safe folder, even though that is not always relevant.
I added some more examples, and iterated on the text in the QHelp to try and improve it.

See the backref for more context and motivation.

I checked in the markdown render of the QHelp to get a nice diff that's easy to review, and deleted it again in the last commit.

@max-schaefer: Can you have a first look at this?

Copy link
Contributor

QHelp previews:

java/ql/src/Security/CWE/CWE-022/TaintedPath.qhelp

Uncontrolled data used in path expression

Accessing paths controlled by users can allow an attacker to access unexpected resources. This can result in sensitive information being revealed or deleted, or an attacker being able to influence behavior by modifying unexpected files.

Paths that are naively constructed from data controlled by a user may be absolute paths or contain unexpected special characters, such as "..". Such a path may potentially point anywhere on the file system.

Recommendation

Validate user input before using it to construct a file path.

Common validation methods include checking that the normalized path is relative and does not contain any ".." components, or that the path is contained within a safe folder. The validation method to use depends on how the path is used in the application and whether the path is supposed to be a single path component.

If the path is supposed to be a single path component (such as a file name) you can check for the existence of any path separators ("/" or "\") or ".." sequences in the input, and reject the input if any are found.

Note that removing "../" sequences is not sufficient, since the input could still contain a path separator followed by "..". For example, the input ".../...//" would still result in the string "../" if only "../" sequences are removed.

Finally, the simplest (but most restrictive) option is to use an allow list of safe patterns and make sure that the user input matches one of these patterns.

Example

In this example, a file name is read from a java.net.Socket and then used to access a file and send it back over the socket. However, a malicious user could enter a file name anywhere on the file system, such as "/etc/passwd" or "../../../etc/passwd".

public void sendUserFile(Socket sock, String user) {
	BufferedReader filenameReader = new BufferedReader(
			new InputStreamReader(sock.getInputStream(), "UTF-8"));
	String filename = filenameReader.readLine();
	// BAD: read from a file without checking its path
	BufferedReader fileReader = new BufferedReader(new FileReader(filename));
	String fileLine = fileReader.readLine();
	while(fileLine != null) {
		sock.getOutputStream().write(fileLine.getBytes());
		fileLine = fileReader.readLine();
	}
}

If the input is just supposed to be a file name, you can check that it doesn't contain any path separators or ".." sequences.

public void sendUserFileGood(Socket sock, String user) {
	BufferedReader filenameReader = new BufferedReader(
			new InputStreamReader(sock.getInputStream(), "UTF-8"));
	String filename = filenameReader.readLine();
	// GOOD: ensure that the filename has no path separators or parent directory references
	if (filename.contains("..") || filename.contains("/") || filename.contains("\\")) {
		throw new IllegalArgumentException("Invalid filename");
	}
	BufferedReader fileReader = new BufferedReader(new FileReader(filename));
	String fileLine = fileReader.readLine();
	while(fileLine != null) {
		sock.getOutputStream().write(fileLine.getBytes());
		fileLine = fileReader.readLine();
	}	
}

If the input is supposed to be found within a specific directory, you can check that the resolved path is still contained within that directory.

public void sendUserFileGood(Socket sock, String user) {
	BufferedReader filenameReader = new BufferedReader(
			new InputStreamReader(sock.getInputStream(), "UTF-8"));
	String filename = filenameReader.readLine();

	Path publicFolder = Paths.get("/home/" + user + "/public").normalize().toAbsolutePath();
	Path filePath = publicFolder.resolve(filename).normalize().toAbsolutePath();

	// GOOD: ensure that the path stays within the public folder
	if (!filePath.startsWith(publicFolder)) {
		throw new IllegalArgumentException("Invalid filename");
	}
	BufferedReader fileReader = new BufferedReader(new FileReader(filePath.toString()));
	String fileLine = fileReader.readLine();
	while(fileLine != null) {
		sock.getOutputStream().write(fileLine.getBytes());
		fileLine = fileReader.readLine();
	}
}

References

java/ql/src/Security/CWE/CWE-022/ZipSlip.qhelp

Arbitrary file access during archive extraction ("Zip Slip")

Extracting files from a malicious zip file, or similar type of archive, is at risk of directory traversal attacks if filenames from the archive are not properly validated.

Zip archives contain archive entries representing each file in the archive. These entries include a file path for the entry, but these file paths are not restricted and may contain unexpected special elements such as the directory traversal element (..). If these file paths are used to create a filesystem path, then a file operation may happen in an unexpected location. This can result in sensitive information being revealed or deleted, or an attacker being able to influence behavior by modifying unexpected files.

For example, if a zip file contains a file entry ..\sneaky-file, and the zip file is extracted to the directory c:\output, then naively combining the paths would result in an output file path of c:\output\..\sneaky-file, which would cause the file to be written to c:\sneaky-file.

Recommendation

Ensure that output paths constructed from zip archive entries are validated to prevent writing files to unexpected locations.

The recommended way of writing an output file from a zip archive entry is to verify that the normalized full path of the output file starts with a prefix that matches the destination directory. Path normalization can be done with either java.io.File.getCanonicalFile() or java.nio.file.Path.normalize(). Prefix checking can be done with String.startsWith(..), but it is better to use java.nio.file.Path.startsWith(..), as the latter works on complete path segments.

Another alternative is to validate archive entries against a whitelist of expected files.

Example

In this example, a file path taken from a zip archive item entry is combined with a destination directory. The result is used as the destination file path without verifying that the result is within the destination directory. If provided with a zip file containing an archive path like ..\sneaky-file, then this file would be written outside the destination directory.

void writeZipEntry(ZipEntry entry, File destinationDir) {
    File file = new File(destinationDir, entry.getName());
    FileOutputStream fos = new FileOutputStream(file); // BAD
    // ... write entry to fos ...
}

To fix this vulnerability, we need to verify that the normalized file still has destinationDir as its prefix, and throw an exception if this is not the case.

void writeZipEntry(ZipEntry entry, File destinationDir) {
    File file = new File(destinationDir, entry.getName());
    if (!file.toPath().normalize().startsWith(destinationDir.toPath()))
        throw new Exception("Bad zip entry");
    FileOutputStream fos = new FileOutputStream(file); // OK
    // ... write entry to fos ...
}

References

max-schaefer
max-schaefer previously approved these changes Jan 23, 2024
Copy link
Contributor

@max-schaefer max-schaefer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, this is much clearer than what was there before!

Path filePath = publicFolder.resolve(filename).normalize().toAbsolutePath();

// GOOD: ensure that the path stays within the public folder
if (!filePath.startsWith(publicFolder)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth refining this check by adding a path separator after publicFolder?

@erik-krogh erik-krogh marked this pull request as ready for review January 23, 2024 13:46
@erik-krogh erik-krogh requested a review from a team as a code owner January 23, 2024 13:46
Copy link
Contributor

@atorralba atorralba left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for doing this! Looping the docs team in as well.

There's one caveat though: users will take these code snippets as prime examples of what should be done to get rid of alerts, but in TaintedPathGoodFolder.java we actually raise an alert when publicFolder.resolve(filename) is called, because Path.resolve is currently a sink (it shouldn't, but that's another debate).

This effectively means that we can't create a Path from user input without raising an alert, so the Path-related validation methods (like resolve + normalize to then check that the path is within an expected directory) can't be recommended until we sort this out.

To catch this sort of thing, it's good practice that we add any QHelp examples as test cases.

Anyway, I would keep the TaintedPathGoodFolder.java example out of this PR for now.

@atorralba atorralba added the ready-for-doc-review This PR requires and is ready for review from the GitHub docs team. label Jan 25, 2024
@erik-krogh
Copy link
Contributor Author

There's one caveat though

I know, and I tried to see if I could rewrite the example to do effectively the same thing, but without constructing any Path objects, but that ended up very ugly (and possibly buggy on Windows).

I want an example of how to sanitize a path that can contain multiple path elements by ensuring that the path is within a safe folder. Because such an example is relevant to the real world.
I want to keep the example, even though it gives a false positive in the analysis, because it is the correct way of fixing that specific scenario.

subatoi
subatoi previously approved these changes Jan 25, 2024
Copy link
Contributor

@subatoi subatoi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! I only had minor suggestions and I've left reasoning for all of them, but if you have any questions about any of them then please just ask me. Similarly if you need me to take another look after you've made any further changes. Thanks!

Comment on lines 10 to 11
<p>Paths that are naively constructed from data controlled by a user may be absolute paths or contain
unexpected special characters, such as "..". Such a path may potentially point anywhere on the file system.</p>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
<p>Paths that are naively constructed from data controlled by a user may be absolute paths or contain
unexpected special characters, such as "..". Such a path may potentially point anywhere on the file system.</p>
<p>Paths that are naively constructed from data controlled by a user may be absolute paths, or may contain unexpected special characters such as "..". Such a path could point anywhere on the file system.</p>

Minor: main suggestion here is just to use "could" instead of "may potentially" for emphasis and less redundancy (I realise "may potentially" was in the original, already-approved, text). Other than that I've just added some commas to break the sentence up a bit.

In any case, like all my suggestions, it's not urgent.

Comment on lines 18 to 20
<p>Common validation methods include checking that the normalized path is relative and does not contain
any ".." components, or that the path is contained within a safe folder. The validation method to use depends
on how the path is used in the application and whether the path is supposed to be a single path component.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
<p>Common validation methods include checking that the normalized path is relative and does not contain
any ".." components, or that the path is contained within a safe folder. The validation method to use depends
on how the path is used in the application and whether the path is supposed to be a single path component.
<p>Common validation methods include checking that the normalized path is relative and does not contain any ".." components, or checking that the path is contained within a safe folder. The method you should use depends on how the path is used in the application, and whether the path should be a single path component.

Main suggestion is just to make emphasis on "you" (the user).

Comment on lines 23 to 25
<p>If the path is supposed to be a single path component (such as a file name) you can check for the existence
of any path separators ("/" or "\") or ".." sequences in the input, and reject the input if any are found.
</p>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
<p>If the path is supposed to be a single path component (such as a file name) you can check for the existence
of any path separators ("/" or "\") or ".." sequences in the input, and reject the input if any are found.
</p>
<p>If the path should be a single path component (such as a file name), you can check for the existence of any path separators ("/" or "\"), or ".." sequences in the input, and reject the input if any are found.
</p>

Very minor: commas to break the sentence up a bit

Comment on lines 46 to 47
If the input is just supposed to be a file name, you can check that it doesn't contain any path separators
or ".." sequences.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
If the input is just supposed to be a file name, you can check that it doesn't contain any path separators
or ".." sequences.
If the input should only be a file name, you can check that it doesn't contain any path separators or ".." sequences.

V minor: suggestion to change "just supposed to be" to "should be", to make it slightly less idiomatic (ie "supposed to be" might be tricky for some non-native English speakers)

Comment on lines 53 to 54
If the input is supposed to be found within a specific directory, you can check that the resolved path
is still contained within that directory.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
If the input is supposed to be found within a specific directory, you can check that the resolved path
is still contained within that directory.
If the input should be within a specific directory, you can check that the resolved path
is still contained within that directory.

Same as above

@subatoi subatoi self-assigned this Jan 25, 2024
@erik-krogh
Copy link
Contributor Author

Looks great! I only had minor suggestions and I've left reasoning for all of them, but if you have any questions about any of them then please just ask me. Similarly if you need me to take another look after you've made any further changes. Thanks!

Thanks 👍
I've applied all your suggested changes.

Now I think it's up to the Java team what we do about the FP.

Copy link
Contributor

github-actions bot commented Jan 25, 2024

QHelp previews:

java/ql/src/Security/CWE/CWE-022/TaintedPath.qhelp

Uncontrolled data used in path expression

Accessing paths controlled by users can allow an attacker to access unexpected resources. This can result in sensitive information being revealed or deleted, or an attacker being able to influence behavior by modifying unexpected files.

Paths that are naively constructed from data controlled by a user may be absolute paths, or may contain unexpected special characters such as "..". Such a path could point anywhere on the file system.

Recommendation

Validate user input before using it to construct a file path.

Common validation methods include checking that the normalized path is relative and does not contain any ".." components, or checking that the path is contained within a safe folder. The method you should use depends on how the path is used in the application, and whether the path should be a single path component.

If the path should be a single path component (such as a file name), you can check for the existence of any path separators ("/" or "\"), or ".." sequences in the input, and reject the input if any are found.

Note that removing "../" sequences is not sufficient, since the input could still contain a path separator followed by "..". For example, the input ".../...//" would still result in the string "../" if only "../" sequences are removed.

Finally, the simplest (but most restrictive) option is to use an allow list of safe patterns and make sure that the user input matches one of these patterns.

Example

In this example, a file name is read from a java.net.Socket and then used to access a file and send it back over the socket. However, a malicious user could enter a file name anywhere on the file system, such as "/etc/passwd" or "../../../etc/passwd".

public void sendUserFile(Socket sock, String user) {
	BufferedReader filenameReader = new BufferedReader(
			new InputStreamReader(sock.getInputStream(), "UTF-8"));
	String filename = filenameReader.readLine();
	// BAD: read from a file without checking its path
	BufferedReader fileReader = new BufferedReader(new FileReader(filename));
	String fileLine = fileReader.readLine();
	while(fileLine != null) {
		sock.getOutputStream().write(fileLine.getBytes());
		fileLine = fileReader.readLine();
	}
}

If the input should only be a file name, you can check that it doesn't contain any path separators or ".." sequences.

public void sendUserFileGood(Socket sock, String user) {
	BufferedReader filenameReader = new BufferedReader(
			new InputStreamReader(sock.getInputStream(), "UTF-8"));
	String filename = filenameReader.readLine();
	// GOOD: ensure that the filename has no path separators or parent directory references
	if (filename.contains("..") || filename.contains("/") || filename.contains("\\")) {
		throw new IllegalArgumentException("Invalid filename");
	}
	BufferedReader fileReader = new BufferedReader(new FileReader(filename));
	String fileLine = fileReader.readLine();
	while(fileLine != null) {
		sock.getOutputStream().write(fileLine.getBytes());
		fileLine = fileReader.readLine();
	}	
}

If the input should be within a specific directory, you can check that the resolved path is still contained within that directory.

public void sendUserFileGood(Socket sock, String user) {
	BufferedReader filenameReader = new BufferedReader(
			new InputStreamReader(sock.getInputStream(), "UTF-8"));
	String filename = filenameReader.readLine();

	Path publicFolder = Paths.get("/home/" + user + "/public").normalize().toAbsolutePath();
	Path filePath = publicFolder.resolve(filename).normalize().toAbsolutePath();

	// GOOD: ensure that the path stays within the public folder
	if (!filePath.startsWith(publicFolder + File.separator)) {
		throw new IllegalArgumentException("Invalid filename");
	}
	BufferedReader fileReader = new BufferedReader(new FileReader(filePath.toString()));
	String fileLine = fileReader.readLine();
	while(fileLine != null) {
		sock.getOutputStream().write(fileLine.getBytes());
		fileLine = fileReader.readLine();
	}
}

References

java/ql/src/Security/CWE/CWE-022/ZipSlip.qhelp

Arbitrary file access during archive extraction ("Zip Slip")

Extracting files from a malicious zip file, or similar type of archive, is at risk of directory traversal attacks if filenames from the archive are not properly validated.

Zip archives contain archive entries representing each file in the archive. These entries include a file path for the entry, but these file paths are not restricted and may contain unexpected special elements such as the directory traversal element (..). If these file paths are used to create a filesystem path, then a file operation may happen in an unexpected location. This can result in sensitive information being revealed or deleted, or an attacker being able to influence behavior by modifying unexpected files.

For example, if a zip file contains a file entry ..\sneaky-file, and the zip file is extracted to the directory c:\output, then naively combining the paths would result in an output file path of c:\output\..\sneaky-file, which would cause the file to be written to c:\sneaky-file.

Recommendation

Ensure that output paths constructed from zip archive entries are validated to prevent writing files to unexpected locations.

The recommended way of writing an output file from a zip archive entry is to verify that the normalized full path of the output file starts with a prefix that matches the destination directory. Path normalization can be done with either java.io.File.getCanonicalFile() or java.nio.file.Path.normalize(). Prefix checking can be done with String.startsWith(..), but it is better to use java.nio.file.Path.startsWith(..), as the latter works on complete path segments.

Another alternative is to validate archive entries against a whitelist of expected files.

Example

In this example, a file path taken from a zip archive item entry is combined with a destination directory. The result is used as the destination file path without verifying that the result is within the destination directory. If provided with a zip file containing an archive path like ..\sneaky-file, then this file would be written outside the destination directory.

void writeZipEntry(ZipEntry entry, File destinationDir) {
    File file = new File(destinationDir, entry.getName());
    FileOutputStream fos = new FileOutputStream(file); // BAD
    // ... write entry to fos ...
}

To fix this vulnerability, we need to verify that the normalized file still has destinationDir as its prefix, and throw an exception if this is not the case.

void writeZipEntry(ZipEntry entry, File destinationDir) {
    File file = new File(destinationDir, entry.getName());
    if (!file.toPath().normalize().startsWith(destinationDir.toPath()))
        throw new Exception("Bad zip entry");
    FileOutputStream fos = new FileOutputStream(file); // OK
    // ... write entry to fos ...
}

References

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Java ready-for-doc-review This PR requires and is ready for review from the GitHub docs team.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants