Skip to content

[clang-tidy] Add readability-string-view-substr check #120055

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 15 commits into
base: main
Choose a base branch
from
1 change: 1 addition & 0 deletions clang-tools-extra/clang-tidy/readability/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,7 @@ add_clang_library(clangTidyReadabilityModule STATIC
StaticAccessedThroughInstanceCheck.cpp
StaticDefinitionInAnonymousNamespaceCheck.cpp
StringCompareCheck.cpp
StringViewSubstrCheck.cpp
SuspiciousCallArgumentCheck.cpp
UniqueptrDeleteReleaseCheck.cpp
UppercaseLiteralSuffixCheck.cpp
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,7 @@
#include "StaticAccessedThroughInstanceCheck.h"
#include "StaticDefinitionInAnonymousNamespaceCheck.h"
#include "StringCompareCheck.h"
#include "StringViewSubstrCheck.h"
#include "SuspiciousCallArgumentCheck.h"
#include "UniqueptrDeleteReleaseCheck.h"
#include "UppercaseLiteralSuffixCheck.h"
Expand Down Expand Up @@ -146,6 +147,8 @@ class ReadabilityModule : public ClangTidyModule {
"readability-static-definition-in-anonymous-namespace");
CheckFactories.registerCheck<StringCompareCheck>(
"readability-string-compare");
CheckFactories.registerCheck<StringViewSubstrCheck>(
"readability-stringview-substr");
CheckFactories.registerCheck<readability::NamedParameterCheck>(
"readability-named-parameter");
CheckFactories.registerCheck<NonConstParameterCheck>(
Expand Down
201 changes: 201 additions & 0 deletions clang-tools-extra/clang-tidy/readability/StringViewSubstrCheck.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,201 @@
//===--- StringViewSubstrCheck.cpp - clang-tidy------------------*- C++ -*-===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//

#include "StringViewSubstrCheck.h"
#include "clang/AST/ASTContext.h"
#include "clang/ASTMatchers/ASTMatchFinder.h"
#include "clang/Lex/Lexer.h"

using namespace clang::ast_matchers;

namespace clang::tidy::readability {

void StringViewSubstrCheck::registerMatchers(MatchFinder *Finder) {
const auto HasStringViewType = hasType(hasUnqualifiedDesugaredType(recordType(
hasDeclaration(recordDecl(hasName("::std::basic_string_view"))))));

// Match assignment to string_view's substr
Finder->addMatcher(
cxxOperatorCallExpr(
hasOverloadedOperatorName("="),
hasArgument(0, expr(HasStringViewType).bind("target")),
hasArgument(
1, cxxMemberCallExpr(callee(memberExpr(hasDeclaration(
cxxMethodDecl(hasName("substr"))))),
on(expr(HasStringViewType).bind("source")))
.bind("substr_call")))
.bind("assignment"),
this);
}

void StringViewSubstrCheck::check(const MatchFinder::MatchResult &Result) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please try to do as much of the checking inside the matchers. I think everything can be done inside the matcher.

Copy link
Contributor Author

@hjanuschka hjanuschka Jan 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all other feedback addressed, and pushed, will go and kind of mentally restart this, to get the most logic moved to matchers() 🙈

const auto *Assignment =
Result.Nodes.getNodeAs<CXXOperatorCallExpr>("assignment");
const auto *Target = Result.Nodes.getNodeAs<Expr>("target");
const auto *Source = Result.Nodes.getNodeAs<Expr>("source");
const auto *SubstrCall =
Result.Nodes.getNodeAs<CXXMemberCallExpr>("substr_call");

if (!Assignment || !Target || !Source || !SubstrCall) {
return;
}

// Get the DeclRefExpr for the target and source
const auto *TargetDRE = dyn_cast<DeclRefExpr>(Target->IgnoreParenImpCasts());
const auto *SourceDRE = dyn_cast<DeclRefExpr>(Source->IgnoreParenImpCasts());

if (!TargetDRE || !SourceDRE) {
return;
}

const Expr *StartArg = SubstrCall->getArg(0);
const Expr *LengthArg =
SubstrCall->getNumArgs() > 1 ? SubstrCall->getArg(1) : nullptr;

std::string StartText =
Lexer::getSourceText(
CharSourceRange::getTokenRange(StartArg->getSourceRange()),
*Result.SourceManager, Result.Context->getLangOpts())
.str();

const bool IsSameVar = (TargetDRE->getDecl() == SourceDRE->getDecl());

// Case 1: Check for remove_prefix pattern
if (!LengthArg || isa<CXXDefaultArgExpr>(LengthArg)) {
if (IsSameVar) {
std::string Replacement = TargetDRE->getNameInfo().getAsString() +
".remove_prefix(" + StartText + ")";
diag(Assignment->getBeginLoc(), "prefer 'remove_prefix' over 'substr' "
"for removing characters from the start")
<< FixItHint::CreateReplacement(Assignment->getSourceRange(),
Replacement);
}
return;
}

// Check if StartArg resolves to 0
bool IsZero = false;

// Handle cases where StartArg is an integer literal
if (const auto *IL =
dyn_cast<IntegerLiteral>(StartArg->IgnoreParenImpCasts())) {
IsZero = IL->getValue() == 0;
}

// Optional: Handle cases where StartArg evaluates to zero
if (!IsZero) {
// Add logic for other constant evaluation (e.g., constexpr evaluation)
const auto &EvalResult = StartArg->EvaluateKnownConstInt(*Result.Context);
IsZero = !EvalResult.isNegative() && EvalResult == 0;
}

// If StartArg resolves to 0, handle the case
if (IsZero) {
bool isFullCopy = false;

// Check for length() or length() - expr pattern
if (const auto *BinOp = dyn_cast<BinaryOperator>(LengthArg)) {
if (BinOp->getOpcode() == BO_Sub) {
const Expr *LHS = BinOp->getLHS();
const Expr *RHS = BinOp->getRHS();

// Check for length() call
if (const auto *LengthCall = dyn_cast<CXXMemberCallExpr>(LHS)) {
if (const auto *LengthMethod =
dyn_cast<CXXMethodDecl>(LengthCall->getDirectCallee())) {
if (LengthMethod->getName() == "length") {
const Expr *LengthObject =
LengthCall->getImplicitObjectArgument();
const auto *LengthDRE =
dyn_cast<DeclRefExpr>(LengthObject->IgnoreParenImpCasts());

if (!LengthDRE || LengthDRE->getDecl() != SourceDRE->getDecl()) {
return;
}

// Check if RHS is 0 or evaluates to 0
bool IsZero = false;
if (const auto *IL =
dyn_cast<IntegerLiteral>(RHS->IgnoreParenImpCasts())) {
IsZero = IL->getValue() == 0;
}

if (IsZero) {
isFullCopy = true;
} else if (IsSameVar) {
// remove_suffix case (only for self-assignment)
std::string RHSText =
Lexer::getSourceText(
CharSourceRange::getTokenRange(RHS->getSourceRange()),
*Result.SourceManager, Result.Context->getLangOpts())
.str();

std::string Replacement =
TargetDRE->getNameInfo().getAsString() + ".remove_suffix(" +
RHSText + ")";
diag(Assignment->getBeginLoc(),
"prefer 'remove_suffix' over 'substr' for removing "
"characters from the end")
<< FixItHint::CreateReplacement(
Assignment->getSourceRange(), Replacement);
return;
}
}
}
}
}
} else if (const auto *LengthCall =
dyn_cast<CXXMemberCallExpr>(LengthArg)) {
// Handle direct length() call
if (const auto *LengthMethod =
dyn_cast<CXXMethodDecl>(LengthCall->getDirectCallee())) {
if (LengthMethod->getName() == "length") {
const Expr *LengthObject = LengthCall->getImplicitObjectArgument();
const auto *LengthDRE =
dyn_cast<DeclRefExpr>(LengthObject->IgnoreParenImpCasts());

if (LengthDRE && LengthDRE->getDecl() == SourceDRE->getDecl()) {
isFullCopy = true;
}
}
}
}
if (isFullCopy) {
if (IsSameVar) {
// Remove redundant self-copy, including the semicolon
SourceLocation EndLoc = Assignment->getEndLoc();
while (EndLoc.isValid()) {
const char *endPtr = Result.SourceManager->getCharacterData(EndLoc);
if (*endPtr == ';')
break;
EndLoc = Lexer::getLocForEndOfToken(EndLoc, 0, *Result.SourceManager,
Result.Context->getLangOpts());
}
if (EndLoc.isValid()) {
diag(Assignment->getBeginLoc(), "redundant self-copy")
<< FixItHint::CreateRemoval(CharSourceRange::getCharRange(
Assignment->getBeginLoc(),
EndLoc.getLocWithOffset(
1))); // +1 to include the semicolon.
}
} else {
// Simplify copy between different variables
std::string Replacement = TargetDRE->getNameInfo().getAsString() +
" = " +
SourceDRE->getNameInfo().getAsString();
diag(Assignment->getBeginLoc(), "prefer direct copy over substr")
<< FixItHint::CreateReplacement(Assignment->getSourceRange(),
Replacement);
}

return;
}
}
}

} // namespace clang::tidy::readability
39 changes: 39 additions & 0 deletions clang-tools-extra/clang-tidy/readability/StringViewSubstrCheck.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
//===--- StringViewSubstrCheck.h - clang-tidy---------------------*- C++
//-*-===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//

#ifndef LLVM_CLANG_TOOLS_EXTRA_CLANG_TIDY_READABILITY_STRINGVIEWSUBSTRCHECK_H
#define LLVM_CLANG_TOOLS_EXTRA_CLANG_TIDY_READABILITY_STRINGVIEWSUBSTRCHECK_H

#include "../ClangTidyCheck.h"

namespace clang::tidy::readability {

/// Finds string_view substr() calls that can be replaced with remove_prefix()
/// or remove_suffix().
///
/// For the user-facing documentation see:
/// https://clang.llvm.org/extra/clang-tidy/checks/readability/string-view-substr.html
///
/// The check matches two patterns:
/// sv = sv.substr(N) -> sv.remove_prefix(N)
/// sv = sv.substr(0, sv.length() - N) -> sv.remove_suffix(N)
///
/// These replacements make the intent clearer and are more efficient as they
/// modify the string_view in place rather than creating a new one.
class StringViewSubstrCheck : public ClangTidyCheck {
public:
StringViewSubstrCheck(StringRef Name, ClangTidyContext *Context)
: ClangTidyCheck(Name, Context) {}
void registerMatchers(ast_matchers::MatchFinder *Finder) override;
void check(const ast_matchers::MatchFinder::MatchResult &Result) override;
};

} // namespace clang::tidy::readability

#endif // LLVM_CLANG_TOOLS_EXTRA_CLANG_TIDY_READABILITY_STRINGVIEWSUBSTRCHECK_H
7 changes: 7 additions & 0 deletions clang-tools-extra/docs/ReleaseNotes.rst
Original file line number Diff line number Diff line change
Expand Up @@ -136,6 +136,13 @@ New checks
Gives warnings for tagged unions, where the number of tags is
different from the number of data members inside the union.

- New :doc:`readability-string-view-substr
<clang-tidy/checks/readability/string-view-substr>` check.

Finds ``std::string_view::substr()`` calls that can be replaced with clearer
alternatives using ``remove_prefix()`` or ``remove_suffix()``. This makes the
intent clearer and is more efficient as it modifies the string_view in place.

- New :doc:`modernize-use-integer-sign-comparison
<clang-tidy/checks/modernize/use-integer-sign-comparison>` check.

Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
.. title:: clang-tidy - readability-string-view-substr

readability-string-view-substr
==============================

Finds ``string_view substr()`` calls that can be replaced with clearer alternatives
using ``remove_prefix()`` or ``remove_suffix()``.

The check suggests the following transformations:

=========================================== =======================
Expression Replacement
=========================================== =======================
``sv = sv.substr(n)`` ``sv.remove_prefix(n)``
``sv = sv.substr(0, sv.length()-n)`` ``sv.remove_suffix(n)``
``sv = sv.substr(0, sv.length())`` Redundant self-copy
``sv1 = sv.substr(0, sv.length())`` ``sv1 = sv``
=========================================== =======================
Loading
Loading