Skip to content

Conversation

@koesie10
Copy link
Member

@koesie10 koesie10 commented Feb 7, 2024

This removes the ReturnValue as a valid access path for constructors. In a constructor, I believe the ReturnValue is meaningless since it's equal to Argument[self].

The main reason that I ran into this is that getAReturnNode() for a constructor returns every statement in the initialize method as a return node, so we were getting duplicate ReturnValue rows.

If ReturnValue is a valid value for constructors, I think the easiest solution would be to return getSelfParameter() as the node instead of the return node.

@koesie10 koesie10 added the no-change-note-required This PR does not need a change note label Feb 7, 2024
@github-actions github-actions bot added the Ruby label Feb 7, 2024
@koesie10 koesie10 marked this pull request as ready for review February 7, 2024 14:07
@koesie10 koesie10 requested a review from a team as a code owner February 7, 2024 14:07
@hmac
Copy link
Contributor

hmac commented Feb 7, 2024

Thinking a bit about this, I wonder if we want to include constructors (by which I think you mean the new class method, which we internally link to the initialize method) at all? Ruby analysis already knows that A.new returns an instance of A for all A, and I don't think it's common or necessary to create flow/source/sink summaries for such methods. If you wanted to mark A.new as a source I think you'd just write add a source model like A,,remote, which I think would target any instance of A (which is what you want).

@hmac
Copy link
Contributor

hmac commented Feb 8, 2024

In a constructor, I believe the ReturnValue is meaningless since it's equal to Argument[self].

I wasn't sure about this, so I tested it.

class A
end

# A.new as a source

a = A.new
sink a # $ hasValueFlow=

class B
    def initialize(x)
    end
end

# B.new(x) as a sink for x

x = source(1)
b = B.new(x) # $ hasValueFlow=1

class C
    def initialize(x)
    end
end

# C.new(x) as a flow summary, where data flows from x to the C instance

x = source(2)
c = C.new(x)
sink c # $ hasValueFlow=2
extensions:
  - addsTo:
      pack: codeql/ruby-all
      extensible: sourceModel
    data:
    # - ["A!", "Method[new]", "remote"] # this doesn't work
      - ["A", "", "remote"]

  - addsTo:
      pack: codeql/ruby-all
      extensible: sinkModel
    data:
      - ["B!", "Method[new].Argument[0]", "code-injection"]

  - addsTo:
      pack: codeql/ruby-all
      extensible: summaryModel
    data:
    # - ["C!", "Method[new]", "Argument[0]", "Argument[self]", "value"] # this doesn't work
      - ["C!", "Method[new]", "Argument[0]", "ReturnValue", "value"]

So for some flow summaries, it seems we will need ReturnValue.

@koesie10
Copy link
Member Author

koesie10 commented Feb 8, 2024

Thanks for the investigation! I've changed the query to always return a ReturnValue for an initialize method, regardless of how many return nodes it has. Since it seems like Argument[self] is meaningless for an initialize method, I've removed it from being returned. This should allow for all the scenarios in your example.

@koesie10 koesie10 merged commit e596862 into main Feb 8, 2024
@koesie10 koesie10 deleted the koesie10/ruby-access-path-constructor-returnvalue branch February 8, 2024 15:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

no-change-note-required This PR does not need a change note Ruby

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants