Skip to content

Blog #1 #144

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Sep 28, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions docs/astro.config.mjs
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,11 @@ export default defineConfig({
label: 'CLI Reference',
slug: 'cli-reference'
}]
},{
label: 'Blog',
autogenerate: {
directory: 'blog'
}
}, {
label: 'PathFinder Queries',
autogenerate: {
Expand Down
Binary file added docs/public/assets/cpf-illustration.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
68 changes: 68 additions & 0 deletions docs/src/content/docs/blog/codeql-oss-alternative.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
---
title: Code PathFinder - Open Source CodeQL Alternative
description: "A short blog post about Code PathFinder, a CodeQL OSS alternative"
template: splash
---

import PostHogLayout from '../../../layouts/PostHogLayout.astro';

<PostHogLayout>
</PostHogLayout>

![Code Pathfinder Illustration](/assets/cpf-illustration.jpg)

## What is Code PathFinder?

Code PathFinder is a code analysis tool that helps you find exact code pattern and paths in your codebase. While there are several ways to
grep source code, having source code broken down into individual entities, building graph & edges which helps in building
relationships between entities, imitates the way a human reads code.


### How do security engineers interact with codebase today?

If you generically think about how engineers interact with codebase, it is something like this:

1. Start by searching for a symbol
2. Resolve the symbol to an entity such as a class or function
3. Find the entity's definition
4. Find the entity's references across the codebase and often across multiple repositories
5. Determine the flow of the code
- 5A. Have a source in mind such as user inputs, database, or a file or even network operations
- 5B. Have a sink in mind such as above symbols definition
- 5C. Determine the flow of the code including method jumps, method calls, and method returns
- 5D. Identify if there is any blocker in between such as conditions, loops, etc
6. Identify the variables that are modified and the variables that are used within the flow

Representing it technically as a graph, can be more useful in finding the flow of the code. Moreover, the relationship as edges
between entities can be used as conditions to focus on the paths that are relevant to the source and sink.

For example, Find code pattern where `Socket` class is instantiated and `send` method is called on it and get me all enclosing methods.

```sql
SELECT MethodInvocation AS mi, MethodDeclaration AS md, ClassInstanceExpr AS ci
WHERE
ci.getClassInstanceExpr().getClassName() = "Socket" &&
mi.getMethodName() = "send" && mi.getEnclosingMethod() = md
mi.getMethodInvocation().getObject() = ci
SELECT MethodDeclaration AS md, MethodInvocation AS mi
```
The above query will return all the enclosing methods of the `send` method in the `Socket` class and invoked call to `send` method.
The above entities such as `MethodInvocation`, `MethodDeclaration`, `ClassInstanceExpr` are called as entities and they are represented as nodes in the graph.
The edges between the nodes are represented as relationships between the entities.

### How does Code PathFinder work?

Code Pathfinder uses tree-sitter to parse the source code and build a graph of the code. The graph is then used to find answers to queries.
Similar to SQL, Code Pathfinder uses a query language to filter and apply conditions to the graph nodes logically. Sometimes, it generates
cartesian product of the graph nodes to retrieve all possible combinations and apply the conditions in order to find the paths in code.
While there are lot of APIs yet to be implemented, lacks support for classes and inheritance, Code Pathfinder is currently equipped with
the following features:

- Predicates
- Complex conditions
- Aliases

If you are interested in contributing to Code Pathfinder, please check out the [Code Pathfinder](https://github.com/shivasurya/code-pathfinder) repository.
Give it a try and file an issue if you find any bugs or have any suggestions.


Loading