Skip to content

Commit 01e2af7

Browse files
Merge pull request #1 from chris-rutkowski/v102
V102
2 parents c622781 + 56a505b commit 01e2af7

File tree

3 files changed

+54
-22
lines changed

3 files changed

+54
-22
lines changed

README.md

Lines changed: 28 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,26 @@
1414

1515
## 🛠️ Usage
1616

17-
### 1. **Create an Ignore File**
17+
### 1. **Add the GitHub Action**
18+
Create a GitHub Actions workflow in `.github/workflows/duplicate_guard.yml`:
19+
20+
```yaml
21+
name: Duplicate Guard
22+
on:
23+
pull_request:
24+
branches:
25+
- main
26+
workflow_dispatch:
27+
28+
jobs:
29+
duplicate_guard:
30+
runs-on: ubuntu-latest
31+
steps:
32+
- name: Duplicate Guard
33+
uses: chris-rutkowski/duplicate-guard@v1.0.2
34+
```
35+
36+
### 2. **Create an ignore file** (optional)
1837
Add a `duplicate_guard.ignore` file to the root of your repository to define patterns for files or directories to exclude from duplicate checks. The syntax follows `.gitignore` conventions.
1938

2039
**Example `duplicate_guard.ignore`:**
@@ -26,38 +45,31 @@ logs/*
2645

2746
---
2847

29-
### 2. **Add the GitHub Action**
30-
Create a GitHub Actions workflow in `.github/workflows/duplicate_guard.yml`:
48+
## ♻️ Find existing duplicates
49+
50+
Run the action manually using the `workflow_dispatch` event to scan and find duplicates in your repository.
3151

3252
```yaml
3353
name: Duplicate Guard
3454
on:
35-
pull_request:
36-
branches:
37-
- master
3855
workflow_dispatch:
56+
pull_request:
3957
40-
jobs:
41-
filesize_guard:
42-
runs-on: ubuntu-latest
43-
steps:
44-
- name: Duplicate Guard
45-
uses: chris-rutkowski/duplicate-guard@v1.0.1
58+
...
4659
```
4760

4861
---
4962

5063
## ⚙️ Configuration
5164

52-
### **Specify a Custom Ignore File Path**
53-
If your `duplicate_guard.ignore` file is not in the root directory, specify its location using the `ignore_file` input:
65+
### **Specify a custom ignore file path**
5466

5567
```yaml
5668
steps:
5769
- name: Duplicate Guard
58-
uses: chris-rutkowski/filesize-guard@v1.0.1
70+
uses: chris-rutkowski/duplicate-guard@v1.0.2
5971
with:
60-
ignore_file: ./my/path/my_filesize_guard.ignore
72+
ignore_file: ./my/path/my_duplicate_guard.ignore
6173
```
6274

6375
---

action.yml

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14,14 +14,22 @@ runs:
1414
uses: actions/checkout@v4
1515

1616
- name: Get changed files
17+
if: ${{ github.event_name == 'pull_request' }}
1718
id: changed-files
1819
uses: tj-actions/changed-files@v45
1920
with:
2021
json: true
2122
write_output_files: true
2223
safe_output: false
2324

25+
- name: Get all files
26+
if: ${{ github.event_name != 'pull_request' }}
27+
run: |
28+
mkdir -p .github/outputs/
29+
find . -type f | sed 's|^\./||' | jq -R . | jq -s . > .github/outputs/all_repo_files.json
30+
shell: bash
31+
2432
- name: Run Duplicate Guard
2533
run: |
26-
python3 ${GITHUB_ACTION_PATH}/duplicate_guard.py ${{ inputs.ignore_file }} .github/outputs/modified_files.json .github/outputs/added_files.json
34+
python3 ${GITHUB_ACTION_PATH}/duplicate_guard.py ${{ inputs.ignore_file }} .github/outputs/modified_files.json .github/outputs/added_files.json .github/outputs/all_repo_files.json
2735
shell: bash

duplicate_guard.py

Lines changed: 17 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -4,9 +4,21 @@
44
import os
55
import sys
66

7+
DEFAULT_IGNORE_FILE = "./duplicate_guard.ignore"
8+
79
def load_ignore_patterns(ignore_file):
10+
default_patterns = [".git/*"]
11+
12+
if not os.path.exists(ignore_file):
13+
if ignore_file == DEFAULT_IGNORE_FILE:
14+
return default_patterns
15+
16+
print(f"Error: The specified ignore file '{ignore_file}' does not exist.", file=sys.stderr)
17+
sys.exit(1)
18+
819
with open(ignore_file, "r") as f:
9-
return [line.strip() for line in f if line.strip() and not line.startswith("#")]
20+
patterns = [line.strip() for line in f if line.strip() and not line.startswith("#")]
21+
return default_patterns + patterns
1022

1123
def should_ignore(file, patterns):
1224
return any(fnmatch.fnmatch(file, pattern) for pattern in patterns)
@@ -31,6 +43,8 @@ def get_all_repository_files(ignore_patterns):
3143
def load_files_from_json(file_paths):
3244
files = []
3345
for file_path in file_paths:
46+
if not os.path.exists(file_path):
47+
continue
3448
with open(file_path, "r") as f:
3549
files.extend(json.load(f))
3650
return files
@@ -43,7 +57,8 @@ def load_files_from_json(file_paths):
4357
checksums = {}
4458
for file in get_all_repository_files(ignore_patterns):
4559
checksum = calculate_checksum(file)
46-
checksums[checksum] = file
60+
if checksum not in checksums:
61+
checksums[checksum] = file
4762
print(f"Done, {len(checksums)} checksums")
4863

4964
# Step 2: Check new/modified files against the repository and themselves
@@ -54,11 +69,8 @@ def load_files_from_json(file_paths):
5469
continue
5570

5671
if should_ignore(file, ignore_patterns):
57-
print(f"Ignoring: '{file}'")
5872
continue
5973

60-
print(f"Processing: '{file}'")
61-
6274
checksum = calculate_checksum(file)
6375

6476
if checksum in checksums:

0 commit comments

Comments
 (0)