Skip to content

Conversation

@exurd
Copy link
Contributor

@exurd exurd commented Sep 29, 2024

This PR fixes issue #36: Initial URLs aren't de-duplicated.

Since index.txt is deduped when starting the script and can be used later when resuming, it also makes sense to dedupe the $file variable when initializing.

@mskiptr
Copy link

mskiptr commented Sep 29, 2024

Thanks for taking care of it! I have tested your patch and it all works well.

Details
$ curl -O https://raw.githubusercontent.com/overcast07/wayback-machine-spn-scripts/7d6774366a9ec3e4e75a6f9de9aa9bcf45425b26/spn.sh
$ chmod +x spn.sh
$ cat - >urls <<EOF
https://mesamatrix.net/
https://example.net/
https://example.net/
https://mesamatrix.net/
EOF
$ ./spn.sh -p 1 urls


== Starting spn.sh ==
Data folder: /home/piotr/.local/share/spn-data/2024-09/1727595599

2024-09-29 07:40:04 [Job submitted] https://mesamatrix.net/
2024-09-29 07:40:21 [Job completed] https://mesamatrix.net/
2024-09-29 07:40:25 [Job submitted] https://example.net/
2024-09-29 07:41:52 [Job completed] https://example.net/


== Ending spn.sh ==
Data folder: /home/piotr/.local/share/spn-data/2024-09/1727595599

So Tested-by: Piotr Masłowski <piotr@maslowski.xyz> if you like those Git tags I guess.

@overcast07 overcast07 merged commit baea8c9 into overcast07:main Sep 29, 2024
@exurd exurd deleted the dedupe-list branch September 29, 2024 09:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants