-
Notifications
You must be signed in to change notification settings - Fork 256
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Recommended List of Malicious Websites #4667
Comments
@hagezi I'll check in the morning if it's possible to use the website as a source for my blocklist |
Thanks @tanmarpn and @jarelllama |
@jarelllama i think you need to scrape it |
Unable to scrape due to Google bot verification |
@jarelllama Thank you for checking. |
@tanmarpn it seems I am unable to access the dataset download: https://data.moi.gov.tw/MoiOD/System/DownloadFile.aspx?DATA=3BB8E3CE-8223-43AF-B1AB-5824FA889883. Even if I could, I am not sure the URL stays the same throughout each update, otherwise I would not be able to scrape the data automatically. |
what I can access is the CSV: https://quality.data.gov.tw/dq_download_csv.php?nid=160055&md5_url=45ab3c35d9f3f23d0166ba8f5ab9fd6d (last updated December 3rd 2024). I am not sure if this is the entire dataset of just a part of it. I will try to scrape the domains but I will have to monitor if the URL changes after each update. If that happens I have tested that I can scrape https://data.gov.tw/dataset/160055 directly to get the CSV URL. |
The CSV has 21922 domains dating back 2022. I will only add those from 2024 onwards. |
Thank you very much @jarelllama, let me know when it is integrated and your list is updated online. |
Testing build now |
Source: 165 Anti-fraud
Raw:11219 Final:10790 Whitelisted: 0 Excluded: 3 Toplist: 4
Processing time: 2 second(s) All good 👍 . Thanks @tanmarpn |
@jarelllama Thank you very much. I have gathered some information that I hope will be helpful to you.
|
Option number 2 seems to work fine. I will update the code in a bit. @hagezi I'll let you know when I am done so you can close this issue. |
Where can I check the total number of entries? I'm currently pulling about 35847. |
Anyway its no matter. I will only pull domains from 2024 onwards. |
@jarelllama Currently, it seems that the total number of entries can only be calculated by visiting the API in method 1, as they do not provide an additional column or annotation for the total count. |
Thank you for your support. The issue is scheduled to be fixed in the next release. You will be notified when the issue is finally fixed. |
Thanks @jarelllama @tanmarpn |
This issue has been fixed in release 2024.358.60841 |
Which domain(s) should be blocked?
www.kcczai.com
www.bitbitquark.com
bvox11.cc
www.fxnovus.co
uslasry.com
www.hkeusy.com
www.maicoinmn.com
Why should these domain(s) be blocked?
This is an anti-fraud website operated by the national government of Taiwan (Taipei). The site regularly publishes links to websites identified as fraudulent or malicious by national monitoring systems.
I have browsed through multiple entries and noticed that some of these websites are not blocked by TIF.
Hope you can make good use of the information here to ensure safer browsing for our regional users.
Uri:
https://165.npa.gov.tw/#/articles/subclass/3
I confirm ...
Privacy
The text was updated successfully, but these errors were encountered: