Add a cleaning rule for URL names, such as Amazon.com -> Amazon.it #758
Open
Description
Similar to #736, we should discard sentences that translate the domain suffix of a website, like Amazon.com -> Amazon.it
With a regex such as /[a-z]+\.com\b/
we could identify a URL on the English side, and ensure that it's matched on the other language.
Edit: Also, we have other English speaking countries to consider, like Amazon.co.uk