-
-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
is '||' a typo? #2771
Labels
upstream
The issue is upstream
Comments
Hello! Thank you for opening your first issue in this repo. It’s people like you who make these host files better! |
Update your scripts; you should *always* validate your input.
The short answer is restrict allowable names to ascii letters, digits,
period, dash and underscore and reject everything else.
The long answer starts with a question -- what is a valid host name?
If Steven Black is giving us really & truely host names then the answer is
simple: they have to be ascii characters in the set of
- upper or lower case letters
- digits
- period or dot
ref:
https://datatracker.ietf.org/doc/html/rfc1034#section-3.5
https://datatracker.ietf.org/doc/html/rfc1123#page-13
But if they're DNS names then.. things start to get interesting because
there are almost _no_ restrictions on DNS names.
For example, an underscore is illegal in a host name but ohNoes?? this name
resolves to an address:
$ dig @1.1.1.1 zn_77ycxjaq1e0122v-cbs.siteintercept.qualtrics.com
; <<>> DiG 9.16.50-Debian <<>> @1.1.1.1
zn_77ycxjaq1e0122v-cbs.siteintercept.qualtrics.com
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 60661
;; flags: qr rd ra ad; QUERY: 1, ANSWER: 4, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
;; QUESTION SECTION:
;zn_77ycxjaq1e0122v-cbs.siteintercept.qualtrics.com. IN A
;; ANSWER SECTION:
zn_77ycxjaq1e0122v-cbs.siteintercept.qualtrics.com. 13197 IN CNAME
siteintercept.qprod2.net.
siteintercept.qprod2.net. 597 IN CNAME
prodlb.siteintercept.qualtrics.com.cdn.cloudflare.net.
prodlb.siteintercept.qualtrics.com.cdn.cloudflare.net. 127 IN A
104.17.208.240
prodlb.siteintercept.qualtrics.com.cdn.cloudflare.net. 127 IN A
104.17.209.240
The DNS name with an underscore is a CNAME - ie. a name that points to
something else.
That something else is also a CNAME but, at last, that name point to a host
name. Which is all ascii letters so it meets the definition of a host name.
Then you get into unicode characters - which are also allowed. eg.
$ dig @1.1.1.1 ουτοπία.δπθ.gr <http://xn--kxae4bafwg.xn--pxaix.gr>
; <<>> DiG 9.16.50-Debian <<>> @1.1.1.1 ουτοπία.δπθ.gr
<http://xn--kxae4bafwg.xn--pxaix.gr>
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 55077
;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
;; QUESTION SECTION:
;ουτοπία.δπθ.gr <http://xn--kxae4bafwg.xn--pxaix.gr>. IN A
;; ANSWER SECTION:
ουτοπία.δπθ.gr <http://xn--kxae4bafwg.xn--pxaix.gr>. 10800 IN CNAME
utopia.duth.gr.
utopia.duth.gr. 10800 IN A 192.108.114.44
But things like ουτοπία.δπθ.gr <http://xn--kxae4bafwg.xn--pxaix.gr> can
also be represented as xn--kxae4bafwg.xn--pxaix.gr
see: https://en.wikipedia.org/wiki/Internationalized_domain_name
$ dig @1.1.1.1 xn--kxae4bafwg.xn--pxaix.gr
; <<>> DiG 9.16.50-Debian <<>> @1.1.1.1 xn--kxae4bafwg.xn--pxaix.gr
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 22558
;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
;; QUESTION SECTION:
;ουτοπία.δπθ.gr <http://xn--kxae4bafwg.xn--pxaix.gr>. IN A
;; ANSWER SECTION:
ουτοπία.δπθ.gr <http://xn--kxae4bafwg.xn--pxaix.gr>. 10800 IN CNAME
utopia.duth.gr.
utopia.duth.gr. 10800 IN A 192.108.114.44
So If steven Black converts all IDN names to punycode names you can still
restrict everything to ascii letters, digits, dash, underscore and period.
Regards,
Lee
…On Tue, Nov 12, 2024 at 3:03 AM Alexis Huxley ***@***.***> wrote:
In
https://raw.githubusercontent.com/StevenBlack/hosts/master/alternates/fakenews-gambling/hosts,
there are two occurences of '|| in a domainname:
0.0.0.0 www.||immediate-bitwave.com
...
0.0.0.0 ||immediate-bitwave.com
Is this a copy-and-paste error or really part of the domain name?
(I'm asking because by downstream scripts are choking on this and I don't
know whether to await an upstream fix or update my scripts.)
—
Reply to this email directly, view it on GitHub
<#2771>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AI6XBA7MWDVWKIQXPTVZC2D2AGY6LAVCNFSM6AAAAABRTNYFZGVHI2DSMVQWIX3LMV43ASLTON2WKOZSGY2TCMZXGQ2TKOA>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
Thank you for this @alexishuxley, good catch! And thank you Lee @ler762 for the assist. I've already contacted the fine folks upstream about this. This should be resolved shortly if this is a mistake. |
moviuro
added a commit
to moviuro/moviuro.bin
that referenced
this issue
Nov 14, 2024
See also: StevenBlack/hosts#2771 > 0.0.0.0 www.||immediate-bitwave.com That line was present in a file we download by default, causing issues on OpenBSD (awk: illegal primary in regular expression), and on Linux the `||` were landing in the unbound conf!
Tank you again Alexis @alexishuxley, this is fixed in the latest release. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
In https://raw.githubusercontent.com/StevenBlack/hosts/master/alternates/fakenews-gambling/hosts, there are two occurences of '|| in a domainname:
Is this a copy-and-paste error or really part of the domain name?
(I'm asking because by downstream scripts are choking on this and I don't know whether to await an upstream fix or update my scripts.)
The text was updated successfully, but these errors were encountered: