Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UDP/TCP confusion with v0.23 #1337

Closed
tjay opened this issue Jan 14, 2024 · 6 comments
Closed

UDP/TCP confusion with v0.23 #1337

tjay opened this issue Jan 14, 2024 · 6 comments
Assignees
Labels
🐞 bug Something isn't working
Milestone

Comments

@tjay
Copy link

tjay commented Jan 14, 2024

Hi,

Since v0.23 there is maybe a problem with some domains and EDNS TCP/UDP usage.
On first request blocky returns a SERVFAIL any further request is answered correct from blocky(or upstream) cache.

blocky debug output:

DEBUG server: new request
DEBUG blacklist_resolver: checking groups for request client_ip=10.0.1.20 client_names=10.0.1.20 groupsToCheck=ads; local question=A (www.deekeep.com.)
DEBUG strict: using upstream 'tcp+udp:10.0.1.1' as resolver client_ip=10.0.1.20 client_names=10.0.1.20 question=A (www.deekeep.com.)
DEBUG upstream: received response from upstream answer= net=tcp+udp protocol=UDP response_time_ms=216 return_code=SERVFAIL upstream=tcp+udp:10.0.1.1 upstream_ip=10.0.1.1
DEBUG strict: using response from resolver answer= client_ip=10.0.1.20 client_names=10.0.1.20 question=A (www.deekeep.com.) resolver={resolver:0xc00037b040 lastErrorTime:{v:{wall:0 ext:62135596800 loc:0xc0000d1570}}}

tcpdump

 1	0.000000	10.0.1.20	10.0.1.1	DNS	98	Standard query 0xcf46 A www.deekeep.com OPT
 2	0.411579	10.0.1.1	10.0.1.20	DNS	116	Standard query response 0xcf46 A www.deekeep.com CNAME deekeep.com A 44.205.225.91 OPT

using dig on upstream server

dig www.deekeep.com OPT @10.0.1.1

; <<>> DiG 9.18.21 <<>> www.deekeep.com OPT @10.0.1.1
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: FORMERR, id: 25083
;; flags: qr rd ad; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0
;; WARNING: recursion requested but not available

;; WARNING: EDNS query returned status FORMERR - retry with '+noedns'

;; QUESTION SECTION:
;www.deekeep.com.               IN      OPT

;; Query time: 3 msec
;; SERVER: 10.0.1.1#53(10.0.1.1) (UDP)
;; WHEN: Sun Jan 14 17:46:25 CET 2024
;; MSG SIZE  rcvd: 33

The upstream-server is in this case a AVM Fritzbox. All EDNS-options in blocky are at default state.

@ThinkChaos
Copy link
Collaborator

Thanks for the all the details.

My first impression is this is an issue with your upstream since both blocky and dig say the upstream returned an error.
I don't see an error in the tcpdump output though. I'm guessing it's supposed to be for the query from the blocky log?
But they don't exactly match since tcdump says it's a OPT query, but blocky logs A.

That might explain why dig and blocky don't see the same error:

  • blocky got SERVFAIL for an A query
  • dig got FORMERR and logs WARNING: EDNS query returned status FORMERR - retry with '+noedns' for an OPT query

Can you try to do an A query using dig to see if it matches blocky?
dig @10.0.1.1 www.deekeep.com should be enough to match blocky.

And get an OPT query from blocky's log?

A couple relevant things about blocky:

  • OPT records are never cached
  • SERVFAIL responses neither

I'll make a PR to fix resolver being broken in the blocky log:
resolver={resolver:0xc00037b040 lastErrorTime:{v:{wall:0 ext:62135596800 loc:0xc0000d1570}}}

@tjay
Copy link
Author

tjay commented Jan 14, 2024

the blocky log belongs 1:1 to the tcpdump-log, but the tcpdump was filtered to udp (because blocky said so) . omitting the filter its got clearer:

UDP:	25	0.853897	10.0.1.20	10.0.1.1	DNS	98	Standard query 0x7d10 A www.deekeep.com OPT
TCP: 	29	0.855039	10.0.1.20	10.0.1.1	DNS	124	Standard query 0x7d10 A www.deekeep.com OPT
TCP: 	31	1.101330	10.0.1.1	10.0.1.20	DNS	101	Standard query response 0x7d10 Server failure A www.deekeep.com
UDP:	36	1.379457	10.0.1.1	10.0.1.20	DNS	116	Standard query response 0x7d10 A www.deekeep.com CNAME deekeep.com A 44.205.225.91 OPT

this log belongs to single dig @10.0.1.20 www.deekeep.com

blocky logs SERVFAIL first. further queries match the response of the upstream (after the TTL expires, a new SERVFAIL will be triggered):

;; ANSWER SECTION:
www.deekeep.com.        482     IN      CNAME   deekeep.com.
deekeep.com.            482     IN      A       44.205.225.91

i cannot reproduce the SERVFAIL with dig.
dig www.deekeep.com OPT @10.0.1.1 +tcp uses TCP but i got the same answer as already posted...

@ThinkChaos
Copy link
Collaborator

i cannot reproduce the SERVFAIL with dig.

Can you try with an A query? dig @10.0.1.1 A www.deekeep.com +tcp

@tjay
Copy link
Author

tjay commented Jan 15, 2024

~]$ dig @10.0.1.1 A www.deekeep.com +tcp

; <<>> DiG 9.18.21 <<>> @10.0.1.1 A www.deekeep.com +tcp
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 17988
;; flags: qr rd ad; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0
;; WARNING: recursion requested but not available

;; QUESTION SECTION:
;www.deekeep.com.               IN      A

;; Query time: 3 msec
;; SERVER: 10.0.1.1#53(10.0.1.1) (TCP)
;; WHEN: Mon Jan 15 12:18:45 CET 2024
;; MSG SIZE  rcvd: 33

there is no SERVFAIL when using udp (even after TTL refresh)

asking the upstream behind the 10.0.1.1 (the provider DNS)

~]$ dig @89.246.64.8 A www.deekeep.com +tcp
;; communications error to 89.246.64.8#53: end of file
;; communications error to 89.246.64.8#53: end of file
;; communications error to 89.246.64.8#53: end of file

; <<>> DiG 9.18.21 <<>> @89.246.64.8 A www.deekeep.com +tcp
; (1 server found)
;; global options: +cmd
;; no servers could be reached

second request:

~]$ dig @89.246.64.8 A www.deekeep.com +tcp

; <<>> DiG 9.18.21 <<>> @89.246.64.8 A www.deekeep.com +tcp
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 15351
;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;www.deekeep.com.               IN      A

;; ANSWER SECTION:
www.deekeep.com.        600     IN      CNAME   deekeep.com.
deekeep.com.            600     IN      A       44.205.225.91

;; Query time: 446 msec
;; SERVER: 89.246.64.8#53(89.246.64.8) (TCP)
;; WHEN: Mon Jan 15 12:25:40 CET 2024
;; MSG SIZE  rcvd: 74

so maybe my provider has a problem, or the DNS entry deekeep.com (iam not affiliated to this domain)
but also this problem is handeled in blocky v0.22 in an other way so it does not leaks to the blocky-clients...

@tjay
Copy link
Author

tjay commented Jan 15, 2024

the tcpdump with v0.22 only shows 1 UDP query

1	0.000000	10.0.1.20	10.0.1.1	DNS	98	Standard query 0x9c56 A www.deekeep.com OPT
2	0.001961	10.0.1.1	10.0.1.20	DNS	116	Standard query response 0x9c56 A www.deekeep.com CNAME deekeep.com A 44.205.225.91 OPT

@tjay tjay changed the title EDNS confusion with v0.23 UDP/TCP confusion with v0.23 Jan 15, 2024
@ThinkChaos
Copy link
Collaborator

Yeah the difference is because of #1302: now blocky tries both TCP and UDP at the same time and uses the first response that didn't fail (in the network failure sense, not DNS non success response sense).
I'll change it to also wait for the second response if the first returned ServFail.

The main issue is likely in the software you run on the server upstream from blocky (10.0.1.1) and I encourage you to report a bug there as anything we do on our side is just a workaround.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🐞 bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants