Discussion:
bind/named dying after 24-48 hrs. "assertion failure"?
(too old to reply)
David C. Rankin
2018-12-07 08:27:49 UTC
Permalink
All,

Has anyone else encountered a bind 9.13.4-1/named daemon dying with
"assertion failure" in the past week. I have encountered the problem twice.
When named dies, status reports:

● named.service - Internet domain name server
Loaded: loaded (/usr/lib/systemd/system/named.service; enabled; vendor
preset: disabled)
Active: failed (Result: signal) since Thu 2018-12-06 10:35:51 CST; 15h ago
Process: 23007 ExecStart=/usr/bin/named -f -u named (code=killed, signal=ABRT)
Main PID: 23007 (code=killed, signal=ABRT)

Dec 06 10:35:51 phoinix named[23007]: #1 0x7f078ea0fcaa in ??
Dec 06 10:35:51 phoinix named[23007]: #2 0x7f078ebc295b in ??
Dec 06 10:35:51 phoinix named[23007]: #3 0x7f078ebc806e in ??
Dec 06 10:35:51 phoinix named[23007]: #4 0x7f078ebcbaba in ??
Dec 06 10:35:51 phoinix named[23007]: #5 0x7f078ea2f349 in ??
Dec 06 10:35:51 phoinix named[23007]: #6 0x7f078df1ea9d in ??
Dec 06 10:35:51 phoinix named[23007]: #7 0x7f078de4eb23 in ??
Dec 06 10:35:51 phoinix named[23007]: exiting (due to assertion failure)
Dec 06 10:35:51 phoinix systemd[1]: named.service: Main process exited,
code=killed, status=6/ABRT
Dec 06 10:35:51 phoinix systemd[1]: named.service: Failed with result 'signal'.

This hasn't happened before. A simple restart makes it happy again -- but
for how long?

# systemctl start named
● named.service - Internet domain name server
Loaded: loaded (/usr/lib/systemd/system/named.service; enabled; vendor
preset: disabled)
Active: active (running) since Fri 2018-12-07 02:04:03 CST; 4s ago
Main PID: 26487 (named)
Tasks: 10 (limit: 4915)
Memory: 15.5M
CGroup: /system.slice/named.service
└─26487 /usr/bin/named -f -u named

Dec 07 02:04:03 phoinix named[26487]: managed-keys-zone: loaded serial 129
Dec 07 02:04:03 phoinix named[26487]: zone 0.0.127.in-addr.arpa/IN: loaded
serial 42
Dec 07 02:04:03 phoinix named[26487]: zone rlfpllc.com/IN: loaded serial
2017000017
Dec 07 02:04:03 phoinix named[26487]: zone localhost/IN: loaded serial 42
Dec 07 02:04:03 phoinix named[26487]: zone 7.168.192.in-addr.arpa/IN: loaded
serial 2017000011
Dec 07 02:04:03 phoinix named[26487]: all zones loaded
Dec 07 02:04:03 phoinix named[26487]: running
Dec 07 02:04:04 phoinix named[26487]: managed-keys-zone: Key 19036 for zone .
is now trusted (acceptance timer com>
Dec 07 02:04:04 phoinix named[26487]: managed-keys-zone: Key 20326 for zone .
is now trusted (acceptance timer com>
Dec 07 02:04:04 phoinix named[26487]: resolver priming query complete

I don't know if this is somehow related to the timers and keys that may be
timing out after some period of time. If no one else has seen this, I'll keep
watching, but if anybody has any ideas, I'd welcome your thoughts.
--
David C. Rankin, J.D.,P.E.
Ismael Bouya
2018-12-07 08:37:10 UTC
Permalink
Post by David C. Rankin
Has anyone else encountered a bind 9.13.4-1/named daemon dying with
"assertion failure" in the past week. I have encountered the problem twice.
When named dies, status reports: (...)
Hey there,
It happened to me once, with the exact same symptoms, but I took it as a
random failure. Now that you mention it too, maybe there is something to
look at.

It happened for me after the bind upgrade (9.13.3-3 -> 9.13.4-1)
Post by David C. Rankin
I don't know if this is somehow related to the timers and keys that may be
timing out after some period of time. If no one else has seen this, I'll keep
watching, but if anybody has any ideas, I'd welcome your thoughts.
I have no specific setup for keys/timers

Kind regards,
--
Ismael
Amish via arch-general
2018-12-07 08:41:30 UTC
Permalink
All, Has anyone else encountered a bind 9.13.4-1/named daemon dying
with "assertion failure" in the past week. I have encountered the
problem twice.
Happens to me everyday (morning) from about a week or may be more.

Dec 07 07:40:47 amish named[768]: resolver.c:10484: REQUIRE(fetchp !=
((void *)0) && *fetchp == ((void *)0)) failed, back trace
Dec 07 07:40:47 amish named[768]: #0 0x55ea39e304d2 in ??
Dec 07 07:40:47 amish named[768]: #1 0x7f889343ccaa in ??
Dec 07 07:40:47 amish named[768]: #2 0x7f88935ef95b in ??
Dec 07 07:40:47 amish named[768]: #3 0x7f88935f506e in ??
Dec 07 07:40:47 amish named[768]: #4 0x7f88935f8aba in ??
Dec 07 07:40:47 amish named[768]: #5 0x7f889345c349 in ??
Dec 07 07:40:47 amish named[768]: #6 0x7f889294ba9d in ??
Dec 07 07:40:47 amish named[768]: #7 0x7f889287bb23 in ??
Dec 07 07:40:47 amish named[768]: exiting (due to assertion failure)
A simple restart makes it happy again -- but for how long?
It does not happen after I restart it, till I shutdown laptop at night.

And then next morning happens again, within an hour, after turning on
the laptop.

As per log above some kind of assertion fails. I havent bothered to find
exact reason though.

I have made only 2 changes to default named.conf that ships with Arch.

Allowed recursion and using Cloudflare as forwarder.

-     allow-recursion { 127.0.0.1; };
+     allow-recursion { any; };
+     forward first;
+     forwarders { 1.0.0.1 ; 1.1.1.1 ; }; //Cloudflare

Regards,

Amish.
Søren Rindom Andersen via arch-general
2018-12-07 09:41:15 UTC
Permalink
Hi

This seems to bedste the same as this bugreport:
https://bugs.archlinux.org/task/60913?project=1&string=bind

BR,
Søren

fre. 7. dec. 2018 09.41 skrev Amish via arch-general <
Post by Amish via arch-general
All, Has anyone else encountered a bind 9.13.4-1/named daemon dying
with "assertion failure" in the past week. I have encountered the
problem twice.
Happens to me everyday (morning) from about a week or may be more.
Dec 07 07:40:47 amish named[768]: resolver.c:10484: REQUIRE(fetchp !=
((void *)0) && *fetchp == ((void *)0)) failed, back trace
Dec 07 07:40:47 amish named[768]: #0 0x55ea39e304d2 in ??
Dec 07 07:40:47 amish named[768]: #1 0x7f889343ccaa in ??
Dec 07 07:40:47 amish named[768]: #2 0x7f88935ef95b in ??
Dec 07 07:40:47 amish named[768]: #3 0x7f88935f506e in ??
Dec 07 07:40:47 amish named[768]: #4 0x7f88935f8aba in ??
Dec 07 07:40:47 amish named[768]: #5 0x7f889345c349 in ??
Dec 07 07:40:47 amish named[768]: #6 0x7f889294ba9d in ??
Dec 07 07:40:47 amish named[768]: #7 0x7f889287bb23 in ??
Dec 07 07:40:47 amish named[768]: exiting (due to assertion failure)
A simple restart makes it happy again -- but for how long?
It does not happen after I restart it, till I shutdown laptop at night.
And then next morning happens again, within an hour, after turning on
the laptop.
As per log above some kind of assertion fails. I havent bothered to find
exact reason though.
I have made only 2 changes to default named.conf that ships with Arch.
Allowed recursion and using Cloudflare as forwarder.
- allow-recursion { 127.0.0.1; };
+ allow-recursion { any; };
+ forward first;
+ forwarders { 1.0.0.1 ; 1.1.1.1 ; }; //Cloudflare
Regards,
Amish.
Loading...