Discussion:
Python and /etc/resolv.conf changes
Jesse Keating
2009-01-20 23:48:30 UTC
Permalink
I'm looking at what might be a bug in python. The anaconda installer
launches python, then we get network configs going and write out some
files such as /etc/resolv.conf. However that original python process
can't seem to resolve anything after this file has been written out,
whereas new python processes started on a different terminal can indeed
use the new /etc/resolv.conf data.

My thought is that the original python process is using stale
information, and that something like a res_init() is needed, but google
doesn't seem to have any real connection between python and calling
res_init. The guys in #python on freenode aren't exactly sure what to
do here either.

In fact, I just did a simple test of an strace on python, where I import
socket, do a lookup, modify /etc/resolv.conf, do more lookups and review
the results. Strace shows that python opens /etc/resolv.conf exactly
once (after I've imported socket), and never again, so it never sees any
of the changes made.

Can anybody confirm what I'm seeing as buggy and in need of fixing?
--
Jesse Keating
Fedora -- Freedom? is a feature!
identi.ca: http://identi.ca/jkeating
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 197 bytes
Desc: This is a digitally signed message part
Url : http://lists.fedoraproject.org/pipermail/devel/attachments/20090120/bfcd997e/attachment.bin
Colin Walters
2009-01-21 01:19:41 UTC
Permalink
Post by Jesse Keating
I'm looking at what might be a bug in python. The anaconda installer
launches python, then we get network configs going and write out some
files such as /etc/resolv.conf. However that original python process
can't seem to resolve anything after this file has been written out,
whereas new python processes started on a different terminal can indeed
use the new /etc/resolv.conf data.
My thought is that the original python process is using stale
information, and that something like a res_init() is needed, but google
doesn't seem to have any real connection between python and calling
res_init. The guys in #python on freenode aren't exactly sure what to
do here either.
Nothing specific to Python here as far as I know; glibc just caches
its first read of /etc/resolv.conf. Everything is affected. The only
program that mostly works is Firefox because they go out of their way
to unbreak things (i.e. call res_init when they get notification from
NetworkManager).

The previous thread was here:

http://www.mailinglistarchive.com/fedora-devel-list at redhat.com/msg39340.html

I think we were stuck on the choice between nscd, bind, and some other
caching package.

Both Debian/Ubuntu
(http://patches.ubuntu.com/g/glibc/extracted/any/local-dynamic-resolvconf.diff)
and OpenSUSE (http://download.opensuse.org/distribution/11.0/repo/src-oss/suse/src/glibc-2.8-14.1.src.rpm,
resolv.dynamic.diff) ship a patch to glibc which stats() resolv.conf.
We do not.
Colin Walters
2009-01-21 14:26:42 UTC
Permalink
Post by Colin Walters
Both Debian/Ubuntu
(http://patches.ubuntu.com/g/glibc/extracted/any/local-dynamic-resolvconf.diff)
and OpenSUSE (http://download.opensuse.org/distribution/11.0/repo/src-oss/suse/src/glibc-2.8-14.1.src.rpm,
resolv.dynamic.diff) ship a patch to glibc which stats() resolv.conf.
We do not.
Thinking about this a bit more this morning on the shuttle, there's a
strong argument that this is a glibc bug, and that the stat() approach
is a correct fix, if not necessarily the most ideal one. That
argument is simply that glibc is caching data without a mechanism for
invalidation; and a cache without invalidation is always a bug.
Dan Williams
2009-01-22 16:00:14 UTC
Permalink
Post by Colin Walters
Post by Colin Walters
Both Debian/Ubuntu
(http://patches.ubuntu.com/g/glibc/extracted/any/local-dynamic-resolvconf.diff)
and OpenSUSE (http://download.opensuse.org/distribution/11.0/repo/src-oss/suse/src/glibc-2.8-14.1.src.rpm,
resolv.dynamic.diff) ship a patch to glibc which stats() resolv.conf.
We do not.
Thinking about this a bit more this morning on the shuttle, there's a
strong argument that this is a glibc bug, and that the stat() approach
is a correct fix, if not necessarily the most ideal one. That
argument is simply that glibc is caching data without a mechanism for
invalidation; and a cache without invalidation is always a bug.
This was discussed with the glibc maintainers a long time ago, and was
rejected for various reasons (see below). Their answer at the time was
to use "lwresd", a lightweight caching nameserver, or nscd to provide
this functionality. This was back in 2004, so perhaps things have
changed, and maybe it's time to strike up the conversation again.
However, I suspect the answer is still "use nscd".

Dan

--------------------------------------
Post by Colin Walters
Post by Colin Walters
1) make glibc stat() /etc/resolv.conf on every call that does name
lookups
2) make glibc re-read /etc/resolv.conf every time something does a name
lookup
3) use nscd instead, and modify ncsd to do either (1) or (2)?
None of this is an option. There is no way we are going to make
everybody pay the price for the needs of a few people who wants
everything to happen automatically.
The solution is to use nscd and have some external code explicitly flush
the cache with
service nscd reload
This is already possible for, I guess, 5-6 years.
Mike McGrath
2009-01-21 02:02:01 UTC
Permalink
Post by Jesse Keating
I'm looking at what might be a bug in python. The anaconda installer
launches python, then we get network configs going and write out some
files such as /etc/resolv.conf. However that original python process
can't seem to resolve anything after this file has been written out,
whereas new python processes started on a different terminal can indeed
use the new /etc/resolv.conf data.
My thought is that the original python process is using stale
information, and that something like a res_init() is needed, but google
doesn't seem to have any real connection between python and calling
res_init. The guys in #python on freenode aren't exactly sure what to
do here either.
In fact, I just did a simple test of an strace on python, where I import
socket, do a lookup, modify /etc/resolv.conf, do more lookups and review
the results. Strace shows that python opens /etc/resolv.conf exactly
once (after I've imported socket), and never again, so it never sees any
of the changes made.
Can anybody confirm what I'm seeing as buggy and in need of fixing?
I believe this might be the same thing that happens to preupgrade and
smolt. It's a hacky fix but here's how we dealt with it:

http://tinyurl.com/8lz8gh

-Mike
Loading...