The low-level protocols beneath HTTP identify machines by IP
addresses, sequences of four 8-bit integers such as
199.232.41.10
1. HTTP, on the other hand, and most application
protocols, manipulate host names, strings such as www.polipo.org
.
The domain name service (DNS) is a distributed database that maps host names to IP addresses. When an application wants to make use of the DNS, it invokes a resolver, a local library or process that contacts remote name servers.
The Unix interface to the resolver is provided by the
gethostbyname
(3) library call (getaddrinfo
(3) on recent
systems), which was designed at a time when a host lookup consisted in
searching for one of five hosts in a HOSTS.TXT
file. The
gethostbyname
call is blocking, meaning that all activity
must cease while a host lookup is in progress. When the call
eventually returns, it doesn't provide a time to live (TTL)
value to indicate how long the address may be cached. For these
reasons, gethostbyname
is hardly useful for programs that need
to contact more than a few hosts2.
In order to avoid the gethostname
(3)'s issues, Polipo usually
tries to speak the DNS protocol itself rather than using the system
resolver. Its precise behaviour is controlled by the value of
dnsUseGethostbyname
. If dnsUseGethostbyname
is
false
, Polipo never uses the system resolver. If it is
reluctantly
(the default), Polipo tries to speak DNS and falls
back to the system resolver if a name server could not be contacted.
If it is happily
, Polipo tries to speak DNS, and falls back to
the system resolver if the host couldn't be found for any reason.
Finally, if dnsUseGethostbyname
is true
, Polipo never
tries to speak DNS itself and uses the system resolver straight away.
If the internal DNS support is used, Polipo must be given a recursive
name server to speak to. By default, this information is taken from
the /etc/resolv.conf
file; however, if you wish to use a
different name server, you may set the variable dnsNameServer
to an IP address3.
When the reply to a DNS request is late to come, Polipo will retry
multiple times using an exponentially increasing timeout. The maximum
timeout used before Polipo gives up is defined by dnsMaxTimeout
(default 60s); the total time before Polipo gives up on a DNS
query will be roughly twice dnsMaxTimeout
.
The variable dnsNegativeTtl
specifies the time during which
negative DNS information (information that a host doesn't
exist) will be cached; this defaults to 120s. Increasing this
value reduces both latency and network traffic but may cause a failed
host not to be noticed when it comes back up.
The variable dnsQueryIPv6
specifies whether to query for IPv4
or IPv6 addresses. If dnsQueryIPv6
is false
, only IPv4
addresses are queried. If dnsQueryIPv6
is reluctantly
,
both types of addresses are queried, but IPv4 addresses are preferred.
If dnsQueryIPv6
is happily
(the default), IPv6 addresses
are preferred. Finally, if dnsQueryIPv6
is true
, only
IPv6 addresses are queried.
If the system resolver is used, the value dnsGethostbynameTtl
specifies the time during which a gethostbyname
reply will be
cached (default 5 minutes).
Or sequences of eight 16-bit integers if you are running IPv6.
Recent systems replace
gethostbyname
(3) by getaddrinfo
(3), which is reentrant.
While this removes one important problem that multi-threaded programs
encounter, it doesn't solve any of the other issues with
gethostbyname
.
While Polipo does its own caching of DNS
data, I strongly recommend that you run a local caching name server.
I am very happy with
pdnsd
,
notwithstanding its somewhat bizarre handling of TCP connections.