Gnut Manual: Ideas for Advanced Use

Next Previous Contents



6. Advanced Topics


6.1 High Availability Connection Point

Gnut can be used as a publicized connection point into the main gnutella network. Just set max_incoming to somewhere around 10 or 20, and redirect to 1. It might be best not to share any files in this case as the gnutella traffic itself is liable to consume large amounts of bandwidth.


6.2 Scriptable Activity

Using expect, it should be possible to automate certain types of tasks in gnut. For example, if there is a file with a certain name that you are trying to get, but the server it is on is not often available, you could write an expect script that runs gnut, issues a search for the exact filename, verifies that a file is found, and starts (or resumes) a download.

There is also a lot that can be done with the recently-added load and eval commands and backquote command substitution. One user suggested having an eval command in his .gnutrc startup that loads a blacklist file from an FTP server somewhere and then runs it -- this is an easy way to keep up-to-date on which sites have been blacklisted.


6.3 Network Topology

gnut users often ask how to maximize their search results. In general, increasing min_connections and ttl increases the number of hosts your searches will reach, but there is more to it than that.

If you do an update command twice in a row, giving it 10 minutes to run each time, you'll see that the number of hosts changes, often significantly.

The reason the results change is because the network is always changing. Connections close, and new connections get created. Every change affects the number of hosts you can reach.

The biggest thing that affects the number of hosts you can reach is when two of your neighbors are connected by a small loop:

          e           f
           \         /
      d --- YOU --- A --- g
             |      |
             |      |
      k ---- B ---- C --- h
            /        \
           j          i

Notice how there are two paths from you to C. Also, see the 8 hosts (d,e,f,g,h,i,j,k) that are in the outer ring. If there wasn't a loop, nodes B and C would both have an additional branch they could use to reach out to more nodes:

          e           f
           \         /
      d --- YOU --- A --- g
             |      |
             |      |
      k ---- B      C --- h
            / \    / \
           j   l  m   i

In general, the more loops there are, the less nodes you can reach.

However, you also need loops for the nodes to be able to reach each other. In the first diagram, all the nodes can reach each other in 3 hops. In the second diagram, some nodes (like i and j, or l and m) are 5 hops away from each other. If their ttl was 4, they wouldn't be able to search each other's shared files.

This looks like a contradiction: loops seem to both increase and decrease the number of reachable nodes. The ideal balance is for there to be lots of loops of large size, but no really small loops. In fact, the most efficient large-scale multiprocessor supercomputers do exactly that -- they use a hypercube or N-dimensional torus network.

gnut dynamically adjusts its network connections so as to avoid small loops. It does this by watching the duplicate-packet statistics on all the connections, and closing the connections with the most duplicate packets. It does not deliberately form large loops, but it actively breaks small loops when it finds itself is part of a small loop, and large loops always form naturally.

Small loops would be rare if all Gnutella clients picked hosts randomly, but many clients don't. Instead, they connect to hosts in whatever order they appear in the host list, and the host list usually comes from PONG packets, so nearby hosts usually end up at the beginning of the list.


Next Previous Contents


WWW: http://www.mrob.com/
EMail: mrob at mrob com (If you aren't a spambot you can rewrite this yourself)
Send junk mail to: rpm@mrob.com

© 1996-2000 Robert P. Munafo. 3