To install the Twisted.Web server, you'll need to have installed Twisted.
Twisted servers, like the web server, do not have configuration files. Instead, you instantiate the server and store it into a 'Pickle' file, web.tap
. This file will then be loaded by the Twisted Daemon.
% mktap web --path /path/to/web/content
If you just want to serve content from your own home directory, the following will do:
% mktap web --path ~/public_html/
Some other configuration options are available as well:
--port
: Specify the port for the web
server to listen on. This defaults to 8080. --logfile
: Specify the path to the
log file. The full set of options that are available can be seen with:
% mktap web --help
Once you've created your web.tap
file and done any configuration, you can start the server:
% twistd -f web.tap
You can stop the server at any time by going back to the directory you started it in and running the command:
% kill `cat twistd.pid`
Twisted.Web serves flat HTML files just as it does any other flat file.
A Resource script is a Python file ending with the extension .rpy
, which is required to create an instance of a (subclass of a) twisted.web.resource.Resource
.
Resource scripts have 3 special variables:
__file__
: The name of the .rpy file, including the full path. This variable is automatically defined and present within the namespace. registry
: An object of class static.Registry
. It can be used to access and set persistent data keyed by a class.resource
: The variable which must be defined by the script and set to the resource instance that will be used to render the page. A very simple Resource Script might look like:
from twisted.web import resource class MyGreatResource(resource.Resource): def render(self, request): return "<html>foo</html>" resource = MyGreatResource()
A slightly more complicated resource script, which accesses some persistent data, might look like:
from twisted.web import resource from SillyWeb import Counter counter = registry.getComponent(Counter) if not counter: registry.setComponent(Counter, Counter()) counter = registry.getComponent(Counter) class MyResource(resource.Resource): def render(self, request): counter.increment() return "you are visitor %d" % counter.getValue() resource = MyResource()
This is assuming you have the SillyWeb.Counter
module,
implemented something like the following:
class Counter: def __init__(self): self.value = 0 def increment(self): self.value += 1 def getValue(self): return self.value
The Woven API is an advanced system for giving web UIs to your application with something resembling MVC and templates. See its documentation for more details.
One of the most interesting applications of Twisted.Web is the distributed webserver; multiple servers can all answer requests on the same port, using the twisted.spread
package for spreadable
computing. In two different directories, run the commands:
% mktap web --user % mktap web --personal [other options, if you desire]
Both of these create a web.tap
; you need to run both at the same time. Once you have, go to http://localhost:8080/your_username.twistd/
-- you will see the front page from the server you created with the --personal
option. What's happening here is that the request you've sent is being relayed from the central (User) server to your own (Personal) server, over a PB connection. This technique can be highly useful for small community
sites; using the code that makes this demo work, you can connect one HTTP port to multiple resources running with different permissions on the same machine, on different local machines, or even over the internet to a remote site.
Everything related to CGI is located in the
twisted.web.twcgi
, and it's here you'll find the classes that you
need to subclass in order to support the language of your (or somebody elses)
taste. You'll also need to create your own kind of resource if you are using a
non-unix operatingsystem (such as Windows), or if the default resources has
wrong pathnames to the parsers.
The following snippet is a .rpy that serves perl-files. Look at twisted.web.twcgi
for more examples regarding twisted.web and CGI.
from twisted.web import static, twcgi class PerlScript(twcgi.FilteredScript): filter = '/usr/bin/perl' # Points to the perl parser resource = static.File("/perlsite") # Points to the perl website resource.processors = {".pl": PerlScript} # Files that end with .pl will be # processed by PerlScript resource.indexNames = ['index.pl']
It is common to use one server (for example, Apache) on a site with multiple
names which then uses reverse proxy (in Apache, via mod_proxy
) to different
internal web servers, possibly on different machines. However, naive
configuration causes miscommunication: the internal server firmly believes it
is running on internal-name:port
, and will generate URLs to that effect,
which will be completely wrong when received by the client.
While Apache has the ProxyPassReverse directive, it is really a hack and is nowhere near comprehensive enough. Instead, the recommended practice in case the internal web server is Twisted.Web is to use VHostMonster.
From the Twisted side, using VHostMonster is easy: just drop a file named
(for example) vhost.rpy
containing the following:
from twisted.web import vhost resource = vhost.VHostMonsterResource()
Of course, an equivalent .trp
can also be used. Make sure
the web server is configured with the correct processors for the
rpy
or trp
extensions (the web server
mktap web --path
generates by default is so configured).
From the Apache side, instead of using the following ProxyPass directive:
<VirtualHost ip-addr> ProxyPass / http://localhost:8538/ ServerName example.com </VirtualHost>
Use the following directive:
<VirtualHost ip-addr> ProxyPass / http://localhost:8538/vhost.rpy/http/example.com:80/ ServerName example.com </VirtualHost>
Here is an example for Twisted.Web's reverse proxy:
from twisted.internet import app from twisted.web import proxy, server, vhost vhostName = 'example.com' reverseProxy = proxy.ReverseProxyResource('internal', 8538, '/vhost.rpy/http/'+vhostName+'/') root = vhost.NamedVirtualHost() root.addHost(vhostName, reverseProxy) site = server.Site(root) application = app.Application('web-proxy') application.listenTCP(80, site)
Sometimes it is convenient to modify the content of the
Request
object
before passing it on. Because this is most often used to rewrite
either the URL, the similarity to Apache's mod_rewrite
has
inspired the twisted.web.rewrite
module. Using
this module is done via wrapping a resource with a
twisted.web.rewrite.RewriterResource
which
then has rewrite rules. Rewrite rules are functions which accept a request
object, and possible modify it. After all rewrite rules run, the child
resolution chain continues as if the wrapped resource, rather than
the RewriterResource
,
was the child.
Here is an example, using the only rule currently supplied by Twisted itself:
default_root = rewrite.RewriterResource(default, rewrite.tildeToUsers)
This causes the URL /~foo/bar.html
to be treated like
/users/foo/bar.html
. If done after setting default's
users
child to a
distrib.UserDirectory
,
it gives a configuration similar to the classical configuration of
web server, common since the first NCSA servers.
Sometimes it is useful to know when the other side has broken the connection. Here is an example which does that:
from twisted.web.resource import Resource from twisted.web import server from twisted.internet import reactor from twisted.python.util import println class ExampleResource(Resource): def render(self, request): request.write("hello world") d = request.notifyFinish() d.addCallback(lambda _: println("finished normally")) d.addErrback(println, "error") reactor.callLater(10, request.finish) return server.NOT_DONE_YET resource = ExampleResource()
This will allow us to run statistics on the log-file to see how many users are frustrated after merely 10 seconds.
Sometimes, you want to be able to send headers and status directly. While
you can do this with a
ResourceScript
, an easier
way is to use AsIsProcessor
.
Use it by, for example, addding it as a processor for the .asis
extension. Here is a sample file:
HTTP/1.0 200 OK Content-Type: text/html Hello world
Twisted Web serves python objects that implement the interface IResource.
HTTPChannel
instances to parse the HTTP request, and begin the object lookup process. They contain the root Resource, the resource which represents the URL /
on the site.IResource
interface describes the methods a Resource object must implement in order to participate in the object publishing process.Site objects serve as the glue between a port to listen for HTTP requests on, and a root Resource object.
When using mktap web --path /foo/bar/baz
, a Site object is created with a root Resource that serves files out of the given path.
You can also create a Site
instance by hand, passing it a
Resource
object which will serve as the root of the site:
from twisted.web import server, resource from twisted.internet import reactor class Simple(resource.Resource): isLeaf = True def render(self, request): return "<html>Hello, world!</html>" site = server.Site(Simple()) reactor.listenTCP(8080, site) reactor.run()
Resource
objects represent a single URL segment of a site. During URL parsing, getChild
is called on the current Resource
to produce the next Resource
object.
When the leaf Resource is reached, either because there were no more URL segments or a Resource had isLeaf set to True, the leaf Resource is rendered by calling render(request)
.
During the Resource location process, the URL segments which have already been processed and those which have not yet been processed are available in request.prepath
and request.postpath
.
A Resource can know where it is in the URL tree by looking at request.prepath
, a list of URL segment strings.
A Resource can know which path segments will be processed after it by looking at request.postpath
.
If the URL ends in a slash, for example http://example.com/foo/bar/
, the final URL segment will be an empty string. Resources can thus know if they were requested with or without a final slash.
Here is a simple Resource object:
from twisted.web.resource import Resource class Hello(Resource): def getChild(self, name, request): if name == '': return self return Resource.getChild( self, name, request) def render(self, request): return """<html> Hello, world! I am located at %r. </html>""" % (request.prepath) resource = Hello()
Resources can be arranged in trees using putChild
. putChild
puts a Resource instance into another Resource instance, making it available at the given path segment name:
root = Hello() root.putChild('fred', Hello()) root.putChild('bob', Hello())
If this root resource is served as the root of a Site instance, the following URLs will all be valid:
http://example.com/
http://example.com/fred
http://example.com/bob
http://example.com/fred/
http://example.com/bob/
Files with the extension .rpy
are python scripts which, when placed in a directory served by Twisted Web, will be executed when visited through the web.
An .rpy
script must define a variable, resource
, which is the Resource object that will render the request.
.rpy
files are very convenient for rapid development and prototyping. Since they are executed on every web request, defining a Resource subclass in an .rpy
will make viewing the results of changes to your class visible simply by refreshing the page:
class MyResource(resource.Resource): def render(self, request): return "<html>Hello, world!</html>" resource = MyResource()
However, it is often a better idea to define Resource subclasses in Python modules. In order for changes in modules to be visible, you must either restart the Python process, or reload the module:
import myresource ## Comment out this line when finished debugging reload(myresource) resource = myresource.MyResource()
Creating a Twisted Web server which serves a directory is easy:
% mktap web --path /Users/dsp/Sites % twistd -nf web.tap
Resource rendering occurs when Twisted Web locates a leaf Resource object to handle a web request. A Resource object may do various things to produce output which will be sent back to the browser:
request.write("stuff")
as many times as desired, then call request.finish()
and return server.NOT_DONE_YET
(This is deceptive, since you are in fact done with the request, but is the correct way to do this)Deferred
, return server.NOT_DONE_YET
, and call request.write("stuff")
and request.finish()
later, in a callback on the Deferred
.HTTP is a stateless protocol; every request-response is treated as an individual unit, distinguishable from any other request only by the URL requested. With the advent of Cookies in the mid nineties, dynamic web servers gained the ability to distinguish between requests coming from different browser sessions by sending a Cookie to a browser. The browser then sends this cookie whenever it makes a request to a web server, allowing the server to track which requests come from which browser session.
Twisted Web provides an abstraction of this browser-tracking behavior called the Session object. Calling request.getSession()
checks to see if a session cookie has been set; if not, it creates a unique session id, creates a Session object, stores it in the Site, and returns it. If a session object already exists, the same session object is returned. In this way, you can store data specific to the session in the session object.