WSGI

Web Server Gateway Interface

This presentation, with sample source code, may be downloaded here:

http://archimedeanco.com/wsgi-tutorial.tar.gz

Click mouse or press spacebar to advance slides.

© 2009 Chris Rossi

WSGI: Web Server Gateway Interface

The specification

PEP 333 http://www.python.org/dev/peps/pep-0333/

A WSGI application is a callable with the following signature:

def application(environ, start_response)

environ is a dictionary which contains everything in os.environ plus variables which contain data about the HTTP reqeust.

start_response is a callable which the application calls to set the status code and headers for a response, before returning the body of the response.

The return value for the WSGI application is an iterable object. Each item in the iterable is a chunk of the HTTP response body as a byte string.

WSGI: Web Server Gateway Interface

environ

The environ dict has pretty much everything you would expect if you were using CGI.

In fact, the canonical method for parsing form parameters involves using the cgi module in the standard library.

All HTTP headers sent with the request are available as variables named HTTP_{header_name}.

Some variables specific to WSGI are present as well and are named wsgi.*.

The short, exhaustive list of variables is available in the excellent specification document.

WSGI: Web Server Gateway Interface

Processing request parameters

The variable, wsgi.input, is a file-like object which can be read to retrieve the body of an HTTP POST request.

The following code creates an instance of cgi.FieldStorage from which we can get request parameters.

import cgi
form = cgi.FieldStorage(fp=environ['wsgi.input'], 
                        environ=environ)

Because we pass in the environ, the above code will also parse any query string for our request, integrating GET and POST parameters into a single data structure.

The other way to process request parameters, that is rapidly becoming canonical, is to use webob.Request.

We will have more on webob, later.

WSGI: Web Server Gateway Interface

start_response

start_response is a callable which has the following signature:

start_response(status, response_headers, exc_info=None)

start_response is widely regarded as being quite weird and will likely not be present in version 2.0 of the WSGI spec, if and when there ever is one.

status is a string which contains the numeric HTTP status code of the response followed by the standard HTTP message for that status code. This is straight up HTTP.

A few common stati look like this:

"200 OK"
"404 NOT FOUND"
"500 SERVER ERROR"

WSGI: Web Server Gateway Interface

start_response: response_headers

response_headers is a list of tuples where each tuple is the name of an HTTP response header and then it's value.

For example:

response_headers = [
    ("Content-type", "text/html"),
    ("Content-length", str(len(body)) ),
    ]
    
start_response("200 OK", response_headers)
return [body,]

The response headers may be modified by the server or any middleware that is downstream of your response.

At a bare minimum you should set the Content-type header, although the server will probably substitue text/plain if you fail to set this.

It is also good practice to set the Content-length header as this allows a client to make another request on the same socket connection and provide the user with download progress information.

WSGI: Web Server Gateway Interface

start_response: exc_info

The exc_info argument is optional and may be used to communicate traceback data to downstream components.

If used, exc_info must be a tuple of the form returned by sys.exc_info().

If your application catches an exception and generates an HTTP error response, it is a good idea call sys.exc_info and pass the result here.

The server or other downstream components can potentially use this to provide error logging or pretty HTML stack traces.

I'm not sure how widely this is used in the wild. Set it anyway.

WSGI: Web Server Gateway Interface

Example 1: Hello World!

The follow is about as simle a working WSGI app as you can write.

def application(environ, start_response):
    start_response("200 OK", [("Content-type", "text/plain")])
    return ["Hello World!",]

That's cool, but how do we run it?

There are several server components out there that are able to run WSGI apps, but for simple testing purposes we can just use the reference implementation included in Python's standard library.

if __name__ == '__main__':
    from wsgiref.simple_server import make_server
    server = make_server('localhost', 8080, application)
    server.serve_forever()

Voila.

WSGI: Web Server Gateway Interface

Example 2: Using a generator for the response body

When serving up an HTML or similar page, it is probably the most common case to render a template to a string and then return a one item list which contains that string.

body = template.render(context=context, view=view)
return [body,]

What if you are delivering a large payload and don't want to store the whole thing in memory at once? Use an iterable!

WSGI: Web Server Gateway Interface

Example 2: Using a generator for the response body

An easy way to make an iterator in Python is to use a generator. Let's say we have an application that servies static files. We could write a generator like this that serves files one chunk at a time, so we only have to store BLOCK_SIZE bytes of the file in memory at any given moment.

def send_file(file_path, size):
    with open(file_path) as f:
        block = f.read(BLOCK_SIZE)
        while block:
            yield block
            block = f.read(BLOCK_SIZE)

The part of our WSGI application that uses the generator might look like this:

size = os.path.getsize(file_path)
headers = [
    ("Content-type", mimetype),
    ("Content-length", str(size)),
]
start_response("200 OK", headers)
return send_file(file_path, size)

WSGI: Web Server Gateway Interface

WSGI Filters (aka Middleware)

Because the WSGI spec is so compact and just specifies a signature for a callable, it is easy to conceive of and implement WSGI applications which call other WSGI implementations, either modifying the request on the way in, the response on the way out, or both.

Such applications are often called middleware applications or filters. Filters can be chained together and the resulting chain is generally referred to as a pipeline.

The generalized form of a filter might look something like this:

class Filter(object):
    def __init__(self, application):
        self.application = application
        
    def __call__(self, environ, start_response):
        # Do something here to modify request
        pass
        
        # Call the wrapped application
        app_iter = self.application(environ, 
                                    self._sr_callback(start_response))
        
        # Do something to modify the response body
        pass
        
        # Return modified response
        return app_iter
        
    def _sr_callback(self, start_response):
        def callback(status, headers, exc_info=None):
            # Do something to modify the response status or headers
            pass
        
            # Call upstream start_response
            start_response(status, headers, exc_info)
        return callback

WSGI: Web Server Gateway Interface

Example 3: Serve static files from middleware

We can convert our standalone file_server application into a middleware application that calls a dynamic application if a static file cannot be found for the request.

Our constructor changes from this:

def __init__(self, path):
    """ path is directory where static files are stored
    """
    self.path = path

To this:

def __init__(self, application, path):
    """ path is directory where static files are stored
    """
    self.path = path
    self.application = application

WSGI: Web Server Gateway Interface

Example 3: Serve static files from middleware

Instead of returning a 404 if we don't find a file, we delegate to the wrapped application. So this:

# If file does not exist, return 404
if not os.path.exists(file_path):
    return self._not_found(start_response)

Becomes this:

# If file does not exist, delegate to wrapped application
if not os.path.isfile(file_path):
    return self.application(environ, start_response)

WSGI: Web Server Gateway Interface

Webob

WSGI was intentionally designed to be minimal for ease of implementation for web servers and for ease of adoption by application programmers.

No one likes to manipulate environ directly, nor does anyone really like using start_response very much. The API provided by WSGI, while easy to implement, is not particularly satisfying to use semantically.

Nearly every application or framework I've seen based on WSGI has invented some sort of request and response objects to wrap manipulation of environ and start_response in a more semantic, less error prone shell.

Webob has emerged as the canonical implementation for request and response objects. Webob makes WSGI easier and more satisfying to use.

Webob has thorough documentation on the web here:

http://pythonpaste.org/webob/

WSGI: Web Server Gateway Interface

Webob: Request and Response objects

The webob.Request class wraps environ and lets us do all sorts of useful things.

from webob import Request
request = Request(environ)
path_info = request.path_info
form_param = request.params["form_param"]

The webob.Response class represents an HTTP response in a semantic way.

webob.Response objects are WSGI callables. The response object handles calling start_response for us.

from webob import Response
response = Response("Hello World!", "200 OK", [
    ("Content-type", "text/plain"),
    ])
return response(environ, start_response)

WSGI: Web Server Gateway Interface

Example 4: Convert to Webob

We can convert our previous example to use Webob request and response objects. First, in our FileServer filter we can encapsulate the conversion from the WSGI api to Webob in one place and then use just Webob throughout the rest of the application:

def __call__(self, environ, start_response):
    """ WSGI entry point
    """
    request = Request(environ)
    response = self._service(request)
    return response(environ, start_response)
    
def _service(self, request):
    assert isinstance(request, Request)
    
    path_info = request.path_info
    if not path_info:
        return request.get_response(self.application)

WSGI: Web Server Gateway Interface

Example 4: Convert to Webob

Decorators are another nice way to encapsulate your WSGI entry points to use Webob. This is what I've used in the printenv application:

def wsgi_app(a):
    def wrapper(environ, start_response):
        request = Request(environ)
        response = a(request)
        return response(environ, start_response)
    return wrapper

@wsgi_app
def app(request):
    body = html % (dict_to_string(request.params), 
                   dict_to_string(request.environ))
    
    return Response(body=body, 
                    headerlist=[
                        ("Content-type", "text/html"),
                        ("Content-length", str(len(body))),
                    ])

WSGI: Web Server Gateway Interface

PasteDeploy

PasteDeploy provides some nice glue for composing your WSGI components into a functioning web application.

WSGI: Web Server Gateway Interface

PasteDeploy: Factories

WSGI contains only a specification for calling WSGI applications. It does not contain a specification for constructing applications.

In order to be able to configure WSGI applications declaratively, PasteDeploy needs a specification for constructing applications. To this end, PasteDeploy has defined signatures for factory methods. Implementing a factory interface means your application can be constructed by PasteDeploy.

The signature for an application factory in PasteDeploy is as follows:

def app_factory(global_config, **local_conf)

The signature for a filter application factory in PasteDeply is:

def filter_app_factory(app, global_conf, **local_conf)

In both of these signatures, global_conf is a dictionary containing configuration parameters declared globally in the PasteDeploy configuration for your site. local_conf are keyword arguments containing configuration parameters that match those configured for your component in PasteDeploy.

WSGI: Web Server Gateway Interface

Example 5: Deploy our sample app with PasteDeploy

A factory is any callable that creates and returns an object. Specifically, in this case, we are concerned with factories that create and return WSGI applications.

All classes are factories. They can be called and they return objects.

In the case of our FileServer, class, then to make it work with PasteDeploy we only need to change the signature of the __init__ method from this:

def __init__(self, application, path):

To this:

def __init__(self, application, global_conf, path):

path is configured in the PasteDeploy configuration file and is passed in by PasteDeploy here as a keyword argument.

Some care must be taken to make sure configuration parameters declared in the PasteDeploy config match up with keyword arguments in the factory callables.

WSGI: Web Server Gateway Interface

Example 5: Deploy our sample app with PasteDeploy

For the printenv application, we show another way to create an application factory.

Here we use a closure to make our configuration parameters available to our WSGI callable:

def factory(global_config, favorite_color):
    @wsgi_app
    def app(request):
        body = html % (favorite_color,
                       dict_to_string(request.params), 
                       dict_to_string(request.environ))
        
        return Response(body=body, 
                        headerlist=[
                            ("Content-type", "text/html"),
                            ("Content-length", str(len(body))),
                        ])
    return app

WSGI: Web Server Gateway Interface

Example 5: Deploy our sample app with PasteDeploy

In the previous example we had wired our filter to our application using Python code. This is what's known, generally, as imperative configuration.

With PasteDeploy we will use declarative config to wire our filter up to our application. We can get rid of application.py and we add helloworld.ini, which is read by PasteDeploy.

Our helloworld.ini looks like this:

[DEFAULT]

[app:main]
use = egg:helloworld#printenv
filter-with = file_server
favorite_color = fuchshia

[filter:file_server]
use = egg:helloworld#file_server
path = /home/chris/proj/wsgi-tutorial/static-files

[server:main]
use = egg:Paste#http
host = 0.0.0.0
port = 8080