Dashboard > Routes > Home > Routes 2 Spec
  Routes Log In | Sign Up   View a printable version of the current page.  
  Routes 2 Spec
Added by Mike Orr, last edited by Mike Orr on Apr 29, 2008  (view change)
Labels: 
(None)

Warning

This article describes a development version of Routes. The API is subject to change.

The implementation is in a Mercurial repository. To download it:

hg clone https://www.knowledgetap.com/hg/routes2-dev

Run "nosetests" in the top-level directory to run the tests. It's recommended to do this in a virtualenv with only Nose installed, to prevent Python from importing another version of Routes.

Note: the current implementation does not have Unicode handling yet. So you may encounter errors if you pass Unicode values containing non-ASCII characters. This will be addressed in a "Unicode pass" after the basic routing functionality is complete.

Goals

  • Simpler and more deterministic than Routes 1.
  • "Explicit is better than implicit."
  • Calling url with a route name should generate that route and no other.
  • Eliminate "minimization gotcha" where an earlier route unexpectedly matches a URL due to minimization.
  • Make RESTful routes more straightforward and flexible.

For more info see Why Routes Changed at the bottom of this article.

Vocabulary

A path variable is a variable name embedded in a route path. Extra variable is the dict passed as the third argument to .connect. A variable is either of these, or perhaps a temporary value during calculation.

After a successful match the terminology changes. The match produces a match dict which contains routing variables. Do not use the term routing args except for the WSGI environ key "wsgiorg.routing_args". This prevents user confusion with method arguments.

params are the query parameters attached to a URL.

RouteMap class

The RouteMap class allows users to add routes thus:

m = RouteMap()
m.connect("home", "", {controller="main", action="index"})
m.connect("archives", "archives/{year}/{month}/{day}",
    {controller="archives", action="view", year=2004},
    requirements=dict(year=r"\d{2,4}", month=r"\d{1,2}"))
m.connect("atom_feed", "feeds/{category}/atom.xml", {controller="feeds", action="atom"})
m.connect("article", "article/{section}/{slug}/{page}.html",
          {controller="article", action="view"})
m.connect(None, "{controller}/{action}")
m.connect("google", "http://www.google.com/", match=False)
m.connect("image", "/images/{record_id}/{filename}.jpg")

This sets a series of routes, which will be tried in order for matching. Features are as follows:

  • The .connect method creates a route used for both matching and generation.
  • If match is specified and false, the route will be used only for generation.
  • The first arg is the route name. Pass None to create an unnamed route. (In Routes 1 you ommitted the first arg to create an unnamed route.)
  • The second arg is the route path. The path starts at the application root, and should not have an initial slash. (If it does have an initial slash, it's ignored.) This arg is required.
  • The third arg is an optional dict of extra variables to be used in matching and generation.
  • Any path containing a colon with no slash preceding it (e.g., "http://example.com/") is an "external route". External routes are allowed only if match=False; otherwise you'll get a ValueError.
  • The path may include variable placeholders like this: "{year}". These will be returned in the match dict. The value may not contain a slash. Matching is greedy.
  • The path may include wildcard placeholders like this: "{*url}". These work like variable placeholders except the value may contain slashes. Matching is greedy. (Or should it be non-greedy?)
  • Arg if_match is a dict of variable name : regex pattern. The variables' value must match the regular expression or the URL will not match the route. ("^" and "$" will be added around the pattern if not present, to make it match the entire value rather than a substring.) This arg is not allowed if match is false.
  • Arg if_function is a function which will be called to see if the route matches. The function should have the following signature: func(environ, routing_args) => bool. environ is the WSGI environment or an empty dict. routing_args is the dict of variables Routes would have returned if if_function had not been specified. The function can modify this dict in place to influence what routing args are returned. This arg is not allowed if match is false.
  • Arg if_subdomain: XXXMO Needs documentation. This arg is not allowed if _match is false.
  • Arg if_method is a list of HTTP methods. The route matches the URL only if environ["REQUEST_METHOD] is one of these. If environ is empty, the route won't match. This arg is not allowed if match is false.
  • Arg generate_filter is a function taking a dict of generator arguments (keyword args), and returning the same or another dict. This is used during generation.
  • RouteMap has a constructor arg required_args, which is is an iterable of variable names. If present, these variables must be defined for every non-external route, either in the route path or in the extra variables. Pylons will want to set this to ["controller", "action"]. TurboGears2 will want ["controller"].

url object

The url object is an instance of URLGenerator. The Routes middleware creates this for a specific web request. Its usage is thus:

url["home"]()
url["archives"](year=2007, month=12, day=27)
url["google"]()
url["google"](q="spooks")

These return a URL which will match the named route. The application's path prefix is prepended unless the URL is external. Keyword args specify values for the path variables in the route. If a path variable does not have a keyword arg, a default value is taken from the extra variables defined in the path, or KeyError is raised. Leftover keyword args and extra variables that are not used in the path become query paramaters, with keyword args overriding. Referring to a nonexistent named route will also raise KeyError.

url.?(\*\*variables)

This generates a route via variable matching. It's mainly used with unnamed routes. (Should it be restricted to unnamed routes?) XXX Need method name: .generate, .unnamed, .by_variables, or what?

url.current(\*path_components, \*\*variables)

This generates the URL of the current page including query parameters. Query args may be used to add or modify the query params returned. A keyword arg with a None value excludes that param from the return value even if it's in the current URL. Special arg _fragment generates an HTML fragment; e.g., _fragment=foo generates "#foo".

Any positional args will be appended to the URL using urljoin. ".." and "." are allowed but will be left as is for the browser to interpret.

The following syntax does not look up a route, but merely format the arguments into a URL.

url(component, \*additional_components, \*\*query_params)

Example:

url("faq/a", page=2)
url("/faq", "a")

The positional args are urljoin'd together and the query parameters added. If the URL is not external and does not begin with a slash, the application prefix is prepended.

All numeric values appearing in arguments or in variables stored in the route are implicitly converted to strings when when generating the URL. This is currently done via str(), which may not be Unicode-safe in some cases. Unicode issues throughout Routes will be addressed after the basic functionality is complete.

Any of these syntaxes can have a keyword arg "anchor", which is converted to a URL fragment ("#anchor").

If the route was defined with a generate filter, it is called to preprocess the keyword args before generating the route. This can be used, e.g., to expand an object to actual route variables based on its attributes. Then if the object changes its interface, you only have to change the filter once rather than changing all your url() calls.

redirect_to takes the same arguments and attributes as url, but causes a redirect to the URL. The status may be set with the _status argument, an int between 300-399, default 301 ("Moved Permanently"). (XXX Does this belong in Routes or in Pylons?)

Fix the url_for Unicode, spaces, and slashes bugs. It converts spaces to "+", which is correct in the query string but not in the path portion of the URL. It should convert spaces in the path portion to "%20". The original bug is either in urllib, or url_for is calling the wrong urllib function (quote_plus instead of quote). This must be fixed.

Minimization and implicit variables are eliminated. These caused too many problems in Routes 1.

Redirect routes

The .redirect method defines a redirect route:

m.redirect(None, "faq",  "/static/faq/index")
m.redirect(None, "outsourced_dept", "http://outsourcing-r-us.example.com/acme/")
m.redirect(None, "favicon.ico", "/images/temp-icon.png", 302)
m.redirect(None, "faq/{section}", "/static/faq/{section}.html")
m.redirect(None, "note/{letter}", "/musical-notes/{letter}",
    _if_match={"letter": r"[A-G]"})

Arguments are as follows:

  • The first positional arg is the route name or None.
  • The second positional arg is the route path. It's required.
  • The third positional arg is the destination URL. If not external, the application prefix will be added. This arg may also be a function returning the destination URL. The function signature should be func(environ, routing_args) => url. (This is similar but not identical to the _if_function signature.)
  • The fourth positional arg is optional and is the HTTP status. It must be an int between 300-399, defaulting to 301 ("Moved Permanently").
  • Keyword arg message may be an HTML error message.
  • if_match etc. work as in regular routes.
  • Additional keyword args may be specified only if the destination arg is a function. These are passed to the function.

Implementation

The Route injects _status, _location, and _message variables into the routing args, containing the HTTP status (int), the URL to redirect to (absolute), and the message arg.

Failure routes

Failure routes signal a 4xx or 5xx status.

m.fail("something", 404)
m.fail("something2", 500, "message")

Args are as follows:

  • The first arg is required and is the route path.
  • The second arg is required and is the HTTP status. It must be an int between 400-599.
  • The third arg is an optional error message. The middleware/application may incorporate this into the error page as an HTML fragment, or it may ignore it. The default message is "". This is only for simple messages like "Under maintenance". More complicated behavior should be handled in the application.

Implementation

These routes set _status and _message in the match dict, similar to redirect routes.

Middleware

The Routes middleware will test an incoming URL against each route in turn. If the URL matches the route, a dict of route variables will be put in the WSGI environment. Routes does not know what the variables mean, but some frameworks require certain variables to be present. For instance, Pylons requires "controller" and "action". It is presumed RouteMap and each route will have a match method to facilitate this, and one or more generate methods.

The existing Middleware class just needs a few patches. It should add two more keys to the environ dict:

routes.url
A URLGenerator instance attuned to the current request.
routes.legacy_url
A LegacyURLGenerator instance attuned to the current request.

If the special key "_status" exists in the match dict, you can either pass "_status", "_location", and "_message" to the application unchanged, or perform the status change yourself and bypass the application. I'm not sure which is better. Maybe there should be a boolean constructor arg to choose. Bypassing the app is more efficient, but we'd have to implement the headers, and it would also prevent the app from logging HTTP errors if it wished to. "_status" is the HTTP status (4xx or 5xx). "_location" is an absolute URL for redirects. "_message" is an HTML error message, which the middleware/application may choose to display or ignore.

Pylons Implementation

pylons.url and pylons.legacy_url should be StackedObjectProxies like the other Pylons globals. These should be initialized to environ["routes.url"] and environ["routes.legacy_url"] for each request.

The standard controller template should have: from pylons import url. (Users can change this to legacy_url if they wish.)

Both url and legacy_url should be in the standard template namespace.

url_for is no longer imported into WebHelpers 6, and is no longer needed in myapp/lib/helpers.py, but users can add it if they wish.

Pylons should handle "_status", "_location", and "_message" appropriately.

RESTful services

Reimplement map.resource() from Routes 1 as map.atom_resource(). It should create a set of routes coresponding to the Atom REST format.

Consider using an expandable class to create the routes, so that other resource methods can create different kinds of non-Atom-compliant routes.

Formats

Formats allow a URL to have a suffix (e.g., rss), and the content is served in that format. Routes 1 has a formatting route paired with every resource route. We'll need to duplicate that functionality. The 'formats' arg already exists to specify which formats are allowed.

Logging

{{map.resource}} should log all the routes it creates at DEBUG level. The user should be able to log this alone, or log all routes created.

option: @vaidate bugfix

Steg has a bugfix for @validate [here|http://pylonshq.com/pasties/689] which indirectly relates to Routes.

option: WSGI routes

Status: under consideration for Routes 2.0 or later.

This would allow Routes to dispatch directly to a WSGI application, bypassing the controller (in Pylons). A .connect arg (wsgi_app?) would specify a WSGI app. The dispatching could be done by the Routes middleware, which would choose an alternate middleware path rather than the normal one. The advantage is it would allow supplemental apps to be mounted at any URL without having to go through the formality of a pseudo-controller. However, for Pylons it would also bypass setting the special globals (pylons.request etc) unless this were done in an earlier middleware. This would be a problem if the supplemental app is also Pylons.

Ben Bangert has an alternate implementation in: http://pylonshq.com/pasties/791 . It modifies Pylons instead of Routes. It requires changeset 8394a0ed494b in Routes 2. In it, you define a route with a 'callable' variable pointing to a WSGI app. (In verbal discussion it was agreed to rename the variable to 'wsgi_app'.) The route path should have a path_info wildcard like so: "myawesomewsgiapp/*path_info". This triggers special handling in Routes: the part to the left of the wildcard is appended to SCRIPT_NAME, and the wildcard value is put in PATH_INFO. The 'path_info' routing arg is set as usual.

A third alternative would be to have a map.wsgi(name, prefix, wsgi_app) method that internally does the same thing as above.

Option: regular expression syntax

Status: Rejected. Could be revived after Routes 2.0 as an alternative syntax.

Some Django users have requested regex syntax in route paths, which is what Django uses. This is a rather verbose and unfriendly syntax. It also presents problems for creating a generation template. Is there anything you can do with regex syntax that you can't do as easily with regular route syntax?

Option: multiple query parameters with the same name

Status: under consideration

Some users need url() to format multiple query parameters with the same name; e.g., "checkbox=1&checkbox=2". One scenario is for keyword args containing list/tuple values to be formatted this way. See the following pylons-discuss thread: http://groups.google.com/group/pylons-discuss/browse_thread/thread/9e46c30f9ef5e8a3

Why Routes Changed

From an IRC discussion 2008-04-23 on #pylons:

mcdonc

so general question... why the syntax changes in routes 2? just personal preference or was there some design flaw in the old syntax? (he asks, clueless on so many levels)

sluggo206

Mainly to separate out the different things Routes does. url_for does four different things: generate a named route by name, generate an unnamed route by variable matching (finding the route with the least number of variables not in common between the route definition and the keyword args), retrieving the current URL and adding query params to it, and putting the prefix on a static route. So it has to guess which mode is appropriate, and sometimes it guesses wrong. So Routes 2 has a url object with a distinct method for each mode. With the connect method, the path argument moves position depending on whether there's a name argument before it. That makes it harder to read down a list of routes and see which are the paths. The :variable syntax changed to {variable} so that you no longer need route groups Minimization and implicit arguments were omitted because they were contributing to the errors (generating the wrong route)

Site running on a free Atlassian Confluence Open Source Project License granted to Pylons. Evaluate Confluence today.
Powered by Atlassian Confluence, the Enterprise Wiki. (Version: 2.3.3 Build:#645 Feb 13, 2007) - Bug/feature request - Contact Administrators