Abstract
As of version 0.9.6.1, Pylons natively support large file uploading, through some cgi.FieldStorage and tempfile.TemporaryFile magic. It is somewhat efficient, yet not fully optimized. Moreover, there's no (or little) support for file size restriction, which means, technically, a user could upload a single extremely large file via a Pylons app to eat up all your server's free space, and you have no (or little) method to stop it. (If I'm wrong, please, please, correct me and tear this article in half!) Here's a hacky way to solve the problem. Source code and sample app provided. (I hope to have a chance of merging this into Pylons' trunk
)
Let the Game Begins
There are several ways to restrict the length of a file (form field) which is about to upload. I'm going to create a sample project called Hello at first, and then show you how to do the job.
Open a terminal and enter the following commands:
paster create -t pylons Hello cd Hello paster controller file
Open Hello/hello/controllers/file.py in your preferred text editor, and modify the code like this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 | import logging from hello.lib.base import * log = logging.getLogger(__name__) class FileController(BaseController): def index(self): # Return a rendered template # return render('/some/template.mako') # or, Return a response return 'Hello World' def upload(self): return """ <form action="/hello/file/receive" enctype="multipart/form-data" method="post"> <h2>Large File Upload Test</h2> File: <input name="myfile" type="file" /> <input type="submit" /> </form> """ def receive(self): return "We are going to return something meaningful here." |
Switch back to the teminal, under Hello directory type:
paster serve --reload development.ini
Visit http://127.0.0.1:5000/file/upload in your web browser, you should see a web page with a file upload form. Now we've got a working environment to continue.
The Essentials of File Uploading
In the above example, the uploaded file can be accessed in a controller method via request.POST['myfile'].file, which is a file-like object in Python. Here's where request.POST comes from:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | class WSGIRequest(object): # ...omitted for brevity... def _POST(self): return parse_formvars(self.environ, include_get_vars=False) def POST(self): # ...docstring omitted... params = self._POST() if self.charset: params = UnicodeMultiDict(params, encoding=self.charset, errors=self.errors, decode_keys=self.decode_param_names) return params POST = property(POST, doc=POST.__doc__) # ...omitted for brevity... |
The UnicodeMultiDict is basically a Python dict wrapper, so the most important part is parse_formvars, which comes from:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 | def parse_formvars(environ, include_get_vars=True): """Parses the request, returning a MultiDict of form variables. (Here's a simplified version for demonstration only!) """ source = environ['wsgi.input'] if 'paste.parsed_formvars' in environ: parsed, check_source = environ['paste.parsed_formvars'] if check_source == source: return parsed input = environ['wsgi.input'] fs = cgi.FieldStorage(fp=input, environ=environ, keep_blank_values=1) formvars = MultiDict() if isinstance(fs.value, list): for name in fs.keys(): values = fs[name] if not isinstance(values, list): values = [values] for value in values: if not value.filename: value = value.value formvars.add(name, value) environ['paste.parsed_formvars'] = (formvars, source) return formvars |
environ['wsgi.input'] is a stream representing the HTTP body, which is provided by the WSGI server, and passed as the fp argument (which means "file-pointer", a file-like object) of cgi.FieldStorage class. The input stream is then cached and parsed by a cgi.FieldStorage instance, and then wrappered in a MultiDict which is returned.
The Quick & Easy Way, with Drawbacks
We can restrict upload length quickly and easily via cgi module, which defines a property named maxlen. As the source code comment says, this is the "maximum input we will accept when REQUEST_METHOD is POST", in bytes. It has an instant effect upon several methods and classes of cgi module, including FieldStorage, which raises ValueError if input stream exceeds length limit. The default value is 0, which means "unlimited input".
For our example, the request.POST['myfile'] actually returns a cgi.FieldStorage instance; input stream will be parsed the first time request.POST is accessed, and cached into cgi.FieldStorage instances. Let's write some codes to set a maximum length of 1,000,000 bytes to upload size:
1 2 3 4 5 6 | # ...omitted for brevity... def receive(self): import cgi cgi.maxlen = 1000000 return request.POST['myfile'].filename |
Save file.py, visit http://127.0.0.1:5000/file/upload and try to upload a file that is slightly larger than 1MB. You'll get a fancy debug page describing the ValueError exception. You can then write some code to catch the exception and do whatever you like.
However, this scheme has some serious drawbacks:
- The limit is about the entire POST body, not a single field.
- The effect of the restriction is global (for every request).
- Modifying cgi.maxlen is not thread-safe
.
Obviously, we need a better solution.
Clean up the Barriers
Before we move on, there are problems to solve. Let's keep the above code unchanged (thus remain the size limit), and try to upload a REALLY large file, say, 200MB in size. After you click the "Submit" button, the hard drive starts to drum, the CPU fan screws heavily, and after a minute or two, the fancy debug page pops up. If you upload a file with 400MB in size, time consuming is doubled. What happened? It seemed that the WHOLE file was uploaded before the exception was raised. How about the 1MB size limit?
The problem is from paste.cascade.Cascade middleware, which lives in Hello/hello/config/middleware.py:
1 2 3 4 5 6 7 8 9 10 11 | from paste.cascade import Cascade # ...omitted for brevity... def make_app(global_conf, full_stack=True, **app_conf): # ...omitted for brevity... # Static files javascripts_app = StaticJavascripts() static_app = StaticURLParser(config['pylons.paths']['static_files']) app = Cascade([static_app, javascripts_app, app]) return app |
Cascade middleware copy the whole input stream into a temporary file, and then do the rest of the work. Therefore, even if we set the file size restriction, a user could still fill up the server's hard drive by uploading an extremely large file, because the whole file is cached on the disk before we can stop it. This is a problem.
Since Cascade is a middleware in generic usage, it has the need of caching input streams for reuse. Luckily, StaticURLParser and StaticJavascripts aren't interested in it, so I created a substitute named DirectCascade, which is a subclass of Cascade and doesn't copy input streams. I put it in Hello/hello/lib/fieldmaxlen.py:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 | from paste.cascade import Cascade class DirectCascade(Cascade): """Cascade-like middleware which doesn't copy wsgi.input. When the app handles large file uploads, this will save a considerable amount of time and resources. (Code mostly pilfered from *paste.cascade.Cascade*) """ def __call__(self, environ, start_response): failed = [] def repl_start_response(status, headers, exc_info=None): code = int(status.split(None, 1)[0]) if code in self.catch_codes: failed.append(None) return _consuming_writer return start_response(status, headers, exc_info) def _consuming_writer(s): pass for app in self.apps[:-1]: environ_copy = environ.copy() failed = [] try: v = app(environ_copy, repl_start_response) if not failed: return v else: if hasattr(v, 'close'): # Exhaust the iterator first: list(v) # then close: v.close() except self.catch_exceptions, e: pass return self.apps[-1](environ, start_response) |
And the code of middleware.py becomes:
1 2 3 4 5 6 7 8 9 10 11 12 13 | #from paste.cascade import Cascade from hello.lib.fieldmaxlen import DirectCascade # ...omitted for brevity... def make_app(global_conf, full_stack=True, **app_conf): # ...omitted for brevity... # Static files javascripts_app = StaticJavascripts() static_app = StaticURLParser(config['pylons.paths']['static_files']) #app = Cascade([static_app, javascripts_app, app]) app = DirectCascade([static_app, javascripts_app, app]) return app |
Now the barriers are cleaned up, so let's do something really cool.
Create a series of restricted objects
After some time of investigation on Pylons source code, I found that the best place to restrict file upload size is in cgi.FieldStorage class. So I created the first version of RestrLenFieldStorage, which is a subclass of cgi.FieldStorage:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 | import os import cgi # ...other imports omitted for brevity... maxlendict_key = 'rockallite.maxlen_dict' class FieldTooLongError(Exception): pass class RestrLenFieldStorage(cgi.FieldStorage): """FieldStorage-like class with field length restriction.""" def __init__(self, fp=None, headers=None, outerboundary="", environ=os.environ, keep_blank_values=0, strict_parsing=0): if headers and 'content-disposition' in headers: cdisp, pdict = cgi.parse_header(headers['content-disposition']) if 'name' in pdict: name = pdict['name'] maxlen_dict = environ.get(maxlendict_key, {}) if name in maxlen_dict: self.maxlen = maxlen_dict[name] # Since cgi.FieldStorage is an old-style class cgi.FieldStorage.__init__(self, fp, headers, outerboundary, environ, keep_blank_values, strict_parsing) def read_binary(self): maxlen = getattr(self, 'maxlen', 0) if maxlen > 0 and self.length > maxlen: raise FieldTooLongError, "Maximum field length exceeded, " \ "field name [%s]" % self.name # Since cgi.FieldStorage is an old-style class cgi.FieldStorage.read_binary(self) def _FieldStorage__write(self, line): # Override the internal __write() method, thus break encapsulation maxlen = getattr(self, 'maxlen', 0) if maxlen > 0 and self.file.tell() + len(line) > maxlen: raise FieldTooLongError, "Maximum field length exceeded, " \ "field name [%s]" % self.name # Since cgi.FieldStorage is an old-style class cgi.FieldStorage._FieldStorage__write(self, line) def __del__(self): fileobj = getattr(self, 'file', None) if fileobj: fileobj.close() def __repr__(self): """Monkey patch for FieldStorage.__repr__ (Borrowed from Pylons) """ if self.file: return "FieldStorage(%r, %r)" % ( self.name, self.filename) return "FieldStorage(%r, %r, %r)" % ( self.name, self.filename, self.value) |
It does almost the same thing as its super class, except it raises a FieldTooLongError when the length of a field exceeds what we specified. In our example, we can specify a maximum length of 1,000,000 for myfile field in this way:
1 2 3 4 5 6 | # ...omitted for brevity... def receive(self): # Must be set before accessing request.POST or request.params request.environ['rockallite.maxlen_dict'] = {'myfile': 1000000} return request.POST['myfile'].filename |
There's still something to do before we can use this. First make a copy of parse_formvars and modify it to take advantage of RestrLenFieldStorage:
1 2 3 4 5 6 7 8 9 10 11 12 13 | def parse_formvars(environ): """Parses the request, returning a MultiDict of form variables. (This is a monkey-patched version of paste.request.parse_formvars, using RestrLenFieldStorage instead of cgi.FieldStorage.) """ # ...same as the original, omitted for brevity... fs = RestrLenFieldStorage(fp=input, environ=environ, keep_blank_values=1) # ...same as the original, omitted for brevity... |
Then, create a subclass of WSGIRequest which utilizes the modified version of parse_formvars. I also added two extra methods, set_field_maxlen and maxlen_fields, which simplify the modification and representation of request.environ['rockallite.maxlen_dict']:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 | from paste.wsgiwrappers import WSGIRequest # ...other imports omitted for brevity... maxlendict_key = 'rockallite.maxlen_dict' class parse_formvars(environ): # ...omitted for brevity... class RestrLenWSGIRequest(WSGIRequest): """WSGIRequest-like class with field length restriction.""" def _POST(self): return parse_formvars(self.environ) def set_field_maxlen(self, name, length=None, kb=None, mb=None, gb=None): """Set maximum length (in bytes) of a field. If length <= 0, the maximum length is unlimited (key deleted from the environ dict). """ # Check the arguments correctness if sum(arg!=None for arg in [length, kb, mb, gb]) != 1: raise ValueError, \ "One and only one of [length, kb, mb, gb] should be specified" if length: length = long(length) if kb: length = long(float(kb) * 1024) if mb: length = long(float(mb) * 1048576) if gb: length = long(float(gb) * 1073741824) env = self.environ if length <= 0: env.get(maxlendict_key, {}).pop(name, None) else: env.setdefault(maxlendict_key, {})[name] = length if not env.get(maxlendict_key, None): env.pop(maxlendict_key, None) @property def maxlen_fields(self): """Return a copy of the maxlen dict.""" # DO NOT alter the returned dictionary! return self.environ.get(maxlendict_key, {}).copy() |
Meanwhile, create a subclass of PylonsBaseWSGIApp which utilizes RestrLenWSGIRequest as the global parameter request in controllers:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 | from paste.wsgiwrappers import WSGIRequest import pylons from pylons.wsgiapp import PylonsBaseWSGIApp # ...other imports omitted for brevity... class RestrLenWSGIRequest(WSGIRequest): # ...omitted for brevity... class RestrLenBaseWSGIApp(PylonsBaseWSGIApp): """PylonsBaseWSGIApp-like class with field length restriction.""" def setup_app_env(self, environ, start_response): # PylonsBaseWSGIApp is a new-style class super(RestrLenBaseWSGIApp, self).setup_app_env(environ, start_response) registry = environ['paste.registry'] req = RestrLenWSGIRequest(environ) registry.register(pylons.request, req) |
Finally, modified middleware.py in order to take advantage of RestrLenBaseWSGIApp:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 | from helloworld.lib.fieldmaxlen import RestrLenBaseWSGIApp, DirectCascade # ...other imports omitted for brevity... def make_app(global_conf, full_stack=True, **app_conf): # ...omitted for brevity... # The Pylons WSGI app #app = PylonsApp() app = PylonsApp(base_wsgi_app=RestrLenBaseWSGIApp) # CUSTOM MIDDLEWARE HERE (filtered by error handling middlewares) # ...omitted for brevity... javascripts_app = StaticJavascripts() static_app = StaticURLParser(config['pylons.paths']['static_files']) #app = Cascade([static_app, javascripts_app, app]) app = DirectCascade([static_app, javascripts_app, app]) return app |
And the controller code can be written as:
1 2 3 4 5 6 7 8 | # ...omitted for brevity... def receive(self): # Must be set before accessing request.POST or request.params # Also can be written as: # request.set_field_maxlen('myfile', mb=1) request.set_field_maxlen('myfile', 1000000) return request.POST['myfile'].filename |
We have already finished making the first version of our restricted app. Now visit http://127.0.0.1:5000/file/upload, and try to upload a file with 200MB in size. The upload will be interrupted quickly because of the 1MB restriction of the field length, and a URL will be shown in the console indicating the fancy debug page of the FieldTooLongError exception. But wait! Why is there a "connection reset" page in the web browser? Shouldn't it be the famous fancy debug page?
After googling a bit, I found some interesting links:
LimitRequestBody and closed connections
Hi everybody,
I am looking for advise on using LimitRequestBody in Apache conf file.
So I set LimitRequestBody to a certain number to control mammoth-size posts. When I try to post tons of data via form using a Web browser it just cuts off like the server dropped a connection. I don't see expected 413 status code and the corresponding error doc that I set up. Both IE and FF do that. However when I tried to post the same request via curl it did show me 413 in headers and the error doc as well. Any idea what's going on here? Is it the browser's fault? Ideally, I would like the browser to show 413 error message otherwise the end-user gets an impression that the server just died. Your help is greatly appreciated. Thanks!
Server: Apache/2.0.55 (Ubuntu) mod_ssl/2.0.55 OpenSSL/0.9.8a
And a more detailed explanation:
...
1.5. Problems with ErrorDocument 413 processing
Normally, there are two components in the web server which can detect the post size limit problem:
...
The HTTP protocol also allows the browser to detect the problem. The optimal way to handle such limits is for the browser to implement the optional Expect: 100-continue handshake with the server so that the browser first tells the server the size of the upload, then the server tells the browser whether or not it is okay based on the size, then the browser either continues with the upload or displays an error to the user.
Popular web browsers such as Internet Explorer do not implement the Expect: 100-continue handshake. Instead, they simply start uploading the data, however large, and continue uploading until done. It is only at the end of the upload that they may see a 413 response from IBM HTTP Server (probably sent long before).
During the time that the web browser is still uploading the request body, IBM HTTP Server may drop the connection (after having already sent an error response). If the browser sees that the connection is dropped during the upload, it often will not see the error message which was sent previously, so the user will not see any error message.
IBM HTTP Server will not read unlimited amounts of request body when it has already been identified as too big. That ties up web server resources for too long and could consititute a Denial of Service. Given that the web server may drop the connection before the entire request body has been read, and that the web browser will not process the error response if it finds out that the connection has been dropped before it finishes uploading, the error message from the web server may or may not be displayed, depending on:
- size of upload (smaller uploads increase likelihood of seeing error response)
- size of error response (larger error responses increase likelihood of seeing error response)
- network bandwidth
A plug-in module could be written for IBM HTTP Server 2.0 and above which will cause the web server to read the entire request body before the error is sent. That usually results in the web browser displaying the desired message. However, this is not recommended for production use since it can tie up web server resources for a long period of time (as long as it takes the client to upload an arbitrary amount of data). It could be used for a denial of service attack.
...
The conclusion is, that the (improper) HTTP POST implementation of "popular web browsers" such as IE or Firefox stops them from reading the response when the web server thinks the upload is too big. After the server does what it should do, which is, sending a 413 or 500 or whatever suitable response for a too big request body, all that it can do is simply droping the connection in order to free server resources. So the web browser thinks that the server "just died" and throws out an unfriendly built-in "connection reset" error page. The behavior is the same for this scheme and the previous cgi.maxlen approach.
Make it graceful, make it safe
To make the upload size restriction more friendly, there are mostly two kinds of improvement:
- Client-side: we need to do some client-side scripting, AJAX, etc., so that the web browser detects whether the upload process is finished or interrupted, and gives out appropriate messages.
- Server-side: the web server gracefully accepts the whole request body no matter how large it is, however it silently discards the redundant data and leaves an indicator for later process when the upload exceeds the size limit. This is similar to the "plug-in module" mentioned in the above IBM recipe.
Either approach has drawbacks. The former one is more complicated, harder to debug, and has cross-browser issues. The latter one may tie up web server resources, although not be as serious as the IBM recipe describes (because we'll only keep a necessary size of data and throw away the rest). However, the server-side approach is easier to impliment. Since this is an article about hacking, I'll leave the first one for you, and go at the server-side solution.
I modified RestrLenFieldStorage a bit, and created the second version:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 | import os import cgi # ...other imports omitted for brevity... maxlendict_key = 'rockallite.maxlen_dict' maxlensafe_key = 'rockallite.maxlen_safe' class FieldTooLongError(Exception): pass class RestrLenFieldStorage(cgi.FieldStorage): """FieldStorage-like class with field length restriction.""" def __init__(self, fp=None, headers=None, outerboundary="", environ=os.environ, keep_blank_values=0, strict_parsing=0): if headers and 'content-disposition' in headers: cdisp, pdict = cgi.parse_header(headers['content-disposition']) if 'name' in pdict: name = pdict['name'] maxlen_dict = environ.get(maxlendict_key, {}) if name in maxlen_dict: self.maxlen = maxlen_dict[name] self.maxlen_safe = environ.get(maxlensafe_key, False) # Set to True when too long self.toolong = False # Since cgi.FieldStorage is an old-style class cgi.FieldStorage.__init__(self, fp, headers, outerboundary, environ, keep_blank_values, strict_parsing) def dumb_read(self): """Internal: read and discard data.""" todo = self.length while todo > 0: data = self.fp.read(min(todo, self.bufsize)) if not data: self.done = -1 break todo -= len(data) def read_binary(self): maxlen = getattr(self, 'maxlen', 0) if maxlen > 0 and self.length > maxlen: self.toolong = True # If "safe" semaphore is set, we continue gracefully to receive # the stream (thus the user won't get the unfriendly "connection # reset" page), but discard the data if self.maxlen_safe: self.dumb_read() return raise FieldTooLongError, "Maximum field length exceeded, " \ "field name [%s]" % self.name # Since cgi.FieldStorage is an old-style class cgi.FieldStorage.read_binary(self) def _FieldStorage__write(self, line): # Override the internal __write() method, thus break encapsulation maxlen = getattr(self, 'maxlen', 0) if 0 < maxlen < self.file.tell() + len(line): self.toolong = True # If "safe" semaphore is set, we continue gracefully to receive # the stream (thus the user won't get the unfriendly "connection # reset" page), but discard the data if self.maxlen_safe: return raise FieldTooLongError, "Maximum field length exceeded, " \ "field name [%s]" % self.name # Since cgi.FieldStorage is an old-style class cgi.FieldStorage._FieldStorage__write(self, line) def __del__(self): fileobj = getattr(self, 'file', None) if fileobj: fileobj.close() def __repr__(self): """Monkey patch for FieldStorage.__repr__ (Borrowed from Pylons) """ if self.file: return "FieldStorage(%r, %r)" % ( self.name, self.filename) return "FieldStorage(%r, %r, %r)" % ( self.name, self.filename, self.value) |
Here we can specify request.environ['rockallite.maxlen_safe'] = True to activate the scheme, that is, reading the whole input stream and discarding the redundant data. I also added a method to RestrLenWSGIRequest for convience:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 | from paste.wsgiwrappers import WSGIRequest # ...other imports omitted for brevity... maxlensafe_key = 'rockallite.maxlen_safe' class RestrLenWSGIRequest(WSGIRequest): """WSGIRequest-like class with field length restriction.""" # ...other methods omitted for brevity... def set_maxlen_safe(self, safe=True): """Set the "safe" semaphore for field length restriction. Without this, the upload stream will be interrupted if a user tries to upload a file that is too large, resulting a "connection reset" page. If the "safe" semaphore is set to True, the upload continues, but the (overflow of) data will be discarded; moreover, the field will have a `toolong` attribute which is set to True. Note: the `toolong` attribute is set on every field that has a maximum length defined, with a default value of False. See `RestrLenFieldStorage` class for more details. """ env = self.environ if safe: env[maxlensafe_key] = True else: env.pop(maxlensafe_key, None) # ...other methods omitted for brevity... |
Once again, the controller code becomes:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | # ...omitted for brevity... def receive(self): # Must be set before accessing request.POST or request.params # Also can be written as: # request.set_field_maxlen('myfile', mb=1) request.set_field_maxlen('myfile', 1000000) # After the maximum field length is set, connection will be reset # when a user try to upload a file that is too large. Very unfriendly. # Without `set_maxlen_safe(True)`, we should really use a client-side # script to handle this. request.set_maxlen_safe(True) post = request.POST filename = post['myfile'].filename toolong = post['myfile'].toolong return 'filename: %s, toolong: %s' % (filename, toolong) |
Visit http://127.0.0.1:5000/file/upload again, and try to upload a 200MB file. It will spend a minute or two on the upload process, then show you a web page which shows the name of the file you uploaded and the words of "toolong: True", without errors.
Get ready for production
The above approach is much more friendly, however it may not be suitable for a production environment. Despite the unnecessary time for which a user waits for the final response, this may tie up web server resources for a long period of time and leave a chance for DoS attacks.
Thanks to Max Ischenko's comment, I found that the Apache web server provides some features on restricting upload size. (Just google LimitRequestBody or read the Apache Core Features.) So when we deploy a Pylons application behind Apache, we can specify a suitable maximum request body size via Apache's configuration file, for example, of 5 to 10MB, and make the real (and smaller) restriction in your application via request.set_field_maxlen. This will prevent web server resources from being tied up too much by extremely big file uploads, but make "popular web browsers" relatively unhappy and throw out a "connection reset" page. On the other side, it will display friendly messages when file uploads slightly exceed the limit. If you don't use Apache or you're just unable to configure it, using cgi.maxlen together with DirectCascade middleware is theoretically the same.
You can download the sample app and try it. See Hello/README.txt for instructions.
Conclusion
We can hack the source code to make upload size restriction smoother in Pylons. In a production environment, Apache will also help us to restrict the maximum bytes allowed in a request body. Due to bugs in HTTP POST implementations in popular web browsers such as IE and Firefox, we need to take better care of how to return a response when file uploads are too big.
Comments (4)
Dec 03, 2007
Max Ischenko says:
re: file size restrictions. If you're behind Apache it will handle this for you....re: file size restrictions. If you're behind Apache it will handle this for you.
Dec 16, 2007
Rockallite Wulf says:
Thanks for your comment! Did you mean the LimitRequestBody parameter in Apache's...Thanks for your comment! Did you mean the LimitRequestBody parameter in Apache's configuration file? I did some research on that topic after you mentioned it, and found that although it can do the job, it doesn't avoid the "connection reset" error page.
Jun 03, 2008
Greg Hazel says:
I tried running the sample app with "paster serve --reload development.ini" with...I tried running the sample app with "paster serve --reload development.ini" with Pylons 0.9.6.2
The sample app gets a 404 when the form is submitted, because it links to "/hello/file/receive" which does not exist. Fixing that to be "/file/receive" allows the post to hit the right controller, but I get "<type 'exceptions.AttributeError'>: toolong".
Aug 05, 2008
Michael Bayer says:
the DirectCascade approach wasn't working for me, it was keeping the browser con...the DirectCascade approach wasn't working for me, it was keeping the browser connection opened when the request body was too long, and even finding the ultimate socket connection and calling close() on it didn't seem to work correctly; only fully reading wsgi.input seemed to fix it.
So based on a suggestion from the Paste list I wrote a much simpler and non-intrusive method which is described at A Better Way To Limit File Upload Size.