Skip to content
This repository has been archived by the owner on Sep 16, 2022. It is now read-only.

Inconsistent handling of unicode for open / listbucket / delete #39

Open
davidwtbuxton opened this issue Jul 20, 2016 · 0 comments
Open

Comments

@davidwtbuxton
Copy link

davidwtbuxton commented Jul 20, 2016

Hi,

The cloudstorage.listbucket(..) gives you GCSFileStat objects, which will decode UTF-8 encoded object names for you so that GCSFileStat.filename is a unicode instance. This is nice.

But passing a unicode instance to the open or delete functions gives you a KeyError if the string includes non-ASCII characters.

Traceback (most recent call last):
  File "/base/data/home/apps/e~davidwtbuxton-test/cloudstorage-utf8-bug.394332959138233059/bottle.py", line 862, in _handle
    return route.call(**args)
  File "/base/data/home/apps/e~davidwtbuxton-test/cloudstorage-utf8-bug.394332959138233059/bottle.py", line 1732, in wrapper
    rv = callback(*a, **ka)
  File "/base/data/home/apps/e~davidwtbuxton-test/cloudstorage-utf8-bug.394332959138233059/wsgi.py", line 32, in create_utf8
    return create_file(u'Señor') #.encode('utf-8'))
  File "/base/data/home/apps/e~davidwtbuxton-test/cloudstorage-utf8-bug.394332959138233059/wsgi.py", line 38, in create_file
    with cloudstorage.open(dest, 'w') as fh:
  File "/base/data/home/apps/e~davidwtbuxton-test/cloudstorage-utf8-bug.394332959138233059/cloudstorage/cloudstorage_api.py", line 91, in open
    filename = api_utils._quote_filename(filename)
  File "/base/data/home/apps/e~davidwtbuxton-test/cloudstorage-utf8-bug.394332959138233059/cloudstorage/api_utils.py", line 94, in _quote_filename
    return urllib.quote(filename)
  File "/base/data/home/runtimes/python27/python27_dist/lib/python2.7/urllib.py", line 1263, in quote
    return ''.join(map(quoter, s))
KeyError: u'\xf1'

It would be nice if the cloudstorage library automatically encoded unicode object names to UTF-8, as well as decoding them.

For example, in this test project which creates objects with UTF-8 encoded names, the filename has to be encoded again when deleting all objects in a bucket.

Thank you,

David B.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants
@davidwtbuxton and others