How to percent-encode URL parameters in Python?

Posted on

Question :

How to percent-encode URL parameters in Python?

If I do

url = "" + urllib.quote(query)
  1. It doesn’t encode / to %2F (breaks OAuth normalization)
  2. It doesn’t handle Unicode (it throws an exception)

Is there a better library?

Answer #1:

Python 2

From the docs:

urllib.quote(string[, safe])

Replace special characters in string
using the %xx escape. Letters, digits,
and the characters ‘_.-‘ are never
quoted. By default, this function is
intended for quoting the path section
of the URL.The optional safe parameter
specifies additional characters that
should not be quoted — its default
value is ‘/’

That means passing ” for safe will solve your first issue:

>>> urllib.quote('/test')
>>> urllib.quote('/test', safe='')

About the second issue, there is a bug report about it here. Apparently it was fixed in python 3. You can workaround it by encoding as utf8 like this:

>>> query = urllib.quote(u"Müller".encode('utf8'))
>>> print urllib.unquote(query).decode('utf8')

By the way have a look at urlencode

Python 3

The same, except replace urllib.quote with urllib.parse.quote.

Answered By: Nadia Alramli

Answer #2:

In Python 3, urllib.quote has been moved to urllib.parse.quote and it does handle unicode by default.

>>> from urllib.parse import quote
>>> quote('/test')
>>> quote('/test', safe='')
>>> quote('/El Niño/')
Answered By: Paolo Moretti

Answer #3:

My answer is similar to Paolo’s answer.

I think module requests is much better. It’s based on urllib3.
You can try this:

>>> from requests.utils import quote
>>> quote('/test')
>>> quote('/test', safe='')
Answered By: Aminah Nuraini

Answer #4:

If you’re using django, you can use urlquote:

>>> from django.utils.http import urlquote
>>> urlquote(u"Müller")

Note that changes to Python since this answer was published mean that this is now a legacy wrapper. From the Django 2.1 source code for django.utils.http:

A legacy compatibility wrapper to Python's urllib.parse.quote() function.
(was used for unicode handling on Python 2)
Answered By: Rick Westera

Answer #5:

It is better to use urlencode here. Not much difference for single parameter but IMHO makes the code clearer. (It looks confusing to see a function quote_plus! especially those coming from other languates)

In [21]: query='lskdfj/sdfkjdf/ksdfj skfj'

In [22]: val=34

In [23]: from urllib.parse import urlencode

In [24]: encoded = urlencode(dict(p=query,val=val))

In [25]: print(f"{encoded}")




Answered By: balki

Leave a Reply

Your email address will not be published.