Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

URL blank encode problem #808

Closed
wanggp3 opened this issue Feb 29, 2016 · 4 comments
Closed

URL blank encode problem #808

wanggp3 opened this issue Feb 29, 2016 · 4 comments
Labels

Comments

@wanggp3
Copy link

wanggp3 commented Feb 29, 2016

#testfile.py
import asyncio,aiohttp

url  = 'http://www.gilt.com/search?q.query=Sam Edelman&q.rows=48&q.category=womens-coats&q.start=-48'

@asyncio.coroutine
def fetch():
    session = aiohttp.ClientSession()#cookies={'currencyid':'2','topcatid':'9057'})
    r = yield from session.get(url)
    body = yield from r.read()
    #print(body)
    print('aiohttp statuss = %s ' % r.status)
    session.close()


def get():
    import requests
    r = requests.get(url)
    print('requests status = %s' % r.status_code)

get()
print('*' * 10)

loop = asyncio.get_event_loop()
loop.run_until_complete(fetch())
loop.close()

In the RFC1738

Unsafe:
Characters can be unsafe for a number of reasons. The space
character is unsafe because significant spaces may disappear and
insignificant spaces may be introduced when URLs are transcribed or
typeset or subjected to the treatment of word-processing programs.
The characters "<" and ">" are unsafe because they are used as the
delimiters around URLs in free text; the quote mark (""") is used to
delimit URLs in some systems. The character "#" is unsafe and should
always be encoded because it is used in World Wide Web and in other
systems to delimit a URL from a fragment/anchor identifier that might
follow it. The character "%" is unsafe because it is used for
encodings of other characters. Other characters are unsafe because
gateways and other transport agents are known to sometimes modify
such characters. These characters are "{", "}", "|", "", "^", "~",
"[", "]", and "`".
All unsafe characters must always be encoded within a URL. For
example, the character "#" must be encoded within URLs even in
systems that do not normally deal with fragment or anchor
identifiers, so that if the URL is copied into another system that
does use them, it will not be necessary to change the URL encoding.

If we pass a url with blank, the aiohttp will fail to get it, but the requests can. So, whether the aiohttp should encode the url with blank first?

@asvetlov
Copy link
Member

Agree. Would you provide a Pull Request?

@wanggp3
Copy link
Author

wanggp3 commented Mar 1, 2016

Yes, I have provided it

@asvetlov
Copy link
Member

Fixed by #1217

@lock
Copy link

lock bot commented Oct 29, 2019

This thread has been automatically locked since there has not been
any recent activity after it was closed. Please open a new issue for
related bugs.

If you feel like there's important points made in this discussion,
please include those exceprts into that new issue.

@lock lock bot added the outdated label Oct 29, 2019
@lock lock bot locked as resolved and limited conversation to collaborators Oct 29, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

2 participants