dr_brown
Chevereto Member
▶🚶Reproduction steps
I am writing a python parser that copies text information in html. When processing text, and selecting links to pictures, there are pictures in the URL that may contain spaces. When I try to upload a picture using API, with such a URL I get an error. Although the browser line opens such links normally, all spaces are replaced by% 20 and everything is ok.
Here is an example of a function where I process text and upload pictures by api:
As you can see, I use the line processing function, which clears the URL of unnecessary characters, and saves only those allowed according to the standard RFC 3986.
While I put a temporary "crutch" to handle this error:
😢Unexpected result
Link Example:
- not perceived
- not perceived
📃Error log message
Invalid file source
I am writing a python parser that copies text information in html. When processing text, and selecting links to pictures, there are pictures in the URL that may contain spaces. When I try to upload a picture using API, with such a URL I get an error. Although the browser line opens such links normally, all spaces are replaced by% 20 and everything is ok.
Here is an example of a function where I process text and upload pictures by api:
Python:
for image in images:
try:
image_clr = urllib.parse.quote(image['src'], safe='-._~:/?#[]@!$&\'()*+,;=')
pic_req = Request(url_api.format(key_api, image_clr))
pic = urlopen(pic_req).read().decode('UTF-8')
if pic == 'Invalid file source':
continue
except HTTPError:
continue
else:
upload_image.append(image['src'])
pic_image.append(pic)
As you can see, I use the line processing function, which clears the URL of unnecessary characters, and saves only those allowed according to the standard RFC 3986.
Python:
image_clr = urllib.parse.quote(image['src'], safe='-._~:/?#[]@!$&\'()*+,;=')
While I put a temporary "crutch" to handle this error:
Python:
if pic == 'Invalid file source':
continue
😢Unexpected result
Link Example:
Code:
https://3dnews.ru/assets/external/illustrations/2019/10/30/996521/Annotation 2019-10-30 085631.jpg
Code:
https://3dnews.ru/assets/external/illustrations/2019/10/30/996521/Annotation%202019-10-30%20085631.jpg
📃Error log message
Invalid file source