I have been using Amazon’s Product Advertising API to generate urls that contains prices for a given book. One url that I have generated is the following:
When I click on the link or paste the link on the address bar, the web page loads fine. However, when I execute the following code I get an error:
url = "http://www.amazon.com/gp/offer-listing/0415376327%3FSubscriptionId%3DAKIAJZY2VTI5JQ66K7QQ%26tag%3Damaztest04-20%26linkCode%3Dxm2%26camp%3D2025%26creative%3D386001%26creativeASIN%3D0415376327"
html_contents = urllib2.urlopen(url)
The error isĀ urllib2.HTTPError: HTTP Error 503: Service Unavailable. First of all, I don’t understand why I even get this error since the web page successfully loads.
Also, another weird behavior that I have noticed is that the following code sometimes does and sometimes does not give the stated error:
html_contents = urllib2.urlopen("http://www.amazon.com/gp/offer-listing/0415376327%3FSubscriptionId%3DAKIAJZY2VTI5JQ66K7QQ%26tag%3Damaztest04-20%26linkCode%3Dxm2%26camp%3D2025%26creative%3D386001%26creativeASIN%3D0415376327")
I am totally lost on how this behavior occurs. Is there any fix or work around to this? My goal is to read the html contents of the url.
Kenil Vasani
Amazon is rejecting the default User-Agent for urllib2 . One workaround is to use the requests module
If you insist on using urllib2, this is how a header can be faked to do it:
Don’t worry about stackoverflow editing the URL. They explain that they are doing this