This post is inspired by my previous post on utilizing urllib2 to download a sequence of files programatically. As you probably know, the transition from Python2 to Python3 has left many people struggling to port their code, so I thought I would re-hash some of my old posts and provide Python3 versions of my code examples. One resource I found recently that really helped me is the online version of Mark Pilgrim’s “Dive into Python3”, specifically the chapter on porting your 2.x code to Python3.
The example provided below outlines how to use the urllib library included within Python3 to download a sequence of image files along with comments to describe what is going on.
#import urllib request import urllib.request #import urllib error handling from urllib.error import HTTPError,URLError #function that downloads a file def downloadFile(file_name,file_mode,base_url): #create the url url = base_url + file_name # Open the url try: f = urllib.request.urlopen(url) print("downloading ", url) # Open our local file for writing local_file = open(file_name, "w" + file_mode) #Write to our local file local_file.write(f.read()) local_file.close() #handle errors except HTTPError as e: print("HTTP Error:",e.code , url) except URLError as e: print("URL Error:",e.reason , url) # Set the range of images to 1-50.It says 51 because the # range function never gets to the endpoint. image_range = list(range(1,51)) # Iterate over image range for index in image_range: base_url = 'http ://www.techniqal.com/' #create file name based on known pattern file_name = str(index) + ".jpg" # Now download the image. If these were text files, # or other ascii types, just pass an empty string # for the second param ala stealStuff(file_name,'',base_url) downloadFile(file_name,"b",base_url)
The key things to learn about converting my old example to the new are outlined below. This was a learning exercise for me, and will hopefully provide enough context for you to understand how to port your own code to Python3.
- There are obvious changes on how to use Urllib vs the old Urllib2 methods. Take a peek at “Dive into Python3” for more details. He does a much better job describing it than I ever could.
- Print statements are now called as a function.
print "My Variable is equal to " + myVariable
print("My Variable is equal to ", myVariable)
- Except blocks are handled differently when using a try/except.Python2:
except HTTPError, e: print "HTTP Error:",e.code , url
except HTTPError as e: print("HTTP Error:",e.code , url)
- The range() function used to return a list , but now returns an iterator object. If you still want to get a list from the range function, see below.
myRangeList = range(1,100)
myRangeList = list(range(1,100))
I’m not a software engineer by trade, so please excuse any syntax oddities. I appreciate any feedback, or more graceful ways to write this code. Leave them in the comments and I’ll happily update my example.