Python File Read Write with Urllib2

Update: Looking for how to download files using Python3 and urllib? Check out my post here .

While checking out my great stats on Lijit recently, I started to see a pattern. I was able to determine that a large part of my Re-Search(search engine) traffic was coming from a post I did about reading and writing to files in Python back in  2005. In an effort to shamelessly attract more traffic on this topic, I have decided to flesh this post out a bit.

A common task that I run into both in my work life as well as my personal life, revolves around programmatically downloading content from the interwebs. This little code example will illustrate how to use urllib to download a file, and write/save the file contents locally. You may be saying to yourself “Self, can’t I do this in my favorite web browser??” . The answer is “YES”, but it’s a pain in the ass if you have more than 5 files you want to download.

Assume there are a set of images on your favorite website, and they are all named  image1.jpg,image2.jpg,image3.jpg, etc. Now imagine there are 50 images using this naming convention.How do you download them all using python , without struggling to do it one image at a time in your browser? Look below!

python

# Let's create a function that downloads a file, and saves it locally.
# This function accepts a file name, a read/write mode(binary or text),
# and the base url.

def stealStuff(file_name,file_mode,base_url):
	from urllib2 import Request, urlopen, URLError, HTTPError
	
	#create the url and the request
	url = base_url + file_name
	req = Request(url)
	
	# Open the url
	try:
		f = urlopen(req)
		print "downloading " + url
		
		# Open our local file for writing
		local_file = open(file_name, "w" + file_mode)
		#Write to our local file
		local_file.write(f.read())
		local_file.close()
		
	#handle errors
	except HTTPError, e:
		print "HTTP Error:",e.code , url
	except URLError, e:
		print "URL Error:",e.reason , url


# Set the range of images to 1-50.It says 51 because the 
# range function never gets to the endpoint.
image_range = range(1,51)

# Iterate over image range
for index in image_range:
	
	base_url = 'http://www.techniqal.com/'
	#create file name based on known pattern 
	file_name =  str(index) + ".jpg"
	# Now download the image. If these were text files, 
	# or other ascii types, just pass an empty string 
	# for the second param ala stealStuff(file_name,'',base_url)
	stealStuff(file_name,"b",base_url)

That’s it. It not only reports on any errors it encountered while downloading, but think of all of the time you just saved… Really though, how important is your time to you if you’re reading this blog???

Time Flies

My wife and I just celebrated our 5th wedding anniversary recently. The whole “5 years” thing hit me, and caused me to reflect. If I look at any other 5 year snippet of my life, it doesn’t even start to compare. The last 5 years have been the best, and as good times usually go, they flew by.

To celebrate, Mary and I enjoyed a wedding gift from a friend. It took some will power to wait five years, but it was worth it.

CIMG0733

Thanks for the cool gift Jennifer.

And to all future wedding attendees, you should try this out for a gift:

Bottle of Champagne (for 1 year anniv.)

Bottle of Wine(for 5 year anniv.)

Bottle of Port(for 10 year anniv.)

It is a risky bet in a country with a 50% divorce rate, but at the very least your newly divorced friend may invite you over to drink it with them. Or have them accelerate the schedule (1 month, 6months, 1 year) ?? Either way, it’s a cool idea.

Thanks for an awesome 5+ years Mary…

Lucas and the Laptop




lucas and the laptop

Originally uploaded by Tarable1

Great pic from the Lijit 2nd birthday party.
If you look real close, you can even see that he is writing a blog post for the Lijit blog.
It’s about time that little monster started earning his keep.
Lorem ipsum dolor sit amet, consectetuer adipiscing elit, sed diam nonummy nibh euismod tincidunt ut laoreet dolore magna aliquam erat volutpat. Ut wisi enim ad minim veniam, quis nostrud exerci tation ullamcorper suscipit lobortis nisl ut aliquip ex ea commodo consequat. Duis autem vel eum iriure dolor in hendrerit in vulputate velit esse molestie consequat, vel illum dolore eu feugiat nulla facilisis at vero eros et accumsan et iusto odio dignissim qui blandit praesent luptatum zzril delenit augue duis dolore te feugait nulla facilisi. Nam liber tempor cum soluta nobis eleifend option congue nihil imperdiet doming id quod mazim placerat facer possim assum. Typi non habe

What are the Odds?

climber

Read this pretty hilarious article on nytimes.com.

Either the first guy inspired the second guy or the second guy went crazy due to his job in IT (you have to be a little crazy just to get hired in this industry). Basically, two people thought it would be cool to climb up the NY Times building yesterday. I am pretty sure there is a larger than normal mass of crazy in New York right now. My favorite part is the quote from the police inspector in charge of the scene/arrest : That’s the last climber today.”At least he isn’t over-committing . He totally gets the whole  “under promise , over deliver”  work ethic. 

Windows Live Writer Tech Preview

I am always looking for a better blog posting tool. The built in TinyMCE stuff on wordpress is cool, but there are still some barriers to actually making blog posts. Image uploading can be tedious(although better in 2.5) .

I decided to try Windows Live Writer today. The main reason I even considered it, was due to the cool image editing and transformation tools it sports. You can easily insert pictures, and video, and then give the images custom borders(rounded corners, drop shadow etc).

writer_menu

The ability to do this editing inline makes it a valuable tool.

If I can keep it from crashing when switching from preview mode to edit mode, I may stick with it for a while. It is after all a tech preview, and stuck with a bastardized  IE8 install on my win32 box.  If you want to give it a try, you know what to do.

 UPDATE:

In the event someone was having the same crashing issues I had, it could be the MS Script Debugger for Internet Explorer. You can disable script debugging at Tools->Internet Options->Advanced . Once the javascript errors stopped popping up, the Live Writer client would stay alive.

UPDATE 2:

The kind folks on the Windows Live Writer team let me know that this particular issue should be fixed in the next public release. Kudos to them for paying attention to their users, and reaching out when they didn’t have to. They have at least one convert, and I am recommending it to my blogging co-workers as well.

Posted from 1050 Walnut, Boulder, CO

Map image