The urllib2 module contains a number of utilities for simple access to data on the web.
For instance, the example below allows the user to enter a series of urls, and displays both the metadata and the contents of each of the documents.
import sys
import urllib2
# read a url
print "Enter the first url (or just enter to quit):"
chosenurl = sys.stdin.readline()
# keep processing urls as long as the user enters them
while (len(chosenurl) > 1):
urlfile = urllib2.open(chosenurl)
print "The associated metadata is: ", urlfile.info()
print "The file content is:"
for nextline in urlfile
print nextline
print "Enter the next url (or just enter to quit):"
chosenurl = sys.stdin.readline()
Similarly, here is a simple Python script that allows you to access a url that requires basic authentication (i.e. the user is supposed to provide a username and password when requesting the url).
#! /usr/bin/env python
import urllib2, sys, base64
# get the url desired, as well as the username/password to be used
print "Enter the url:"
chosenurl = sys.stdin.readline()
print "Enter your username:"
username = sys.stdin.readline()
print "Enter the password (warning: this is not encrypted)"
password = sys.stdin.readline()
# request the url
request = urllib2.Request(url)
# respond to the authentication request with the name/pwd
base64string = base64.encodestring('%s:%s' % (username, password))[:-1]
request.add_header("Authorization", "Basic %s" % base64string)
# open the resulting resource and read its contents
htmlFile = urllib2.urlopen(request)
htmlData = htmlFile.read()
print htmlData
htmlFile.close()
Some of the other commonly used routines from urllib include:
Python and email
While the email module contains tools for handling more sophisticated message structures,
even the smtplib module contains a number of mail handling utilities, e.g. to send mail:
import smptlib
server = smptlib.SMPT('localhost')
server.sendmail('someonesending@somewhere', 'someonegetting@somewhereelse',
"""To: someonegetting@somewhereelse
From: someonesending@somewhere
The body of the email.
""")
server.quit()
Python for CGI
Output is typically generated by a Python CGI script simply using the print statement, e.g.
print "Content-type: text/html" print " " print "<html><body>" print "Hi!" print "</body></html>"For obtaining form data passed to the CGI script, life is simplest if we use the Python CGI module: import cgi
Note: if, during debugging, you want to have the browser display
error information that results from bugs in your scripts,
include the following additional line:
import cgitb; cgitb.enable()
If, on the other hand, you'd like to have this information dumped to a file
instead of displayed on the browser, include this line instead:
import cgitb; cgitb.enable(display=0, logdir="/tmp")
|
Submitted form data is available via the FieldStorage class, so we can capture that information with statements like: myformdata = cgi.FieldStorage()
To extract the data from form field "name", we can use statements like:
namevalue = myformdata["name"].value
Note that sometimes a field contains multiple values, so we can check to see
if it is a list or not:
if isinstance(namevalue, list):
There are a huge number of other possibilities, but this should at least give you a start at Python CGI.
Cookie handling in Python
Naturally Python also provides support for handling cookies, here is a very simple example showing how to check for existing cookies, modify them if necessary, and create them if they don't exist.
#! /usr/bin/env python
# grab the modules we'll use
import os, cgi, Cookie, time
from Cookie import SimpleCookie
# Looking up, creating, and adjusting simple cookies
# ==================================================
# a cookie that counts how often the user has visited this page
# check to see if there is already a defined cookie
if os.environ.has_key('HTTP_COOKIE'):
countercookie = SimpleCookie(os.environ['HTTP_COOKIE'])
# otherwise we'll define one here
else:
countercookie = SimpleCookie()
# if a value already exists for a cookie named 'counter' grab that
# otherwise we want to set one with value 0
initialvalues = {'counter': 0 }
for key in initialvalues.keys():
if not countercookie.has_key(key):
countercookie[key] = initialvalues[key]
# increment the 'counter' component of the cookie by 1
countercookie['counter'] = int(countercookie['counter'].value) + 1
# print our HTML form, specifying the number of visits
print "Content-type: text/html"
print
print " Visit number: "
print countercookie['counter'].value
print " "
# if we want to set an expiry date as N seconds in the future,
# simply use the following (here with N = 86400 seconds, or 24 hours)
countercookie['counter']['expires'] = 86400
# Aside: here's how to grab the current time
# if you're planning on using/manipulating that
currtime = time.gmtime(time.time())