Python WWW Macro
Solution 1:
Almost a direct fulfillment of the wishes in the question - twill.
twill is a simple language that allows users to browse the Web from a command-line interface. With twill, you can navigate through Web sites that use forms, cookies, and most standard Web features.
twill supports automated Web testing and has a simple Python interface.
(pyparsing
, mechanize
, and BeautifulSoup
are included with twill for convenience.)
A Python API
example:
from twill.commands import go, showforms, formclear, fv, submit
go('http://issola.caltech.edu/~t/qwsgi/qwsgi-demo.cgi/')
go('./widgets')
showforms()
formclear('1')
fv("1", "name", "test")
fv("1", "password", "testpass")
fv("1", "confirm", "yes")
showforms()
submit('0')
Solution 2:
Use mechanize. Other than executing JavaScript in a page, it's pretty good.
Solution 3:
Another thing to consider is writing your own script. It's actually not too tough once you get the hang of it, and without invoking a half dozen huge libraries it might even be faster (but I'm not sure). I use a web debugger called "Charles" to surf websites that I want to scrape. It logs all outgoing/incoming http communications, and I use the records to reverse engineer the query strings. Manipulating them in python makes for quite speedy, flexible scraping.
Post a Comment for "Python WWW Macro"