So I was talking with Decius the other day about the NYT. If you folks haven't noticed you can access NYT articles through google news without needing an account. Google simply adds a parameter "partner=google" to the end of the NYT URL. I figured Decius and I could could write a script that checks for recommended memes for NYT stories and insert this at the end, not allow for people without accounts to read stories. However this is not what NYT checks for. If you go to a NYT article through a standard google search, the "partner=google" is not added to the URL, but you can still access the story So how does NYTs do this? With the "referer" field on a standard HTTP GET request. So tonight I had an idea. WGET! it has a nice litte option "--referer=". Sure enough, you can grab NYT stories using WGET. Thus to read for example the Theory-vs-reality story, using: wget http://www.nytimes.com/2004/02/23/opinion/23HERB.html&OQ=pagewantedQ3DprintQ26positionQ3D will save the login screen but wget --referer=http://www.google.com http://www.nytimes.com/2004/02/23/opinion/23HERB.html&OQ=pagewantedQ3DprintQ26positionQ3D will grab the page. Yes yes yes, I know, as Decius told me: "Just get a freaky account dude." |