Eric A proof of concept as to just what lengths you sometimes have to go to access Public Domain data provided by the Gov in an open format. this is what I was working on earlier, the write up shows the awkward lengths which had to be gone to to access the data. At one point I had to disable Chrome's security measures. Yikes.
Dongsung Kim Nice writing. I was all sad face reading through the article, nevertheless the apology to John Doe made me laugh!
Eric Thanks, Im slowly getting the hang of writing in a more reader friendly way :)
Martijn It almost sounds like it might have been easier by writing a dedicated scraper, but really nice stuff there! Maybe you should slowly collect the high resolution images and put them up on the Internet Archive if they aren't there yet? That would then allow people to torrent them etc without ever burdening the original service again.
Eric I'm half tempted to do just a few records a night, but there's so many.. I had it set at 10 second intervals. If it wasn't a Government site I might be a bit more gun-ho about it but since it's 'the feds' I have reservations about intensive scraping. I didn't want to develop a dedicated scraper as such since I wanted to see what could be done with tools readily available.
