Eric vild.org/blog/post... A proof of concept as to just what lengths you sometimes have to go to access Public Domain data provided by the Gov in an open format. this is what I was working on earlier, the write up shows the awkward lengths which had to be gone to to access the data. At one point I had to disable Chrome's security measures. Yikes.
Login or register your account to reply
Dongsung Kim Nice writing. I was all sad face reading through the article, nevertheless the apology to John Doe made me laugh!
8y, 5w 1 reply
Eric Thanks, Im slowly getting the hang of writing in a more reader friendly way :)
8y, 5w reply
Martijn It almost sounds like it might have been easier by writing a dedicated scraper, but really nice stuff there! Maybe you should slowly collect the high resolution images and put them up on the Internet Archive if they aren't there yet? That would then allow people to torrent them etc without ever burdening the original service again.
8y, 5w 1 reply
Eric I'm half tempted to do just a few records a night, but there's so many.. I had it set at 10 second intervals. If it wasn't a Government site I might be a bit more gun-ho about it but since it's 'the feds' I have reservations about intensive scraping. I didn't want to develop a dedicated scraper as such since I wanted to see what could be done with tools readily available.
8y, 5w reply