#archiveteam<efnet> log for 2013-01-12

Home Search Previous day Next day

04:57:00	<omf_>	I just tried the tracker site and got a blank page. Is it down?
05:27:00	<S[h]O[r]T>	godane if you give me the g4tv url list ill download it all
08:17:00	<godane1>	S[h]O[r]T: https://archive.org/details/g4tv.com-video-url-list-1
08:17:00	<godane1>	i uploaded the list
10:55:00	<IR5611>	12www.jizzday.com
11:04:00	<GLaDOS>	I wouldn't set jizzday up on an autoban list..
12:39:00	<SketchCow>	hi.
12:39:00	<SketchCow>	we are bring back the JSTOR downloader
12:39:00	<SketchCow>	for aaron.
12:40:00	<SketchCow>	I believe alard and underscor have the code?
13:09:00	<alard>	SketchCow: I must have it somewhere, yes.
13:19:00	<SketchCow>	let's do it.
13:23:00	<Cameron_D>	yes, when I first heard about it I wondered if we'd ever completed the JSTOR stuff
13:23:00	<Cameron_D>	so lets do that
13:44:00	<SketchCow>	get the code, prep the bookmarklet. please, someone check if jstor changed tos to address what we are doing.
13:45:00	<SketchCow>	ill provide a non archive.org box access for this.
13:45:00	<SketchCow>	And after I nap a little, I will write verbiage for the page.
13:47:00	<GLaDOS>	(d) undertake any activity such as computer programs that automatically download or export Content, commonly known as web robots, spiders, crawlers, wanderers or accelerators that may interfere with, disrupt or otherwise burden the JSTOR server(s) or any third-party server(s) being used or accessed in connection with JSTOR
13:48:00	<GLaDOS>	The only part in the prohibited activities clause which could conflict.
13:48:00	<kennethre>	odd timing http://www.webpronews.com/jstor-opens-up-its-archive-kinda-sorta-2013-01
13:48:00	<Cameron_D>	Won't be automated, IIRC it was something that had to be manually triggered for each fiile
13:49:00	<GLaDOS>	AFAIK, we're ok
13:49:00	<Cameron_D>	At least, bookmarklet implies that
13:53:00	<SketchCow>	ok.
13:53:00	<SketchCow>	did it change? is that old or new info?
13:53:00	<GLaDOS>	Just fetched it
13:54:00	<GLaDOS>	Wait
13:54:00	<GLaDOS>	"he Content can be read online (but not printed or downloaded) as further described in Section 2.1 below."
13:55:00	<godane>	i'm capturing thefeed from g4tv.com and its taking a very long time
13:56:00	<godane>	this capture is without the images in it
13:56:00	<godane>	its readly 733mb
13:56:00	<godane>	*alreadly
13:57:00	<GLaDOS>	(f) download or print, or attempt to download or print an entire issue of a journal (unless such entire issue has been purchased through the Publisher Sales Service) or substantial portions of the entire run of a journal, except for the specific case in which the complete contents of a journal issue or a substantial portion of Textual Content (e.g. a series of scholarly essays) is relevant to the particular research
13:57:00	<GLaDOS>	(c) incorporate Content into an unrestricted database or website, except that authors or other Content creators may incorporate their Content into such sites with prior permission from the publisher and other applicable rights holders
13:57:00	<GLaDOS>	Any of these new?
14:00:00	<SketchCow>	check wayback
14:02:00	<alard>	https://twitter.com/JSTOR/status/174155323668574208
14:03:00	<GLaDOS>	Newest version in wayback is may 31
14:03:00	<GLaDOS>	http://web.archive.org/web/20120531065004/http://about.jstor.org/participate-jstor/individuals/early-journal-content
14:03:00	<GLaDOS>	Wait, mind mixed order of messages up
14:04:00	<GLaDOS>	Blocked by robots.txt
14:04:00	<Cameron_D>	http://www.jstor.org/robots.txt it won't be in wayback?
14:10:00	<SketchCow>	I don't to move too rashly on this. I've done that in the past, not always forgood.
14:11:00	<SketchCow>	a part ofmewants to make it so it violates the agreement, so thousands of people commit the felony.
14:11:00	<SketchCow>	ok, rest
14:11:00	<Cameron_D>	Yeah, and looknig at point (c) we may not be able to, although there are no past versions of the ToC to compare to
14:59:00	<balrog_>	I'm wondering if something exists that just stores any PDFs you're viewing in browser together with a little bit of metadata
15:40:00	<riordan>	Is this where OpAaronSW is going down?
15:47:00	<balrog_>	to some extent
17:27:00	<SketchCow>	I've put a slight waiting period on it to understand the best thing to do.
17:27:00	<SketchCow>	But I want his stuff in away from keyboard on archive.org, so we are definitely doing that.
17:45:00	<riordan>	SketchCow: totally - thank you man
18:18:00	<godane>	uploaded: http://archive.org/details/www.aaronsw.com-20130112-mirror
21:46:00	<SketchCow>	Hi.
21:46:00	<SketchCow>	OK, so.
21:49:00	<SketchCow>	#1. He deleted some sites, before hanging himself.
21:49:00	<SketchCow>	#2. Making a collection now.
21:50:00	<SketchCow>	#3. Soooooo angry still, but running out of people to blame
21:52:00	<SketchCow>	I've cooked up a plan, working it out with alard.
21:53:00	<SketchCow>	Here's the plan.
21:53:00	<SketchCow>	Bookmarket, like the JSTOR downloader. You run it, and it downloads one document.
21:53:00	<SketchCow>	You write something about aaron when you do it.
21:53:00	<SketchCow>	And so it gets uploaded, with your memorial.
21:53:00	<SketchCow>	Then everyone commits a felony
21:53:00	<SketchCow>	And says their peace.
21:58:00	<chronomex>	nice
22:02:00	<dashcloud>	SketchCow: if the goal is to download everything, can't we just have something that would take a group of people months to complete (i.e, low profile enough to avoid detection until the end?)
22:06:00	<alard>	One document seems like a nice idea. So people can also leave their name and a message?
22:06:00	<alard>	(They'll still have to install the bookmarklet, even if there's only one document.)
22:28:00	<balrog_>	alard: I'd like to see something I mentioned above to be done
22:28:00	<balrog_>	basically like RECAP but for more than just PACER
22:41:00	<SketchCow>	yes
22:42:00	<balrog_>	it bothers me greatly when PDFs (and content in general) that I browsed when doing research even recently goes dark
22:42:00	<SketchCow>	dashcloud: goal is not torape jstor todeath
22:42:00	<SketchCow>	sorry, ipad
22:43:00	<balrog_>	often a lot of the older stuff is on very sketchy sites to begin with :/
22:43:00	<balrog_>	look at chip datasheets for example...
22:48:00	<philpem>	yeah, the EAB archive is one such site
22:48:00	<philpem>	bloody huge collection of databooks and so on, sitting behind someone's cable modem.
22:48:00	<philpem>	if I had the details of the guy who ran it, I'd offer to send hima a
22:49:00	<philpem>	*him a Peli hardcase and a bunch of hard drives in exchange for a copy.
22:49:00	<SketchCow>	n 20
22:51:00	<SketchCow>	OK, so, this is what I would like.
22:51:00	<SketchCow>	1. JSTOR bookmarklet. You add it, click it, and it downloads the article, asking you for a message about aaron.
22:52:00	<SketchCow>	2. If someone has a virtual instance alard can use, I'd like you to coordinate with him. He has a lot done.
22:52:00	<SketchCow>	3. When the bookmarklet is used again, banner thanking people, and then a link to the Wikipedia article on Aaron.
22:52:00	<SketchCow>	Make sense?
22:58:00	<fault>	I've got some server capacity that can be used
23:01:00	<fault>	Send me a message if you need somewhere to dump it, I can set up nginx/cgi, whatever stack you need
23:05:00	<alard>	Actually, it's almost bedtime for me. I have little time tomorrow. So if there's anyone who wants to take over, please do.
23:05:00	<alard>	I've done the following so far:
23:06:00	<alard>	There's a bookmarklet that does a form POST with the PDF and the message to a script somewhere. What's needed is a server-side thing that receives the POST data, stores it and adds it to the memorial page.
23:10:00	<chronomex>	this seems like a fit for tracker.archiveteam.org
23:13:00	<SketchCow>	Who can take over?
23:53:00	<Nemo_bis>	Some warrior instances getting killed for not enough memory.
23:54:00	<Nemo_bis>	Ah, looks like I lost a user of which I downloaded some 10-15 GiB.

Home Search Previous day Next day