#archiveteam<efnet> log for 2012-12-29

Home Search Previous day Next day

09:20:00	<kennethre>	is there any way to upload something to upload something to archive.org without creative commons?
09:21:00	<DFJustin>	sure
09:21:00	<kennethre>	i see, it's optional
09:27:00	<kennethre>	there's no generic 'data' category?
09:27:00	<kennethre>	has to be audio, movie, or text?
09:29:00	<Coderjoe>	you're using the form, aren't you?
09:29:00	<kennethre>	yes
09:29:00	<kennethre>	is there an api?
09:29:00	<kennethre>	sorry, i've never really investigated this before :)
09:29:00	<Coderjoe>	there are other categories, just not available through the web form
09:30:00	<kennethre>	ah excellent
09:30:00	<Coderjoe>	http://archive.org/help/abouts3.txt
09:30:00	<kennethre>	oh god, perfect
09:30:00	<kennethre>	thank you
09:32:00	<kennethre>	i'm building a 'blackbox' system for everything i ever create
09:32:00	<kennethre>	and the goal is for it to be as permanent as possible
09:33:00	<Coderjoe>	however, unless you are an admin, you can only upload to one of a few collections
09:33:00	<kennethre>	Coderjoe: wonder if i can get a collection added for myself
09:33:00	<Coderjoe>	(which the web form picked via the category you chose)
09:34:00	<kennethre>	that'd be ideal
09:49:00	<kennethre>	ideally i'll have a warc for everything too
09:49:00	<kennethre>	but we'll see
10:10:00	<chronomex>	Coderjoe: you can be added to the approve list for a collection, of course
10:27:00	<Nemo_bis>	mediatype can be set to anything by anyone
10:27:00	<godane>	i'm starting to hate the speed of ftp
10:27:00	<Nemo_bis>	godane: only now?
10:28:00	<godane>	it normally works fine
10:28:00	<Nemo_bis>	No. It doesn't.
10:28:00	<godane>	for me it does
10:28:00	<godane>	but ever so often the speed becomes very slow
10:28:00	<Nemo_bis>	Maybe you're the only user left. https://archive.org/~tracey/mrtg/ftp.html
10:29:00	<Nemo_bis>	Every time a single other person tries to use it, you're both ruined. ;)
10:29:00	<Famicoman>	I'm using it
10:30:00	<godane>	i'm not that good with the scripting uploads to s3
10:30:00	<Famicoman>	I kept getting errors that the drive was full earler
10:30:00	<kennethre>	is there anyone here i should bother for a 'kennethreitz' collection, or should i go through the normal process?
10:30:00	<kennethre>	/cc @chronomex
10:30:00	<chronomex>	hi
10:31:00	<chronomex>	I think underscor or SketchCow are the people to ask
10:31:00	<kennethre>	/cc underscor :)
11:40:00	<godane>	i think s3 is very slow too
11:41:00	<godane>	not just ftp
11:41:00	<SketchCow>	What does this collection have?
11:42:00	<GLaDOS>	WARCs of everything he's done.
12:14:00	<Coderjoe>	what the
12:14:00	<Coderjoe>	the ia donate page no longer has the 3-to-1 match blurb
12:15:00	<ersi>	That's unfortunate, because maybe there's a few holding out to the absolute last day for some reason
12:15:00	<Coderjoe>	the amounts reflect it, and the blog post about it says it goes to the 31st
12:16:00	<Coderjoe>	but the progress meter is gone
12:17:00	<Famicoman>	maybe the goal was reached?
12:18:00	<ersi>	It was lacking 17k yesterday
12:18:00	<Famicoman>	ah, doubtful then
12:55:00		SmileyG looks in
13:13:00	<kennethre>	SketchCow: i'm working on a continual archive of everything i create, including articles, tweets, photos, music, etc
13:13:00	<kennethre>	SketchCow: the plan is to have it back itself up to archive.org in case I have an untimely demise :)
13:18:00	<kennethre>	it's coming along quite nicely so far
13:18:00	<kennethre>	http://blackbox.kennethreitz.org/records/1e7f3c62-96e4-4be4-a26a-f62c61ce939d
13:18:00	<kennethre>	http://blackbox.kennethreitz.org/records/1e7f3c62-96e4-4be4-a26a-f62c61ce939d/download
14:03:00		Nemo_bis has 1200 tasks waiting for admin. :/
15:45:00	<push>	i think web archive should open up old 90s versions of sites, it sucks now that some domains seem to be totally gone due to a NEW robots.txt put on the active site?
15:45:00	<ersi>	bla bla bla whine old bla bla
15:45:00	<ersi>	It's been iterated over a billion times already.
15:47:00	<push>	ah sorry, didnt think about that
15:48:00	<ersi>	But I agree that it's unfortunate that some new owner of a domain can make the previous owners data hidden in the Wayback Machine.
15:49:00	<ersi>	There's a lot of data public for what I know, look in the crawldata collection @ IA. It's not everything though, I think. And besides, the data will continue to exist - it's just hidden/darkened (until it's public again, if IA undarks or robots.txt goes away)
15:51:00	<push>	yeah, theres still a chance to see some of it some time later i guess
15:51:00	<push>	it hasnt been a huge thing or anything, only a few sites
15:51:00	<ersi>	Yeah, but it comes up so often it makes me almost angry everytime it comes up
15:52:00	<push>	i have had a similar reaction :P
15:52:00	<ersi>	^_^
15:53:00	<push>	it's hard to solve though i would think, sometimes a legitimate owner wants to block the whole history and i reckon he should be able to
15:53:00	<push>	i think other times they dont even know about IA maybe
15:54:00	<push>	some have forbidden everything by default and it seems senseless
15:54:00	<ersi>	I know that the Wayback Machine does a HTTP GET on the robots.txt when it's going to serve something from a crawled domain - everytime
15:54:00	<push>	ah
15:55:00	<ersi>	Maybe I'm wrong, but I have a faint memory of that from fiddling with the code and trying to set Wayback Machine up (http://github.com/internetarchive/wayback/)
15:57:00	<push>	guess it can also be tested, i have a couple old domains indexed i could set them up again and do before/after robots.txt
15:57:00	<push>	but it does feel that way
15:57:00	<push>	it was restrictive just earlier, a site is blocked and i was totally excited to see it
15:57:00	<push>	some very old site
15:57:00	<push>	brb
15:57:00	<push>	ehe
15:59:00	<ersi>	Yeah, sucks when you run into the problem
16:43:00	<SketchCow>	That's an interesting tactic, kennethre
16:43:00	<kennethre>	SketchCow: thanks, i like it more the longer i think about it
17:47:00	<tef>	push: archive should have old copies of robots.txt ?
19:25:00	<balrog_>	anyone here familiar with archiving yahoo groups?
19:25:00	<balrog_>	I found this tool: http://grabyahoogroup.sourceforge.net
20:12:00	<balrog_>	it's giving me error 500s though

Home Search Previous day Next day