09:20:00<kennethre>is there any way to upload something to upload something to archive.org without creative commons?
09:21:00<DFJustin>sure
09:21:00<kennethre>i see, it's optional
09:27:00<kennethre>there's no generic 'data' category?
09:27:00<kennethre>has to be audio, movie, or text?
09:29:00<Coderjoe>you're using the form, aren't you?
09:29:00<kennethre>yes
09:29:00<kennethre>is there an api?
09:29:00<kennethre>sorry, i've never really investigated this before :)
09:29:00<Coderjoe>there are other categories, just not available through the web form
09:30:00<kennethre>ah excellent
09:30:00<Coderjoe>http://archive.org/help/abouts3.txt
09:30:00<kennethre>oh god, perfect
09:30:00<kennethre>thank you
09:32:00<kennethre>i'm building a 'blackbox' system for everything i ever create
09:32:00<kennethre>and the goal is for it to be as permanent as possible
09:33:00<Coderjoe>however, unless you are an admin, you can only upload to one of a few collections
09:33:00<kennethre>Coderjoe: wonder if i can get a collection added for myself
09:33:00<Coderjoe>(which the web form picked via the category you chose)
09:34:00<kennethre>that'd be ideal
09:49:00<kennethre>ideally i'll have a warc for everything too
09:49:00<kennethre>but we'll see
10:10:00<chronomex>Coderjoe: you can be added to the approve list for a collection, of course
10:27:00<Nemo_bis>mediatype can be set to anything by anyone
10:27:00<godane>i'm starting to hate the speed of ftp
10:27:00<Nemo_bis>godane: only now?
10:28:00<godane>it normally works fine
10:28:00<Nemo_bis>No. It doesn't.
10:28:00<godane>for me it does
10:28:00<godane>but ever so often the speed becomes very slow
10:28:00<Nemo_bis>Maybe you're the only user left. https://archive.org/~tracey/mrtg/ftp.html
10:29:00<Nemo_bis>Every time a single other person tries to use it, you're both ruined. ;)
10:29:00<Famicoman>I'm using it
10:30:00<godane>i'm not that good with the scripting uploads to s3
10:30:00<Famicoman>I kept getting errors that the drive was full earler
10:30:00<kennethre>is there anyone here i should bother for a 'kennethreitz' collection, or should i go through the normal process?
10:30:00<kennethre>/cc @chronomex
10:30:00<chronomex>hi
10:31:00<chronomex>I think underscor or SketchCow are the people to ask
10:31:00<kennethre>/cc underscor :)
11:40:00<godane>i think s3 is very slow too
11:41:00<godane>not just ftp
11:41:00<SketchCow>What does this collection have?
11:42:00<GLaDOS>WARCs of everything he's done.
12:14:00<Coderjoe>what the
12:14:00<Coderjoe>the ia donate page no longer has the 3-to-1 match blurb
12:15:00<ersi>That's unfortunate, because maybe there's a few holding out to the absolute last day for some reason
12:15:00<Coderjoe>the amounts reflect it, and the blog post about it says it goes to the 31st
12:16:00<Coderjoe>but the progress meter is gone
12:17:00<Famicoman>maybe the goal was reached?
12:18:00<ersi>It was lacking 17k yesterday
12:18:00<Famicoman>ah, doubtful then
12:55:00SmileyG looks in
13:13:00<kennethre>SketchCow: i'm working on a continual archive of everything i create, including articles, tweets, photos, music, etc
13:13:00<kennethre>SketchCow: the plan is to have it back itself up to archive.org in case I have an untimely demise :)
13:18:00<kennethre>it's coming along quite nicely so far
13:18:00<kennethre>http://blackbox.kennethreitz.org/records/1e7f3c62-96e4-4be4-a26a-f62c61ce939d
13:18:00<kennethre>http://blackbox.kennethreitz.org/records/1e7f3c62-96e4-4be4-a26a-f62c61ce939d/download
14:03:00Nemo_bis has 1200 tasks waiting for admin. :/
15:45:00<push>i think web archive should open up old 90s versions of sites, it sucks now that some domains seem to be totally gone due to a NEW robots.txt put on the active site?
15:45:00<ersi>bla bla bla whine old bla bla
15:45:00<ersi>It's been iterated over a billion times already.
15:47:00<push>ah sorry, didnt think about that
15:48:00<ersi>But I agree that it's unfortunate that some new owner of a domain can make the previous owners data hidden in the Wayback Machine.
15:49:00<ersi>There's a lot of data public for what I know, look in the crawldata collection @ IA. It's not everything though, I think. And besides, the data will continue to exist - it's just hidden/darkened (until it's public again, if IA undarks or robots.txt goes away)
15:51:00<push>yeah, theres still a chance to see some of it some time later i guess
15:51:00<push>it hasnt been a huge thing or anything, only a few sites
15:51:00<ersi>Yeah, but it comes up so often it makes me almost angry everytime it comes up
15:52:00<push>i have had a similar reaction :P
15:52:00<ersi>^_^
15:53:00<push>it's hard to solve though i would think, sometimes a legitimate owner wants to block the whole history and i reckon he should be able to
15:53:00<push>i think other times they dont even know about IA maybe
15:54:00<push>some have forbidden everything by default and it seems senseless
15:54:00<ersi>I know that the Wayback Machine does a HTTP GET on the robots.txt when it's going to serve something from a crawled domain - everytime
15:54:00<push>ah
15:55:00<ersi>Maybe I'm wrong, but I have a faint memory of that from fiddling with the code and trying to set Wayback Machine up (http://github.com/internetarchive/wayback/)
15:57:00<push>guess it can also be tested, i have a couple old domains indexed i could set them up again and do before/after robots.txt
15:57:00<push>but it does feel that way
15:57:00<push>it was restrictive just earlier, a site is blocked and i was totally excited to see it
15:57:00<push>some very old site
15:57:00<push>brb
15:57:00<push>ehe
15:59:00<ersi>Yeah, sucks when you run into the problem
16:43:00<SketchCow>That's an interesting tactic, kennethre
16:43:00<kennethre>SketchCow: thanks, i like it more the longer i think about it
17:47:00<tef>push: archive should have old copies of robots.txt ?
19:25:00<balrog_>anyone here familiar with archiving yahoo groups?
19:25:00<balrog_>I found this tool: http://grabyahoogroup.sourceforge.net
20:12:00<balrog_>it's giving me error 500s though