00:10:00 | <SketchCow> | Here is why we can't have nice things. |
00:10:00 | <SketchCow> | I have 400 videotapes in my house, from GDC (game developers conference) that I've been digitizing. |
00:11:00 | <SketchCow> | Now, what to do when they're done. Don't want to throw them out, don't want to return them because they have no space. They'd throw them out. |
00:11:00 | <SketchCow> | So I suggested Stamford University, which has a games archive and which I have worked with extensively. |
00:11:00 | <SketchCow> | So that was going on |
00:11:00 | <SketchCow> | Now it is not. |
00:12:00 | <SketchCow> | Why? Because Stamford wants GDC to sign a contract saying "We are fine giving you these tapes." |
00:12:00 | <chronomex> | goddamnit |
00:12:00 | <SketchCow> | GDC legal says "We never got authorization from these people to give away these tapes" |
00:12:00 | <SketchCow> | So now they go "Can you supply it to another archive?" |
00:12:00 | <SketchCow> | And I'm going "well, I can call them, but every legit archive wants SOMETHING saying 'thanks for the tapes'" |
00:12:00 | <SketchCow> | Anyway, so that's where we are. |
00:13:00 | <chronomex> | goddamnit |
00:13:00 | <SketchCow> | Regardless, I'm digitizing all the fucking tapes and they're all going into archive.org |
00:13:00 | <SketchCow> | So fuck everybody |
00:13:00 | <chronomex> | yep |
00:13:00 | <chronomex> | fuck em all, let god sort them out |
00:13:00 | <chronomex> | erm |
00:13:00 | <chronomex> | yeah |
00:13:00 | <SketchCow> | Fuck them all, let God type in the metadata |
00:14:00 | <chronomex> | dictation, motherfuckers |
00:14:00 | <chronomex> | TOTALLY METADATA GRADE |
00:14:00 | <joepie91> | lol |
00:15:00 | <joepie91> | SketchCow: set up your own physical archive :D |
00:15:00 | <SketchCow> | Wayyyy ahead of you |
00:15:00 | <SketchCow> | But my archive wants to give them away |
00:15:00 | <SketchCow> | Ha ha, I could totally.... |
00:15:00 | <SketchCow> | hahaha |
00:15:00 | <SketchCow> | I could sign a contract |
00:15:00 | <SketchCow> | Then turn around and give them to stamford |
00:15:00 | <chronomex> | hahahaha |
00:15:00 | <SketchCow> | and sign the contract |
00:15:00 | <chronomex> | cross-archive donation |
00:16:00 | <chronomex> | I like this |
00:16:00 | <SketchCow> | No, it means I take on the burden |
00:16:00 | <SketchCow> | OH NO |
00:16:00 | <SketchCow> | These things in my house stay in my house |
00:16:00 | <SketchCow> | fuck everybody |
00:17:00 | <chronomex> | fuck em all, let god sort them out |
00:17:00 | <SketchCow> | God uses RDF, he's fucked |
00:18:00 | <chronomex> | at least it's not xml-encoded asn.1 |
00:28:00 | <SketchCow> | Just so you can see what these videos look like: |
00:28:00 | <SketchCow> | http://archive.org/details/2004-gdc-deferred-shading-on-dx9-hardware-xbox |
00:28:00 | <SketchCow> | I'm uploading these very quickly. |
00:29:00 | <BlueMax> | time to shove off an email for JSTP I guess |
00:35:00 | <SketchCow> | Tabblo has gone 100% into Wayback |
00:35:00 | <SketchCow> | Take that, bitches |
00:35:00 | <BlueMax> | What about Webshots? :D |
00:35:00 | <SketchCow> | Webshots is partially in |
00:35:00 | <SketchCow> | But some previous ones have to be handled. |
00:36:00 | <SketchCow> | Snd I'm focusing on other stuff right now, stuff no longer up. |
00:36:00 | <BlueMax> | sorry, that was meant to be a joke. |
00:36:00 | <no2pencil> | is there a url for this file format project you posted about earlier? |
00:37:00 | <SketchCow> | http://www.archiveteam.org/index.php?title=Just_Solve_the_Problem_2012 |
00:47:00 | <BlueMax> | SketchCow, question for you: I assume the results of Just Solve The Problem will be laid out in a seperate wiki (correct me if I'm wrong) - do we have a particular layout for each page yet? |
00:57:00 | <SketchCow> | No |
00:57:00 | <SketchCow> | That will happen very shortly |
00:57:00 | <SketchCow> | wiki is about to be set up this weekend. |
01:01:00 | <BlueMax> | good to know SketchCow |
01:09:00 | <DFJustin> | 14.6 gb avi fuck yeah |
02:04:00 | <SketchCow> | OK SELF-DIRECTED PROJECT |
02:04:00 | <SketchCow> | http://www.pummelvision.com/ |
02:04:00 | <SketchCow> | If you can figure out how to save it, let's save it. |
02:06:00 | <creativec> | What's this pummelvision supposed to be? |
02:06:00 | <creativec> | This video is just a bunch of what appears to be Facebook pictures... |
02:07:00 | <SketchCow> | Yeah |
02:07:00 | <SketchCow> | It's not impressive. |
02:07:00 | <SketchCow> | Someone wrote me and said "could you save it!!!!" |
02:08:00 | <SketchCow> | And it's like............. |
02:08:00 | <SketchCow> | .................no |
02:08:00 | <creativec> | heh |
02:09:00 | <joepie91> | http://techcrunch.com/2010/12/23/pummelvision/ |
02:10:00 | <creativec> | I would assume that it is unsavable if we don't have access to the source code...? |
02:10:00 | <joepie91> | I'm not sure what there is to save in the first place |
02:11:00 | <joepie91> | it used external sources |
02:13:00 | <creativec> | eh, it looks reproducable easily. I don't see if there's a reason to save it. |
02:19:00 | <godane> | SketchCow: i grabbed www.apdl.co.uk today |
02:20:00 | <godane> | there is tons of demo ware and pd ware for risc os in these warc |
02:27:00 | <joepie91> | has oldversion.com ever been archived |
02:28:00 | <godane> | not really |
02:29:00 | <godane> | the way back machine has last snapshot from 2009 |
02:30:00 | <joepie91> | okay, so |
02:30:00 | <joepie91> | I'd like to archive it |
02:30:00 | <joepie91> | but the fuckers |
02:30:00 | <joepie91> | use javascript for the downloads |
02:30:00 | <joepie91> | so I need to figure out how to script wget-lua :P |
03:09:00 | <joepie91> | seriously? SERIOUSLY? |
03:09:00 | <joepie91> | these oldversion guys |
03:09:00 | <joepie91> | for fucks sake |
03:09:00 | <joepie91> | they really REALLY try to discourage crawling/archiving |
03:10:00 | <joepie91> | alard, SketchCow, whenever either of you gets here, is there a way to create warcs in python? |
03:11:00 | <balrog_> | joepie91: how are they doing so? |
03:11:00 | <balrog_> | oh, js... |
03:12:00 | <balrog_> | joepie91: there's a trick |
03:12:00 | <balrog_> | http://www.oldversion.com/main_download.php?sid=N |
03:12:00 | <balrog_> | and you get the file |
03:12:00 | <balrog_> | N seems to be sequential :D |
03:15:00 | <joepie91> | yeah, no |
03:15:00 | <joepie91> | 302s to the main page |
03:16:00 | <joepie91> | unless you've gone through the whole sequence of download pages |
03:16:00 | <joepie91> | :| |
03:16:00 | <joepie91> | @ balrog_ |
03:16:00 | <joepie91> | and I have nfi how to script that in lua |
03:17:00 | <balrog_> | can't you use regular expressions or bash or python? |
03:17:00 | <joepie91> | problem is |
03:17:00 | <joepie91> | can't use python in wget |
03:17:00 | <joepie91> | don't know how to make warcs in python |
03:17:00 | <joepie91> | :P |
03:17:00 | <joepie91> | can you see my issue? |
03:17:00 | <joepie91> | and regular expressions don't do much if you have to make certain page requests to be able to download the file in the first place |
03:17:00 | <balrog_> | ah, a dl timer |
03:17:00 | <balrog_> | bleh |
03:18:00 | <joepie91> | well no, not a timer per se |
03:18:00 | <balrog_> | yeah you may not be able to use warc here |
03:18:00 | <joepie91> | what's the format of a warc like? |
03:18:00 | <joepie91> | in simple terms |
03:18:00 | <balrog_> | you may have to hack up something involving jdownloader/slimrat/plowshare :| |
03:18:00 | <joepie91> | oh, I can write my own downloader, the warc thing is the only problem :P |
03:18:00 | <joepie91> | what I'm thinking of... |
03:18:00 | <joepie91> | is just writing a download script specifically for the downloads |
03:18:00 | <joepie91> | then wget-warcing the main site |
03:18:00 | <joepie91> | and afterwards modifying the warc to point to the files directly |
03:19:00 | <joepie91> | and adding the files |
03:19:00 | <joepie91> | but I don't know how modifiable a warc file is |
03:24:00 | <joepie91> | anyway, time to sleep |
03:24:00 | <joepie91> | balrog_: thanks for the slimrat/plowshare stuff btw |
03:24:00 | <joepie91> | wasn't aware of its existence |
03:24:00 | <joepie91> | goodnight :P |
03:58:00 | <joepie91> | ugh I hate this - have to sleep., but not tired :( |
04:42:00 | <bsmith094> | joepie91: been there |
07:31:00 | <Nemo_bis> | oh wonderful, it's getting a habit http://www.us.archive.org/log_show.php?task_id=128637767 |
07:38:00 | <alard> | joepie91: To write warcs in Python, you have http://code.hanzoarchives.com/warc-tools (I've only used that for reading warcs, though). |
07:40:00 | <alard> | joepie91: There is no Wget-Lua documentation yet. You could look at examples, https://github.com/alard/wget-lua/tree/lua/lua-example and the recent *-grab projects, and in the Wget side of the Lua hooks: https://github.com/alard/wget-lua/blob/lua/src/luahooks.c . |
07:42:00 | <alard> | (And just ask if you have a question; most of the documentation is still in my head. You may be the first who writes a Wget-Lua script.) |
09:34:00 | <SketchCow> | http://www.dailydot.com/news/livejournal-shut-down-us-office/ |
09:35:00 | <C-Keen> | signs of decay? |
14:15:00 | <joepie91> | alard: will have a look, thanks |
14:16:00 | <joepie91> | hey, um, SketchCow, brainfart: have several people across the world accept old magazines/manuals/CDs/whatever and collectively digitize it |
14:17:00 | <joepie91> | several people across the world == lower shipping costs |
14:18:00 | <BlueMax> | main problem would be volunteers for this joepie91 |
14:18:00 | <joepie91> | ofc, but I can imagine that there are at least a few people that have a few spare hours of time where they're bored out of their skull |
14:19:00 | <joepie91> | so they may as well scan and categorize stuff :P |
14:19:00 | <joepie91> | (includes me) |
14:21:00 | <BlueMax> | fair enough, you may want to get a few more people to make it worthwhile |
19:53:00 | <SketchCow> | http://archive.org/details/archiveteam-umich-save |
19:53:00 | <SketchCow> | As you can see, now a pile of "WARC" versions, all of which will get into the wayback. |
19:56:00 | <godane> | i'm home today |
19:57:00 | <godane> | i uploaded to linux format isos early today |
19:57:00 | <godane> | now uploading a 3rd |
20:14:00 | <SketchCow> | http://archive.org/details/atariforcecomics-205 |
22:29:00 | <SketchCow> | I just proposed the "hand it to jason, jason will hand it to Stamford" approach |
22:29:00 | <SketchCow> | Artifact laundering. 21st century. |
22:30:00 | <BlueMax> | All you need is some form of cash involved |
22:36:00 | <DFJustin> | is Stamford like Harfurd http://dilbert.com/strips/comic/1994-03-15/ |
22:42:00 | <chronomex> | laundering++ |
22:53:00 | <joepie91> | SketchCow: do you have a second? |
22:54:00 | <joepie91> | preferably several :P |
22:57:00 | <BlueMax> | I was actually wondering when the JSTP Wiki was gonna get underway |
23:17:00 | <SketchCow> | I have occasional seconds. |
23:19:00 | <chronomex> | brb 3rds of cookie |
23:20:00 | <chronomex> | DFJustin: please allow me to introduce you to /fast/: http://dilbert.com/fast/1994-03-15/ |
23:28:00 | <joepie91> | SketchCow: whoop, missed your response - anyway, did you see my brainfart last night? regarding the accepting old materials by snail mail and digitizing them |
23:28:00 | <joepie91> | seems you already have some experience with that judging from the presentation about the lawsuit |