18:03:00<SketchCow>Which gallery
18:04:00<schbiridi>http://www.ratedesi.com/albumrecentpics.php <- NSFW penises, nothing worse though
18:04:00<SketchCow>Yes, someone should immediately grab this.
18:04:00<SketchCow>Do we have an effective way to grab vbulletin?
18:05:00<schbiridi>nice, looks like it: http://archiveteam.org/index.php?title=VBulletin
18:06:00<schbiridi>that gallery is seperate though and i think so are the user profiles.
18:06:00<schbiridi>gotta go, take care
18:06:00<SketchCow>Let's start with what makes sense.
18:06:00<SketchCow>WHO WANTS TO DOWNLOAD RATEDESI.COM
18:08:00<edsu>SketchCow: who listens to info@archive.org ?
18:08:00<SketchCow>It's a general mailbox that allows whoever's on duty to route questions to the right internal person.
18:09:00<SketchCow>xk_id: Be aware, you'll violate the TOS to do it. We're all for it, but your advisor needs to get in on it, sadly.
18:09:00<SketchCow>Unless of course you're doing this freestyle, then just do it
18:19:00<alard>SketchCow: https://github.com/ArchiveTeam/3frame-grab
18:19:00<balrog_>https://twitter.com/eevblog/status/294389522836381696
18:19:00<SketchCow>Great
18:23:00<alard>(Actually, that's not a good strategy to get 3frames. It doesn't include numbers.)
18:25:00<xk_id>SketchCow: no, I have a supervisor. Thank you
18:25:00<xk_id>I'll speak to him
18:25:00<xk_id>wow. How come scholars never discuss this issue?
18:25:00<xk_id>I've never seen it mentioned in any articles from my area
18:28:00<chronomex>anonymisation is a very hard problem, btw
18:28:00<chronomex>people keep messing it up in all sorts of ways
18:28:00<chronomex>viz., the aol searches dump
18:28:00<xk_id>In my case, it's pretty easy.
18:28:00<chronomex>what are you working with?
18:28:00<xk_id>because I need to crawl an online social network, and extract the social graph. No nodes will have usernames/names
18:29:00<xk_id>Just making my own. Finished coding the worker and I'ms tarting to look into distributing it over EC2
18:29:00<edsu>SketchCow: underscor is helping me out over in #internetarchive now, so I think I'm sorted
18:31:00<SketchCow>I saw and I saw him hijacking internal chat to get this going, so yes.
18:31:00<SketchCow>But info@archive.org would have worked too.
18:31:00<SketchCow>xk_id: Sociologists have a massive amount of mores and issues regarding this. And rulesets.
18:32:00<SketchCow>xk_id: The problem is that we moved into programmatic research, that is, the ability of programs and other observational items to go into general computing platforms, without those rules following. So it's easy to scrape something but TOSes get in the way.
18:34:00<xk_id>I'm really surprised scholarly literature does not mention this issue
18:37:00<alard>xk_id: You probably already know about these? http://snap.stanford.edu/data/
18:37:00xk_id nods
18:37:00<xk_id>I want to make my own dataset
18:37:00<xk_id>It is more worthwhile :)
18:38:00<xk_id>but SNAP (and the others) are my backup plan
18:38:00<alard>It's always good to do it yourself.
18:38:00<alard>There's also our http://archive.org/details/friendster-dataset-201107 and http://archive.org/details/friendster-groups-201107
18:39:00<xk_id>oh, cool. don't you need an account for accessing the friendster network?\
18:41:00<alard>This is from before it changed into a gaming site.
18:42:00<xk_id>alard: that's a very interesting dataset. has it been used so far?
18:42:00<xk_id>I didn't know it's a gaming site now
18:42:00<alard>xk_id: Not that I know of. I tried to get it listed on that snap site, sent them an email but never got a response.
18:43:00<alard>They have a frienster dataset, but it's much smaller. (And that for a repository of "large" datasets. Ha.)
18:43:00<xk_id>academics are a bit cliquey too i think
18:49:00<SketchCow>Well yeah
18:58:00<alard>xk_id: What kind of research are you doing?
19:00:00<edsu>SketchCow: i will remember info@archive.org for the future, sorry if I subverted the normal procedure there
19:28:00<SketchCow>It's not a big deal, I'm just telling you the easiest way to ensure stuff gets handled. I subvert the process 12 times a day
19:42:00<chronomex>heheh
20:04:00<edsu>SketchCow: nice :)
21:34:00<godane>so all 2007 episodes of tekzilla are uploaded now
21:48:00<SketchCow>I've been integrating as fast as I can.
21:48:00<SketchCow>How's the new toy?
21:58:00<godane>good
21:58:00<godane>i have use it in windows
21:58:00<godane>for some reason slitaz doesn't can't detect it
22:15:00<godane>so i'm also mirroring thefeed images from my thefeed articles dump
22:16:00<balrog_>what's the model again?
22:16:00<balrog_>Plustek OpticBook 3800?
22:17:00<balrog_>or 4800?
22:22:00<godane>4800
22:40:00<SketchCow>4800
22:40:00<SketchCow>godane: Go to http://www.hamrick.com/ and grab the trial software
22:42:00<godane>i have it
22:43:00<godane>i tried vuescan on linux and it didn't detect the scanner
22:44:00<godane>i think i just have to update my slitaz-tank distro
22:44:00<balrog_>I don't see any OpticBooks in http://www.hamrick.com/vuescan/vuescan.htm#plustek
23:18:00<SketchCow>Twitter is shutting down Posterous.
23:18:00<SketchCow>Archive Team ahoy
23:18:00<SketchCow>And I thought it was going to be a quiet fuckin' year
23:22:00<SketchCow>Wall, explore anyway, no official date set yet.
23:22:00<SketchCow>http://posterous.uservoice.com/knowledgebase/articles/56001-acquisition-faq
23:23:00<chronomex>posterous? fuck
23:23:00<chronomex>I don't see anything about shutdown there
23:24:00<chronomex>I mean it hints at it
23:24:00<chronomex>but that was in march
23:26:00<SketchCow>http://socialnewsdaily.com/7309/posterous-not-accepting-new-accounts-twitter-reveals-nothing/
23:27:00<chronomex>weird.