00:00:01dm4v quits [Client Quit]
00:00:56<Ryz>From my personal experience archiving those forums with that forum software, I consider these things absolute hell~
00:01:28<@JAA>Let me guess, Lithium?
00:01:35dm4v joins
00:01:37dm4v quits [Changing host]
00:01:37dm4v (dm4v) joins
00:03:04<@JAA>(Narrator: It was Lithium.)
00:03:26<Ryz>Yes, /that/ Lithium
00:04:08<@JAA>Please add it to Deathwatch anyway. Post IDs go to over 7 million if I'm reading it correctly.
00:06:21<@JAA>Hmm, or not.
00:06:27<@JAA>Either way, lots of content.
00:16:16<Ryz>It's not just the forums, it's also the blogs too D:
00:21:09<h2ibot>Ryz edited Deathwatch (+185, /* 2021 */ Added user forums and user blogs of…): https://wiki.archiveteam.org/?diff=47242&oldid=47241
00:21:25<Ryz>JAA ^
00:26:22<@arkiver>JAA: overall i'm not sure about attempting to archive a website with a location it's not actually reachable through
00:26:27<@arkiver>we should be careful with that.
00:39:44<Ryz>Any progress on archiving https://peertube.social/ ? It's gonna shut down in 7-8 days~
00:43:13<@JAA>arkiver: I agree that it needs to be done carefully. At the very least, it needs to be documented clearly. Here's the one I did a while back: https://archive.org/details/www.seniorcitizens.9f.com_20190514
00:44:12<@JAA>It's certainly no worse than the random https://www/ records due to search directives etc.
00:52:49<@arkiver>Ryz: going to check it out, thanks for the ping
01:03:02dm4v quits [Read error: Connection reset by peer]
01:04:07dm4v joins
01:04:09dm4v quits [Changing host]
01:04:09dm4v (dm4v) joins
01:33:17Stiletto quits [Ping timeout: 265 seconds]
02:01:48celestial quits [Ping timeout: 265 seconds]
02:03:06<Ryz>arkiver, there's also this: "October 15: The Tor Project will release a new version of the client that drops support for version 2 of Onion Services, effectively rendering them inaccessible once the network upgrades." - not sure how feasible that is though...
02:13:27celestial joins
03:04:21qw3rty__ joins
03:08:04qw3rty_ quits [Ping timeout: 252 seconds]
03:34:07HP_Archivist quits [Ping timeout: 265 seconds]
03:55:27qwertyasdfuiopghjkl joins
04:55:39HP_Archivist (HP_Archivist) joins
05:11:45HP_Archivist quits [Ping timeout: 265 seconds]
06:39:39Stiletto joins
06:56:17dominikh joins
06:59:44<dominikh>hey. I'm trying to find information on "A Smattering of Tweets", i.e. the archiveteam_twitter collection on archive.org. I can't find any reference to this project on archiveteam.org, nor find anything about its status anywhere on the internet. is this actually an archive team project, and why am I having such a hard time finding anything about it?
07:25:29<rewby>Ryz: That sounds like a logistical nightmare. Especially given the speed and latency of tor.
07:55:07<@OrIdow6>rewby: Not super familiar with TOR's design, but maybe we could reduce the number of hops going in?
07:55:23<@OrIdow6>https://tor.stackexchange.com/questions/1312/how-to-decrease-number-of-tor-hops
07:56:03<@OrIdow6>Something like that could be applied, don't think many people care about the workers being anonymous
07:57:19<@OrIdow6>I do worry that indiscriminately archiving a huge chunk of .onion could be like swinging a hammer into a space between some trees and hoping you don't hit a beehive
07:57:49<rewby>Yeeaahhh. There's that too
07:58:08<@OrIdow6>Several groups of people that could annoy
07:58:26<rewby>Yeah, I hadn't considered that aspect
07:58:39<rewby>Probably best for a legal and safety pov to just not...
08:01:05<@OrIdow6>It would be nice to get at least some of it, though
08:01:33<@OrIdow6>Manually curating it
08:35:22qwertyasdfuiopghjkl quits [Ping timeout: 244 seconds]
08:56:16qwertyasdfuiopghjkl joins
09:21:52KiyoshIWJ quits [Ping timeout: 244 seconds]
09:33:56<@OrIdow6>Also, on peertube.social: there's the problem of selection
09:37:23<@OrIdow6>Or whether to select entirely
09:43:06<@OrIdow6>Well, looks like it only comes out to the TiB range
09:43:19<@OrIdow6>Very roughly
10:51:47pabs quits [Remote host closed the connection]
10:52:45pabs (pabs) joins
11:21:33wizards quits [Ping timeout: 258 seconds]
11:23:20wizards joins
12:21:22Ruthalas quits [Ping timeout: 252 seconds]
12:39:30Ruthalas (Ruthalas) joins
13:08:06HP_Archivist (HP_Archivist) joins
13:11:59h3ndr1k quits [Quit: ]
13:12:26h3ndr1k (h3ndr1k) joins
13:15:16Ruthalas0 (Ruthalas) joins
13:16:55Ruthalas quits [Ping timeout: 252 seconds]
13:19:55Ruthalas0 quits [Ping timeout: 265 seconds]
14:16:26second (second) joins
14:17:30sec^nd quits [Ping timeout: 258 seconds]
14:17:30second is now known as sec^nd
14:37:03Arcorann_ quits [Ping timeout: 258 seconds]
14:51:31Myself quits [Ping timeout: 252 seconds]
15:59:46<rewby>OrIdow6: Is peertube really that small?
16:04:10<@JAA>1488.4 GB as of last November: https://write.tedomum.net/peertubesocial/
16:04:21<rewby>Huh
16:05:03<rewby>Ah that's just a specific instance of peertube
16:05:05<russss>that's just that one instance, though, not all peertube instances
16:05:08<@JAA>Yeah
16:06:50<@JAA>It all sits on a Hetzner storage box according to https://peertube.social/about/instance
16:45:20HP_Archivist quits [Ping timeout: 265 seconds]
16:53:38HP_Archivist (HP_Archivist) joins
17:22:16HP_Archivist quits [Ping timeout: 258 seconds]
17:37:37HP_Archivist (HP_Archivist) joins
17:41:22Wingy quits [Remote host closed the connection]
17:42:19Wingy (Wingy) joins
17:52:33HP_Archivist quits [Ping timeout: 258 seconds]
18:40:36billy549 quits [Quit: ZNC - https://znc.in]
18:43:40HP_Archivist (HP_Archivist) joins
18:46:54superkuh_ quits [Quit: the neuronal action potential is an electrical manipulation of reversible abrupt phase changes in the lipid bilayer]
18:59:28billy549 (Billy549) joins
19:07:53<gazorpazorp>Is discogs.com (basically IMDb for music) archived or even easily archivable? There's a lot of useful info there that has helped me categorize music more easily
19:09:43<@JAA>The website wasn't well-archived last time I checked. Some things (e.g. edit history) are only visible to logged-in users.
19:09:57<@JAA>They do have regular data dumps, and I've archived them a couple times I think.
19:11:22<gazorpazorp>Like official data dumps? Nice
19:11:54<@JAA>Yep, monthly dumps of artist, label, master, and release data.
19:12:00<@JAA>http://data.discogs.com/
19:14:00<@JAA>(Doesn't work in the WBM because it rewrites the XML response, but the data is all there as of August 2020 via AB.)
19:14:32<@JAA>https://web.archive.org/web/202008*/https://discogs-data.s3-us-west-2.amazonaws.com/*
19:22:21<gazorpazorp>Awesome, thanks a lot!
19:28:34wizards_ joins
19:31:50wizards quits [Ping timeout: 258 seconds]
19:45:15HP_Archivist quits [Ping timeout: 258 seconds]
20:11:15superkuh joins
20:24:42qwertyasdfuiopghjkl quits [Client Quit]
20:27:00qwertyasdfuiopghjkl joins
20:27:40superkuh quits [Client Quit]
20:33:42HP_Archivist (HP_Archivist) joins
21:00:34HP_Archivist quits [Ping timeout: 252 seconds]
21:13:57<tzt>Also here, but access restricted: https://archive.org/details/discogs-web
21:31:26Wingy quits [Remote host closed the connection]
21:32:22Wingy (Wingy) joins
22:08:10<@arkiver>HCross: would it be possible to bring the archivebot tor pipeline back?
22:08:21<@arkiver>Ryz: regarding tor ^
22:08:42<@arkiver>looks like peertube has webseeds next to torrents
22:19:23<Ryz>Didn't JAA said there were complications...?
22:20:32<@JAA>With Tor? Nope, just needs to be set up correctly, and the data doesn't go into the normal collection or the WBM.
22:27:12qwertyasdfuiopghjkl quits [Ping timeout: 244 seconds]
22:39:17<h2ibot>FMecha edited 4chan (+336, b4k dropped /jp/ last November): https://wiki.archiveteam.org/?diff=47243&oldid=46337
22:44:58HP_Archivist (HP_Archivist) joins
23:20:52Arcorann_ joins
23:50:55lapki joins
23:54:02lapki quits [Client Quit]