| 00:00:01 | | dm4v quits [Client Quit] |
| 00:00:56 | <Ryz> | From my personal experience archiving those forums with that forum software, I consider these things absolute hell~ |
| 00:01:28 | <@JAA> | Let me guess, Lithium? |
| 00:01:35 | | dm4v joins |
| 00:01:37 | | dm4v is now authenticated as dm4v |
| 00:01:37 | | dm4v quits [Changing host] |
| 00:01:37 | | dm4v (dm4v) joins |
| 00:03:04 | <@JAA> | (Narrator: It was Lithium.) |
| 00:03:26 | <Ryz> | Yes, /that/ Lithium |
| 00:04:08 | <@JAA> | Please add it to Deathwatch anyway. Post IDs go to over 7 million if I'm reading it correctly. |
| 00:06:21 | <@JAA> | Hmm, or not. |
| 00:06:27 | <@JAA> | Either way, lots of content. |
| 00:16:16 | <Ryz> | It's not just the forums, it's also the blogs too D: |
| 00:21:09 | <h2ibot> | Ryz edited Deathwatch (+185, /* 2021 */ Added user forums and user blogs of…): https://wiki.archiveteam.org/?diff=47242&oldid=47241 |
| 00:21:25 | <Ryz> | JAA ^ |
| 00:26:22 | <@arkiver> | JAA: overall i'm not sure about attempting to archive a website with a location it's not actually reachable through |
| 00:26:27 | <@arkiver> | we should be careful with that. |
| 00:39:44 | <Ryz> | Any progress on archiving https://peertube.social/ ? It's gonna shut down in 7-8 days~ |
| 00:43:13 | <@JAA> | arkiver: I agree that it needs to be done carefully. At the very least, it needs to be documented clearly. Here's the one I did a while back: https://archive.org/details/www.seniorcitizens.9f.com_20190514 |
| 00:44:12 | <@JAA> | It's certainly no worse than the random https://www/ records due to search directives etc. |
| 00:52:49 | <@arkiver> | Ryz: going to check it out, thanks for the ping |
| 01:03:02 | | dm4v quits [Read error: Connection reset by peer] |
| 01:04:07 | | dm4v joins |
| 01:04:09 | | dm4v is now authenticated as dm4v |
| 01:04:09 | | dm4v quits [Changing host] |
| 01:04:09 | | dm4v (dm4v) joins |
| 01:33:17 | | Stiletto quits [Ping timeout: 265 seconds] |
| 02:01:48 | | celestial quits [Ping timeout: 265 seconds] |
| 02:03:06 | <Ryz> | arkiver, there's also this: "October 15: The Tor Project will release a new version of the client that drops support for version 2 of Onion Services, effectively rendering them inaccessible once the network upgrades." - not sure how feasible that is though... |
| 02:13:27 | | celestial joins |
| 03:04:21 | | qw3rty__ joins |
| 03:08:04 | | qw3rty_ quits [Ping timeout: 252 seconds] |
| 03:34:07 | | HP_Archivist quits [Ping timeout: 265 seconds] |
| 03:55:27 | | qwertyasdfuiopghjkl joins |
| 04:55:39 | | HP_Archivist (HP_Archivist) joins |
| 05:11:45 | | HP_Archivist quits [Ping timeout: 265 seconds] |
| 06:39:39 | | Stiletto joins |
| 06:56:17 | | dominikh joins |
| 06:59:44 | <dominikh> | hey. I'm trying to find information on "A Smattering of Tweets", i.e. the archiveteam_twitter collection on archive.org. I can't find any reference to this project on archiveteam.org, nor find anything about its status anywhere on the internet. is this actually an archive team project, and why am I having such a hard time finding anything about it? |
| 07:25:29 | <rewby> | Ryz: That sounds like a logistical nightmare. Especially given the speed and latency of tor. |
| 07:55:07 | <@OrIdow6> | rewby: Not super familiar with TOR's design, but maybe we could reduce the number of hops going in? |
| 07:55:23 | <@OrIdow6> | https://tor.stackexchange.com/questions/1312/how-to-decrease-number-of-tor-hops |
| 07:56:03 | <@OrIdow6> | Something like that could be applied, don't think many people care about the workers being anonymous |
| 07:57:19 | <@OrIdow6> | I do worry that indiscriminately archiving a huge chunk of .onion could be like swinging a hammer into a space between some trees and hoping you don't hit a beehive |
| 07:57:49 | <rewby> | Yeeaahhh. There's that too |
| 07:58:08 | <@OrIdow6> | Several groups of people that could annoy |
| 07:58:26 | <rewby> | Yeah, I hadn't considered that aspect |
| 07:58:39 | <rewby> | Probably best for a legal and safety pov to just not... |
| 08:01:05 | <@OrIdow6> | It would be nice to get at least some of it, though |
| 08:01:33 | <@OrIdow6> | Manually curating it |
| 08:35:22 | | qwertyasdfuiopghjkl quits [Ping timeout: 244 seconds] |
| 08:56:16 | | qwertyasdfuiopghjkl joins |
| 09:21:52 | | KiyoshIWJ quits [Ping timeout: 244 seconds] |
| 09:33:56 | <@OrIdow6> | Also, on peertube.social: there's the problem of selection |
| 09:37:23 | <@OrIdow6> | Or whether to select entirely |
| 09:43:06 | <@OrIdow6> | Well, looks like it only comes out to the TiB range |
| 09:43:19 | <@OrIdow6> | Very roughly |
| 10:51:47 | | pabs quits [Remote host closed the connection] |
| 10:52:45 | | pabs (pabs) joins |
| 11:21:33 | | wizards quits [Ping timeout: 258 seconds] |
| 11:23:20 | | wizards joins |
| 12:21:22 | | Ruthalas quits [Ping timeout: 252 seconds] |
| 12:39:30 | | Ruthalas (Ruthalas) joins |
| 13:08:06 | | HP_Archivist (HP_Archivist) joins |
| 13:11:59 | | h3ndr1k quits [Quit: ] |
| 13:12:26 | | h3ndr1k (h3ndr1k) joins |
| 13:15:16 | | Ruthalas0 (Ruthalas) joins |
| 13:16:55 | | Ruthalas quits [Ping timeout: 252 seconds] |
| 13:19:55 | | Ruthalas0 quits [Ping timeout: 265 seconds] |
| 14:16:26 | | second (second) joins |
| 14:17:30 | | sec^nd quits [Ping timeout: 258 seconds] |
| 14:17:30 | | second is now known as sec^nd |
| 14:37:03 | | Arcorann_ quits [Ping timeout: 258 seconds] |
| 14:51:31 | | Myself quits [Ping timeout: 252 seconds] |
| 15:59:46 | <rewby> | OrIdow6: Is peertube really that small? |
| 16:04:10 | <@JAA> | 1488.4 GB as of last November: https://write.tedomum.net/peertubesocial/ |
| 16:04:21 | <rewby> | Huh |
| 16:05:03 | <rewby> | Ah that's just a specific instance of peertube |
| 16:05:05 | <russss> | that's just that one instance, though, not all peertube instances |
| 16:05:08 | <@JAA> | Yeah |
| 16:06:50 | <@JAA> | It all sits on a Hetzner storage box according to https://peertube.social/about/instance |
| 16:45:20 | | HP_Archivist quits [Ping timeout: 265 seconds] |
| 16:53:38 | | HP_Archivist (HP_Archivist) joins |
| 17:22:16 | | HP_Archivist quits [Ping timeout: 258 seconds] |
| 17:37:37 | | HP_Archivist (HP_Archivist) joins |
| 17:41:22 | | Wingy quits [Remote host closed the connection] |
| 17:42:19 | | Wingy (Wingy) joins |
| 17:52:33 | | HP_Archivist quits [Ping timeout: 258 seconds] |
| 18:40:36 | | billy549 quits [Quit: ZNC - https://znc.in] |
| 18:43:40 | | HP_Archivist (HP_Archivist) joins |
| 18:46:54 | | superkuh_ quits [Quit: the neuronal action potential is an electrical manipulation of reversible abrupt phase changes in the lipid bilayer] |
| 18:59:28 | | billy549 (Billy549) joins |
| 19:07:53 | <gazorpazorp> | Is discogs.com (basically IMDb for music) archived or even easily archivable? There's a lot of useful info there that has helped me categorize music more easily |
| 19:09:43 | <@JAA> | The website wasn't well-archived last time I checked. Some things (e.g. edit history) are only visible to logged-in users. |
| 19:09:57 | <@JAA> | They do have regular data dumps, and I've archived them a couple times I think. |
| 19:11:22 | <gazorpazorp> | Like official data dumps? Nice |
| 19:11:54 | <@JAA> | Yep, monthly dumps of artist, label, master, and release data. |
| 19:12:00 | <@JAA> | http://data.discogs.com/ |
| 19:14:00 | <@JAA> | (Doesn't work in the WBM because it rewrites the XML response, but the data is all there as of August 2020 via AB.) |
| 19:14:32 | <@JAA> | https://web.archive.org/web/202008*/https://discogs-data.s3-us-west-2.amazonaws.com/* |
| 19:22:21 | <gazorpazorp> | Awesome, thanks a lot! |
| 19:28:34 | | wizards_ joins |
| 19:31:50 | | wizards quits [Ping timeout: 258 seconds] |
| 19:45:15 | | HP_Archivist quits [Ping timeout: 258 seconds] |
| 20:11:15 | | superkuh joins |
| 20:24:42 | | qwertyasdfuiopghjkl quits [Client Quit] |
| 20:27:00 | | qwertyasdfuiopghjkl joins |
| 20:27:40 | | superkuh quits [Client Quit] |
| 20:33:42 | | HP_Archivist (HP_Archivist) joins |
| 21:00:34 | | HP_Archivist quits [Ping timeout: 252 seconds] |
| 21:13:57 | <tzt> | Also here, but access restricted: https://archive.org/details/discogs-web |
| 21:31:26 | | Wingy quits [Remote host closed the connection] |
| 21:32:22 | | Wingy (Wingy) joins |
| 22:08:10 | <@arkiver> | HCross: would it be possible to bring the archivebot tor pipeline back? |
| 22:08:21 | <@arkiver> | Ryz: regarding tor ^ |
| 22:08:42 | <@arkiver> | looks like peertube has webseeds next to torrents |
| 22:19:23 | <Ryz> | Didn't JAA said there were complications...? |
| 22:20:32 | <@JAA> | With Tor? Nope, just needs to be set up correctly, and the data doesn't go into the normal collection or the WBM. |
| 22:27:12 | | qwertyasdfuiopghjkl quits [Ping timeout: 244 seconds] |
| 22:39:17 | <h2ibot> | FMecha edited 4chan (+336, b4k dropped /jp/ last November): https://wiki.archiveteam.org/?diff=47243&oldid=46337 |
| 22:44:58 | | HP_Archivist (HP_Archivist) joins |
| 23:20:52 | | Arcorann_ joins |
| 23:50:55 | | lapki joins |
| 23:54:02 | | lapki quits [Client Quit] |