| 01:01:59 | | HP_Archivist (HP_Archivist) joins |
| 02:17:39 | <anarcat> | https://www.vice.com/en/article/xgwqgw/facebooks-powerful-large-language-model-leaks-online-4chan-llama |
| 02:56:00 | | qwertyasdfuiopghjkl quits [Ping timeout: 265 seconds] |
| 03:00:14 | | shreyasminocha quits [Ping timeout: 245 seconds] |
| 03:00:14 | | thehedgeh0g quits [Ping timeout: 245 seconds] |
| 03:00:14 | | Kevin quits [Client Quit] |
| 03:00:18 | | shreyasminocha (shreyasminocha) joins |
| 03:00:18 | | thehedgeh0g (mrHedgehog0) joins |
| 03:04:32 | | njha1 quits [Ping timeout: 251 seconds] |
| 03:04:32 | | @AlsoJAA quits [Ping timeout: 251 seconds] |
| 03:04:32 | | Jonimus quits [Ping timeout: 251 seconds] |
| 03:04:32 | | FalconK quits [Ping timeout: 251 seconds] |
| 03:04:32 | | kpcyrd quits [Ping timeout: 251 seconds] |
| 03:04:42 | | AlsoJAA (JAA) joins |
| 03:04:42 | | @ChanServ sets mode: +o AlsoJAA |
| 03:04:42 | | FalconK (FalconK) joins |
| 03:04:43 | | kpcyrd (kpcyrd) joins |
| 03:04:51 | | Jonimus joins |
| 03:06:01 | | njha1 joins |
| 03:09:50 | | Arcorann (Arcorann) joins |
| 03:48:57 | | pabs quits [Quit: Don't rest until all the world is paved in moss and greenery.] |
| 03:51:00 | | pabs (pabs) joins |
| 04:30:32 | | hackbug quits [Remote host closed the connection] |
| 04:32:50 | | hackbug (hackbug) joins |
| 05:05:02 | | systwi_ (systwi) joins |
| 05:05:37 | | systwi quits [Ping timeout: 252 seconds] |
| 05:06:10 | | Terbium quits [Ping timeout: 252 seconds] |
| 05:15:27 | | Terbium joins |
| 05:25:09 | <pabs> | https://www.livescience.com/lost-georges-lemaitre-interview-recovered |
| 05:27:03 | | systwi_ is now known as systwi |
| 05:30:01 | <pabs> | is there a way to archive youtube? |
| 05:36:54 | <pabs> | "Tell HN: Freenom (the operator of .tk, .ml, .ga, .cf, .gq TLDs) is falling apart" https://news.ycombinator.com/item?id=34194555 |
| 05:37:29 | <pabs> | time for a new archiving project? |
| 05:38:11 | <pabs> | also https://krebsonsecurity.com/2023/03/sued-by-meta-freenom-halts-domain-registrations/ https://news.ycombinator.com/item?id=35062806 |
| 05:52:22 | | jamesatjaminit quits [Read error: Connection reset by peer] |
| 05:52:22 | | jamesatjaminit_ (jamesatjaminit) joins |
| 05:54:48 | | Pichu0202 quits [Remote host closed the connection] |
| 05:58:15 | | Pichu0102 joins |
| 05:58:15 | | Pichu0102 is now authenticated as Pichu0102 |
| 06:01:29 | | hitgrr8 joins |
| 06:23:11 | | BlueMaxima quits [Read error: Connection reset by peer] |
| 06:49:42 | | Island quits [Read error: Connection reset by peer] |
| 07:13:16 | | Gereon62009 (Gereon) joins |
| 07:15:58 | | Gereon6200 quits [Ping timeout: 252 seconds] |
| 07:15:58 | | Gereon62009 is now known as Gereon6200 |
| 07:33:34 | <thuban> | pabs: #youtubearchive |
| 08:03:59 | | raxxy-137409 quits [Quit: raxxy-137409] |
| 08:06:04 | | raxxy-137409 joins |
| 08:08:22 | <pabs> | thanks |
| 08:51:07 | | thuban quits [Ping timeout: 252 seconds] |
| 08:58:42 | | LeGoupil joins |
| 09:04:40 | | thuban joins |
| 09:08:05 | | Jake quits [Client Quit] |
| 09:08:20 | | Jake (Jake) joins |
| 10:08:00 | <pabs> | whats the difference between #youtubearchive and #down-the-tube ? |
| 10:20:20 | | qwertyasdfuiopghjkl (qwertyasdfuiopghjkl) joins |
| 11:06:05 | <dieserniko> | I think the first one was for archiving the videos themselves and the second one was archiving metadata in response to the removal of the dislike button |
| 12:08:43 | | Jake quits [Client Quit] |
| 12:08:43 | | LeGoupil quits [Remote host closed the connection] |
| 12:08:46 | | Jake (Jake) joins |
| 12:08:49 | | LeGoupil1 joins |
| 12:11:12 | | LeGoupil1 is now known as LeGoupil |
| 12:52:34 | | Arcorann quits [Ping timeout: 252 seconds] |
| 13:51:55 | | hitgrr8 quits [Client Quit] |
| 14:06:07 | <pabs> | ah |
| 14:26:04 | | lennier1 quits [Ping timeout: 252 seconds] |
| 14:27:00 | | lennier1 (lennier1) joins |
| 14:47:05 | | HackMii quits [Remote host closed the connection] |
| 14:47:42 | | HackMii (hacktheplanet) joins |
| 14:48:37 | | Gereon6200 quits [Ping timeout: 252 seconds] |
| 15:02:56 | | thuban quits [Ping timeout: 265 seconds] |
| 15:15:32 | | ehmry joins |
| 15:16:27 | | Gereon6200 (Gereon) joins |
| 15:21:43 | | thuban joins |
| 15:29:08 | | HackMii quits [Ping timeout: 276 seconds] |
| 15:34:02 | | HackMii (hacktheplanet) joins |
| 15:37:19 | | Island joins |
| 16:10:39 | | hitgrr8 joins |
| 16:20:36 | | qwertyasdfuiopghjkl quits [Remote host closed the connection] |
| 16:52:19 | | LeGoupil quits [Client Quit] |
| 16:57:13 | | umgr036 quits [Remote host closed the connection] |
| 16:57:39 | | second (second) joins |
| 16:58:08 | | sec^nd quits [Remote host closed the connection] |
| 16:58:08 | | second is now known as sec^nd |
| 17:00:48 | | umgr036 joins |
| 17:03:12 | | umgr036 quits [Remote host closed the connection] |
| 17:05:38 | <@JAA> | pabs: What dieserniko is correct, but the practical difference is that #down-the-tube data goes into the Wayback Machine with working video playback but can only somewhat selectively archive videos (cf. wiki page for guidelines) while #youtubearchive goes into a storage that isn't publicly accessible and is a bit less strictly limited. |
| 17:18:42 | | qwertyasdfuiopghjkl (qwertyasdfuiopghjkl) joins |
| 18:05:50 | | Mateon1 quits [Remote host closed the connection] |
| 18:07:17 | | Mateon1 joins |
| 18:54:02 | | qwertyasdfuiopghjkl quits [Client Quit] |
| 19:00:25 | | qwertyasdfuiopghjkl (qwertyasdfuiopghjkl) joins |
| 19:26:00 | | ehmry quits [Ping timeout: 252 seconds] |
| 19:39:50 | | HP_Archivist quits [Read error: Connection reset by peer] |
| 21:00:42 | | miko joins |
| 21:15:45 | | miko leaves |
| 21:16:01 | | mikolaj joins |
| 21:18:43 | <mikolaj> | hello |
| 21:19:26 | <mikolaj> | I've been writing my own open source forum downloader tool. Far from done, but I have bery basic Discourse, PhpBB, SMF, HyperKitty, Pipermail extractors, and will be now working on Hypermail, Proboards, vBulletin, IP.Board extractors |
| 21:19:48 | <mikolaj> | I'd like to ask here if there's any application for such tool. So that I can be sure I'm not sinking my energy into oblivion |
| 21:20:20 | <mikolaj> | I've started writing it because I found wget and httrack limited in usability for downloading forums. But today I learned wpull exists, which might beat the purpose of my project |
| 21:23:43 | <mikolaj> | oh -- the primary selling point is that it can dump posts to JSON, instead of only downloading entire pages |
| 21:25:16 | <mikolaj> | my WIP project is here if anyone cares: https://github.com/mikwielgus/forum-dl |
| 21:27:39 | <@JAA> | mikolaj: That's definitely useful and something I wanted to write for a while. Ideally, there'd be a way to still capture the raw network traffic and write WARCs as well, but that's easy to get wrong if you're not very familiar with the details, so that's not a recommendation to add it now. |
| 21:28:29 | <TheTechRobo> | JAA: Could warcprox effectively automate the WARC writing? Or no since that hasnt been audited yet? |
| 21:28:53 | <@JAA> | TheTechRobo: Yeah, audit needed, but potentially yes. |
| 21:34:54 | <mikolaj> | JAA: can you describe how it could be useful in your use cases? Feedback would be great, so that can know what to focus on |
| 21:44:57 | | mikolaj|m joins |
| 21:50:36 | <@JAA> | mikolaj: I like stuff that can go into the Wayback Machine. :-) |
| 21:51:11 | <@JAA> | Also, preserving the raw original data would allow for reprocessing in the future if a bug is found in the extractor. |
| 22:01:23 | <mikolaj> | sorry, I didn't mean to ask why you need WARCs, but rather what particular use you'd have for my tool (a bundle of forum-software-specific downloaders) |
| 22:12:17 | | hitgrr8 quits [Client Quit] |
| 22:14:14 | <@JAA> | mikolaj: Ah, right, nothing specific, just extracting the post contents into a common machine-readable data format. |
| 22:16:22 | <@JAA> | Allows for indexing, replaying, etc. across forum softwares. |
| 22:16:50 | <@JAA> | Someone else here was working on an indexer of forum pages from WARCs a while ago. Don't think it went anywhere though. |
| 22:19:46 | <mikolaj> | I just browsed your IRC logs and found a person named avoozl talking about developing such a project. Haven't found any repo though |
| 22:20:00 | <mikolaj> | avoozl: how is your project going? Do you have a public repository? |
| 22:24:42 | <mikolaj|m> | (I've permanently connected here via Matrix now so I'll disconnect via IRC now) |
| 22:24:56 | | mikolaj quits [Remote host closed the connection] |
| 22:30:47 | <@JAA> | avoozl: fg |
| 22:31:12 | <@JAA> | (Sorry, ignore) |
| 22:33:06 | | Larsenv quits [Quit: ZNC 1.8.2+deb2build5 - https://znc.in] |
| 22:35:36 | | Larsenv (Larsenv) joins |
| 23:09:53 | | BlueMaxima joins |
| 23:16:29 | | Mateon2 joins |
| 23:16:32 | | Mateon1 quits [Remote host closed the connection] |
| 23:16:32 | | Mateon2 is now known as Mateon1 |
| 23:25:33 | | Mateon1 quits [Remote host closed the connection] |
| 23:25:43 | | Mateon1 joins |
| 23:59:44 | <pabs> | JAA: hmm, sounds like #down-the-tube is preferrable |
| 23:59:58 | | pabs wonders if forum-dl can output Maildir :) |