| 00:04:57 | | sec^nd quits [Remote host closed the connection] |
| 00:06:19 | | sec^nd (second) joins |
| 00:22:46 | | hackbug quits [Remote host closed the connection] |
| 00:30:00 | | Sluggs quits [Read error: Connection reset by peer] |
| 00:30:58 | | Sluggs joins |
| 00:46:41 | <pokechu22> | myself: https://archive.org/details/wiki-wavesharecom_w - this has all of the files and page history. The big trick I used to deal with the rate limiting is that wikiteam tools can export 50 revisions at a time... but I still had to deal with the 12-second wait when downloading files (I made sure to leave a minimum gap of 12 seconds, but I included the time spent downloading |
| 00:46:44 | <pokechu22> | the file, and some files took way longer (e.g. Game-hat-3D-drawing.7z which is 15 MB took 72 seconds to download)). Some files are missing as one was detected as malware and others are victims of case-insensitive filesystems though (I should look into fixing the latter problem at some point). I am running the files through archivebot as well, so those should still end up |
| 00:46:46 | <pokechu22> | being saved (but it'll take even longer) |
| 00:52:10 | | AlsoTheTechRobo quits [Remote host closed the connection] |
| 00:52:51 | | AlsoTheTechRobo (TheTechRobo) joins |
| 00:59:21 | <AlsoTheTechRobo> | thuban: I just modified an existing script to add the post argument /shrug |
| 00:59:36 | | AlsoTheTechRobo quits [Remote host closed the connection] |
| 01:01:08 | | TheTechRobo (TheTechRobo) joins |
| 01:46:31 | | Justin[home] quits [Ping timeout: 252 seconds] |
| 01:58:26 | <pabs> | is there a project for GitLab archiving? this just got removed from the Mozilla add-ons store https://gitlab.com/magnolia1234/bypass-paywalls-clean-filters |
| 01:58:40 | <pabs> | https://gitlab.com/magnolia1234/bypass-paywalls-firefox-clean/-/issues/905 |
| 01:58:46 | <pabs> | https://nits.readthefinemanual.net/Magnolia1234B/status/1624091040570306561 |
| 01:58:58 | <pabs> | https://www.ghacks.net/2023/02/13/mozilla-removes-bypass-paywalls-clean-extension-from-its-add-ons-repository/ |
| 02:04:21 | <@JAA> | I wonder if the .xpi is still accessible. |
| 02:05:16 | <@JAA> | Nope, they purged those, too. |
| 02:05:54 | <tomodachi94> | JAA: They're in the GitLab releases, but no telling if they're the same as the Store: https://gitlab.com/magnolia1234/bypass-paywalls-firefox-clean/-/releases |
| 02:06:12 | <@JAA> | Yeah, I meant the ones signed by Mozilla. |
| 02:08:19 | | Icyelut (Icyelut) joins |
| 02:08:21 | <@JAA> | Archiving GitLab is a pain because of all the scripting. I don't think we have any tooling for that currently. |
| 02:11:21 | <pabs> | the glab tool is useful for command-line API access: https://gitlab.com/gitlab-org/cli |
| 02:12:54 | <@JAA> | Yeah, not the web interface into the WBM though. |
| 02:15:17 | | umgr036 joins |
| 02:27:25 | <pabs> | yep. I guess SPN is the only JS-to-WBM thing there is? |
| 02:27:59 | | pabs wonders if the WBM will ever get DOM dumps in addition to screenshots etc |
| 02:34:59 | | lennier1 quits [Client Quit] |
| 02:35:14 | | lennier1 (lennier1) joins |
| 02:37:28 | <@JAA> | Kind of, yeah. SPN uses Brozzler, which is essentially a WARC-writing MITM proxy and a Chromium browser (plus machinery around it). And that kind of approach is really the only thing that's sensible for heavily interactive sites. Unfortunately, WARC supports neither HTTP/2 (or /3) nor WebSockets, so it still breaks down quickly. |
| 02:37:50 | <@JAA> | And of course, this is extremely resource-intensive compared to the crawls we normally do. |
| 02:48:12 | <@JAA> | FWIW, I've dumped a bundle of the Git repo itself onto IA. |
| 02:49:21 | <pokechu22> | Hmm, I thought I saw websockets work before? I think with a ws_ suffix on the timestamp? |
| 02:49:26 | | DiscantX joins |
| 02:49:40 | <@JAA> | The WARC format does not support WS at all. |
| 02:50:01 | <@JAA> | It'd be an unofficial extension with no public specification (that I'm aware of). |
| 02:50:20 | <@JAA> | Not that this stops people. There are also people writing fake HTTP/1.1 responses for HTTP/2 traffic. |
| 02:51:18 | <TheTechRobo> | I love the idea of WARC, but it's horrendously outdated and the tooling is meh |
| 02:51:45 | <@JAA> | Yeah, it's certainly slow-moving. |
| 02:51:50 | <TheTechRobo> | Storing raw data sent and received is a great idea, but it doesn't matter if you can't store any modern protocol. |
| 02:52:00 | <TheTechRobo> | Eventually people will abandon HTTP/1.1. |
| 02:52:12 | <TheTechRobo> | Might take ages, but it's going to happen. |
| 02:52:27 | <TheTechRobo> | And a lot of servers already don't support it. |
| 02:52:43 | <@JAA> | You could store pcaps + SSL pre-master keys. That would be the most generic capture format possible. It'd be even worse to work with though. |
| 02:52:50 | <TheTechRobo> | WebSockets are also super important, and we don't have that. |
| 02:53:16 | <TheTechRobo> | JAA: Yeah, that'd make an annoying-to-use format just dumb to use. Nobody's going to use WARC if it's that difficult to extract a damn webpage. |
| 02:53:24 | | DiscantX quits [Client Quit] |
| 02:53:39 | <@JAA> | I haven't come across many HTTP servers that didn't accept 1.1 connections. In fact, I can't think of any right now. Buttflare's 1.1 implementation is flawed, but that's about it. |
| 02:53:42 | | DiscantX joins |
| 02:54:03 | | DiscantX quits [Client Quit] |
| 02:54:23 | <TheTechRobo> | JAA: It will happen though, it's just a matter of time. Websites are so awfully bloated that HTTP/1.1 won't be able to keep up. |
| 02:54:36 | <@JAA> | The problem is that people have very different ideas of what 'a webpage' even is. |
| 02:55:03 | <@JAA> | And that's especially true for interactive pages. |
| 02:55:08 | <TheTechRobo> | ^ |
| 02:55:33 | <pokechu22> | Ah, what I was thinking of is that e.g. https://web.archive.org/web/20230214025333/https://en.js.cx/article/websocket/chat/ redirects wss://web.archive.org/web/20230214025333ws_/wss://javascript.info/article/websocket/chat/ws to https://web.archive.org/web/20221010222504ws_/http://javascript.info/article/websocket/chat/ws (which doesn't work here) |
| 02:57:22 | <@JAA> | I've had that little idea in my head for a while of modifying a browser in a way to make things deterministic. E.g. random seed values, current timestamp, and whatnot. Mocking all external side effects, essentially. Then, on playback, you could in theory play things back exactly as they were captured. However, this wouldn't necessarily be what people actually want anyway, and it'd be a *lot* of work. |
| 02:57:44 | <TheTechRobo> | Yeah |
| 02:58:31 | <TheTechRobo> | I just want to go back to the geocities era, where this wasn't nearly as much of a problem. Sign my guestbook, I guess... |
| 02:58:59 | <@JAA> | Agreed, but also delete JavaScript engines from the browser, thank. |
| 03:01:18 | <@JAA> | Well, and ActiveX and Flash and all that nonsense. :-) |
| 03:01:34 | <TheTechRobo> | And Java! |
| 03:01:44 | <TheTechRobo> | I wasn't even on the Internet during that era and I miss it. |
| 03:01:44 | <@JAA> | Oh god yeah, Java applets... |
| 03:02:37 | <@JAA> | Just HTML and CSS. If you squint, it's even Turing-complete, so it's not like it limits what you can do with the page. :-) |
| 03:02:44 | <TheTechRobo> | hah |
| 03:03:03 | <TheTechRobo> | Just make it all server-side and be done with it. |
| 03:03:18 | | BlueMaxima quits [Read error: Connection reset by peer] |
| 03:03:40 | <TheTechRobo> | Maybe add an html feature to add a loading screen to the page while something's loading, e.g. a link or form submit, but other than that, no client-side scripting kthxbai. |
| 03:03:49 | <@JAA> | Oh, also, POST forms for navigation are banned. |
| 03:19:10 | | pabs hugs the URL extraction of yt-dlp yt-dlp --verbose --dump-json |
| 03:51:20 | | Ketchup901 quits [Ping timeout: 276 seconds] |
| 03:57:12 | | Ketchup901 (Ketchup901) joins |
| 04:38:23 | | dan_a quits [Remote host closed the connection] |
| 05:19:19 | | monoxane7 (monoxane) joins |
| 05:21:53 | | monoxane quits [Ping timeout: 265 seconds] |
| 05:21:54 | | monoxane7 is now known as monoxane |
| 05:36:51 | | sonick quits [Client Quit] |
| 05:52:48 | | Ketchup902 (Ketchup901) joins |
| 05:52:53 | | Ketchup901 quits [Ping timeout: 276 seconds] |
| 06:04:26 | | DopefishJustin joins |
| 06:04:26 | | DopefishJustin is now authenticated as DopefishJustin |
| 06:05:11 | | Ketchup902 quits [Remote host closed the connection] |
| 06:05:30 | | Ketchup901 (Ketchup901) joins |
| 06:29:48 | <Barto> | not sure if throwing https://community.mycroft.ai/ into AB will be working, it's running discourse. |
| 06:30:00 | <pabs> | that was done recently already |
| 06:33:02 | <Barto> | :-) |
| 07:16:02 | | hackbug (hackbug) joins |
| 07:17:31 | | sec^nd quits [Remote host closed the connection] |
| 07:18:31 | | sec^nd (second) joins |
| 07:35:13 | | hitgrr8 joins |
| 07:36:57 | | hackbug quits [Client Quit] |
| 07:42:50 | | benjinsm joins |
| 07:43:45 | | hackbug (hackbug) joins |
| 07:44:48 | | Island quits [Read error: Connection reset by peer] |
| 07:46:13 | | benjins quits [Ping timeout: 252 seconds] |
| 07:53:05 | | Arcorann (Arcorann) joins |
| 08:01:52 | | hackbug quits [Ping timeout: 265 seconds] |
| 08:08:11 | | umgr036 quits [Remote host closed the connection] |
| 08:12:21 | | umgr036 joins |
| 08:16:28 | | LeGoupil joins |
| 08:27:18 | | @dxrt quits [Quit: ZNC - http://znc.sourceforge.net] |
| 08:27:40 | | dxrt joins |
| 08:27:42 | | dxrt is now authenticated as dxrt |
| 08:27:42 | | dxrt quits [Changing host] |
| 08:27:42 | | dxrt (dxrt) joins |
| 08:27:42 | | @ChanServ sets mode: +o dxrt |
| 09:51:26 | | HackMii_ quits [Ping timeout: 276 seconds] |
| 09:53:28 | | HackMii_ (hacktheplanet) joins |
| 09:54:26 | | datechnoman quits [Quit: The Lounge - https://thelounge.chat] |
| 09:55:32 | | datechnoman (datechnoman) joins |
| 10:01:20 | | benjinsmi joins |
| 10:04:49 | | benjinsm quits [Ping timeout: 252 seconds] |
| 10:05:01 | | umgr036 quits [Remote host closed the connection] |
| 10:05:16 | | umgr036 joins |
| 11:12:28 | | xkey quits [Client Quit] |
| 11:13:04 | | KateBush joins |
| 11:13:25 | | KateBush quits [Remote host closed the connection] |
| 11:13:41 | | StrangePhenomena joins |
| 11:13:46 | <StrangePhenomena> | Hello all |
| 11:15:10 | | StrangePhenomena quits [Remote host closed the connection] |
| 11:16:33 | | xkey (xkey) joins |
| 11:23:51 | | dan_a (dan_a) joins |
| 11:54:16 | | LeGoupil quits [Ping timeout: 252 seconds] |
| 12:29:19 | | benjinsmi is now known as benjins |
| 12:29:20 | | benjins is now authenticated as benjins |
| 12:42:00 | | LeGoupil joins |
| 12:56:13 | | Arcorann quits [Ping timeout: 265 seconds] |
| 12:57:31 | | hackbug (hackbug) joins |
| 13:18:16 | | daxxy quits [Quit: bye] |
| 13:18:30 | | daxxy (daxxy) joins |
| 13:20:37 | | qwertyasdfuiopghjkl quits [Remote host closed the connection] |
| 13:22:16 | | lennier1 quits [Ping timeout: 252 seconds] |
| 13:22:27 | | lennier1 (lennier1) joins |
| 13:24:36 | | lennier2_ joins |
| 13:26:40 | | lennier1 quits [Ping timeout: 252 seconds] |
| 13:28:45 | | Icyelut quits [Client Quit] |
| 13:28:52 | | lennier2_ quits [Ping timeout: 252 seconds] |
| 13:31:43 | | lennier2_ joins |
| 13:31:48 | | lennier2_ is now known as lennier1 |
| 13:32:28 | | Icyelut (Icyelut) joins |
| 13:35:39 | | lennier2 joins |
| 13:38:02 | | lennier1 quits [Ping timeout: 252 seconds] |
| 13:39:16 | | lennier2_ joins |
| 13:39:36 | | lennier2_ quits [Read error: Connection reset by peer] |
| 13:39:51 | | lennier2_ joins |
| 13:39:51 | | lennier2_ is now known as lennier1 |
| 13:42:08 | | lennier2 quits [Ping timeout: 265 seconds] |
| 13:56:47 | | eroc1990 quits [Quit: The Lounge - https://thelounge.chat] |
| 13:57:29 | | eroc1990 (eroc1990) joins |
| 14:12:41 | | Chris5010 quits [Quit: Ping timeout (120 seconds)] |
| 14:12:41 | | franga2000 quits [Quit: Ping timeout (120 seconds)] |
| 14:12:41 | | nomad-geek quits [Client Quit] |
| 14:12:41 | | Arachnophine quits [Quit: Ping timeout (120 seconds)] |
| 14:12:41 | | Jake quits [Client Quit] |
| 14:12:41 | | birdjj quits [Quit: Ping timeout (120 seconds)] |
| 14:12:41 | | eroc1990 quits [Client Quit] |
| 14:12:41 | | CraftByte quits [Quit: Ping timeout (120 seconds)] |
| 14:12:41 | | umgr036 quits [Remote host closed the connection] |
| 14:12:41 | | VerifiedJ quits [Quit: Ping timeout (120 seconds)] |
| 14:12:41 | | ave quits [Quit: Ping timeout (120 seconds)] |
| 14:12:41 | | flashfire42 quits [Quit: Ping timeout (120 seconds)] |
| 14:12:41 | | CraftByte6 (DragonSec|CraftByte) joins |
| 14:12:41 | | nomad-geek1 joins |
| 14:12:41 | | Jake7 (Jake) joins |
| 14:12:41 | | birdjj3 joins |
| 14:12:41 | | flashfire423 (flashfire42) joins |
| 14:12:41 | | VerifiedJ7 (VerifiedJ) joins |
| 14:12:42 | | CraftByte6 is now known as CraftByte |
| 14:12:42 | | VerifiedJ7 is now known as VerifiedJ |
| 14:12:42 | | nomad-geek1 is now known as nomad-geek |
| 14:12:42 | | Jake7 is now known as Jake |
| 14:12:42 | | flashfire423 is now known as flashfire42 |
| 14:12:42 | | birdjj3 is now known as birdjj |
| 14:12:43 | | umgr036 joins |
| 14:12:43 | | franga2000 joins |
| 14:12:45 | | Arachnophine (Arachnophine) joins |
| 14:12:47 | | Chris5010 (Chris5010) joins |
| 14:12:49 | | eroc1990 (eroc1990) joins |
| 14:12:49 | | ave (ave) joins |
| 14:13:43 | | Nulo joins |
| 14:14:31 | | Sluggs quits [Ping timeout: 241 seconds] |
| 14:15:15 | | Sluggs joins |
| 14:28:49 | | lennier1 quits [Ping timeout: 252 seconds] |
| 14:29:45 | | lennier1 (lennier1) joins |
| 14:37:14 | | benjinsm joins |
| 14:40:22 | | benjins quits [Ping timeout: 252 seconds] |
| 14:42:33 | | lennier1 quits [Ping timeout: 265 seconds] |
| 14:44:30 | | lennier1 (lennier1) joins |
| 14:44:39 | | sonick (sonick) joins |
| 14:56:05 | | lennier1 quits [Ping timeout: 265 seconds] |
| 14:56:42 | | lennier1 (lennier1) joins |
| 15:06:06 | | lennier2 joins |
| 15:07:52 | | lennier1 quits [Ping timeout: 252 seconds] |
| 15:07:57 | | lennier2 is now known as lennier1 |
| 15:25:00 | | lennier2 joins |
| 15:25:32 | | sec^nd quits [Ping timeout: 276 seconds] |
| 15:25:55 | | HackMii_ quits [Remote host closed the connection] |
| 15:26:29 | | HackMii_ (hacktheplanet) joins |
| 15:27:40 | | lennier1 quits [Ping timeout: 252 seconds] |
| 15:27:41 | | lennier2 is now known as lennier1 |
| 15:29:49 | | sec^nd (second) joins |
| 15:34:16 | | lennier1 quits [Ping timeout: 252 seconds] |
| 15:35:27 | | lennier1 (lennier1) joins |
| 15:36:35 | | Island joins |
| 15:55:05 | | benjinsm is now known as benjins |
| 15:55:07 | | benjins is now authenticated as benjins |
| 15:56:39 | | jabagawee joins |
| 16:20:05 | | jabagawee quits [Client Quit] |
| 16:22:11 | | lennier2 joins |
| 16:23:13 | | lennier1 quits [Ping timeout: 252 seconds] |
| 16:23:16 | | lennier2 is now known as lennier1 |
| 16:54:00 | | LeGoupil quits [Client Quit] |
| 16:54:34 | | lennier1 quits [Ping timeout: 252 seconds] |
| 17:05:43 | | lennier1 (lennier1) joins |
| 17:07:05 | | qwertyasdfuiopghjkl joins |
| 17:12:10 | | lennier1 quits [Ping timeout: 252 seconds] |
| 17:12:57 | | lennier1 (lennier1) joins |
| 17:50:19 | | umgr036 quits [Remote host closed the connection] |
| 17:54:31 | | Barto quits [Ping timeout: 252 seconds] |
| 18:01:12 | | lennier2 joins |
| 18:01:18 | | lennier1 quits [Ping timeout: 252 seconds] |
| 18:01:22 | | lennier2 is now known as lennier1 |
| 18:56:48 | | Barto (Barto) joins |
| 19:37:08 | <h2ibot> | JustAnotherArchivist edited Deathwatch (+595, /* 2023 */ Add Terminal Boredom): https://wiki.archiveteam.org/?diff=49462&oldid=49461 |
| 20:21:44 | | edisondotme quits [Remote host closed the connection] |
| 20:24:24 | | benjins quits [Remote host closed the connection] |
| 20:24:24 | | franga2000 quits [Client Quit] |
| 20:24:24 | | Chris5010 quits [Client Quit] |
| 20:24:24 | | benjinsm joins |
| 20:24:24 | | eroc1990 quits [Client Quit] |
| 20:24:24 | | franga20000 joins |
| 20:24:24 | | ave quits [Client Quit] |
| 20:24:25 | | franga20000 is now known as franga2000 |
| 20:24:30 | | ave (ave) joins |
| 20:24:30 | | Chris5010 (Chris5010) joins |
| 20:24:44 | | eroc1990 (eroc1990) joins |
| 21:48:56 | <Barto> | pabs: thanks for doing it. Sorry i missed your work earlier. |
| 22:46:22 | | hitgrr8 quits [Client Quit] |
| 23:11:33 | | BlueMaxima joins |
| 23:32:04 | | Atom-- quits [Read error: Connection reset by peer] |
| 23:48:51 | | lennier1 quits [Client Quit] |
| 23:49:16 | | lennier1 (lennier1) joins |
| 23:51:20 | | lennier1 quits [Client Quit] |
| 23:57:39 | | benjinsm is now known as benjins |
| 23:57:40 | | benjins is now authenticated as benjins |