| 00:00:32 | | Arcorann (Arcorann) joins |
| 00:06:05 | | Perk quits [Client Quit] |
| 00:09:13 | | Perk joins |
| 00:13:37 | | Perk quits [Client Quit] |
| 00:15:13 | | DLoader quits [Quit: DLoader] |
| 00:15:33 | | DLoader (DLoader) joins |
| 00:15:39 | | Perk joins |
| 00:24:26 | <eightthree> | should I use wpull, wget-lua or something else for downloading random website domains? wpull seems not in the aur, but will it work just fine in a python venv if my distro discourages installing directly through pip3 install? |
| 00:24:46 | <eightthree> | I'd be looking to get as close to browsing the actual website (only using local data and maybe even code, perhaps the same search engine like lucene or whatever the website uses), |
| 00:26:52 | | Perk quits [Client Quit] |
| 00:28:11 | | Perk joins |
| 00:49:57 | | Wohlstand quits [Client Quit] |
| 01:08:35 | <thuban> | eightthree: you probably want https://github.com/ArchiveTeam/grab-site/ (and a viewer like https://replayweb.page/ to browse the resulting warc) |
| 01:09:46 | <pabs> | eightthree: if it was me I'd just ask folks to run it in ArchiveBot, that doesn't have a JS interpreter or do page interactions though, which are often needed to get everything |
| 01:14:39 | <thuban> | fireonlive: you're still monitoring the archivebot websocket for project urls, correct? |
| 01:15:01 | <fireonlive> | indeed |
| 01:15:17 | | pabs is too (for code, wikis and Mailman/2) |
| 01:15:26 | <thuban> | cool, ty |
| 01:15:35 | <thuban> | oh lol, that explains why i couldn't remember who was doing it |
| 01:16:16 | <fireonlive> | :) |
| 01:16:34 | <thuban> | just double-checking since i noticed mediafire links in some of these scanlation blog jobs |
| 01:17:57 | <fireonlive> | ah ye |
| 01:36:20 | | MrMcNuggets joins |
| 01:37:13 | | MrMcNuggets quits [Client Quit] |
| 01:47:32 | <eightthree> | pabs: im confused, if I want to save the page as if it were downloaded/viewed with js and everything as though a real user browsing in a browser, are you saying wpull and wget-lua don't reproduce the resulting page "bug for bug" and "bit for bit", or is it the archivebot and the specific grab-site that doesn't? Or both? I don't think my website has a grab-site project associated with it, unless I just use the generic one that... |
| 01:47:37 | <eightthree> | ... isn't tailored to any specific site? |
| 01:48:38 | <@JAA> | eightthree: None of these tools know what JS is. |
| 01:49:17 | <@JAA> | If a site is very script-heavy, you may need brozzler. |
| 01:49:58 | <@JAA> | But exact and functional reproduction of script-heavy sites is hard to impossible. |
| 01:50:08 | <@JAA> | In the general case, anyway. |
| 01:51:08 | <eightthree> | thuban: I think Ill try locally hosting that https://github.com/webrecorder/replayweb.page, hopefully it works just as well as the site, thanks! |
| 01:52:27 | <@JAA> | That's only for playback, not for archival. |
| 01:53:04 | <pabs> | eightthree: think about a game written in JS, if you don't play it in a browser to the end and do all the side quests, you won't get everything. some JS websites are similar |
| 01:53:28 | <@JAA> | That's a good analogy! :-) |
| 01:53:44 | <pabs> | horrifying one but yeah :) |
| 01:55:25 | <nicolas17> | I remember a point-and-click Flash game (Myst style), each possible place you could be standing at was a different .swf |
| 01:56:05 | <pabs> | whoa |
| 01:56:28 | <pabs> | eightthree: which sites are you interested in btw? |
| 01:58:57 | <nicolas17> | pabs: well that's better than actual Myst/Riven where each possible place you can be standing at and each possible *state* the room can be in (door open / door closed?) is a different bitmap image |
| 02:05:08 | <TheTechRobo> | JAA: Would you happen to still have your process for retrieving the highest-quality audio from The Artists Union on WBM? |
| 02:05:47 | | TheTechRobo would rather not dig through logs with TheLounge's awful search functionality |
| 02:06:35 | <@JAA> | You're the second one tonight to confuse its perceived awful search with its actual general awfulness. ;-) |
| 02:06:58 | <@JAA> | I did it with the CDX. |
| 02:07:11 | <TheTechRobo> | JAA: Oh, yeah, TL sucks, but its search sucks especially. :-) |
| 02:07:23 | <TheTechRobo> | Trouble is, I still haven't gotten around to making my replacement for it. |
| 02:07:34 | <thuban> | TheTechRobo: recent discussion re tau: https://hackint.logs.kiska.pw/archiveteam-bs/20240224 |
| 02:07:35 | <TheTechRobo> | Whether I like it or not, it checks the most boxes out of any IRC client I've seen so far. |
| 02:07:37 | <@JAA> | Find the relevant WARC via WBM headers, sort the corresponding CDX by offset (field 10 or whatever it is), then look at the nearby responses. |
| 02:07:52 | <TheTechRobo> | thuban: Ah thanks, forgor about public logs. |
| 02:08:17 | <@JAA> | There are at least two or three different URL patterns for the audio URLs, so that's the most reliable method. |
| 02:08:34 | <nicolas17> | my phone autocomplete learned the word "forgor" recently |
| 02:08:40 | <fireonlive> | The Lounge best irc client |
| 02:08:46 | <TheTechRobo> | Fun fact about The Lounge I learned a few weeks ago: Its sqlite database option is literally just a table of JSON objects. No wonder it can only show messages it's loaded into memory already. |
| 02:09:12 | <@JAA> | Yeah, the awfulness is fractal. |
| 02:09:28 | <fireonlive> | yeah.... best.. database.. design.. 🥲 |
| 02:11:04 | <TheTechRobo> | TheLounge-- |
| 02:11:05 | <eggdrop> | [karma] 'TheLounge' now has -1 karma! |
| 02:11:11 | <TheTechRobo> | or is it with a space |
| 02:11:20 | <fireonlive> | !karma The Lounge |
| 02:11:21 | <eggdrop> | [karma] "The Lounge" has -1 karma |
| 02:11:28 | <fireonlive> | The Lounge-- |
| 02:11:30 | <eggdrop> | [karma] 'The Lounge' now has -3 karma! |
| 02:11:33 | <fireonlive> | some ppl like https://github.com/glowing-bear/glowing-bear |
| 02:11:37 | <fireonlive> | but it uploads to imgur i think |
| 02:11:38 | <@JAA> | The automatic history deletion in the next release is going to surprise a bunch of people. |
| 02:12:04 | <TheTechRobo> | fireonlive: That actually looks really cool |
| 02:12:06 | <TheTechRobo> | JAA: The what? |
| 02:12:10 | <@JAA> | :-) |
| 02:12:19 | <fireonlive> | rip me history |
| 02:12:31 | <fireonlive> | it's disabled by default though isn't it? |
| 02:12:38 | <@JAA> | The 'data hoarder' option only deletes 'low-value' messages. |
| 02:12:39 | <TheTechRobo> | Where's that meme when I need it? |
| 02:12:48 | <@JAA> | As I said, the awfulness is fractal. |
| 02:12:52 | <@JAA> | https://github.com/thelounge/thelounge/pull/4799 |
| 02:13:11 | <thuban> | ಠ_ಠ|
| 02:13:18 | <@JAA> | fireonlive: Might be, yeah. I wouldn't trust them to not enable it by default though. |
| 02:13:27 | <fireonlive> | true |
| 02:13:34 | <fireonlive> | maybe i should look into flowing bear |
| 02:13:43 | <fireonlive> | and replacing imgur with.. not that |
| 02:13:59 | <TheTechRobo> | yeah that'll be way easier than what I wanted to do |
| 02:14:14 | <@JAA> | The Lounge-- |
| 02:14:15 | <eggdrop> | [karma] 'The Lounge' now has -4 karma! |
| 02:15:20 | <TheTechRobo> | > and then provides some nice features on top of that, like embedding images, videos, and other content |
| 02:15:20 | <TheTechRobo> | please tell me that's configurable |
| 02:15:20 | | Lord_Nightmare (Lord_Nightmare) joins |
| 02:17:15 | <fireonlive> | the ddos must continue |
| 02:22:50 | <eightthree> | Im trying to join #down-the-tube and failing 4+ times through heisenbridge, is there anything different with the room like +R +r, just in case? the room is functional? Otherwise I'll keep trying to sort it out with other bridge users in case it's failing at that level |
| 02:26:19 | <icedice> | Convos is another option: https://convos.chat/ |
| 02:28:12 | <@JAA> | eightthree: #down-the-tube has the same modes as this channel, -p. |
| 02:47:10 | | ^ quits [Ping timeout: 255 seconds] |
| 02:47:49 | | ^ (^) joins |
| 02:49:35 | <TheTechRobo> | icedice: This feels imposing https://lounge.thetechrobo.ca/uploads/3ffe46ce618ea206/image.png |
| 02:52:38 | <h2ibot> | Ryz edited List of websites excluded from the Wayback Machine (+28, Added https://www.tamindir.com/): https://wiki.archiveteam.org/?diff=51989&oldid=51982 |
| 03:00:39 | <h2ibot> | JAABot edited List of websites excluded from the Wayback Machine (+0): https://wiki.archiveteam.org/?diff=51990&oldid=51989 |
| 03:14:10 | | Perk quits [Read error: Connection reset by peer] |
| 03:16:32 | | Perk joins |
| 03:16:33 | | Perk7 joins |
| 03:16:34 | | Perk quits [Remote host closed the connection] |
| 03:16:34 | | Perk7 is now known as Perk |
| 03:38:59 | | ^ quits [Remote host closed the connection] |
| 03:39:13 | | ^ (^) joins |
| 03:48:34 | | fireonlive is now known as \ |
| 03:48:40 | | \ is now known as fireonlive |
| 04:04:32 | <thuban> | aw, this (spanish-language) scanlation blog has a bunch of taringa links :( |
| 04:05:48 | <@JAA> | F |
| 04:36:41 | <thuban> | and this other one a bunch of zippyshare :( |
| 04:46:44 | | GNU_world joins |
| 05:14:51 | | grid joins |
| 06:15:14 | | qwertyasdfuiopghjkl quits [Client Quit] |
| 06:16:22 | <nicolas17> | thuban: not on WBM? |
| 06:17:25 | <thuban> | didn't check, since we can't do anything about it at this point |
| 06:24:47 | | qwertyasdfuiopghjkl (qwertyasdfuiopghjkl) joins |
| 06:28:36 | | Ruthalas59 quits [Client Quit] |
| 06:39:21 | | bladem quits [Read error: Connection reset by peer] |
| 06:39:31 | | BlueMaxima quits [Read error: Connection reset by peer] |
| 06:42:05 | | Naruyoko5 quits [Quit: Leaving] |
| 06:53:46 | | Ruthalas59 (Ruthalas) joins |
| 06:59:58 | | Naruyoko joins |
| 07:03:13 | | pabs quits [Ping timeout: 255 seconds] |
| 07:05:09 | | pabs (pabs) joins |
| 07:24:45 | | grid quits [Client Quit] |
| 07:26:12 | <thuban> | hm, do we have a regex for pastebin we can add to the url lists template? |
| 07:32:53 | <fireonlive> | i've just been using the imgur one but with pastebin\.com instead |
| 08:19:29 | | GNU_world quits [Ping timeout: 272 seconds] |
| 09:00:01 | | Bleo182600 quits [Client Quit] |
| 09:01:22 | | Bleo182600 joins |
| 09:27:39 | | GNU_world joins |
| 09:32:02 | | line joins |
| 09:35:29 | | line_ quits [Ping timeout: 272 seconds] |
| 09:47:02 | <icedice> | <TheTechRobo> icedice: This feels imposing https://lounge.thetechrobo.ca/uploads/3ffe46ce618ea206/image.png |
| 09:47:03 | <icedice> | oof |
| 09:47:14 | <icedice> | See you guys later |
| 09:47:16 | | icedice quits [Client Quit] |
| 10:15:23 | | VickoSaviour joins |
| 10:18:18 | <VickoSaviour> | I was just searching the wiki for fun when i checked the Friendster archive. Looking at it, surely it was a big and a close one, like a Google+ project... Now, when i checked the website that was supposed to be offline, i saw that it is online (probably relaunching, copyright 2023) and it has a early access participation. Someone could edit the |
| 10:18:19 | <VickoSaviour> | wiki page... |
| 10:22:11 | <VickoSaviour> | Also, while I'm still on IRC, can someone tell me what happened to DeviantArt project, we have about 1 million items, and it is stuck at 1.31 items... Did we finished backing up the groups or did it got stuck on those items? |
| 10:24:44 | | JaffaCakes118 (JaffaCakes118) joins |
| 10:28:25 | | JaffaCakes118_2 quits [Ping timeout: 255 seconds] |
| 10:33:17 | | JaffaCakes118 quits [Remote host closed the connection] |
| 10:47:12 | | JaffaCakes118 (JaffaCakes118) joins |
| 11:02:33 | <imer> | VickoSaviour: I believe deviantart blocked the UA and the deadline ran out, so code patch to change UA again hasn't been applied since it's too late? |
| 11:02:51 | <imer> | neevermind lol arkiver just patched it |
| 11:04:37 | <qwertyasdfuiopghjkl> | According to https://www.deviantart.com/team/journal/Convert-your-group-to-a-new-design-994388001, "all Groups will be migrated by [2024-04-08]" |
| 11:05:03 | <imer> | ah, topic said 25th |
| 11:05:38 | <imer> | we should head over to #devianttart if there's more to talk about :) |
| 11:15:45 | | kiryu quits [Remote host closed the connection] |
| 11:17:29 | | kiryu joins |
| 11:17:29 | | kiryu is now authenticated as kiryu |
| 11:17:29 | | kiryu quits [Changing host] |
| 11:17:29 | | kiryu (kiryu) joins |
| 11:30:18 | | vukky quits [Quit: @ERROR: max connections (-1) reached -- try again later] |
| 11:37:43 | | Wohlstand (Wohlstand) joins |
| 11:54:12 | | kiryu quits [Client Quit] |
| 11:59:42 | | kiryu joins |
| 11:59:42 | | kiryu is now authenticated as kiryu |
| 11:59:42 | | kiryu quits [Changing host] |
| 11:59:42 | | kiryu (kiryu) joins |
| 12:06:20 | | vukky (vukky) joins |
| 12:08:14 | | Wohlstand quits [Client Quit] |
| 12:48:22 | | GNU_world quits [Ping timeout: 255 seconds] |
| 12:57:13 | | Letur quits [Quit: Client Quit] |
| 12:59:25 | | Arcorann quits [Ping timeout: 272 seconds] |
| 13:00:03 | | Letur joins |
| 13:05:12 | | sec^nd quits [Ping timeout: 255 seconds] |
| 13:10:21 | | sec^nd (second) joins |
| 13:40:14 | | GNU_world joins |
| 14:02:37 | | zhongfu quits [Ping timeout: 255 seconds] |
| 14:22:45 | | zhongfu (zhongfu) joins |
| 14:28:43 | | line quits [Ping timeout: 272 seconds] |
| 14:30:22 | | line joins |
| 15:05:52 | | JaffaCakes118 quits [Remote host closed the connection] |
| 15:33:53 | | wickerz quits [Quit: The Lounge - https://thelounge.chat] |
| 15:34:15 | | wickerz joins |
| 16:10:41 | | tzt quits [Ping timeout: 272 seconds] |
| 16:11:30 | | tzt (tzt) joins |
| 16:56:17 | | abirkill- (abirkill) joins |
| 16:58:07 | | abirkill quits [Ping timeout: 255 seconds] |
| 16:58:07 | | abirkill- is now known as abirkill |
| 17:01:03 | <pokechu22> | from #archivebot: 14:53 <youbanana> Have you guys gotten google podcasts yet? It'll be shutting down in 3 days and I couldn't find a full scrape of it in the viewer. |
| 17:06:48 | <pokechu22> | do we have anything going on for that? |
| 17:07:29 | <pokechu22> | deathwatch says date is unknown |
| 17:40:07 | <pokechu22> | https://podcasts.google.com/ says the date is April 2 though |
| 17:40:58 | <h2ibot> | Pokechu22 edited Deathwatch (+53, /* 2024 */ April 2 for Google Podcasts): https://wiki.archiveteam.org/?diff=51991&oldid=51956 |
| 17:43:25 | <c3manu> | !ig 4bu2dpgytytcjp7bnhjfbudc2 ^https?://www\.tametick\.com/ |
| 18:21:28 | | Doranwen (Doranwen) joins |
| 18:24:08 | | Unholy23613166180851599738 (Unholy2361) joins |
| 18:24:25 | | Unholy23613166180851599738 quits [Client Quit] |
| 18:24:46 | | Unholy23613166180851599738 (Unholy2361) joins |
| 18:25:35 | | Dango360 quits [Ping timeout: 272 seconds] |
| 18:27:26 | | icedice (icedice) joins |
| 18:33:49 | | Dango360 (Dango360) joins |
| 18:37:47 | | Dango360_ joins |
| 18:41:37 | | Dango360 quits [Ping timeout: 255 seconds] |
| 19:10:31 | | icedice quits [Client Quit] |
| 19:26:18 | | icedice (icedice) joins |
| 19:44:21 | | VickoSaviour quits [Client Quit] |
| 19:52:57 | | nertzy joins |
| 20:12:41 | <fireonlive> | -+rss- A. K. Dewdney has died: https://lfpress.remembering.ca/obituary/alexander-dewdney-1089463499 https://news.ycombinator.com/item?id=39886272 |
| 20:13:33 | <fireonlive> | Dan Lynch Has Died (SRI, Arpanet, Internet) https://www.internethalloffame.org/2021/04/19/dan-lynchs-love-brilliant-complexity-fuels-early-internet-development-growth/ https://news.ycombinator.com/item?id=39887275 |
| 20:14:22 | <h2ibot> | That lurker edited List of websites excluded from the Wayback Machine (+22, Added ntcore.com): https://wiki.archiveteam.org/?diff=51992&oldid=51990 |
| 20:16:22 | | grid joins |
| 20:20:09 | | Dango360_ quits [Client Quit] |
| 20:20:29 | | Dango360 (Dango360) joins |
| 20:28:00 | | jacksonchen666 quits [Ping timeout: 255 seconds] |
| 20:39:44 | | that_lurker scrolls up and wonders why lounge got minus karma and then found out why |
| 20:39:49 | <that_lurker> | https://lounge.kuhaon.fun/folder/36e26e279eeb5fd9/fuuuck-nicolas-cage.gif |
| 20:44:08 | | tt joins |
| 20:44:47 | | jacksonchen666 (jacksonchen666) joins |
| 20:45:08 | | tt quits [Client Quit] |
| 20:52:18 | | eightthree quits [Remote host closed the connection] |
| 20:54:07 | | eightthree joins |
| 20:56:20 | | eightthree quits [Remote host closed the connection] |
| 21:00:28 | <h2ibot> | JAABot edited List of websites excluded from the Wayback Machine (+0): https://wiki.archiveteam.org/?diff=51993&oldid=51992 |
| 21:01:24 | | eightthree joins |
| 21:09:36 | | jacksonchen666 quits [Remote host closed the connection] |
| 21:10:08 | | jacksonchen666 (jacksonchen666) joins |
| 21:15:36 | | eightthree quits [Remote host closed the connection] |
| 21:16:49 | | eightthree joins |
| 21:21:03 | | JaffaCakes118 (JaffaCakes118) joins |
| 21:27:14 | | pixel leaves [Error from remote client] |
| 21:27:15 | | pixel (pixel) joins |
| 21:28:20 | | eightthree quits [Remote host closed the connection] |
| 21:29:26 | | eightthree joins |
| 21:33:41 | <nicolas17> | fireonlive: SRI is where Siri came from |
| 21:38:48 | | eightthree quits [Remote host closed the connection] |
| 21:50:57 | | eightthree joins |
| 21:51:48 | | eightthree quits [Remote host closed the connection] |
| 21:52:36 | | eightthree joins |
| 22:03:38 | | eightthree quits [Remote host closed the connection] |
| 22:04:49 | | eightthree joins |
| 22:09:33 | | eightthree quits [Remote host closed the connection] |
| 22:11:58 | | eightthree joins |
| 22:13:12 | | BlueMaxima joins |
| 22:21:25 | <fireonlive> | :o |
| 22:21:35 | <fireonlive> | no wonder it's shitty, apple didn't make it |
| 22:24:22 | | teacold66 joins |
| 22:24:29 | <teacold66> | Could someone archive https://forum.kaspersky.com/ with archivebot please (not much coverage after mid 2023) |
| 22:28:47 | | midou quits [Ping timeout: 272 seconds] |
| 22:30:54 | | teacold66 quits [Client Quit] |
| 22:36:15 | | grid quits [Client Quit] |
| 22:38:44 | | nertzy quits [Remote host closed the connection] |
| 22:38:49 | | midou joins |
| 22:40:58 | <@JAA> | thuban: https://gitea.arpa.li/JustAnotherArchivist/little-things/src/branch/master/extract-urls-for-archiveteam-projects |
| 22:44:13 | <@JAA> | Oof re Google Podcasts |
| 23:05:35 | | eightthree quits [Remote host closed the connection] |
| 23:06:56 | | eightthree joins |
| 23:09:27 | | Bleo182600 quits [Client Quit] |
| 23:09:46 | | Bleo182600 joins |
| 23:33:45 | | Arcorann (Arcorann) joins |