00:00:32 | | Arcorann (Arcorann) joins |
00:06:05 | | Perk quits [Client Quit] |
00:09:13 | | Perk joins |
00:13:37 | | Perk quits [Client Quit] |
00:15:13 | | DLoader quits [Quit: DLoader] |
00:15:33 | | DLoader (DLoader) joins |
00:15:39 | | Perk joins |
00:24:26 | <eightthree> | should I use wpull, wget-lua or something else for downloading random website domains? wpull seems not in the aur, but will it work just fine in a python venv if my distro discourages installing directly through pip3 install? |
00:24:46 | <eightthree> | I'd be looking to get as close to browsing the actual website (only using local data and maybe even code, perhaps the same search engine like lucene or whatever the website uses), |
00:26:52 | | Perk quits [Client Quit] |
00:28:11 | | Perk joins |
00:49:57 | | Wohlstand quits [Client Quit] |
01:08:35 | <thuban> | eightthree: you probably want https://github.com/ArchiveTeam/grab-site/ (and a viewer like https://replayweb.page/ to browse the resulting warc) |
01:09:46 | <pabs> | eightthree: if it was me I'd just ask folks to run it in ArchiveBot, that doesn't have a JS interpreter or do page interactions though, which are often needed to get everything |
01:14:39 | <thuban> | fireonlive: you're still monitoring the archivebot websocket for project urls, correct? |
01:15:01 | <fireonlive> | indeed |
01:15:17 | | pabs is too (for code, wikis and Mailman/2) |
01:15:26 | <thuban> | cool, ty |
01:15:35 | <thuban> | oh lol, that explains why i couldn't remember who was doing it |
01:16:16 | <fireonlive> | :) |
01:16:34 | <thuban> | just double-checking since i noticed mediafire links in some of these scanlation blog jobs |
01:17:57 | <fireonlive> | ah ye |
01:36:20 | | MrMcNuggets joins |
01:37:13 | | MrMcNuggets quits [Client Quit] |
01:47:32 | <eightthree> | pabs: im confused, if I want to save the page as if it were downloaded/viewed with js and everything as though a real user browsing in a browser, are you saying wpull and wget-lua don't reproduce the resulting page "bug for bug" and "bit for bit", or is it the archivebot and the specific grab-site that doesn't? Or both? I don't think my website has a grab-site project associated with it, unless I just use the generic one that... |
01:47:37 | <eightthree> | ... isn't tailored to any specific site? |
01:48:38 | <@JAA> | eightthree: None of these tools know what JS is. |
01:49:17 | <@JAA> | If a site is very script-heavy, you may need brozzler. |
01:49:58 | <@JAA> | But exact and functional reproduction of script-heavy sites is hard to impossible. |
01:50:08 | <@JAA> | In the general case, anyway. |
01:51:08 | <eightthree> | thuban: I think Ill try locally hosting that https://github.com/webrecorder/replayweb.page, hopefully it works just as well as the site, thanks! |
01:52:27 | <@JAA> | That's only for playback, not for archival. |
01:53:04 | <pabs> | eightthree: think about a game written in JS, if you don't play it in a browser to the end and do all the side quests, you won't get everything. some JS websites are similar |
01:53:28 | <@JAA> | That's a good analogy! :-) |
01:53:44 | <pabs> | horrifying one but yeah :) |
01:55:25 | <nicolas17> | I remember a point-and-click Flash game (Myst style), each possible place you could be standing at was a different .swf |
01:56:05 | <pabs> | whoa |
01:56:28 | <pabs> | eightthree: which sites are you interested in btw? |
01:58:57 | <nicolas17> | pabs: well that's better than actual Myst/Riven where each possible place you can be standing at and each possible *state* the room can be in (door open / door closed?) is a different bitmap image |
02:05:08 | <TheTechRobo> | JAA: Would you happen to still have your process for retrieving the highest-quality audio from The Artists Union on WBM? |
02:05:47 | | TheTechRobo would rather not dig through logs with TheLounge's awful search functionality |
02:06:35 | <@JAA> | You're the second one tonight to confuse its perceived awful search with its actual general awfulness. ;-) |
02:06:58 | <@JAA> | I did it with the CDX. |
02:07:11 | <TheTechRobo> | JAA: Oh, yeah, TL sucks, but its search sucks especially. :-) |
02:07:23 | <TheTechRobo> | Trouble is, I still haven't gotten around to making my replacement for it. |
02:07:34 | <thuban> | TheTechRobo: recent discussion re tau: https://hackint.logs.kiska.pw/archiveteam-bs/20240224 |
02:07:35 | <TheTechRobo> | Whether I like it or not, it checks the most boxes out of any IRC client I've seen so far. |
02:07:37 | <@JAA> | Find the relevant WARC via WBM headers, sort the corresponding CDX by offset (field 10 or whatever it is), then look at the nearby responses. |
02:07:52 | <TheTechRobo> | thuban: Ah thanks, forgor about public logs. |
02:08:17 | <@JAA> | There are at least two or three different URL patterns for the audio URLs, so that's the most reliable method. |
02:08:34 | <nicolas17> | my phone autocomplete learned the word "forgor" recently |
02:08:40 | <fireonlive> | The Lounge best irc client |
02:08:46 | <TheTechRobo> | Fun fact about The Lounge I learned a few weeks ago: Its sqlite database option is literally just a table of JSON objects. No wonder it can only show messages it's loaded into memory already. |
02:09:12 | <@JAA> | Yeah, the awfulness is fractal. |
02:09:28 | <fireonlive> | yeah.... best.. database.. design.. 🥲 |
02:11:04 | <TheTechRobo> | TheLounge-- |
02:11:05 | <eggdrop> | [karma] 'TheLounge' now has -1 karma! |
02:11:11 | <TheTechRobo> | or is it with a space |
02:11:20 | <fireonlive> | !karma The Lounge |
02:11:21 | <eggdrop> | [karma] "The Lounge" has -1 karma |
02:11:28 | <fireonlive> | The Lounge-- |
02:11:30 | <eggdrop> | [karma] 'The Lounge' now has -3 karma! |
02:11:33 | <fireonlive> | some ppl like https://github.com/glowing-bear/glowing-bear |
02:11:37 | <fireonlive> | but it uploads to imgur i think |
02:11:38 | <@JAA> | The automatic history deletion in the next release is going to surprise a bunch of people. |
02:12:04 | <TheTechRobo> | fireonlive: That actually looks really cool |
02:12:06 | <TheTechRobo> | JAA: The what? |
02:12:10 | <@JAA> | :-) |
02:12:19 | <fireonlive> | rip me history |
02:12:31 | <fireonlive> | it's disabled by default though isn't it? |
02:12:38 | <@JAA> | The 'data hoarder' option only deletes 'low-value' messages. |
02:12:39 | <TheTechRobo> | Where's that meme when I need it? |
02:12:48 | <@JAA> | As I said, the awfulness is fractal. |
02:12:52 | <@JAA> | https://github.com/thelounge/thelounge/pull/4799 |
02:13:11 | <thuban> | ಠ_ಠ|
02:13:18 | <@JAA> | fireonlive: Might be, yeah. I wouldn't trust them to not enable it by default though. |
02:13:27 | <fireonlive> | true |
02:13:34 | <fireonlive> | maybe i should look into flowing bear |
02:13:43 | <fireonlive> | and replacing imgur with.. not that |
02:13:59 | <TheTechRobo> | yeah that'll be way easier than what I wanted to do |
02:14:14 | <@JAA> | The Lounge-- |
02:14:15 | <eggdrop> | [karma] 'The Lounge' now has -4 karma! |
02:15:20 | <TheTechRobo> | > and then provides some nice features on top of that, like embedding images, videos, and other content |
02:15:20 | <TheTechRobo> | please tell me that's configurable |
02:15:20 | | Lord_Nightmare (Lord_Nightmare) joins |
02:17:15 | <fireonlive> | the ddos must continue |
02:22:50 | <eightthree> | Im trying to join #down-the-tube and failing 4+ times through heisenbridge, is there anything different with the room like +R +r, just in case? the room is functional? Otherwise I'll keep trying to sort it out with other bridge users in case it's failing at that level |
02:26:19 | <icedice> | Convos is another option: https://convos.chat/ |
02:28:12 | <@JAA> | eightthree: #down-the-tube has the same modes as this channel, -p. |
02:47:10 | | ^ quits [Ping timeout: 255 seconds] |
02:47:49 | | ^ (^) joins |
02:49:35 | <TheTechRobo> | icedice: This feels imposing https://lounge.thetechrobo.ca/uploads/3ffe46ce618ea206/image.png |
02:52:38 | <h2ibot> | Ryz edited List of websites excluded from the Wayback Machine (+28, Added https://www.tamindir.com/): https://wiki.archiveteam.org/?diff=51989&oldid=51982 |
03:00:39 | <h2ibot> | JAABot edited List of websites excluded from the Wayback Machine (+0): https://wiki.archiveteam.org/?diff=51990&oldid=51989 |
03:14:10 | | Perk quits [Read error: Connection reset by peer] |
03:16:32 | | Perk joins |
03:16:33 | | Perk7 joins |
03:16:34 | | Perk quits [Remote host closed the connection] |
03:16:34 | | Perk7 is now known as Perk |
03:38:59 | | ^ quits [Remote host closed the connection] |
03:39:13 | | ^ (^) joins |
03:48:34 | | fireonlive is now known as \ |
03:48:40 | | \ is now known as fireonlive |
04:04:32 | <thuban> | aw, this (spanish-language) scanlation blog has a bunch of taringa links :( |
04:05:48 | <@JAA> | F |
04:36:41 | <thuban> | and this other one a bunch of zippyshare :( |
04:46:44 | | GNU_world joins |
05:14:51 | | grid joins |
06:15:14 | | qwertyasdfuiopghjkl quits [Client Quit] |
06:16:22 | <nicolas17> | thuban: not on WBM? |
06:17:25 | <thuban> | didn't check, since we can't do anything about it at this point |
06:24:47 | | qwertyasdfuiopghjkl (qwertyasdfuiopghjkl) joins |
06:28:36 | | Ruthalas59 quits [Client Quit] |
06:39:21 | | bladem quits [Read error: Connection reset by peer] |
06:39:31 | | BlueMaxima quits [Read error: Connection reset by peer] |
06:42:05 | | Naruyoko5 quits [Quit: Leaving] |
06:53:46 | | Ruthalas59 (Ruthalas) joins |
06:59:58 | | Naruyoko joins |
07:03:13 | | pabs quits [Ping timeout: 255 seconds] |
07:05:09 | | pabs (pabs) joins |
07:24:45 | | grid quits [Client Quit] |
07:26:12 | <thuban> | hm, do we have a regex for pastebin we can add to the url lists template? |
07:32:53 | <fireonlive> | i've just been using the imgur one but with pastebin\.com instead |
08:19:29 | | GNU_world quits [Ping timeout: 272 seconds] |
09:00:01 | | Bleo182600 quits [Client Quit] |
09:01:22 | | Bleo182600 joins |
09:27:39 | | GNU_world joins |
09:32:02 | | line joins |
09:35:29 | | line_ quits [Ping timeout: 272 seconds] |
09:47:02 | <icedice> | <TheTechRobo> icedice: This feels imposing https://lounge.thetechrobo.ca/uploads/3ffe46ce618ea206/image.png |
09:47:03 | <icedice> | oof |
09:47:14 | <icedice> | See you guys later |
09:47:16 | | icedice quits [Client Quit] |
10:15:23 | | VickoSaviour joins |
10:18:18 | <VickoSaviour> | I was just searching the wiki for fun when i checked the Friendster archive. Looking at it, surely it was a big and a close one, like a Google+ project... Now, when i checked the website that was supposed to be offline, i saw that it is online (probably relaunching, copyright 2023) and it has a early access participation. Someone could edit the |
10:18:19 | <VickoSaviour> | wiki page... |
10:22:11 | <VickoSaviour> | Also, while I'm still on IRC, can someone tell me what happened to DeviantArt project, we have about 1 million items, and it is stuck at 1.31 items... Did we finished backing up the groups or did it got stuck on those items? |
10:24:44 | | JaffaCakes118 (JaffaCakes118) joins |
10:28:25 | | JaffaCakes118_2 quits [Ping timeout: 255 seconds] |
10:33:17 | | JaffaCakes118 quits [Remote host closed the connection] |
10:47:12 | | JaffaCakes118 (JaffaCakes118) joins |
11:02:33 | <imer> | VickoSaviour: I believe deviantart blocked the UA and the deadline ran out, so code patch to change UA again hasn't been applied since it's too late? |
11:02:51 | <imer> | neevermind lol arkiver just patched it |
11:04:37 | <qwertyasdfuiopghjkl> | According to https://www.deviantart.com/team/journal/Convert-your-group-to-a-new-design-994388001, "all Groups will be migrated by [2024-04-08]" |
11:05:03 | <imer> | ah, topic said 25th |
11:05:38 | <imer> | we should head over to #devianttart if there's more to talk about :) |
11:15:45 | | kiryu quits [Remote host closed the connection] |
11:17:29 | | kiryu joins |
11:17:29 | | kiryu is now authenticated as kiryu |
11:17:29 | | kiryu quits [Changing host] |
11:17:29 | | kiryu (kiryu) joins |
11:30:18 | | vukky quits [Quit: @ERROR: max connections (-1) reached -- try again later] |
11:37:43 | | Wohlstand (Wohlstand) joins |
11:54:12 | | kiryu quits [Client Quit] |
11:59:42 | | kiryu joins |
11:59:42 | | kiryu is now authenticated as kiryu |
11:59:42 | | kiryu quits [Changing host] |
11:59:42 | | kiryu (kiryu) joins |
12:06:20 | | vukky (vukky) joins |
12:08:14 | | Wohlstand quits [Client Quit] |
12:48:22 | | GNU_world quits [Ping timeout: 255 seconds] |
12:57:13 | | Letur quits [Quit: Client Quit] |
12:59:25 | | Arcorann quits [Ping timeout: 272 seconds] |
13:00:03 | | Letur joins |
13:05:12 | | sec^nd quits [Ping timeout: 255 seconds] |
13:10:21 | | sec^nd (second) joins |
13:40:14 | | GNU_world joins |
14:02:37 | | zhongfu quits [Ping timeout: 255 seconds] |
14:22:45 | | zhongfu (zhongfu) joins |
14:28:43 | | line quits [Ping timeout: 272 seconds] |
14:30:22 | | line joins |
15:05:52 | | JaffaCakes118 quits [Remote host closed the connection] |
15:33:53 | | wickerz quits [Quit: The Lounge - https://thelounge.chat] |
15:34:15 | | wickerz joins |
16:10:41 | | tzt quits [Ping timeout: 272 seconds] |
16:11:30 | | tzt (tzt) joins |
16:56:17 | | abirkill- (abirkill) joins |
16:58:07 | | abirkill quits [Ping timeout: 255 seconds] |
16:58:07 | | abirkill- is now known as abirkill |
17:01:03 | <pokechu22> | from #archivebot: 14:53 <youbanana> Have you guys gotten google podcasts yet? It'll be shutting down in 3 days and I couldn't find a full scrape of it in the viewer. |
17:06:48 | <pokechu22> | do we have anything going on for that? |
17:07:29 | <pokechu22> | deathwatch says date is unknown |
17:40:07 | <pokechu22> | https://podcasts.google.com/ says the date is April 2 though |
17:40:58 | <h2ibot> | Pokechu22 edited Deathwatch (+53, /* 2024 */ April 2 for Google Podcasts): https://wiki.archiveteam.org/?diff=51991&oldid=51956 |
17:43:25 | <c3manu> | !ig 4bu2dpgytytcjp7bnhjfbudc2 ^https?://www\.tametick\.com/ |
18:21:28 | | Doranwen (Doranwen) joins |
18:24:08 | | Unholy23613166180851599738 (Unholy2361) joins |
18:24:25 | | Unholy23613166180851599738 quits [Client Quit] |
18:24:46 | | Unholy23613166180851599738 (Unholy2361) joins |
18:25:35 | | Dango360 quits [Ping timeout: 272 seconds] |
18:27:26 | | icedice (icedice) joins |
18:33:49 | | Dango360 (Dango360) joins |
18:37:47 | | Dango360_ joins |
18:41:37 | | Dango360 quits [Ping timeout: 255 seconds] |
19:10:31 | | icedice quits [Client Quit] |
19:26:18 | | icedice (icedice) joins |
19:44:21 | | VickoSaviour quits [Client Quit] |
19:52:57 | | nertzy joins |
20:12:41 | <fireonlive> | -+rss- A. K. Dewdney has died: https://lfpress.remembering.ca/obituary/alexander-dewdney-1089463499 https://news.ycombinator.com/item?id=39886272 |
20:13:33 | <fireonlive> | Dan Lynch Has Died (SRI, Arpanet, Internet) https://www.internethalloffame.org/2021/04/19/dan-lynchs-love-brilliant-complexity-fuels-early-internet-development-growth/ https://news.ycombinator.com/item?id=39887275 |
20:14:22 | <h2ibot> | That lurker edited List of websites excluded from the Wayback Machine (+22, Added ntcore.com): https://wiki.archiveteam.org/?diff=51992&oldid=51990 |
20:16:22 | | grid joins |
20:20:09 | | Dango360_ quits [Client Quit] |
20:20:29 | | Dango360 (Dango360) joins |
20:28:00 | | jacksonchen666 quits [Ping timeout: 255 seconds] |
20:39:44 | | that_lurker scrolls up and wonders why lounge got minus karma and then found out why |
20:39:49 | <that_lurker> | https://lounge.kuhaon.fun/folder/36e26e279eeb5fd9/fuuuck-nicolas-cage.gif |
20:44:08 | | tt joins |
20:44:47 | | jacksonchen666 (jacksonchen666) joins |
20:45:08 | | tt quits [Client Quit] |
20:52:18 | | eightthree quits [Remote host closed the connection] |
20:54:07 | | eightthree joins |
20:56:20 | | eightthree quits [Remote host closed the connection] |
21:00:28 | <h2ibot> | JAABot edited List of websites excluded from the Wayback Machine (+0): https://wiki.archiveteam.org/?diff=51993&oldid=51992 |
21:01:24 | | eightthree joins |
21:09:36 | | jacksonchen666 quits [Remote host closed the connection] |
21:10:08 | | jacksonchen666 (jacksonchen666) joins |
21:15:36 | | eightthree quits [Remote host closed the connection] |
21:16:49 | | eightthree joins |
21:21:03 | | JaffaCakes118 (JaffaCakes118) joins |
21:27:14 | | pixel leaves [Error from remote client] |
21:27:15 | | pixel (pixel) joins |
21:28:20 | | eightthree quits [Remote host closed the connection] |
21:29:26 | | eightthree joins |
21:33:41 | <nicolas17> | fireonlive: SRI is where Siri came from |
21:38:48 | | eightthree quits [Remote host closed the connection] |
21:50:57 | | eightthree joins |
21:51:48 | | eightthree quits [Remote host closed the connection] |
21:52:36 | | eightthree joins |
22:03:38 | | eightthree quits [Remote host closed the connection] |
22:04:49 | | eightthree joins |
22:09:33 | | eightthree quits [Remote host closed the connection] |
22:11:58 | | eightthree joins |
22:13:12 | | BlueMaxima joins |
22:21:25 | <fireonlive> | :o |
22:21:35 | <fireonlive> | no wonder it's shitty, apple didn't make it |
22:24:22 | | teacold66 joins |
22:24:29 | <teacold66> | Could someone archive https://forum.kaspersky.com/ with archivebot please (not much coverage after mid 2023) |
22:28:47 | | midou quits [Ping timeout: 272 seconds] |
22:30:54 | | teacold66 quits [Client Quit] |
22:36:15 | | grid quits [Client Quit] |
22:38:44 | | nertzy quits [Remote host closed the connection] |
22:38:49 | | midou joins |
22:40:58 | <@JAA> | thuban: https://gitea.arpa.li/JustAnotherArchivist/little-things/src/branch/master/extract-urls-for-archiveteam-projects |
22:44:13 | <@JAA> | Oof re Google Podcasts |
23:05:35 | | eightthree quits [Remote host closed the connection] |
23:06:56 | | eightthree joins |
23:09:27 | | Bleo182600 quits [Client Quit] |
23:09:46 | | Bleo182600 joins |
23:33:45 | | Arcorann (Arcorann) joins |