00:00:10 | | sec^nd quits [Remote host closed the connection] |
00:00:56 | | sec^nd (second) joins |
00:28:55 | | Umbire quits [Remote host closed the connection] |
00:29:24 | | Umbire (Umbire) joins |
00:34:10 | | xarph quits [Quit: ZNC 1.8.2+deb2ubuntu0.1 - https://znc.in] |
00:36:08 | | xarph joins |
02:48:14 | | lemuria (lemuria) joins |
02:51:12 | | archiveDrill quits [Quit: The Lounge - https://thelounge.chat] |
03:17:41 | <pabs> | https://discourse.ubuntu.com/t/introducing-debcrafters/63674 |
03:20:18 | <nicolas17> | pabs: sounds like an easier way to get paid to work on Ubuntu than Canonical's hiring process :D |
03:20:58 | <pabs> | sounds like you'd still have to go through that process to get paid? |
03:21:06 | <nicolas17> | ah I misread the "paid to contribute one day per week" part |
03:21:20 | <pabs> | just with the extra requirement of having to already be a volunteer Ubuntu contributor first |
03:21:21 | <nicolas17> | that's "apart from their regular job doing Ubuntu stuff" |
03:22:08 | <pabs> | did you see this? https://dustri.org/b/my-experience-with-canonicals-interview-process.html |
03:22:44 | <nicolas17> | yes |
03:22:47 | <nicolas17> | and others |
03:22:54 | <nicolas17> | I have yet to see anyone talk about a positive experience |
03:23:27 | <pabs> | I hear people already there enjoy the job |
03:26:26 | <nicolas17> | yeah you just have to subject yourself to that hazing process to maybe get in |
03:28:02 | <nicolas17> | it sounds like Mark Shuttleworth should be tied to the same rocket as Elon Musk and Mark Zuckerberg and aimed in the general direction of Mars |
03:29:00 | <BlankEclair> | he's named shuttleworth of all things |
03:31:34 | | HackMii quits [Ping timeout: 264 seconds] |
03:31:52 | <pabs> | TBH, if I could skip working on the Ubuntu Archive, and only do FOSS work upstream on Debian and other projects, the hazing might be tempting... |
03:32:05 | | HackMii (hacktheplanet) joins |
03:36:42 | <BlankEclair> | > So I exercised my GDPR rights, and asked to be communicated everything pertaining to my interviews. |
03:36:45 | <BlankEclair> | holy shit that's nice |
03:45:32 | <pabs> | https://www.nullpt.rs/reversing-botid |
03:52:14 | | Umbire quits [Ping timeout: 260 seconds] |
04:11:37 | | Guest58 joins |
04:12:53 | | Guest58 quits [Client Quit] |
04:13:51 | | HackMii quits [Remote host closed the connection] |
04:13:51 | | HackMii_ (hacktheplanet) joins |
04:40:48 | | Guest58 joins |
04:53:29 | | jinn6 quits [Ping timeout: 260 seconds] |
04:55:26 | | jinn6 joins |
04:58:49 | | Lord_Nightmare quits [Quit: ZNC - http://znc.in] |
05:02:11 | | Lord_Nightmare (Lord_Nightmare) joins |
05:16:06 | | archiveDrill joins |
05:25:36 | | Guest58 quits [Client Quit] |
05:44:08 | | Umbire (Umbire) joins |
05:46:14 | | Guest58 joins |
06:09:19 | | Umbire quits [Ping timeout: 260 seconds] |
06:18:14 | | Guest58 quits [Client Quit] |
06:19:33 | | Guest58 joins |
06:26:10 | | trix quits [Quit: o7 lain.ripe.net] |
06:30:39 | | trix (trix) joins |
06:43:20 | | Umbire (Umbire) joins |
07:05:13 | | Guest58 quits [Client Quit] |
07:05:54 | | Guest58 joins |
07:07:41 | | Umbire quits [Ping timeout: 276 seconds] |
07:10:34 | | Guest58 quits [Ping timeout: 260 seconds] |
08:04:30 | | Guest58 joins |
08:29:07 | | Guest58 quits [Client Quit] |
08:31:59 | | Guest58 joins |
09:34:39 | | ducky quits [Ping timeout: 260 seconds] |
09:35:01 | | ducky (ducky) joins |
09:40:45 | | Dada joins |
11:00:01 | | Bleo182600722719623455222 quits [Quit: The Lounge - https://thelounge.chat] |
11:02:56 | | Bleo182600722719623455222 joins |
11:14:39 | | Guest58 quits [Client Quit] |
11:18:47 | | Guest58 joins |
12:42:26 | | yasomi quits [Ping timeout: 276 seconds] |
12:46:06 | | yasomi (yasomi) joins |
12:46:06 | | Medowar quits [Read error: Connection reset by peer] |
12:46:48 | | Medowar joins |
12:47:47 | | Medowar is now authenticated as Medowar |
12:51:50 | <@JAA> | I received a 'reaction' email from GMail for the first time. Thunderbird classified it as spam. |
12:54:00 | | katocala is now authenticated as katocala |
12:57:24 | <jinn6> | sounds about right |
12:57:35 | <jinn6> | also no topic set in here |
12:57:54 | | katia pulls hexa_'s tail |
12:58:14 | <jinn6> | or in #archiveteam-bs for that matter, lol |
13:05:23 | | Guest58 quits [Client Quit] |
13:06:42 | <nulldata> | https://blog.cloudflare.com/introducing-pay-per-crawl/ |
13:07:03 | | steering wonders which server(s) lost the topic |
13:07:24 | <katia> | aaaaaaaaaaaaaaaaaaaaaaaaa clownflare |
13:30:01 | <masterx244|m> | s/clownflare/buttflare/g |
13:31:45 | | Guest58 joins |
13:36:54 | | Dada quits [Remote host closed the connection] |
13:38:10 | <jinn6> | clownflare is great |
13:38:33 | <jinn6> | apparently I'm on vindobona.hackint.org and that doesn't have topic |
13:42:03 | | Guest58 quits [Client Quit] |
13:46:57 | | ducky quits [Remote host closed the connection] |
13:47:55 | | ducky (ducky) joins |
13:48:16 | | ducky quits [Read error: Connection reset by peer] |
13:50:39 | | ducky (ducky) joins |
13:52:01 | | ducky quits [Read error: Connection reset by peer] |
13:53:48 | | ducky (ducky) joins |
13:53:49 | | Guest58 joins |
13:56:20 | | Umbire (Umbire) joins |
14:04:48 | | Guest58 quits [Client Quit] |
14:19:15 | | HackMii_ quits [Remote host closed the connection] |
14:19:35 | | HackMii (hacktheplanet) joins |
14:42:04 | <IDK> | https://www.wired.com/story/cloudflare-blocks-ai-crawlers-default/ |
14:42:17 | <IDK> | The end for SPN? :( |
14:45:58 | <katia> | fck cld |
14:47:51 | <IDK> | --🤡🔥 |
14:48:23 | <IDK> | 🤡🔥-- |
14:48:24 | <eggdrop> | [karma] '🤡🔥' now has -1 karma! |
14:48:30 | <IDK> | there |
15:12:49 | | archiveDrill2 joins |
15:15:12 | | archiveDrill quits [Ping timeout: 276 seconds] |
15:15:12 | | archiveDrill2 is now known as archiveDrill |
15:17:56 | | @Fusl quits [Quit: K-Lined] |
15:18:13 | | Fusl (Fusl) joins |
15:18:13 | | @ChanServ sets mode: +o Fusl |
15:30:09 | <jinn6> | "You've run out of free articles." ah yes |
15:40:26 | <Umbire> | I mean the scraping has also been outright causing various sites to buckle from the traffic |
15:40:40 | <Umbire> | including a wiki I admin (which also offers downloads of its contents to boot) |
15:41:39 | <Umbire> | jinn6, https://archive.is/IT0Jj |
15:42:33 | | wessel15126 joins |
15:44:33 | <jinn6> | thx |
15:44:54 | | wessel15126 quits [Client Quit] |
15:46:20 | | wessel15126 joins |
15:47:01 | <jinn6> | you could argue that those sites were poorly designed to begin with, tbh, like, the expensive dynamically generated endpoints, like "show page edit history", or "show version diff" should be allowed for accepted users only, or such.....but yeah, the bots don't care |
15:47:58 | <Umbire> | yeah I don't think sites should like, explode because they were poorly designed |
15:48:11 | <Umbire> | and exploded sites tend to be a bit trickier to make design changes to |
15:48:15 | | wessel15126 quits [Client Quit] |
15:48:41 | | wessel1512 joins |
15:50:04 | <Umbire> | in the same vein one could also argue for much more sensible scraping (or just avoiding it altogether, in the case of sites offering their own archives and such) |
15:50:15 | <Umbire> | none of this said with any hostility, just to be clear |
15:50:30 | <Umbire> | tend to come off combative when I'm not aiming to |
15:51:20 | <jinn6> | something something robots.txt |
15:52:28 | <jinn6> | and because of a few (dozen) bad actors, now people end up blocking like, "every cloud hosting IP", and stuff |
15:53:21 | <Umbire> | nothing about it's ideal lol |
15:53:41 | <Umbire> | and I'm not a site designer so idk how feasible robots.txt is as an option for a given type of site |
15:53:44 | <Umbire> | in either direction |
15:59:33 | <jinn6> | quite feasible, but robots.txt is just a guideline that only well-behaved bots care for |
15:59:53 | <Umbire> | If it isn't already clear I'd rather bots used sensibly for archival purposes be not caught in the crossfire (also rather it be done somewhat sensibly but then I think most archival bots from teams that even halfway know what they're doing fall under that) |
16:00:57 | <Umbire> | and for further clarity I don't have anything to do with the wiki site's own hosting, I just admin the wiki itself |
16:03:01 | <jinn6> | yeah, most bots are fine, problem is the LLM-trainers just get like a gazillion IPs all hammering on stuff, so you cannot even block them with normal tools, since a given IP might only hit a few pages a minute, and when blocked, just switches to another IP...which is why I can't walk 10 steps without another captcha, anubis, turnstile, etc hitting me in the face, nowdays >:( |
16:03:37 | <Umbire> | yeah and I can't really blame most people for blanket blocking as much as I also think it fucking Sucks |
16:04:07 | <Umbire> | (even not factoring my own personal beefs with Cloudflare, which seems to be mandatory to chat in here :P) |
16:06:36 | <jinn6> | at least anubis by default lets stuff through that doesn't pretend to be mozilla-like (and as such can just be blocked based on user-agent if needed)...but most people just captcha/cloudflare... |
16:09:43 | <Umbire> | yeah another editor suggested anubis also, might bring it up again |
16:09:56 | <Umbire> | ideally while there's still a lull in the scraping hits for us |
16:17:16 | <jinn6> | it's not ideal, but at least it lets text-only browsers through |
16:17:54 | <Umbire> | mhmhmmm |
16:20:09 | <jinn6> | there's a bunch of other thingies by now too, but I have no idea what their defaults are like |
16:21:59 | <jinn6> | go-away, powxy, apparently even a css based one, I've heard that even just checking e.g. whether the requester sends a referer header is a surprisingly decent way to block bots, but I don't have any personal experience with all that jazz |
16:22:51 | | wessel1512 is now authenticated as wessel1512 |
16:23:21 | <Umbire> | I see, will look into |
16:23:36 | <Umbire> | thanks for the chat, gonna make lunch now |
16:26:41 | | anonymoususer852 quits [Ping timeout: 276 seconds] |
16:27:21 | <jinn6> | a funny idea I've thought about, is using server-side imagemaps, every browser that is able to display images, supports it, down to ncsa mosaic, all it does is appends clicked coordinates to the url, but it could probably be used with an ai-poisoned-image as a crude DIY captcha |
16:27:53 | <jinn6> | (it wouldn't be accessible to blind people, and it wouldn't work in text browsers without images, though) |
16:28:06 | | anonymoususer852 (anonymoususer852) joins |
16:48:54 | | grill (grill) joins |
17:22:45 | <nicolas17> | it's unfortunate that any whitelist to let us archivers through would be easy for LLM scrapers to abuse too |
17:25:00 | <Umbire> | yeah |
17:26:34 | | grill quits [Ping timeout: 260 seconds] |
17:28:26 | | grill (grill) joins |
17:32:13 | | Dada joins |
17:50:18 | | Umbire quits [Quit: [brb]] |
18:08:49 | | HP_Archivist quits [Quit: Leaving] |
18:13:14 | | grill quits [Ping timeout: 260 seconds] |
18:15:05 | | grill (grill) joins |
18:17:34 | | Umbire (Umbire) joins |
18:31:42 | | Umbire quits [Remote host closed the connection] |
18:32:11 | | Umbire (Umbire) joins |
18:55:40 | | pixel (pixel) joins |
19:00:44 | | HP_Archivist (HP_Archivist) joins |
19:08:11 | <yzqzss> | > apparently even a css based one |
19:08:11 | <yzqzss> | You mean https://github.com/yzqzss/csswaf ? :) |
19:09:37 | <yzqzss> | > using server-side imagemaps |
19:09:57 | <yzqzss> | Imagemaps captcha is common on the Tor |
19:20:34 | | Umbire quits [Remote host closed the connection] |
19:21:03 | | Umbire (Umbire) joins |
19:43:26 | <jinn6> | I've never seen one |
19:43:39 | <jinn6> | in fact, I've never seen anyone use server-side imagemaps anywhere, ever |
19:46:54 | | HP_Archivist quits [Client Quit] |
19:48:08 | <nicolas17> | I have seen it for a JS-free IE5-era image editor |
19:48:11 | <nicolas17> | you could like |
19:48:13 | <nicolas17> | crop images |
19:48:52 | <nicolas17> | by clicking a corner of the rectangle you want to crop, it would send the coordinates to the server with an <input type="image"> and return a new html page where you click the second corner |
19:49:33 | <jinn6> | that's cool |
19:50:11 | <nicolas17> | idr if it was JS-free |
19:50:27 | <jinn6> | only by very explicitly searching for them, have I found a handful of sites demoing them, but I've never seen one be actually used, and all the software for dealing with it seems to be so ancient that the web servers themselves don't even exist |
19:50:30 | <nicolas17> | maybe it was one of those "if IE then use document.all, if Netscape then use document.layers" |
19:58:20 | <steering> | < nicolas17> it's unfortunate that any whitelist to let us archivers through would be easy for LLM scrapers to abuse too |
19:58:55 | <steering> | I would love to see someone start an "IP safelist" that would just compile IPs from/of various reputable sources (archive.org, whatever IPs AT chose to share) |
19:59:34 | <nicolas17> | IPs would be the safest way yeah |
20:00:06 | <nicolas17> | I just mean we can't ask to pretty please whitelist the SPN user agent, AI scrapers would figure it out pretty quick |
20:00:13 | <steering> | yeah for sure |
20:00:39 | <nicolas17> | ffs |
20:00:46 | <nicolas17> | https://apps.microsoft.com/detail/9nf052js0rj4 it seems this app includes an entire copy of Okular and then getting people angry at KDE because the so-called PDF editor doesn't actually edit |
20:01:07 | <nicolas17> | GPL violation too, of course |
20:19:14 | | grill quits [Ping timeout: 260 seconds] |
20:21:19 | <jinn6> | as long as they don't claim to be okular themselves, it should be somewhat fine-ish |
20:22:13 | <jinn6> | just like blender's had long-running trouble too, people just take it and rebrand slightly, and make money (or don't rebrand and just insert ads into the original) |
20:23:35 | <masterx244|m> | did the blender devs get that pest squashed somehow? |
20:23:46 | <nicolas17> | jinn6: well I don't see any signs of attribution, license, source code, etc |
20:23:57 | <jinn6> | yeahhhhhh |
20:24:10 | <nicolas17> | and they're emailing KDE asking for a refund |
20:24:19 | <nicolas17> | because KDE is what's mentioned in the About window |
20:25:31 | <jinn6> | that's sucky, yeah |
20:26:17 | <jinn6> | I think the "sayoapps" has 4 copies of the same-ish program even, lul |
20:26:24 | <nicolas17> | (also lol seems it's laughably easy to download an .appx from the microsoft store without buying the app, that's how I'm looking at the package contents) |
20:27:43 | <jinn6> | lol, is it? |
20:29:10 | <nicolas17> | I'm scraping https://store.rg-adguard.net/ to monitor when certain free apps get updated, a friend is doing similar monitoring by using the actual Microsoft Store APIs directly |
20:29:19 | <nicolas17> | and I didn't *expect* it to work for paid ones but it seems it does :| |
20:49:34 | | riteo quits [Ping timeout: 260 seconds] |
20:59:51 | | BornOn420 quits [Remote host closed the connection] |
21:00:28 | | BornOn420 (BornOn420) joins |
21:02:33 | | BornOn420 quits [Max SendQ exceeded] |
21:03:05 | | BornOn420 (BornOn420) joins |
21:05:53 | | BennyOtt_ joins |
21:07:04 | | BennyOtt quits [Ping timeout: 260 seconds] |
21:07:04 | | BennyOtt_ is now known as BennyOtt |
21:07:04 | | BennyOtt is now authenticated as BennyOtt |
21:15:36 | <@JAA> | jinn6: Yes, topics missing from some servers are a known network-wide issue. |
21:16:01 | <jinn6> | it shouldn't be hard to like, make a bot to re-set them, I'd think? |
21:17:57 | <@JAA> | Sure, but this has been happening in various constellations for some time, so it'd be better to fix the underlying issue first. |
21:50:00 | | wickedplayer494 quits [Remote host closed the connection] |
21:56:52 | | wickedplayer494 joins |
21:57:07 | | wickedplayer494 is now authenticated as wickedplayer494 |
22:21:43 | | Dada quits [Remote host closed the connection] |
22:22:14 | | Guest58 joins |
22:32:27 | | Barto quits [Quit: WeeChat 4.6.3] |
22:36:38 | | Shjosan quits [Quit: Am sleepy (-, – )…zzzZZZ] |
22:37:04 | | Barto (Barto) joins |
22:37:39 | | Shjosan (Shjosan) joins |
23:19:22 | | Guest58 quits [Client Quit] |
23:24:53 | | Guest58 joins |
23:31:09 | | Umbire quits [Ping timeout: 260 seconds] |
23:34:44 | | Umbire (Umbire) joins |
23:34:55 | | Umbire quits [Remote host closed the connection] |
23:35:23 | | Umbire (Umbire) joins |