00:03:58 | <h2ibot> | PaulWise edited Anubis (+173, forges using anubis): https://wiki.archiveteam.org/?diff=55934&oldid=55931 |
00:05:58 | <h2ibot> | BlankEclair edited Anubis (+40, /* Projects and websites known to deploy Anubis…): https://wiki.archiveteam.org/?diff=55935&oldid=55934 |
00:19:00 | <h2ibot> | PaulWise edited Anubis (+27, git.linuxtv.org): https://wiki.archiveteam.org/?diff=55936&oldid=55935 |
00:20:00 | <h2ibot> | PaulWise edited Anubis (-13, all of linuxtv.org): https://wiki.archiveteam.org/?diff=55937&oldid=55936 |
00:21:08 | <pabs> | seems like we are going to need per-domain or per-URL UAs in AB at some point |
00:21:37 | | DopefishJustin quits [Remote host closed the connection] |
01:14:09 | | Guest58 joins |
01:19:05 | | Guest58 quits [Remote host closed the connection] |
01:19:39 | | Guest58 joins |
01:23:40 | | Guest58 quits [Remote host closed the connection] |
01:24:01 | | etnguyen03 (etnguyen03) joins |
01:24:15 | | Guest58 joins |
01:48:02 | | ericgallager quits [Quit: This computer has gone to sleep] |
02:00:42 | | DopefishJustin joins |
02:00:42 | | DopefishJustin is now authenticated as DopefishJustin |
02:27:27 | | dabs quits [Read error: Connection reset by peer] |
02:32:27 | | flotwig quits [Read error: Connection reset by peer] |
02:33:38 | | flotwig joins |
02:53:47 | | notarobot1 joins |
03:05:14 | | notarobot1 quits [Ping timeout: 276 seconds] |
03:06:12 | | etnguyen03 quits [Remote host closed the connection] |
04:13:24 | | notarobot1 joins |
05:00:47 | | ericgallager joins |
05:02:14 | | benjins3__ quits [Ping timeout: 276 seconds] |
06:16:24 | | ducky_ (ducky) joins |
06:17:31 | | ducky__ (ducky) joins |
06:18:31 | | ducky quits [Ping timeout: 260 seconds] |
06:18:31 | | ducky__ is now known as ducky |
06:20:51 | | ducky_ quits [Ping timeout: 260 seconds] |
06:25:46 | <@arkiver> | i see there were some attempts to archive https://forums.soompi.com/ with AB, have those been succesful? |
06:27:36 | <@arkiver> | i see news.goo.ne.jp is going well in AB, looks like it may finish? |
06:27:54 | <@arkiver> | project for goo.dictionary.ne.jp coming |
06:27:57 | <@arkiver> | err |
06:28:04 | <@arkiver> | the first parts switched |
06:28:04 | <nulldata> | arkiver - nope, Soompi forums use aggressive Buttflare |
06:28:13 | <@arkiver> | nulldata: right i see... |
06:28:15 | <@arkiver> | blegh |
06:33:41 | | ymgve quits [Ping timeout: 260 seconds] |
07:01:17 | | benjins3 joins |
07:23:50 | | Island quits [Read error: Connection reset by peer] |
07:34:21 | | JTL quits [Ping timeout: 260 seconds] |
07:36:07 | | JTL (JTL) joins |
07:40:48 | | ymgve joins |
08:29:39 | | Webuser986293 joins |
08:31:34 | | Dada joins |
08:33:00 | | Webuser986293 quits [Client Quit] |
08:38:46 | | beastbg8__ joins |
08:42:01 | | beastbg8_ quits [Ping timeout: 260 seconds] |
09:17:01 | | corentin quits [Ping timeout: 260 seconds] |
09:24:02 | | corentin joins |
11:00:05 | | Bleo182600722719623455 quits [Quit: The Lounge - https://thelounge.chat] |
11:02:46 | | Bleo182600722719623455 joins |
11:13:45 | | chrismeller8 (chrismeller) joins |
11:14:51 | | chrismeller quits [Ping timeout: 260 seconds] |
11:14:51 | | chrismeller8 is now known as chrismeller |
11:17:25 | | Wohlstand (Wohlstand) joins |
12:01:04 | | tertu2 (tertu) joins |
12:03:16 | | tertu quits [Ping timeout: 260 seconds] |
12:08:52 | <joepie91|m> | hey, did we get a capture of this stuff at some point? https://soc.megatokyo.moe/notice/AtZQhXj0TQ6he5GvMu |
14:17:04 | | Wohlstand quits [Client Quit] |
14:38:30 | | kedihacker help |
14:46:19 | <plcp> | hi, is the AB bot able to grab pdf files hosted "publicly" in a google drive? |
14:46:22 | <plcp> | for context, late 2023 the reference website on the history of french telcos died, and together with friends (+with the agreement of the original author) we brought it back at https://histelfrance.fr |
14:47:19 | <plcp> | it's a dumb copy of the previous website, which had webpages such as https://www.histelfrance.fr/page-5599392bc9f47.html that hosted historical PDF files in a google drive owned by the author, for example https://drive.google.com/file/d/1ZlSMlaAKJfv3Gxk19ls4r-3PhgLvTRuJ/view |
14:47:41 | <plcp> | the IA bot seems to not grab these files https://web.archive.org/web/20230823125801/https://drive.google.com/file/d/1ZlSMlaAKJfv3Gxk19ls4r-3PhgLvTRuJ/view |
14:48:36 | <plcp> | the original author now wants to reclaim the domain name to build something new / a continuation of his work |
14:49:15 | <plcp> | I'd like "just to be safe" to grab all these pdfs detached in google drive |
14:49:51 | <plcp> | do I need to figure something out by myself (like listing all the drive.google.com links and grabing them) |
14:50:36 | <plcp> | or is there something less ad-hoc that exists out there? :) |
15:12:55 | | Wohlstand (Wohlstand) joins |
15:13:52 | <pabs> | https://wiki.archiveteam.org/index.php/Google_Drive |
15:25:46 | | Wohlstand quits [Client Quit] |
15:51:47 | <h2ibot> | Manu edited Mailman/2 (-75, /* Queued lists.wikimedia.org */): https://wiki.archiveteam.org/?diff=55938&oldid=55859 |
15:55:28 | <@arkiver> | so we have an interesting thing coming |
15:55:43 | <@arkiver> | i only recently became aware of it (maybe others knew it already) |
15:56:15 | <@arkiver> | Meta is going remove old ads from their Facebook Ad Library https://www.axios.com/2025/05/20/meta-removing-expired-political-ads |
15:56:52 | <@arkiver> | we would be doing a bit of an emergency project to get these archived as fast as possible |
15:56:58 | <@arkiver> | does anyone have channel ideas? |
15:57:36 | <@arkiver> | this is about https://www.facebook.com/ads/library/ |
15:57:41 | <@arkiver> | (search for something :) ) |
15:57:44 | <@arkiver> | it's pretty cool actually |
15:57:47 | <@arkiver> | more tomorrow |
15:58:19 | | @arkiver is afk for sleep |
16:04:11 | | flotwig quits [Ping timeout: 260 seconds] |
16:18:46 | | croissant_ joins |
16:22:16 | | croissant quits [Ping timeout: 260 seconds] |
16:23:59 | | grill (grill) joins |
16:46:11 | | adryd0 quits [Ping timeout: 276 seconds] |
16:52:28 | | lennier2 joins |
16:53:25 | <nulldata> | "fads" |
16:55:56 | | lennier2__ quits [Ping timeout: 276 seconds] |
16:57:39 | <Vokun> | subtractlibrary |
17:13:28 | | lennier2_ joins |
17:16:44 | | lennier2 quits [Ping timeout: 276 seconds] |
17:18:00 | <h2ibot> | HadeanEon edited Deaths in 2025 (+819, BOT - Updating page: {{saved}} (130),…): https://wiki.archiveteam.org/?diff=55939&oldid=55932 |
17:18:01 | <h2ibot> | HadeanEon edited Deaths in 2025/list (+54, BOT - Updating list): https://wiki.archiveteam.org/?diff=55940&oldid=55933 |
17:39:46 | | flotwig joins |
17:42:25 | | dabs joins |
18:02:45 | <Dango360> | for anubis, we should probably ask the developer (https://github.com/Xe) to add a rule that would allow AB machine IPs to send requests without activating the challenge |
18:17:49 | <aninternettroll> | Isn't easier to use a user agent that default anubis config allows? |
18:19:07 | <pokechu22> | Yeah, I think we got the default archivebot user-agent allowed |
18:19:39 | <that_lurker> | Anubis is default allow anyway so if AB is blocked the user has done that for a "reason" |
18:20:04 | | jacksonchen666 is now authenticated as * |
18:20:04 | | jacksonchen666 is now known as RJHacker14970 |
18:20:04 | | RJHacker14970 quits [Killed (guybrush.hackint.org (Nickname regained by services))] |
18:20:08 | | jacksonchen666 (jacksonchen666) joins |
18:20:14 | | jacksonchen666_ (jacksonchen666) joins |
18:20:26 | | dabs quits [Ping timeout: 276 seconds] |
18:21:04 | | BornOn420 quits [Remote host closed the connection] |
18:21:39 | | BornOn420 (BornOn420) joins |
18:22:38 | <masterx244|m> | IP is at least something that the AI crawlers cannot fake unless a pipeline disappears and they get access to it. if the useragent gets out somehow the agent-based protection gets useless without a change |
18:29:54 | <nulldata> | I dunno if publishing a list of pipeline IPs is desirable. |
18:32:09 | <masterx244|m> | one or two as a known whitelistable pipeline could work as a middle ground |
18:49:34 | <aninternettroll> | Is there any project being blocked by anubis now? |
19:01:31 | | szczot3k|t quits [Ping timeout: 260 seconds] |
19:20:14 | | Island joins |
19:27:43 | | Bleo182600722719623455 quits [Quit: Ping timeout (120 seconds)] |
19:27:57 | | Bleo182600722719623455 joins |
19:43:10 | | cuphead2527480 quits [Quit: Connection closed for inactivity] |
19:51:00 | | Wohlstand (Wohlstand) joins |
20:05:58 | | jacksonchen666 quits [Client Quit] |
20:05:59 | | jacksonchen666_ is now known as jacksonchen666 |
20:06:50 | | valdikss quits [Read error: Connection reset by peer] |
20:07:01 | | valdikss joins |
20:13:51 | | grill quits [Ping timeout: 260 seconds] |
20:23:20 | | PredatorIWD25 quits [Read error: Connection reset by peer] |
20:23:51 | | PredatorIWD25 joins |
20:24:30 | | MrMcNuggets quits [Read error: Connection reset by peer] |
20:25:16 | | MrMcNuggets (MrMcNuggets) joins |
20:25:36 | | @imer quits [Quit: Ping timeout (120 seconds)] |
20:26:01 | | imer (imer) joins |
20:26:01 | | @ChanServ sets mode: +o imer |
20:33:11 | | ^ quits [Read error: Connection reset by peer] |
20:33:32 | | ^ (^) joins |
20:33:40 | | beastbg8_ joins |
20:37:11 | | beastbg8__ quits [Ping timeout: 260 seconds] |
20:38:21 | | ell7 quits [Ping timeout: 260 seconds] |
20:42:01 | | ell7 (ell) joins |
20:48:58 | | BornOn420 quits [Ping timeout: 264 seconds] |
20:52:47 | | BornOn420 (BornOn420) joins |
20:55:56 | | Webuser818864 joins |
20:56:17 | | Webuser818864 quits [Client Quit] |
21:00:25 | | dabs joins |
21:39:41 | | tertu (tertu) joins |
21:41:21 | | tertu2 quits [Ping timeout: 260 seconds] |
22:04:52 | | etnguyen03 (etnguyen03) joins |
22:17:32 | | lennier2 joins |
22:20:26 | | lennier2_ quits [Ping timeout: 260 seconds] |
22:35:53 | | MrMcNuggets quits [Ping timeout: 276 seconds] |
22:38:41 | | Dada quits [Remote host closed the connection] |
22:56:53 | | cuphead2527480 (Cuphead2527480) joins |
23:13:11 | | Wohlstand quits [Quit: Wohlstand] |
23:13:59 | <h2ibot> | PaulWise edited Mailing Lists (+91, lists.launchpad.net is mhonarc): https://wiki.archiveteam.org/?diff=55941&oldid=55860 |
23:23:09 | | etnguyen03 quits [Client Quit] |
23:36:57 | <pabs> | aninternettroll: all the domains on https://wiki.archiveteam.org/index.php/Anubis |
23:37:45 | | etnguyen03 (etnguyen03) joins |
23:48:04 | <h2ibot> | PaulWise edited Mailing Lists (+126, add Simplelists): https://wiki.archiveteam.org/?diff=55942&oldid=55941 |
23:51:04 | <h2ibot> | PaulWise edited Mailing Lists (+33, merge Simplelists sections): https://wiki.archiveteam.org/?diff=55943&oldid=55942 |
23:59:05 | <h2ibot> | PaulWise edited Mailing Lists (+558, add Message-Id notes for pipermail, hyperkitty,…): https://wiki.archiveteam.org/?diff=55944&oldid=55943 |