00:07:08<BlankEclair>they really don't have any moderation for these?
00:07:22<BlankEclair>sounds like a bad idea when you're soliciting user input and displaying it out to the world
00:07:43<BlankEclair>next up: someone publishes a leak of uk government records through blocked.org.uk
00:07:59cuphead2527480 quits [Quit: Connection closed for inactivity]
00:30:24DogsRNice quits [Client Quit]
02:01:18etnguyen03 (etnguyen03) joins
02:10:51Webuser492993 joins
02:12:30Webuser492993 quits [Client Quit]
02:48:51SootBector quits [Remote host closed the connection]
02:59:41IDK (IDK) joins
03:06:31etnguyen03 quits [Remote host closed the connection]
03:22:59tzt quits [Ping timeout: 260 seconds]
03:23:45lflare quits [Quit: Bye]
03:24:20tzt (tzt) joins
03:24:31lflare (lflare) joins
04:10:31DogsRNice joins
04:20:12tmg1|michelson joins
05:01:07<ericgallager>I see WPlace is trending: https://bsky.app/profile/trending.bsky.app/feed/371544258
05:01:26<ericgallager>when I try to visit https://wplace.live/ though, it says it's down
05:01:37<ericgallager>anyone know if it's been archived at all?
05:03:42<pokechu22>We've got an archivebot job for various tiles on the site, but it has to run slowly and will take 2 weeks to finish
05:03:58<pokechu22>looks like that's still up, e.g. https://backend.wplace.live/files/s0/tiles/388/874.png
05:04:31<ericgallager>maybe it's just my browser, then...
05:05:20<pokechu22>the site's still up for me in firefox, yeah
05:06:48<ericgallager>ok so maybe it's a particular extension, uBlock Origin or something, then...
05:09:44IDK quits [Client Quit]
05:14:03LddPotato quits [Remote host closed the connection]
05:15:50LddPotato (LddPotato) joins
05:16:48IDK (IDK) joins
05:26:28notSokar joins
05:26:34Sokar quits [Ping timeout: 240 seconds]
05:43:16BornOn420 (BornOn420) joins
05:48:34NatTheCat quits [Ping timeout: 240 seconds]
05:54:33Island quits [Read error: Connection reset by peer]
05:59:29awauwa (awauwa) joins
06:11:50NatTheCat (NatTheCat) joins
06:31:39DogsRNice quits [Read error: Connection reset by peer]
06:35:46Webuser862649 joins
06:44:08Webuser862649 quits [Client Quit]
07:19:44IDK quits [Client Quit]
07:44:34APOLLO03 quits [Ping timeout: 240 seconds]
07:54:49notSokar quits [Ping timeout: 260 seconds]
07:57:51Sokar joins
08:15:52nine quits [Quit: See ya!]
08:16:05nine joins
08:16:06nine quits [Changing host]
08:16:06nine (nine) joins
08:36:14midou quits [Ping timeout: 240 seconds]
08:45:59midou joins
09:00:32IDK (IDK) joins
09:17:40cyanbox joins
09:20:37APOLLO03 joins
10:02:00igloo22225 quits [Quit: The Lounge - https://thelounge.chat]
10:02:27igloo22225 (igloo22225) joins
10:27:29<c3manu>"Eastman Kodak, the 133-year-old photography company, is warning investors thats it might not survive much longer." - https://edition.cnn.com/2025/08/12/business/kodak-survival-warning
10:32:26<cyanbox>https://www.kodak.com/en/company/blog-post/statement-regarding-misleading-media-reports/
10:33:37<cyanbox>film is crazy popular rn, wouldn't make sense for them to be struggling that hard
10:36:30<h2ibot>Manu edited Discourse (+348, Add more active Discourses): https://wiki.archiveteam.org/?diff=56903&oldid=56863
10:45:23FiTheArchiver joins
11:00:03Bleo182600722719623455222 quits [Quit: The Lounge - https://thelounge.chat]
11:02:48Bleo182600722719623455222 joins
11:03:37NotGLaDOS quits []
11:17:56FiTheArchiver1 joins
11:21:19FiTheArchiver quits [Ping timeout: 260 seconds]
11:31:54rohvani quits [Ping timeout: 240 seconds]
11:52:14FiTheArchiver1 quits [Read error: Connection reset by peer]
11:58:08BornOn420 quits [Read error: Connection reset by peer]
11:58:38BornOn420 (BornOn420) joins
12:02:28Wohlstand (Wohlstand) joins
12:19:18etnguyen03 (etnguyen03) joins
12:29:26notSokar joins
12:31:19Sokar quits [Ping timeout: 260 seconds]
12:32:07etnguyen03 quits [Client Quit]
12:32:49Barto quits [Quit: WeeChat 4.7.0]
12:40:15notSokar quits [Client Quit]
12:40:25Sokar joins
12:50:53Barto (Barto) joins
13:01:43etnguyen03 (etnguyen03) joins
13:03:49Barto quits [Client Quit]
13:15:22etnguyen03 quits [Client Quit]
13:27:54nine quits [Ping timeout: 240 seconds]
13:31:04sepro (sepro) joins
13:34:32nine joins
13:34:32nine quits [Changing host]
13:34:32nine (nine) joins
13:51:14midou quits [Ping timeout: 240 seconds]
14:00:30midou joins
14:02:36ATinySpaceMarine joins
14:08:32ATinySpaceMarine quits [Client Quit]
14:11:08ATinySpaceMarine joins
14:18:06ATSM joins
14:19:26ATSM quits [Client Quit]
14:20:59ATinySpaceMarine quits [Ping timeout: 260 seconds]
14:30:20shuuji3 quits [Quit: Ooops, wrong browser tab.]
14:44:24MrMcNuggets (MrMcNuggets) joins
14:50:03Dada joins
15:26:39<gamer191-1|m>Farm Transparency Project (who I mentioned ages ago, but I don’t think anything happened because their site uses Vimeo embeds so it couldn’t be downloaded with a simple AB job) is now publicly encouraging users to archive their website (and the videos on it) https://www.instagram.com/p/DNVGMMeyc5o/
15:35:48<gamer191-1|m>Context: Farm Transparency Project is an Australian website which publishes undercover footage showing (often illegal) animal cruelty and animal welfare violations at farms and slaughterhouses. The Australian federal court recently ruled that they were violating copyright laws by publishing undercover footage (because the slaughterhouse owns the copyright to undercover footage shot there)
15:40:08<gamer191-1|m>I DMed them yesterday suggesting that they should create a BitTorrent of the site, and they left me on “seen”. Not sure if we should try to contact them by email requesting a copy of their site, or if we should run an AB job (I don’t have the skills to run a job involving Vimeo)
15:59:37<@arkiver>gamer191-1|m: do you have their site?
15:59:48<@arkiver>we should definitely make a copy and put their youtube (if any) in #down-the-tube
16:03:41BornOn420 quits [Read error: Connection reset by peer]
16:04:11BornOn420 (BornOn420) joins
16:20:03<Vokun>https://www.farmtransparency.org
16:26:51<cruller>They have 1258 videos in their repository (https://www.farmtransparency.org/), but only 233 on their Vimeo channel (https://vimeo.com/farmtransparency) and 99 on their YouTube channel (https://vimeo.com/farmtransparency).
16:35:15<cruller>According to https://wiki.archiveteam.org/index.php/vimeo, downloading public videos on Vimeo requires a login. yt-dlp shows a similar message.
16:36:43<cruller>However, some cobalt instances generate MP4 links that don't require a login. AB should be able to grab them.
16:38:02<cruller>(I don't know the mechanism of link generation.)
16:42:09pabs quits [Ping timeout: 260 seconds]
16:43:16pabs (pabs) joins
16:48:32<cruller>IIRC, embedded private Vimeo videos can only be viewed from the embedding page. Browser developer tools and yt-dlp --referer https://www.farmtransparency.org/campaigns/eggs-exposed https://player.vimeo.com/video/1075508246 fetch hls, but it is unclear whether MP4 URLs exist.
16:49:53<justauser|m>Yeah, encouraging mirrors without providing a good way to do this is so... humanish? bureaucracish?
16:50:34ThreeHM quits [Ping timeout: 240 seconds]
16:52:40ThreeHM (ThreeHeadedMonkey) joins
16:53:38<cruller>I don't understand why they make so many Vimeo videos "private".
17:04:51cyanbox quits [Read error: Connection reset by peer]
17:05:42<cruller><cruller> "They have 1258 videos in their..." <- https://www.youtube.com/c/farmtransparencyproject
17:14:43MrMcNuggets quits [Client Quit]
17:24:37<h2ibot>HadeanEon edited Deaths in 2025 (+689, BOT - Updating page: {{saved}} (149),…): https://wiki.archiveteam.org/?diff=56904&oldid=56873
17:24:38<h2ibot>HadeanEon edited Deaths in 2025/list (+54, BOT - Updating list): https://wiki.archiveteam.org/?diff=56905&oldid=56874
17:34:33ducky quits [Remote host closed the connection]
17:36:45ducky (ducky) joins
17:38:33Webuser278280 joins
17:42:49Webuser278280 quits [Client Quit]
17:43:04ducky quits [Read error: Connection reset by peer]
17:45:11ducky (ducky) joins
17:47:42notSokar joins
17:49:49Sokar quits [Ping timeout: 260 seconds]
17:57:12ducky quits [Remote host closed the connection]
17:57:45ducky (ducky) joins
18:22:12awauwa quits [Quit: awauwa]
18:26:04dhinakg (dhinakg) joins
18:29:12ducky_ (ducky) joins
18:31:14ducky quits [Ping timeout: 260 seconds]
18:31:14ducky_ is now known as ducky
18:52:48Barto (Barto) joins
18:59:32DogsRNice joins
19:00:14TheEnbyperor_ quits [Ping timeout: 240 seconds]
19:00:24TheEnbyperor quits [Ping timeout: 260 seconds]
19:06:51HP_Archivist quits [Quit: Leaving]
19:08:09IDK quits [Quit: Connection closed for inactivity]
19:22:10ducky quits [Remote host closed the connection]
19:24:12ducky (ducky) joins
19:36:11TheEnbyperor (TheEnbyperor) joins
19:43:34TheEnbyperor quits [Ping timeout: 260 seconds]
19:58:44<Ryz>Heya folks, I have this subdomain https://audiothek.dasdeck.com/ - that I found, and I was about to archive it weeks ago, since it's from the other subdomains under https://dasdeck.com/ - but stopped because it seemed it was getting video files or audio files from somewhere, but the video files are converted into audio files? Can anyone figure out
19:58:44<Ryz>where these files came from? Sadly I can't really archive this (even if assuming there's no funky JS stuff blockading from archiving), just needing my curiosity satiated~
20:03:00<pokechu22>clicking one I see it loads https://audiothek.dasdeck.com/?url=https://rodlzdf-a.akamaihd.net/none/zdf/22/04/220415_1720_sendung_trs/5/220415_1720_sendung_trs_a3a4_808k_p11v17.mp4&title=%5Bzdf%2014.08.2025%5D%20bali%20(s24_e03)(deu-ad)
20:03:24<pokechu22>and it's listed at https://mediathekviewweb.de/api/query?query=%7B%22queries%22%3A%5B%5D%2C%22sortBy%22%3A%22timestamp%22%2C%22sortOrder%22%3A%22desc%22%2C%22future%22%3Afalse%2C%22size%22%3A30%2C%22offset%22%3A0%7D
20:04:02<pokechu22>uh, "Results 1 to 30 of 694178." (from "Treffer 1 bis 30 von insgesamt 694178.") though, which seems probably too big?
20:05:42ducky quits [Remote host closed the connection]
20:07:19ducky (ducky) joins
20:08:50<@JAA>Sounds like it's an alternative interface to the audio parts of the German public broadcasters' Mediatheken, i.e. https://www.ardaudiothek.de/ and whatever the ZDF equivalent is.
20:08:54<@JAA>That's bound to be big.
20:11:26TheEnbyperor joins
20:12:36<Ryz>Am a bit confused and tried poking around the files before or shortly after typing it up, and wasn't sure if this is legit or not, since I'm not strongly familiar diving into websites other than English in terms of finding goodies
20:14:52<@JAA>Oh yeah, it goes beyond that and takes anything from the Mediatheken and extracts just the audio track. So it's even bigger than just the specific audio releases...
20:15:17<pokechu22>Those URLs appeared in the browser console (F12, reload the page after opening it). We'd need to generate a list of URLs and then do an !ao < list; it's too scripty for !a
20:16:02<pokechu22>it also seems like the server itself is doing work when it's extracting audio - I don't think they'd be happy with us bulk-requesting that
20:16:22<@JAA>The Mediatheken basically contain every TV production by any of the public broadcasters in Germany. You can watch them for free for like a month (often geolocked to Germany).
20:16:41<@JAA>Yeah, I don't think we need to archive audio streams via a third-party service anyway.
20:17:38<pokechu22>!a https://elitemeetus.org/ -i blogs,badvideos -e Proactive
20:17:57<pokechu22>I guess I might as well generate a list of those API URLs though
20:19:22<pokechu22>oh, https://mediathekviewweb.de/api/query?query={%22queries%22%3A[]%2C%22sortBy%22%3A%22timestamp%22%2C%22sortOrder%22%3A%22desc%22%2C%22future%22%3Afalse%2C%22size%22%3A30%2C%22offset%22%3A3694170} doesn't work - elasticsearch only wants to expose the first 10000 results
20:21:57<@JAA>Yeah, the API would be reasonable.
20:22:10<@JAA>That's the official API, too, I think.
20:22:53<@JAA>Hmm, no
20:23:28notarobot17 quits [Quit: Ping timeout (120 seconds)]
20:23:41notarobot17 joins
20:25:45<@JAA>Mixing it up with something else, nevermind.
20:25:53TheEnbyperor_ (TheEnbyperor) joins
20:29:10<h2ibot>Debug32 edited List of lost online videos/list (+929): https://wiki.archiveteam.org/?diff=56906&oldid=53968
20:40:58Dada quits [Remote host closed the connection]
20:45:47APOLLO03 quits [Quit: .]
20:47:41cuphead2527480 (Cuphead2527480) joins
20:50:09cuphead2527480 is now known as CuppyMan
20:55:09notSokar quits [Quit: Leaving]
20:55:21Sokar joins
20:58:57APOLLO03 joins
21:21:19dabs joins
21:41:07abirkill quits [Quit: Let us prepare to grapple with the ineffable itself, and see if we may not eff it after all.]
21:58:12ericgallager quits [Quit: This computer has gone to sleep]
21:59:58dabs quits [Read error: Connection reset by peer]
22:16:52atphoenix__ (atphoenix) joins
22:18:54atphoenix_ quits [Ping timeout: 240 seconds]
23:12:35ericgallager joins
23:13:56<gamer191-1|m>koichi: The situation with Vimeo (I was discussing this with one of the yt-dlp developers, for unrelated reasons) is that you can no longer generate unauthenticated guest tokens because they’ve hardened their api. However, if you have a cached guest token (I have one, which I’m willing to share if needed) then you can continue using it. Also, Vimeo embeds can be downloaded without an account (subject to heavy rate-limiting), but
23:13:56<gamer191-1|m>the embed url usually isn’t guessable and often requires a referrer for a website it’s embedded on (https://www.farmtransparency.org, I guess)
23:15:54<pokechu22>I assume a recursive AB crawl of https://www.farmtransparency.org would generate embed URLs
23:17:00<pokechu22>hmm, it looks like https://archive.fart.website/archivebot/viewer/job/202103290041478ojfw had videos ignored?
23:21:43<pokechu22>those might have been done in https://archive.fart.website/archivebot/viewer/job/20210329010802akni3 instead
23:26:30<gamer191-1|m>“I assume a recursive AB crawl of https://www.farmtransparency.org would generate embed URLs”
23:26:30<gamer191-1|m>Yeah, although like I said that’s very heavily rate-limited, so we’d need to switch IP addresses every few videos (once an IP address is rate-limited, it will start getting Cloudflare turnstile captchas in Vimeo embeds, and idk how long that will last)
23:26:30<gamer191-1|m>Also idk if it would give us m3u8, dash or https links (I can’t check right now cause I’m on my phone) and it would require the curl_cffi dependency (I assume that’s installed on AB
23:29:35<pokechu22>I was more thinking we ignore the embeds in archivebot, but use the recursive crawl to generate a list of URLs like https://player.vimeo.com/video/1103632636?h=2a9984e565 from https://www.farmtransparency.org/videos?id=l5b22836ku
23:30:20<pokechu22>the videos themselves could be downloaded outside of archivebot
23:31:55<gamer191-1|m>pokechu22: Hang on…can we just use the download button on the video pages
23:32:55<pokechu22>hmm, that seems to download it from vimeo actually
23:33:18<pokechu22>... but how does it choose which resolution to download?
23:33:36<pokechu22>just going to https://www.farmtransparency.org/videos?id=l5b22836ku&action=download fails
23:34:30<gamer191-1|m>I guess it uses a post request (I’m on my phone right now so I can’t check)
23:35:20<pokechu22>ah, it's a POST to that with a CSRF token (in firefox, navigate to https://www.farmtransparency.org/videos?id=l5b22836ku, then press alt, then select file -> work offline, then click one of the download links which should open https://www.farmtransparency.org/videos?id=l5b22836ku&action=download in a new tab, then go to that tab, then press alt, then uncheck file -> work
23:35:21<pokechu22>offline, then press f12 for dev tools, then refresh the page, then confirm that you're willing to refresh a POST request. it should then appear in devtools.)
23:35:40Wohlstand quits [Quit: Wohlstand]
23:38:25<pokechu22>hmm, the CSRF token doesn't change per pageload but does seem to be tied to a cookie of some sort
23:38:51<pokechu22>either way, probably best to just use an archivebot job to enumerate videos and other media without downloading it, and then do something with the videos separately afterwards
23:44:41<pokechu22>ugh, pagination requires loading https://www.farmtransparency.org/scripts/asset-display?p=3&asset_types=videos with the header X-Requested-With: XMLHttpRequest
23:48:34<gamer191-1|m>“either way, probably best to just use an archivebot job to enumerate videos and other media without downloading it, and then do something with the videos separately afterwards” Agreed!
23:48:34<gamer191-1|m>Should the AB job also enumerate their photos, documents, and campaign material? (I don’t know if any of those 3 categories are easy to archive)
23:49:22<@JAA>'Please mirror our stuff, but also, we'll make it as hard as we can.'
23:49:23<@JAA>Lovely
23:50:44<gamer191-1|m>Should we contact them?
23:50:57<pokechu22>seems like curl 'https://www.farmtransparency.org/scripts/asset-display?p=3&asset_types=videos' -H 'X-Requested-With: XMLHttpRequest' works so I can enumerate things that way
23:51:33<pokechu22>I don't think a recursive AB job would find everything on its own
23:57:34andrew quits [Ping timeout: 240 seconds]