00:07:08 | <BlankEclair> | they really don't have any moderation for these? |
00:07:22 | <BlankEclair> | sounds like a bad idea when you're soliciting user input and displaying it out to the world |
00:07:43 | <BlankEclair> | next up: someone publishes a leak of uk government records through blocked.org.uk |
00:07:59 | | cuphead2527480 quits [Quit: Connection closed for inactivity] |
00:30:24 | | DogsRNice quits [Client Quit] |
02:01:18 | | etnguyen03 (etnguyen03) joins |
02:10:51 | | Webuser492993 joins |
02:12:30 | | Webuser492993 quits [Client Quit] |
02:48:51 | | SootBector quits [Remote host closed the connection] |
02:59:41 | | IDK (IDK) joins |
03:06:31 | | etnguyen03 quits [Remote host closed the connection] |
03:22:59 | | tzt quits [Ping timeout: 260 seconds] |
03:23:45 | | lflare quits [Quit: Bye] |
03:24:20 | | tzt (tzt) joins |
03:24:31 | | lflare (lflare) joins |
04:10:31 | | DogsRNice joins |
04:20:12 | | tmg1|michelson joins |
05:01:07 | <ericgallager> | I see WPlace is trending: https://bsky.app/profile/trending.bsky.app/feed/371544258 |
05:01:26 | <ericgallager> | when I try to visit https://wplace.live/ though, it says it's down |
05:01:37 | <ericgallager> | anyone know if it's been archived at all? |
05:03:42 | <pokechu22> | We've got an archivebot job for various tiles on the site, but it has to run slowly and will take 2 weeks to finish |
05:03:58 | <pokechu22> | looks like that's still up, e.g. https://backend.wplace.live/files/s0/tiles/388/874.png |
05:04:31 | <ericgallager> | maybe it's just my browser, then... |
05:05:20 | <pokechu22> | the site's still up for me in firefox, yeah |
05:06:48 | <ericgallager> | ok so maybe it's a particular extension, uBlock Origin or something, then... |
05:09:44 | | IDK quits [Client Quit] |
05:14:03 | | LddPotato quits [Remote host closed the connection] |
05:15:50 | | LddPotato (LddPotato) joins |
05:16:48 | | IDK (IDK) joins |
05:26:28 | | notSokar joins |
05:26:34 | | Sokar quits [Ping timeout: 240 seconds] |
05:43:16 | | BornOn420 (BornOn420) joins |
05:48:34 | | NatTheCat quits [Ping timeout: 240 seconds] |
05:54:33 | | Island quits [Read error: Connection reset by peer] |
05:59:29 | | awauwa (awauwa) joins |
06:11:50 | | NatTheCat (NatTheCat) joins |
06:31:39 | | DogsRNice quits [Read error: Connection reset by peer] |
06:35:46 | | Webuser862649 joins |
06:44:08 | | Webuser862649 quits [Client Quit] |
07:19:44 | | IDK quits [Client Quit] |
07:44:34 | | APOLLO03 quits [Ping timeout: 240 seconds] |
07:54:49 | | notSokar quits [Ping timeout: 260 seconds] |
07:57:51 | | Sokar joins |
08:15:52 | | nine quits [Quit: See ya!] |
08:16:05 | | nine joins |
08:16:06 | | nine is now authenticated as nine |
08:16:06 | | nine quits [Changing host] |
08:16:06 | | nine (nine) joins |
08:36:14 | | midou quits [Ping timeout: 240 seconds] |
08:40:15 | | hexagonwin is now authenticated as hexagonwin |
08:45:59 | | midou joins |
09:00:32 | | IDK (IDK) joins |
09:17:40 | | cyanbox joins |
09:20:37 | | APOLLO03 joins |
10:02:00 | | igloo22225 quits [Quit: The Lounge - https://thelounge.chat] |
10:02:27 | | igloo22225 (igloo22225) joins |
10:27:29 | <c3manu> | "Eastman Kodak, the 133-year-old photography company, is warning investors thats it might not survive much longer." - https://edition.cnn.com/2025/08/12/business/kodak-survival-warning |
10:32:26 | <cyanbox> | https://www.kodak.com/en/company/blog-post/statement-regarding-misleading-media-reports/ |
10:33:37 | <cyanbox> | film is crazy popular rn, wouldn't make sense for them to be struggling that hard |
10:36:30 | <h2ibot> | Manu edited Discourse (+348, Add more active Discourses): https://wiki.archiveteam.org/?diff=56903&oldid=56863 |
10:45:23 | | FiTheArchiver joins |
11:00:03 | | Bleo182600722719623455222 quits [Quit: The Lounge - https://thelounge.chat] |
11:02:48 | | Bleo182600722719623455222 joins |
11:03:37 | | NotGLaDOS quits [] |
11:17:56 | | FiTheArchiver1 joins |
11:21:19 | | FiTheArchiver quits [Ping timeout: 260 seconds] |
11:31:54 | | rohvani quits [Ping timeout: 240 seconds] |
11:52:14 | | FiTheArchiver1 quits [Read error: Connection reset by peer] |
11:58:08 | | BornOn420 quits [Read error: Connection reset by peer] |
11:58:38 | | BornOn420 (BornOn420) joins |
12:02:28 | | Wohlstand (Wohlstand) joins |
12:19:18 | | etnguyen03 (etnguyen03) joins |
12:29:26 | | notSokar joins |
12:31:19 | | Sokar quits [Ping timeout: 260 seconds] |
12:32:07 | | etnguyen03 quits [Client Quit] |
12:32:49 | | Barto quits [Quit: WeeChat 4.7.0] |
12:40:15 | | notSokar quits [Client Quit] |
12:40:25 | | Sokar joins |
12:50:53 | | Barto (Barto) joins |
13:01:43 | | etnguyen03 (etnguyen03) joins |
13:03:49 | | Barto quits [Client Quit] |
13:15:22 | | etnguyen03 quits [Client Quit] |
13:27:54 | | nine quits [Ping timeout: 240 seconds] |
13:31:04 | | sepro (sepro) joins |
13:34:32 | | nine joins |
13:34:32 | | nine is now authenticated as nine |
13:34:32 | | nine quits [Changing host] |
13:34:32 | | nine (nine) joins |
13:51:14 | | midou quits [Ping timeout: 240 seconds] |
14:00:30 | | midou joins |
14:02:36 | | ATinySpaceMarine joins |
14:08:32 | | ATinySpaceMarine quits [Client Quit] |
14:11:08 | | ATinySpaceMarine joins |
14:18:06 | | ATSM joins |
14:19:26 | | ATSM quits [Client Quit] |
14:20:59 | | ATinySpaceMarine quits [Ping timeout: 260 seconds] |
14:30:20 | | shuuji3 quits [Quit: Ooops, wrong browser tab.] |
14:44:24 | | MrMcNuggets (MrMcNuggets) joins |
14:50:03 | | Dada joins |
15:26:39 | <gamer191-1|m> | Farm Transparency Project (who I mentioned ages ago, but I don’t think anything happened because their site uses Vimeo embeds so it couldn’t be downloaded with a simple AB job) is now publicly encouraging users to archive their website (and the videos on it) https://www.instagram.com/p/DNVGMMeyc5o/ |
15:35:48 | <gamer191-1|m> | Context: Farm Transparency Project is an Australian website which publishes undercover footage showing (often illegal) animal cruelty and animal welfare violations at farms and slaughterhouses. The Australian federal court recently ruled that they were violating copyright laws by publishing undercover footage (because the slaughterhouse owns the copyright to undercover footage shot there) |
15:40:08 | <gamer191-1|m> | I DMed them yesterday suggesting that they should create a BitTorrent of the site, and they left me on “seen”. Not sure if we should try to contact them by email requesting a copy of their site, or if we should run an AB job (I don’t have the skills to run a job involving Vimeo) |
15:59:37 | <@arkiver> | gamer191-1|m: do you have their site? |
15:59:48 | <@arkiver> | we should definitely make a copy and put their youtube (if any) in #down-the-tube |
16:03:41 | | BornOn420 quits [Read error: Connection reset by peer] |
16:04:11 | | BornOn420 (BornOn420) joins |
16:20:03 | <Vokun> | https://www.farmtransparency.org |
16:26:51 | <cruller> | They have 1258 videos in their repository (https://www.farmtransparency.org/), but only 233 on their Vimeo channel (https://vimeo.com/farmtransparency) and 99 on their YouTube channel (https://vimeo.com/farmtransparency). |
16:35:15 | <cruller> | According to https://wiki.archiveteam.org/index.php/vimeo, downloading public videos on Vimeo requires a login. yt-dlp shows a similar message. |
16:36:43 | <cruller> | However, some cobalt instances generate MP4 links that don't require a login. AB should be able to grab them. |
16:38:02 | <cruller> | (I don't know the mechanism of link generation.) |
16:42:09 | | pabs quits [Ping timeout: 260 seconds] |
16:43:16 | | pabs (pabs) joins |
16:48:32 | <cruller> | IIRC, embedded private Vimeo videos can only be viewed from the embedding page. Browser developer tools and yt-dlp --referer https://www.farmtransparency.org/campaigns/eggs-exposed https://player.vimeo.com/video/1075508246 fetch hls, but it is unclear whether MP4 URLs exist. |
16:49:53 | <justauser|m> | Yeah, encouraging mirrors without providing a good way to do this is so... humanish? bureaucracish? |
16:50:34 | | ThreeHM quits [Ping timeout: 240 seconds] |
16:52:40 | | ThreeHM (ThreeHeadedMonkey) joins |
16:53:38 | <cruller> | I don't understand why they make so many Vimeo videos "private". |
17:04:51 | | cyanbox quits [Read error: Connection reset by peer] |
17:05:42 | <cruller> | <cruller> "They have 1258 videos in their..." <- https://www.youtube.com/c/farmtransparencyproject |
17:14:43 | | MrMcNuggets quits [Client Quit] |
17:24:37 | <h2ibot> | HadeanEon edited Deaths in 2025 (+689, BOT - Updating page: {{saved}} (149),…): https://wiki.archiveteam.org/?diff=56904&oldid=56873 |
17:24:38 | <h2ibot> | HadeanEon edited Deaths in 2025/list (+54, BOT - Updating list): https://wiki.archiveteam.org/?diff=56905&oldid=56874 |
17:34:33 | | ducky quits [Remote host closed the connection] |
17:36:45 | | ducky (ducky) joins |
17:38:33 | | Webuser278280 joins |
17:42:49 | | Webuser278280 quits [Client Quit] |
17:43:04 | | ducky quits [Read error: Connection reset by peer] |
17:45:11 | | ducky (ducky) joins |
17:47:42 | | notSokar joins |
17:49:49 | | Sokar quits [Ping timeout: 260 seconds] |
17:57:12 | | ducky quits [Remote host closed the connection] |
17:57:45 | | ducky (ducky) joins |
18:22:12 | | awauwa quits [Quit: awauwa] |
18:26:04 | | dhinakg (dhinakg) joins |
18:29:12 | | ducky_ (ducky) joins |
18:31:14 | | ducky quits [Ping timeout: 260 seconds] |
18:31:14 | | ducky_ is now known as ducky |
18:52:48 | | Barto (Barto) joins |
18:59:32 | | DogsRNice joins |
19:00:14 | | TheEnbyperor_ quits [Ping timeout: 240 seconds] |
19:00:24 | | TheEnbyperor quits [Ping timeout: 260 seconds] |
19:06:51 | | HP_Archivist quits [Quit: Leaving] |
19:08:09 | | IDK quits [Quit: Connection closed for inactivity] |
19:22:10 | | ducky quits [Remote host closed the connection] |
19:24:12 | | ducky (ducky) joins |
19:36:11 | | TheEnbyperor (TheEnbyperor) joins |
19:43:34 | | TheEnbyperor quits [Ping timeout: 260 seconds] |
19:58:44 | <Ryz> | Heya folks, I have this subdomain https://audiothek.dasdeck.com/ - that I found, and I was about to archive it weeks ago, since it's from the other subdomains under https://dasdeck.com/ - but stopped because it seemed it was getting video files or audio files from somewhere, but the video files are converted into audio files? Can anyone figure out |
19:58:44 | <Ryz> | where these files came from? Sadly I can't really archive this (even if assuming there's no funky JS stuff blockading from archiving), just needing my curiosity satiated~ |
20:03:00 | <pokechu22> | clicking one I see it loads https://audiothek.dasdeck.com/?url=https://rodlzdf-a.akamaihd.net/none/zdf/22/04/220415_1720_sendung_trs/5/220415_1720_sendung_trs_a3a4_808k_p11v17.mp4&title=%5Bzdf%2014.08.2025%5D%20bali%20(s24_e03)(deu-ad) |
20:03:24 | <pokechu22> | and it's listed at https://mediathekviewweb.de/api/query?query=%7B%22queries%22%3A%5B%5D%2C%22sortBy%22%3A%22timestamp%22%2C%22sortOrder%22%3A%22desc%22%2C%22future%22%3Afalse%2C%22size%22%3A30%2C%22offset%22%3A0%7D |
20:04:02 | <pokechu22> | uh, "Results 1 to 30 of 694178." (from "Treffer 1 bis 30 von insgesamt 694178.") though, which seems probably too big? |
20:05:42 | | ducky quits [Remote host closed the connection] |
20:07:19 | | ducky (ducky) joins |
20:08:50 | <@JAA> | Sounds like it's an alternative interface to the audio parts of the German public broadcasters' Mediatheken, i.e. https://www.ardaudiothek.de/ and whatever the ZDF equivalent is. |
20:08:54 | <@JAA> | That's bound to be big. |
20:11:26 | | TheEnbyperor joins |
20:12:36 | <Ryz> | Am a bit confused and tried poking around the files before or shortly after typing it up, and wasn't sure if this is legit or not, since I'm not strongly familiar diving into websites other than English in terms of finding goodies |
20:14:52 | <@JAA> | Oh yeah, it goes beyond that and takes anything from the Mediatheken and extracts just the audio track. So it's even bigger than just the specific audio releases... |
20:15:17 | <pokechu22> | Those URLs appeared in the browser console (F12, reload the page after opening it). We'd need to generate a list of URLs and then do an !ao < list; it's too scripty for !a |
20:16:02 | <pokechu22> | it also seems like the server itself is doing work when it's extracting audio - I don't think they'd be happy with us bulk-requesting that |
20:16:22 | <@JAA> | The Mediatheken basically contain every TV production by any of the public broadcasters in Germany. You can watch them for free for like a month (often geolocked to Germany). |
20:16:41 | <@JAA> | Yeah, I don't think we need to archive audio streams via a third-party service anyway. |
20:17:38 | <pokechu22> | !a https://elitemeetus.org/ -i blogs,badvideos -e Proactive |
20:17:57 | <pokechu22> | I guess I might as well generate a list of those API URLs though |
20:19:22 | <pokechu22> | oh, https://mediathekviewweb.de/api/query?query={%22queries%22%3A[]%2C%22sortBy%22%3A%22timestamp%22%2C%22sortOrder%22%3A%22desc%22%2C%22future%22%3Afalse%2C%22size%22%3A30%2C%22offset%22%3A3694170} doesn't work - elasticsearch only wants to expose the first 10000 results |
20:21:57 | <@JAA> | Yeah, the API would be reasonable. |
20:22:10 | <@JAA> | That's the official API, too, I think. |
20:22:53 | <@JAA> | Hmm, no |
20:23:28 | | notarobot17 quits [Quit: Ping timeout (120 seconds)] |
20:23:41 | | notarobot17 joins |
20:25:45 | <@JAA> | Mixing it up with something else, nevermind. |
20:25:53 | | TheEnbyperor_ (TheEnbyperor) joins |
20:29:10 | <h2ibot> | Debug32 edited List of lost online videos/list (+929): https://wiki.archiveteam.org/?diff=56906&oldid=53968 |
20:40:58 | | Dada quits [Remote host closed the connection] |
20:45:47 | | APOLLO03 quits [Quit: .] |
20:47:41 | | cuphead2527480 (Cuphead2527480) joins |
20:50:09 | | cuphead2527480 is now known as CuppyMan |
20:55:09 | | notSokar quits [Quit: Leaving] |
20:55:21 | | Sokar joins |
20:58:57 | | APOLLO03 joins |
21:21:19 | | dabs joins |
21:41:07 | | abirkill quits [Quit: Let us prepare to grapple with the ineffable itself, and see if we may not eff it after all.] |
21:58:12 | | ericgallager quits [Quit: This computer has gone to sleep] |
21:59:58 | | dabs quits [Read error: Connection reset by peer] |
22:16:52 | | atphoenix__ (atphoenix) joins |
22:18:54 | | atphoenix_ quits [Ping timeout: 240 seconds] |
23:12:35 | | ericgallager joins |
23:13:56 | <gamer191-1|m> | koichi: The situation with Vimeo (I was discussing this with one of the yt-dlp developers, for unrelated reasons) is that you can no longer generate unauthenticated guest tokens because they’ve hardened their api. However, if you have a cached guest token (I have one, which I’m willing to share if needed) then you can continue using it. Also, Vimeo embeds can be downloaded without an account (subject to heavy rate-limiting), but |
23:13:56 | <gamer191-1|m> | the embed url usually isn’t guessable and often requires a referrer for a website it’s embedded on (https://www.farmtransparency.org, I guess) |
23:15:54 | <pokechu22> | I assume a recursive AB crawl of https://www.farmtransparency.org would generate embed URLs |
23:17:00 | <pokechu22> | hmm, it looks like https://archive.fart.website/archivebot/viewer/job/202103290041478ojfw had videos ignored? |
23:21:43 | <pokechu22> | those might have been done in https://archive.fart.website/archivebot/viewer/job/20210329010802akni3 instead |
23:26:30 | <gamer191-1|m> | “I assume a recursive AB crawl of https://www.farmtransparency.org would generate embed URLs” |
23:26:30 | <gamer191-1|m> | Yeah, although like I said that’s very heavily rate-limited, so we’d need to switch IP addresses every few videos (once an IP address is rate-limited, it will start getting Cloudflare turnstile captchas in Vimeo embeds, and idk how long that will last) |
23:26:30 | <gamer191-1|m> | Also idk if it would give us m3u8, dash or https links (I can’t check right now cause I’m on my phone) and it would require the curl_cffi dependency (I assume that’s installed on AB |
23:29:35 | <pokechu22> | I was more thinking we ignore the embeds in archivebot, but use the recursive crawl to generate a list of URLs like https://player.vimeo.com/video/1103632636?h=2a9984e565 from https://www.farmtransparency.org/videos?id=l5b22836ku |
23:30:20 | <pokechu22> | the videos themselves could be downloaded outside of archivebot |
23:31:55 | <gamer191-1|m> | pokechu22: Hang on…can we just use the download button on the video pages |
23:32:55 | <pokechu22> | hmm, that seems to download it from vimeo actually |
23:33:18 | <pokechu22> | ... but how does it choose which resolution to download? |
23:33:36 | <pokechu22> | just going to https://www.farmtransparency.org/videos?id=l5b22836ku&action=download fails |
23:34:30 | <gamer191-1|m> | I guess it uses a post request (I’m on my phone right now so I can’t check) |
23:35:20 | <pokechu22> | ah, it's a POST to that with a CSRF token (in firefox, navigate to https://www.farmtransparency.org/videos?id=l5b22836ku, then press alt, then select file -> work offline, then click one of the download links which should open https://www.farmtransparency.org/videos?id=l5b22836ku&action=download in a new tab, then go to that tab, then press alt, then uncheck file -> work |
23:35:21 | <pokechu22> | offline, then press f12 for dev tools, then refresh the page, then confirm that you're willing to refresh a POST request. it should then appear in devtools.) |
23:35:40 | | Wohlstand quits [Quit: Wohlstand] |
23:38:25 | <pokechu22> | hmm, the CSRF token doesn't change per pageload but does seem to be tied to a cookie of some sort |
23:38:51 | <pokechu22> | either way, probably best to just use an archivebot job to enumerate videos and other media without downloading it, and then do something with the videos separately afterwards |
23:44:41 | <pokechu22> | ugh, pagination requires loading https://www.farmtransparency.org/scripts/asset-display?p=3&asset_types=videos with the header X-Requested-With: XMLHttpRequest |
23:48:34 | <gamer191-1|m> | “either way, probably best to just use an archivebot job to enumerate videos and other media without downloading it, and then do something with the videos separately afterwards” Agreed! |
23:48:34 | <gamer191-1|m> | Should the AB job also enumerate their photos, documents, and campaign material? (I don’t know if any of those 3 categories are easy to archive) |
23:49:22 | <@JAA> | 'Please mirror our stuff, but also, we'll make it as hard as we can.' |
23:49:23 | <@JAA> | Lovely |
23:50:44 | <gamer191-1|m> | Should we contact them? |
23:50:57 | <pokechu22> | seems like curl 'https://www.farmtransparency.org/scripts/asset-display?p=3&asset_types=videos' -H 'X-Requested-With: XMLHttpRequest' works so I can enumerate things that way |
23:51:33 | <pokechu22> | I don't think a recursive AB job would find everything on its own |
23:57:34 | | andrew quits [Ping timeout: 240 seconds] |