00:20:20 | | wickedplayer494 is now authenticated as wickedplayer494 |
00:26:14 | | TheEnbyperor_ quits [Ping timeout: 260 seconds] |
00:26:14 | | TheEnbyperor quits [Ping timeout: 260 seconds] |
00:27:50 | | TheEnbyperor joins |
00:31:58 | | TheEnbyperor_ (TheEnbyperor) joins |
00:32:06 | | APOLLO03 quits [Client Quit] |
00:32:45 | | APOLLO03 joins |
00:34:37 | <h2ibot> | TriangleDemon edited Hatena (-60): https://wiki.archiveteam.org/?diff=56569&oldid=56560 |
00:57:09 | | qwertyasdfuiopghjkl21 joins |
00:57:33 | | qwertyasdfuiopghjkl21 quits [Max SendQ exceeded] |
00:58:16 | | qwertyasdfuiopghjkl21 joins |
00:58:40 | | qwertyasdfuiopghjkl21 quits [Max SendQ exceeded] |
00:59:32 | | qwertyasdfuiopghjkl21 joins |
00:59:56 | | qwertyasdfuiopghjkl21 quits [Max SendQ exceeded] |
01:00:25 | | qwertyasdfuiopghjkl21 joins |
01:00:49 | | qwertyasdfuiopghjkl21 quits [Max SendQ exceeded] |
01:01:15 | | qwertyasdfuiopghjkl21 joins |
01:01:39 | | qwertyasdfuiopghjkl21 quits [Max SendQ exceeded] |
01:01:49 | | qwertyasdfuiopghjkl2 quits [Ping timeout: 260 seconds] |
01:02:27 | | qwertyasdfuiopghjkl2 joins |
01:02:27 | | qwertyasdfuiopghjkl2 is now authenticated as qwertyasdfuiopghjkl2 |
01:02:51 | | qwertyasdfuiopghjkl2 quits [Max SendQ exceeded] |
01:03:47 | | qwertyasdfuiopghjkl2 joins |
01:03:47 | | qwertyasdfuiopghjkl2 is now authenticated as qwertyasdfuiopghjkl2 |
01:04:11 | | qwertyasdfuiopghjkl2 quits [Max SendQ exceeded] |
01:04:54 | | qwertyasdfuiopghjkl2 joins |
01:04:54 | | qwertyasdfuiopghjkl2 is now authenticated as qwertyasdfuiopghjkl2 |
01:05:18 | | qwertyasdfuiopghjkl2 quits [Max SendQ exceeded] |
01:05:56 | | qwertyasdfuiopghjkl2 joins |
01:05:56 | | qwertyasdfuiopghjkl2 is now authenticated as qwertyasdfuiopghjkl2 |
01:06:20 | | qwertyasdfuiopghjkl2 quits [Max SendQ exceeded] |
01:06:32 | | _wotd_ quits [Quit: connection go boom] |
01:06:37 | | wotd joins |
01:06:42 | | qwertyasdfuiopghjkl2 joins |
01:06:42 | | qwertyasdfuiopghjkl2 is now authenticated as qwertyasdfuiopghjkl2 |
01:07:06 | | qwertyasdfuiopghjkl2 quits [Max SendQ exceeded] |
01:08:03 | | qwertyasdfuiopghjkl2 joins |
01:08:03 | | qwertyasdfuiopghjkl2 is now authenticated as qwertyasdfuiopghjkl2 |
01:08:27 | | qwertyasdfuiopghjkl2 quits [Max SendQ exceeded] |
01:09:16 | | qwertyasdfuiopghjkl2 joins |
01:09:16 | | qwertyasdfuiopghjkl2 is now authenticated as qwertyasdfuiopghjkl2 |
01:09:40 | | qwertyasdfuiopghjkl2 quits [Max SendQ exceeded] |
01:10:16 | | qwertyasdfuiopghjkl2 joins |
01:10:16 | | qwertyasdfuiopghjkl2 is now authenticated as qwertyasdfuiopghjkl2 |
01:10:40 | | qwertyasdfuiopghjkl2 quits [Max SendQ exceeded] |
01:11:03 | | qwertyasdfuiopghjkl2 joins |
01:11:03 | | qwertyasdfuiopghjkl2 is now authenticated as qwertyasdfuiopghjkl2 |
01:11:27 | | qwertyasdfuiopghjkl2 quits [Max SendQ exceeded] |
01:11:53 | | qwertyasdfuiopghjkl2 joins |
01:11:53 | | qwertyasdfuiopghjkl2 is now authenticated as qwertyasdfuiopghjkl2 |
01:12:17 | | qwertyasdfuiopghjkl2 quits [Max SendQ exceeded] |
01:13:18 | | qwertyasdfuiopghjkl2 joins |
01:13:18 | | qwertyasdfuiopghjkl2 is now authenticated as qwertyasdfuiopghjkl2 |
01:24:34 | | etnguyen03 (etnguyen03) joins |
01:37:41 | | dabs quits [Read error: Connection reset by peer] |
01:49:26 | | nicolas17 quits [Quit: Konversation terminated!] |
01:52:14 | | nicolas17 joins |
02:05:38 | | nicolas17_ joins |
02:08:54 | | nicolas17 quits [Ping timeout: 260 seconds] |
02:13:24 | | Webuser475056 joins |
02:13:50 | | Webuser475056 quits [Client Quit] |
02:14:56 | <h2ibot> | PaulWise edited Anubis (+151, document mnbot/SPN status with Anubis): https://wiki.archiveteam.org/?diff=56570&oldid=56428 |
02:22:19 | | cuphead2527480 quits [Quit: Connection closed for inactivity] |
02:23:53 | | etnguyen03 quits [Client Quit] |
02:32:17 | | BornOn420 quits [Remote host closed the connection] |
02:32:17 | | nicolas17_ quits [Read error: Connection reset by peer] |
02:32:58 | | BornOn420 (BornOn420) joins |
02:33:17 | | nicolas17_ joins |
02:33:24 | | etnguyen03 (etnguyen03) joins |
02:35:44 | | Mateon1 quits [Ping timeout: 260 seconds] |
02:36:46 | | Mateon1 joins |
02:43:17 | | etnguyen03 quits [Remote host closed the connection] |
02:51:14 | | midou quits [Ping timeout: 240 seconds] |
03:01:26 | | midou joins |
03:02:50 | | nicolas17 joins |
03:04:19 | | beastbg8 quits [Read error: Connection reset by peer] |
03:06:04 | | nicolas17_ quits [Ping timeout: 260 seconds] |
03:17:46 | | beastbg8 (beastbg8) joins |
03:25:20 | | onetruth quits [Read error: Connection reset by peer] |
03:31:08 | <h2ibot> | TriangleDemon edited Hatena (+449): https://wiki.archiveteam.org/?diff=56572&oldid=56569 |
03:34:08 | <h2ibot> | TriangleDemon edited Colors! (+55): https://wiki.archiveteam.org/?diff=56573&oldid=56558 |
04:40:48 | | nicolas17_ joins |
04:44:39 | | nicolas17 quits [Ping timeout: 260 seconds] |
05:18:04 | <c3manu> | i don't know more context about archiving itch.io pages and have to leave soon, but apparently they started shadowbanning NSFW content: https://bsky.app/profile/thetransfemininereview.com/post/3luogyd7z4k2i |
05:22:15 | | healingherb joins |
05:22:49 | <healingherb> | Are we on track to archive all goo.gl URLs by August 25? |
05:28:28 | | healingherb is now authenticated as healingherb |
05:28:46 | | healingherb quits [Changing host] |
05:28:46 | | healingherb (healingherb) joins |
05:31:00 | | Guest58 quits [Quit: My Mac has gone to sleep. ZZZzzz…] |
05:38:21 | | LunarianBunny1147 (LunarianBunny1147) joins |
05:40:25 | | Webuser447569 joins |
05:41:51 | <Webuser447569> | Just as a heads up, itch.io is shadow banning being able to find adult games on its platform, so that may be worth looking into as many games ONLY exist on itch.io |
05:42:05 | <Webuser447569> | E.g. the tag `nsfw` used to return tens of thousands of results... now it only returns 4 |
05:43:04 | <Webuser447569> | the games still exist via direct links though |
05:44:39 | | nicolas17 joins |
05:45:41 | <Webuser447569> | i wouldn't even know how to start with archiving those games, but I'm happy to try help however i can lol |
05:45:41 | | nicolas17_ quits [Read error: Connection reset by peer] |
05:47:33 | <Webuser447569> | wow... even tags like erotic, adult, bdsm, etc have been nuked, from >10k each to <100 each |
05:49:21 | <Webuser447569> | huh... they also shadow banned most of the `trans` tag... fun |
05:55:02 | | igloo22225 (igloo22225) joins |
05:55:55 | | Webuser447569 quits [Client Quit] |
05:57:56 | | nicolas17_ joins |
05:59:04 | | nine quits [Quit: See ya!] |
05:59:17 | | nine joins |
05:59:18 | | nine is now authenticated as nine |
05:59:18 | | nine quits [Changing host] |
05:59:18 | | nine (nine) joins |
06:01:04 | | nicolas17 quits [Ping timeout: 260 seconds] |
06:03:25 | | abirkill- (abirkill) joins |
06:05:09 | | Irenes quits [Ping timeout: 260 seconds] |
06:05:09 | | abirkill quits [Ping timeout: 260 seconds] |
06:05:09 | | abirkill- is now known as abirkill |
06:05:29 | | Irenes (ireneista) joins |
06:10:36 | <pokechu22> | We did grab stuff with itch.io starting 2 years ago (but that job lasted nearly a year) |
06:11:22 | <pokechu22> | https://itch.io/sitemap.xml does exist though |
06:14:06 | <healingherb> | Webuser090327I'm going to take a guess that the thing with the trans tag is probably a mistake |
06:18:40 | <healingherb> | the sole owner of itch has always been the same one person, Leaf Corcoran, and in April there was a game bundle raising money for a trans organization |
06:18:58 | <healingherb> | I doubt he's changed his mind in the past 3 months |
06:22:39 | <pokechu22> | https://itch.io/games/tag-nsfw seems nearly-empty now but https://itch.io/games/nsfw is 7049 results |
06:23:10 | <pokechu22> | ... but it was much higher before: https://web.archive.org/web/20250716040724/https://itch.io/games/nsfw |
06:27:39 | <healingherb> | How recent was this change? I wonder if it's all just some kind of mistake or glitch |
06:28:23 | <that_lurker> | payment providers do not like nsfw so they might have started pressuring itch like they did with Steam |
06:38:05 | <that_lurker> | https://itch.io/updates/update-on-nsfw-content |
06:38:17 | | that_lurker hates the power payment providers have on platforms |
06:43:48 | | nicolas17 joins |
06:47:09 | | nicolas17_ quits [Ping timeout: 260 seconds] |
06:55:28 | | awauwa (awauwa) joins |
07:01:06 | | nicolas17_ joins |
07:01:10 | | nicolas17 quits [Read error: Connection reset by peer] |
07:07:54 | | APOLLO03 quits [Ping timeout: 240 seconds] |
07:08:31 | <cruller> | Steam Banned and Removed Games List https://docs.google.com/spreadsheets/d/1aAbrEDNa2NmgntrKtij0RuxKneiFcTJisGyfMUv2nnM/edit |
07:08:31 | <cruller> | They also mention itch.io a little bit. |
07:10:16 | | nicolas17 joins |
07:13:59 | | nicolas17_ quits [Ping timeout: 260 seconds] |
07:15:36 | | nicolas17_ joins |
07:17:27 | <healingherb> | that_lurker: thank you for finding that blog post, that explains everything |
07:18:14 | | TheEnbyperor quits [Ping timeout: 240 seconds] |
07:19:49 | | nicolas17 quits [Ping timeout: 260 seconds] |
07:19:49 | | HugsNotDrugs quits [Ping timeout: 260 seconds] |
07:20:03 | | TheEnbyperor_ is now known as TheEnbyperor |
07:20:07 | | TheEnbyperor_ joins |
07:21:34 | | Snivy quits [Ping timeout: 260 seconds] |
07:22:17 | <healingherb> | I don't see a "trans" tag currently, but there is a "transgender" tag that has 1,192 results. Wayback Machine says a month ago it had 1,652. I guess the difference is 460 NSFW games? |
07:23:55 | | HugsNotDrugs joins |
07:25:06 | | Island quits [Read error: Connection reset by peer] |
07:36:56 | | nicolas17 joins |
07:40:14 | | nicolas17_ quits [Ping timeout: 260 seconds] |
07:41:32 | | Dada joins |
07:56:23 | | rohvani joins |
08:03:14 | | hexa_ quits [Quit: WeeChat 4.6.3] |
08:05:38 | | hexa_ (hexa-) joins |
08:15:35 | | Snivy (Snivy) joins |
08:30:47 | | nicolas17_ joins |
08:31:51 | | nicolas17 quits [Read error: Connection reset by peer] |
08:37:57 | | nicolas17 joins |
08:38:29 | | nicolas17_ quits [Read error: Connection reset by peer] |
08:41:29 | | flotwig quits [Read error: Connection reset by peer] |
08:42:22 | | flotwig joins |
08:54:54 | | pabs quits [Ping timeout: 260 seconds] |
08:57:19 | | nicolas17_ joins |
09:00:44 | | nicolas17 quits [Ping timeout: 260 seconds] |
09:20:15 | | healingherb quits [Quit: Ooops, wrong browser tab.] |
09:20:24 | | Guest58 joins |
09:26:45 | | nicolas17 joins |
09:29:54 | | nicolas17_ quits [Ping timeout: 260 seconds] |
09:31:45 | <@OrIdow6> | cruller: Holy cow, that is some spreadsheet |
09:32:02 | <@OrIdow6> | I guess the "delisted" ones are actionable by us? |
09:33:13 | <@OrIdow6> | Also a warning that that link is NSFW |
09:49:11 | | nicolas17_ joins |
09:49:44 | | nicolas17 quits [Ping timeout: 260 seconds] |
10:10:50 | | pabs (pabs) joins |
10:19:47 | | APOLLO03 joins |
10:20:04 | <@OrIdow6> | cruller: I've started a basic job to save the "available" and "delisted" URLs on that list |
10:34:24 | <that_lurker> | was Sex With Hitler delisted |
10:46:30 | <@OrIdow6> | that_lurker: Don't know, it's not on the spreadsheet |
11:00:02 | | Bleo182600722719623455222 quits [Quit: The Lounge - https://thelounge.chat] |
11:02:50 | | Bleo182600722719623455222 joins |
11:04:24 | | anonymoususer852 quits [Ping timeout: 260 seconds] |
11:10:57 | <cruller> | that_lurker: If the content filter is off, the title appears in the search results. (Some of) the "delisted" titles on the spreadsheet don't appear. |
11:12:13 | | mgrytbak quits [Quit: Ping timeout (120 seconds)] |
11:12:22 | | mgrytbak joins |
11:14:53 | | anonymoususer852 (anonymoususer852) joins |
11:23:42 | <cruller> | <OrIdow6> "koichi: I've started a basic job..." <- Thank you. However, many of the URLs on steam_removed_games_spreadsheet.txt may be loginwalled. |
11:34:34 | | mgrytbak quits [Ping timeout: 240 seconds] |
11:42:13 | | mgrytbak joins |
11:44:34 | <cruller> | Archive.today somehow bypasses it without a login. -> https://archive.is/XMAMT (NSFW) |
11:52:45 | <cruller> | Oh, it has already been mentioned on #archivebot. |
12:09:28 | | BennyOtt_ joins |
12:10:54 | | BennyOtt quits [Ping timeout: 260 seconds] |
12:10:54 | | BennyOtt_ is now known as BennyOtt |
12:10:54 | | BennyOtt is now authenticated as BennyOtt |
12:17:51 | | Gadelhas562873784 joins |
12:18:58 | <hexagonwin_> | It seems like the russian computer forum oszone is going offline in september. i'm not russian or member there so not sure of the details, I just found out while lurking around. could this be queued to archivebot? http://forum.oszone.net/thread-356420.html |
12:24:21 | <hexagonwin_> | seems like someone managed to scrape the site and shared it through edk2.. not sure how complete that is http://forum.oszone.net/post-3039063-42.html |
12:27:24 | | justcool393 quits [Read error: Connection reset by peer] |
12:27:30 | | justcool393 (justcool393) joins |
12:55:25 | <gamer191-1|m> | My laptop just ran out of power and auto-hibernated. The warrior was running so when I boot it back up it will probably continue with a few failed network requests during the boot up. Is the warrior capable of handling that and repeating the requests, or should I force shut it down as soon as my laptop boots up? |
13:00:40 | <nstrom|m> | It should be good on its own to retry |
14:18:40 | <@arkiver> | i am a bit less available until monday/tuesday (depending on your timezone) |
14:21:39 | | lunik1 quits [Quit: :x] |
14:22:10 | | lunik1 joins |
14:32:02 | | PredatorIWD25 quits [Read error: Connection reset by peer] |
14:45:43 | | Guest58 quits [Quit: My Mac has gone to sleep. ZZZzzz…] |
14:50:33 | <@arkiver> | if there's a login wall that is not easily possible to circumvent without login, then archive.is may be the better options |
14:50:34 | <@arkiver> | option* |
14:51:16 | <@arkiver> | i would in general stay away from project that require an account to archive |
14:52:11 | <@arkiver> | so, the "shadow ban" on ich.io has already happened? |
15:43:14 | | Wohlstand (Wohlstand) joins |
16:21:40 | | justauser|m1 joins |
16:21:44 | | Juest quits [Ping timeout: 260 seconds] |
16:35:06 | <pokechu22> | gamer191-1|m: it'll retry the requests, however the VM's clock doesn't know about host hibernation and will be incorrect, so it's best to gracefully stop it and restart it. |
16:50:24 | | Juest (Juest) joins |
16:58:13 | | dabs joins |
17:09:20 | <h2ibot> | Pokechu22 edited Deathwatch (+165, /* 2025 */ OSzone.net): https://wiki.archiveteam.org/?diff=56574&oldid=56542 |
17:09:31 | | Wohlstand quits [Remote host closed the connection] |
17:10:20 | <h2ibot> | Pokechu22 edited Deathwatch (+23, /* 2025 */): https://wiki.archiveteam.org/?diff=56575&oldid=56574 |
17:25:33 | | grill (grill) joins |
17:52:04 | <@OrIdow6> | So, uh, are we doing anything for Itch.io? Or is someone closely keeping track of the situaiton? |
17:53:44 | <pokechu22> | I saved their sitemap as it currently exists, which seems to include NSFW games still (at least I saw an entry in the sitemap that didn't appear in search). Actually saving all of the games seems more difficult |
17:54:08 | <pokechu22> | uh, though maybe we could send everything in the sitemap to #// ? I'm still not familiar with the project and the risks of doing something like that |
17:54:48 | <nicolas17_> | btw at current speed the ETA for goo-gl is August 28 and deadline is August 25, we need a little extra push |
17:57:53 | <justauser|m> | > uh, though maybe we could send everything in the sitemap to #// ? |
17:57:53 | <justauser|m> | Advised against I think. |
17:58:27 | <justauser|m> | Lists containing large numbers of URLs on one host are not appropriate here. This project runs at very high speed and can easily DDoS a server. If you would like a crawl of a single website, please request it in #archivebot instead. |
17:58:51 | <pokechu22> | Yeah, given that it took a year to run in archivebot last time, probably not going to work there either :/ |
17:59:20 | <pokechu22> | (though that included a lot of junk related to faceting; maybe an !ao < list of just the sitemap would work better) |
18:00:19 | | grill quits [Ping timeout: 260 seconds] |
18:02:02 | | grill (grill) joins |
18:02:25 | | nicolas17_ is now known as nicolas17 |
18:03:05 | <@OrIdow6> | Yeah that seems like a good action |
18:03:11 | <@OrIdow6> | And I'm glad you ran the sitemap as is |
18:03:32 | <@OrIdow6> | I guess I'm just not clear on what the future looks like |
18:04:03 | <@OrIdow6> | Whether these games will stay permanently noindexed but otherwise present, whether they'll get removed in practice because their creators can no longer recieve payments etc |
18:04:25 | <@OrIdow6> | Or whether itch.io is in some sort of negotiations with Visa/Mastercard and in a week they're gonna get removed entirely |
18:05:06 | <pokechu22> | https://itch.io/updates/update-on-nsfw-content says they're reviewing them and some may be removed |
18:05:12 | <justauser|m> | pokechu22: I'm not sure I understand the proposal. |
18:05:54 | <justauser|m> | Sitemap only links to game homepages - not even to images, let alone games themselves, when available. |
18:06:41 | <pokechu22> | ArchiveBot parses HTML and extracts images from it |
18:06:57 | <pokechu22> | it won't download the games themselves though (it doesn't run JS, so it can't handle embedded games) |
18:07:35 | <pokechu22> | but we can get screenshot thumbnails at least (full-sized screenshots won't work) |
18:08:18 | <justauser|m> | Does it parse on !ao, though? |
18:08:25 | <pokechu22> | Yes |
18:08:26 | <justauser|m> | RTFD isn't clear. |
18:08:47 | <@OrIdow6> | pokechu22: Thanks for the link, it does sound like we should act fast |
18:09:48 | <justauser|m> | > so it can't handle embedded games |
18:09:48 | <justauser|m> | I'm speaking of downloadable files, not embedded games. |
18:10:30 | <pokechu22> | however, it seems like they use some kind of URL signing, so the only way to go from https://img.itch.zone/aW1hZ2UvMTU4LzU4My5wbmc=/347x500/wEIKhS.png to https://img.itch.zone/aW1hZ2UvMTU4LzU4My5wbmc=/original/N3awz4.png is to extract both from the HTML on https://malec2b.itch.io/flight-of-the-data-thief (which won't happen with an !ao < list). I guess we could do an !a < |
18:10:32 | <pokechu22> | list (which is not documented and has a bunch of limitations), but that does require knowing all the subdomains (which I guess we do with the sitemap) |
18:10:48 | <pokechu22> | I'm pretty sure even for free games you have to pass through the payment page, which isn't going to work |
18:12:00 | <pokechu22> | We'd need a dedicated project to do that |
18:16:46 | <pokechu22> | OK, yeah, even for name your price games, you need to POST https://malec2b.itch.io/flight-of-the-data-thief/download_url (no body/cookies required) to get a download link (and that changes with each POST). Archivebot doesn't do POST |
18:17:01 | <pokechu22> | (and we'd need to identify if a game is free or not) |
18:25:00 | <@OrIdow6> | pokechu22: Thanks for the research |
18:25:08 | <@OrIdow6> | That is kinda a shame |
18:26:10 | <@OrIdow6> | I could look into writing a grab script today or tomorrow, I guess |
18:26:32 | <@OrIdow6> | Maybe just something barebones to get the stuff that needs JS logic/POST, and the rest we can do thru AB |
18:27:15 | <pokechu22> | https://bourrindesbois.itch.io/bx-the-rogue-like-chapter-1-road-trip-to-villebois-lavalette actually doesn't even have the purchase link, but instead does a POST to a different URL |
18:27:43 | <pokechu22> | but I don't think we can distinguish between name your price and pay directly |
18:29:38 | <pokechu22> | I'm going to do it as an !a < list seeded with the games sitemaps in reverse order (so newest games first); that seems like the easiest way to get everything we might want. I'll *not* include https://itch.io/ itself in that list. That should at least be a starting point... |
18:30:04 | <pokechu22> | oh, and I did previously check, and NSFW games have a JS overlay, but the rest of the content does exist behind it even without accepting the overlay, so we don't need to do anything special there |
18:30:47 | <@OrIdow6> | Ah, good |
18:30:48 | <@OrIdow6> | And thanks |
18:31:11 | <@OrIdow6> | Feeling tired right now for some reason |
18:31:39 | <pokechu22> | for reference: cat games*.xml | sed 's~<url><loc>~~' | sed 's~</loc></url>~~' | sed 's~<?xml version="1.0" encoding="UTF-8"?><urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">~~' | sed "s~</urlset>~\n~" | tac | zstd -10 > itch.io_subdomain_games.txt.zst |
18:35:19 | <@OrIdow6> | Thanks for the shell pipeline, I'll probably use it :) |
18:35:43 | <pokechu22> | Yeah, there's probably a smarter way to convert a sitemap into a list of URLs, but I'm good at jank like that :) |
18:35:46 | <@OrIdow6> | Downloading the XML myself now, will probably try to check if there are as many games as the ID numbers imply, and try to do a size estimate |
18:36:07 | <@OrIdow6> | Size estimate of total download size |
19:03:36 | <h2ibot> | OrIdow6 created Itch.io (+353, Created page with "Game hosting/downloading…): https://wiki.archiveteam.org/?title=Itch.io |
19:05:17 | | awauwa quits [Quit: awauwa] |
19:05:58 | | userofweb joins |
19:14:05 | <pokechu22> | https://itch.io/docs/api/javascript uses https://leafo.itch.io/x-moon/data.json - I'll generate a list of those too. The tag info there will probably be helpful. |
19:15:08 | <pokechu22> | in particular, note that https://bourrindesbois.itch.io/bx-the-rogue-like-chapter-1-road-trip-to-villebois-lavalette/data.json doesn't have a price, while https://leafo.itch.io/x-moon/data.json has a price of $0.00 |
19:16:26 | <pokechu22> | !status |
19:17:49 | <pokechu22> | for reference again: zstdcat itch.io_subdomain_games.txt.zst | sed 's~$~/data.json~' | zstd -10 > itch.io_game_api_data.txt.zst |
19:18:02 | <Swryl> | What would be a good resource for me to send people in order to archive their games and pages? |
19:18:16 | <Swryl> | People are wondering. |
19:19:16 | <pokechu22> | Right now I think the best we have is https://web.archive.org/save/ - the archivebot jobs should cover most stuff that's easy to get for all currently-uploaded games, other than downloads |
19:20:04 | <Swryl> | Thank you. |
19:23:15 | | woodsman joins |
19:26:46 | | woodsman quits [Read error: Connection reset by peer] |
19:33:34 | | Irenes quits [Read error: Connection reset by peer] |
19:42:23 | <@OrIdow6> | Have taken a long winding road trying to figure out how to download html5 games from itch.io |
19:42:25 | <@OrIdow6> | Exorcism: |
19:42:27 | <@OrIdow6> | Whoops |
19:47:26 | <@OrIdow6> | Exorcism: So from the looks of it this is a fake shutdown announcement, and it's actually a user speculating that the site's administration is going to neglect it in the future? |
19:47:27 | | Webuser337104 joins |
19:47:31 | | Webuser337104 quits [Client Quit] |
19:48:19 | | Island joins |
19:48:34 | | grill quits [Ping timeout: 240 seconds] |
19:52:14 | | ducky quits [Ping timeout: 240 seconds] |
19:53:49 | <h2ibot> | TriangleDemon edited Colors! (+71, /* Site structure */): https://wiki.archiveteam.org/?diff=56577&oldid=56573 |
19:53:52 | <pokechu22> | OrIdow6: I'm going to actually start several archivebot jobs for the data.json URLs since those contain the information we would need to identify NSFW games and free games |
19:54:08 | <pokechu22> | zstdcat itch.io_game_api_data.txt.zst | split --number r/6 --numeric-suffixes=1 --filter='zstd -10 > $FILE' --additional-suffix=".txt.zst" - itch.io_game_api_data_part_ |
19:55:04 | <@OrIdow6> | I am finding it hard to navigate thru Amino |
19:55:06 | <@OrIdow6> | pokechu22: Thx |
19:55:10 | <@OrIdow6> | Good idea |
19:55:53 | | Irenes (ireneista) joins |
20:01:20 | <@OrIdow6> | https://aminoapps.com/c/wings-of-amino/page/blog/psa-app-updates/r8Vr_rxseurrqVj7KMl0pENQ0QNgQxavNY - ' |
20:01:22 | <@OrIdow6> | This news comes from a game master named Fairyyoona who is not affiliated with Team Amino... All information that has been circulating around is speculation, which means there are no plans for Amino to be shut down at this time.... Fairyyoona appears to be taking out their grief on Amino by abusing their own privileges and abilities as a way to try and get back and, in their words, "sabotage".' |
20:03:51 | | userofweb quits [Client Quit] |
20:04:44 | <@OrIdow6> | !remindme 2h amino |
20:04:44 | <eggdrop> | [remind] ok, i'll remind you at 2025-07-24T22:04:44Z |
20:29:10 | | sec^nd quits [Ping timeout: 264 seconds] |
20:32:51 | | APOLLO03 quits [Quit: Leaving] |
20:34:03 | | sec^nd (second) joins |
20:34:58 | | Wohlstand (Wohlstand) joins |
20:35:32 | <Swryl> | Does https://web.archive.org/save/ grab the downloads for itch as well? |
20:39:54 | <Swryl> | Darn, it looks like it keeps hitting the age gate. |
20:43:27 | | etnguyen03 (etnguyen03) joins |
20:49:25 | <pokechu22> | It doesn't grab the downloads. It does hit the age gate, but the age gate is just a thing in front of the page, and the content behind it still is saved (while I don't think you can actually close it on web.archive.org, you can delete it with inspect element and the page is behind it) |
20:51:09 | | APOLLO03 joins |
20:53:13 | | ducky (ducky) joins |
20:56:10 | | arch quits [Remote host closed the connection] |
20:56:28 | | arch (arch) joins |
20:57:09 | | JayEmbee quits [Quit: WeeChat 4.1.1] |
21:02:50 | | userofweb joins |
21:03:31 | | DopefishJustin quits [Remote host closed the connection] |
21:05:36 | <Swryl> | got it. I'm helping people upload their works to the archive as files. |
21:05:49 | <Swryl> | Do you have any resources/good to know things to share? |
21:07:57 | <pokechu22> | I think there's metadata you can set to indicate that an archive.org item is 18+ but I'm not sure of the details (and I think that's generally more a concern for images/thumbnails) |
21:08:11 | <katia> | I think if you give us links we can archivebot them so they’ll be in the wayback machine - not necessarily an alternative - but also an option. |
21:08:21 | <pokechu22> | I don't remember what it is though |
21:08:52 | <pokechu22> | What I saw was that the download links weren't really compatible with archivebot since they're POST-based :/ |
21:16:14 | | DopefishJustin joins |
21:16:14 | | DopefishJustin is now authenticated as DopefishJustin |
21:22:41 | <katia> | Oh :/ |
21:29:45 | | etnguyen03 quits [Client Quit] |
21:30:05 | | etnguyen03 (etnguyen03) joins |
21:34:06 | | PredatorIWD25 joins |
21:39:51 | | etnguyen03 quits [Client Quit] |
21:40:11 | | etnguyen03 (etnguyen03) joins |
21:49:57 | | etnguyen03 quits [Client Quit] |
21:50:17 | | etnguyen03 (etnguyen03) joins |
22:00:03 | | etnguyen03 quits [Client Quit] |
22:00:23 | | etnguyen03 (etnguyen03) joins |
22:04:44 | <eggdrop> | [remind] OrIdow6: amino |
22:05:34 | | ThreeHM quits [Ping timeout: 240 seconds] |
22:07:43 | | ThreeHM (ThreeHeadedMonkey) joins |
22:10:09 | | etnguyen03 quits [Client Quit] |
22:10:29 | | etnguyen03 (etnguyen03) joins |
22:16:55 | | atphoenix_ (atphoenix) joins |
22:19:19 | | atphoenix__ quits [Ping timeout: 260 seconds] |
22:23:47 | | Wohlstand quits [Client Quit] |
22:38:42 | <BlankEclair> | re itch.io: according to a quote from <https://thetransfemininereview.com/2025/07/24/itch-io-nsfw-ban/>: "Part of this review will see some pages being permanently removed from itch.io" |
22:38:44 | <BlankEclair> | so that's fun |
22:44:24 | | Dada quits [Ping timeout: 260 seconds] |
22:45:45 | <pabs> | hexagonwin_: can you add that to Deathwatch on the wiki? will it be going read-only before that? |
22:46:22 | <pabs> | ah, nevermind, that was done already |
22:48:00 | <nicolas17> | speaking of which, we may want to document the itchio situation on the wiki... |
22:55:16 | <pabs> | arkiver: sounds like a DPoS is needed for archiwum.allegro.pl (and maybe allegro.pl) https://old.reddit.com/r/Archiveteam/comments/1m8eby6/allegropl_the_biggest_ecommerce_platform_in/ |
22:55:35 | <pabs> | -feed/#archiveteam-reddit- Allegro.pl, the biggest e-commerce platform in Poland, is purging its archive of offers operating since 2015 <https://old.reddit.com/r/Archiveteam/comments/1m8eby6/allegropl_the_biggest_ecommerce_platform_in/> |
22:55:36 | | userofweb quits [Client Quit] |
23:02:05 | <nicolas17> | a huge macOS InstallAssistant.pkg that I fed into archivebot a few days ago has now been released under 3 more URLs, I should have done my own warc instead of archivebot |
23:06:31 | <@arkiver> | pokechu22: it sounds like you have itch.io handled then? |
23:07:24 | <pokechu22> | For now, I think we're doing everything that's practical to do. Once the API stuff finishes downloading, that could be used to do other specialized projects |
23:07:47 | <pokechu22> | (which I probably won't be directly involved with, but it sounds like OrIdow6 was looking into) |
23:09:39 | <@arkiver> | pokechu22: did you already collect a list an put it on transfer.archivete.am ? |
23:10:56 | <pokechu22> | Yes, but they're all from the games sitemaps on https://itch.io/sitemap.xml so it wasn't much work on my part. https://transfer.archivete.am/2azX1/itch.io_subdomain_games.txt.zst is the list of games (reversed, so newest is at the start instead of the end) |
23:11:32 | <pokechu22> | that list is running in archivebot but it'll probably take weeks to finish |
23:12:04 | <pokechu22> | I also have 6 lists running containing URLs like https://nickp1008008.itch.io/fireman/data.json which can hopefully identify NSFW games and free games |
23:12:41 | <pokechu22> | (although on further thought, that only contains tags, and I think tagged as NSFW and marked as NSFW are slightly different?) |
23:14:14 | <@arkiver> | pokechu22: hmm yeah 1.5 million of these is maybe a little steep for AB |
23:15:22 | <pokechu22> | Note that they do have rate limiting (con=6, d=100-100 is too fast, but con=3, d=100-100 is fine). Since I split it into 6 lists the metadata jobs should be done in a reasonable amount of time |
23:16:16 | <pokechu22> | but we are going to need something more targetted to get at risk stuff |
23:16:44 | <pokechu22> | oh, and downloads require POST, so archivebot won't get them |
23:17:10 | <@arkiver> | pabs: thanks for the notice on that, sounds like they stop adding new content to the archive this month, so i think we'll start the project next month |
23:17:48 | <@arkiver> | deadline would be november when they start redirecting if possible |
23:18:12 | <@arkiver> | pokechu22: this may have been discussed before, but do we know anything of a deadline on itch.io ? |
23:18:49 | <pokechu22> | no, the only info we have is https://itch.io/updates/update-on-nsfw-content |
23:20:20 | <@arkiver> | the all-mighty payment processors :/ |
23:21:59 | <pabs> | payment processors-- |
23:21:59 | <eggdrop> | [karma] 'payment processors' now has -1 karma! |
23:22:31 | <@arkiver> | yeah! |
23:29:11 | | etnguyen03 quits [Client Quit] |
23:29:31 | | etnguyen03 (etnguyen03) joins |
23:32:20 | <@OrIdow6> | arkiver: The plan on itch.io is to write a little thing that just does the logic to download the games, figure out all the games that are at risk (hopefully via pokechu22's metadata AB job having finished by then), and then running that in the next few days |
23:32:30 | <@OrIdow6> | I should be able to write it though nothing operational yet |
23:32:38 | <@arkiver> | OrIdow6: sounds good :) |
23:34:46 | <@OrIdow6> | There's a chance it'll have gone down then of course |
23:35:01 | <@OrIdow6> | But hopefully they'll take more time to do their thing by then |
23:35:59 | <@arkiver> | OrIdow6: let me know if you need something from me! |
23:39:17 | | etnguyen03 quits [Client Quit] |
23:57:21 | | Guest58 joins |
23:58:52 | | JayEmbee (JayEmbee) joins |