00:12:18wyatt8740 joins
00:12:51wyatt8750 quits [Ping timeout: 265 seconds]
00:14:29<lennier1>With the recent news about Apple removing apps that don't get updated, how difficult would it be to archive https://apps.apple.com ?
00:15:45<lennier1>That would get a lot of metadata about the apps, but not the .ipa files themselves, thought it would also be useful for identifying apps, seeing when they were last updated, whether they're free or paid, etc.
00:16:26<TheTechRobo>lennier1: The individual app pages at least don't seem to need JS
00:17:24<TheTechRobo>However, the review page only shows a few with no option that I can see to load more.
00:17:42<TheTechRobo>Not sure abt ratelimiting
00:24:41<lennier1>Yeah, I guess it just shows 10 reviews. They also have multiple regions. i.e. https://apps.apple.com/us/app/maximus-2/id1554432924 https://apps.apple.com/de/app/maximus-2/id1554432924
00:30:27<lennier1>You can get to a page with just the app ID. https://apps.apple.com/us/app/id447188370
00:31:53<lennier1>Is there a good way to discover the app IDs? They go up to at least the low billions, but I believe there are just about a couple of million apps.
00:32:07<TheTechRobo>search engines?
00:32:14<TheTechRobo>there's also probably a discovery page
00:34:30<lennier1>The main page has some links, and then each app has links to similar apps, apps by the same developer, etc. But I don't know if that would get to all the obscure ones.
00:49:39<thuban>ok, my brute force scan of glencoe.mheducation.com sites is running
00:50:31<thuban>i did it in the maximally lazy way, so it'll take about 30 days, but with shutdown on 30 june that should leave plenty of time for archiving (there appear to be few sites and they appear to be small)
00:52:04<thuban>the sites themselves use javascript, but should be crawlable without it
00:52:58Mateon1 quits [Ping timeout: 265 seconds]
00:53:17Mateon1 joins
00:54:11Megame quits [Client Quit]
00:56:48<thuban>(some teachers' resources are "password-protected"... using client-side javascript. everything seems to be reachable through the sitemaps)
00:59:37<thuban>the shutdown announcement is for "glencoe.mheducation.com and all of its associated sites"; i'm not sure whether this implies there's content other than the sites associated with the domain, but i haven't been able to find any
01:02:38dm4v quits [Client Quit]
01:02:46dm4v joins
01:02:49dm4v quits [Changing host]
01:02:49dm4v (dm4v) joins
01:09:03march_happy quits [Ping timeout: 265 seconds]
01:10:34march_happy (march_happy) joins
01:12:47DiscantX quits [Ping timeout: 265 seconds]
01:16:35HP_Archivist (HP_Archivist) joins
01:20:45nikow1 joins
01:20:54Mateon2 joins
01:21:38BlueMaxima_ joins
01:21:41Jake4 (Jake) joins
01:22:01BlueMaxima_ quits [Remote host closed the connection]
01:22:07IDK_ quits [Client Quit]
01:22:07BlueMaxima quits [Remote host closed the connection]
01:22:07nikow quits [Remote host closed the connection]
01:22:07adamus1red quits [Client Quit]
01:22:07le0n quits [Client Quit]
01:22:07Mateon1 quits [Remote host closed the connection]
01:22:07Jake quits [Client Quit]
01:22:07mikael quits [Client Quit]
01:22:07dm4v quits [Client Quit]
01:22:07Mateon2 is now known as Mateon1
01:22:08Jake4 is now known as Jake
01:22:10dm4v joins
01:22:11msrn_ joins
01:22:20dm4v quits [Changing host]
01:22:20dm4v (dm4v) joins
01:22:43le0n (le0n) joins
01:22:43adamus1red (adamus1red) joins
01:22:53IDK_ joins
01:23:07BlueMaxima_ joins
01:23:31BlueMaxima_ quits [Remote host closed the connection]
01:24:37BlueMaxima_ joins
01:25:01BlueMaxima_ quits [Remote host closed the connection]
01:26:07BlueMaxima_ joins
01:26:31BlueMaxima_ quits [Remote host closed the connection]
01:26:45BlueMaxima_ joins
01:44:48Arcorann quits [Ping timeout: 252 seconds]
01:55:15HP_Archivist quits [Client Quit]
01:59:31<h2ibot>Hoarderhank edited Alive... OR ARE THEY (+466, Added The Correspondent): https://wiki.archiveteam.org/?diff=48570&oldid=48444
02:36:48tzt quits [Remote host closed the connection]
02:37:12tzt (tzt) joins
03:04:34jacobk quits [Ping timeout: 265 seconds]
03:14:06DiscantX joins
03:19:55BlueMaxima_ quits [Client Quit]
03:22:13benjinsmith joins
03:24:44benjins quits [Ping timeout: 265 seconds]
03:28:13jacobk joins
03:38:39march_happy quits [Ping timeout: 252 seconds]
03:39:13march_happy (march_happy) joins
03:58:13march_happy quits [Ping timeout: 265 seconds]
03:59:02march_happy (march_happy) joins
04:06:46Arcorann (Arcorann) joins
04:22:33<Arcorann>Question: is there a project related to the Philippines' current situation?
04:28:00<Frogging101>What situation is that?
04:34:03kn100 quits [Quit: https://kn100.me :)]
04:34:26kn100 joins
04:45:07<Ryz>Think it might be election related
05:34:09jacobk quits [Ping timeout: 252 seconds]
05:58:17jacobk joins
06:04:44Atom-- joins
06:06:36Atom quits [Ping timeout: 252 seconds]
06:21:02Atom joins
06:23:18michaelblob quits [Read error: Connection reset by peer]
06:24:12Atom-- quits [Ping timeout: 252 seconds]
06:26:40michaelblob (michaelblob) joins
06:45:27DiscantX quits [Ping timeout: 265 seconds]
07:06:35Shjosan quits [Ping timeout: 265 seconds]
07:30:38systwi__ (systwi) joins
07:31:18systwi quits [Ping timeout: 252 seconds]
07:35:29<thuban>whoops! those 10-digit identifiers are isbn-10s, which means (among other things) that the final 'digit' can be an "x": http://glencoe.mheducation.com/sites/007895312x/ http://glencoe.mheducation.com/sites/007874637x/ http://glencoe.mheducation.com/sites/007873830x/
07:36:05<thuban>not a problem, just have to make sure i scan those as well
08:30:20benjinsmith quits [Ping timeout: 265 seconds]
09:09:29shoghicp joins
09:09:29shoghicp quits [Changing host]
09:09:29shoghicp (shoghicp) joins
09:51:00shoghicp quits [Ping timeout: 252 seconds]
09:52:10shoghicp joins
09:52:11shoghicp quits [Changing host]
09:52:11shoghicp (shoghicp) joins
09:55:36qwertyasdfuiopghjkl joins
10:01:08benjins joins
10:19:48Megame (Megame) joins
11:11:25systwi__ is now known as systwi
11:59:23wyatt8750 joins
11:59:42wyatt8740 quits [Ping timeout: 252 seconds]
12:10:53eroc1990 quits [Client Quit]
12:12:04eroc1990 (eroc1990) joins
12:47:00march_happy quits [Ping timeout: 252 seconds]
12:47:13march_happy (march_happy) joins
13:22:44HP_Archivist (HP_Archivist) joins
13:26:23evan quits [Remote host closed the connection]
13:26:23jamesp quits [Remote host closed the connection]
13:26:24shreyasminocha quits [Remote host closed the connection]
13:27:14jamesp joins
13:27:14evan joins
13:27:14jamesp quits [Changing host]
13:27:14jamesp (jamesp) joins
13:27:39shreyasminocha (shreyasminocha) joins
13:53:54<Doranwen>glad you're saving the Glencoe stuff - I've grabbed *tons* of their downloadable content (workbook pdfs and such) over the years, as some of it is super valuable to teachers, even when using a different company's books to teach from
13:56:31<Doranwen>thuban: ah yes, this is what I mean by the super useful downloadable stuff - I've used these workbooks personally in tutoring students at one point: http://glencoe.mheducation.com/sites/007873830x/student_view0/student_workbooks.html
13:57:05<Doranwen>a few years back I went hunting for every case like that I could find but that was only from browsing through the links they provided, wonder if there are more that weren't linked to there somehow
14:17:14Arcorann quits [Ping timeout: 265 seconds]
14:23:10march_happy quits [Ping timeout: 265 seconds]
14:23:29march_happy (march_happy) joins
14:23:30Megame quits [Client Quit]
15:12:37Shjosan (Shjosan) joins
15:15:00HP_Archivist quits [Client Quit]
15:39:08<Ryz>Weird, https://gamefaqs.gamespot.com/ starting rendering all the URLs as 404s, and then finally it gave 503s to indicate it was down... oo;
15:46:21AramZS joins
15:48:44<AramZS>Hey folks @Chronotope on Twitter here. Wanted to note that I proposed adding The Believer's website to the 'Alive... but are they!' watchlist. They appear to have been purchased for their SEO juice by a sex toy company
15:48:47<AramZS>See: https://twitter.com/ST_Collective_/status/1523756595317927936
15:49:43<AramZS>So while presumably they are well covered in the archives already, and the current owner is incentivized to keep them up, if there is a process for double checking their pages are indeed covered and archived it might be worthwhile to do.
15:55:22<Ryz>Hello AramZS, is this the one? https://believermag.com/
15:55:58wyatt8750 quits [Ping timeout: 265 seconds]
15:56:01<AramZS>Yup, that's the site
15:56:14wyatt8740 joins
15:57:49<Ryz>I threw the website into ArchiveBot ArmazZS; just trying to figure out if the website has a Twitter account~
15:58:17<AramZS>https://twitter.com/believermag
15:58:22<AramZS>Is their Twitter I think
15:58:27<Ryz>Grand o:
15:58:32<Ryz>Running it through ArchiveBot too~
15:58:55<Ryz>AramZS, thanks for bringing it up to us, means a lot when websites like that could suddenly change how available their content is
15:59:12<Ryz>Please don't hesistate to come back and report websites, or even suggest websites that might be in danger
15:59:35<AramZS>Exactly and awesome. That's a relief. There is a lot of significant authorship who have contributed major essays over the years to that site and while it also has a print version I do believe there significance to the online form and format and likely some online-only content. Will do!
16:01:20<Ryz>Gonna also archive the website from https://twitter.com/ST_Collective_ - being https://sextoycollective.com/ - just incase
16:24:11HP_Archivist (HP_Archivist) joins
16:50:40<tech234a>Might make sense to make sure iPod-related pages on Apple’s site are archived, Apple appears to have announced that iPod Touches have been discontinued https://www.apple.com/newsroom/2022/05/the-music-lives-on/
17:19:12HP_Archivist quits [Client Quit]
17:39:50sec^nd quits [Remote host closed the connection]
17:41:03sec^nd (second) joins
18:28:34Icyelut|3 quits [Ping timeout: 265 seconds]
19:20:43Craigle quits [Quit: The Lounge - https://thelounge.chat]
19:21:09Craigle (Craigle) joins
19:38:17wyatt8750 joins
19:38:18wyatt8740 quits [Ping timeout: 265 seconds]
19:57:59Void0 (Void0) joins
20:03:57<Void0>Hey Ryz
20:04:12<Ryz>Hello Void0, what forum website are you trying to archive?
20:04:18<Void0>blackpearl.biz
20:04:50<Void0>piracy discussion/sharing forum. shutting down at the end of the month. i tried to back up what i could but didn't have much progress
20:06:11<Ryz>Unfortunately I can't toss it into ArchiveBot since the bot can't go through the website on it's own;
20:06:43<Ryz>I recall some people might've been able to archive invite-only forums here, but unsure if they're present right now
20:07:14<Void0>ah okay, thanks and no worries! i'll try and lurk in here to see if anyone responds.
20:07:52<Void0>would it work if i invited you?
20:11:38<Ryz>Invite me? Uhh, I don't really have tools to manually archive websites on my computer; I just usually use ArchiveBot for archiving websites and links in general
20:25:50<Void0>ooh okay! thanks. so archivebox works for sites that are public?
20:53:01<programmerq>Void0: I wouldn't mind tinkering with trying to grab a backup. It wouldn't end up in web.archive.org if I just do the grab myself.
21:01:02Mateon1 quits [Remote host closed the connection]
21:01:16Mateon1 joins
21:32:55LeGoupil joins
21:34:54LeGoupil quits [Client Quit]
21:37:15LeGoupil joins
21:40:43LeGoupil quits [Client Quit]
21:52:53HP_Archivist (HP_Archivist) joins
22:08:32<@arkiver>Void0: feel free to PM me
22:12:29AramZS quits [Ping timeout: 265 seconds]
22:31:26BlueMaxima joins
23:15:39Void0 quits [Client Quit]
23:16:37HP_Archivist quits [Client Quit]
23:19:52HP_Archivist (HP_Archivist) joins
23:29:56phuzion quits [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.]
23:30:02phuzion (phuzion) joins
23:33:33march_happy quits [Ping timeout: 265 seconds]
23:34:17march_happy (march_happy) joins
23:42:21Arcorann (Arcorann) joins
23:46:28HP_Archivist quits [Client Quit]
23:50:00<TheTechRobo>Void0: If you're reading logs, feel free to PM me with details
23:51:57march_happy quits [Ping timeout: 252 seconds]
23:52:32jacobk quits [Ping timeout: 265 seconds]
23:52:48march_happy (march_happy) joins
23:57:22march_happy quits [Ping timeout: 265 seconds]
23:58:48march_happy (march_happy) joins