| 00:00:04 | | matoro quits [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.] |
| 00:01:02 | | DopefishJustin quits [Remote host closed the connection] |
| 00:01:51 | | matoro joins |
| 00:02:50 | | midou joins |
| 00:17:53 | | etnguyen03 (etnguyen03) joins |
| 00:20:39 | <@JAA> | Uh, right |
| 00:26:38 | | DopefishJustin joins |
| 00:26:38 | | DopefishJustin is now authenticated as DopefishJustin |
| 00:39:07 | | chaoticbee (chaoticbee) joins |
| 01:05:32 | | cyanbox joins |
| 01:08:37 | | makeworld quits [Remote host closed the connection] |
| 01:16:07 | <h2ibot> | PaulWise edited ArchiveBot (-145, use variables to make the script shorter, more…): https://wiki.archiveteam.org/?diff=57793&oldid=57658 |
| 01:19:07 | <h2ibot> | PaulWise edited ArchiveBot (-2, use <pre>): https://wiki.archiveteam.org/?diff=57794&oldid=57793 |
| 01:34:41 | <pabs> | justauser|m: can you add that wiki to Deathwatch? |
| 02:14:52 | | ducky quits [Ping timeout: 260 seconds] |
| 02:15:52 | | ducky (ducky) joins |
| 02:50:00 | | andrewnyr quits [Quit: Ping timeout (120 seconds)] |
| 02:57:22 | | midou quits [Ping timeout: 256 seconds] |
| 03:02:22 | | pabs quits [Read error: Connection reset by peer] |
| 03:03:44 | | pabs (pabs) joins |
| 03:05:42 | | midou joins |
| 03:17:55 | | Wohlstand (Wohlstand) joins |
| 03:30:53 | | Island quits [Read error: Connection reset by peer] |
| 03:31:53 | <nicolas17> | agh |
| 03:32:18 | <nicolas17> | https://opensource.samsung.com/uploadSearch?searchValue=SCH-E329I this (and a few others) are supposed to be PDFs |
| 03:32:28 | <nicolas17> | instead they have a magic number of "<## NASCA DRM FILE - VER1.00 ##>" |
| 03:32:48 | <nicolas17> | whatever, let's archive them anyway for completeness and to preserve evidence of samsung's fuckup, right? |
| 03:33:06 | <nicolas17> | "error uploading SCH-E329I.pdf: Uploaded content is unacceptable. - error checking pdf file" |
| 03:45:12 | | PredatorIWD256 joins |
| 03:45:50 | | Guest58 joins |
| 03:46:00 | | Guest58 quits [Client Quit] |
| 03:46:05 | | midou quits [Ping timeout: 272 seconds] |
| 03:47:21 | | PredatorIWD25 quits [Ping timeout: 272 seconds] |
| 03:47:21 | | PredatorIWD256 is now known as PredatorIWD25 |
| 03:54:29 | | midou joins |
| 04:03:15 | | etnguyen03 quits [Remote host closed the connection] |
| 04:22:50 | | Guest58 joins |
| 04:26:31 | | Shard79 quits [Quit: Ping timeout (120 seconds)] |
| 04:26:43 | | Shard79 (Shard) joins |
| 04:33:35 | | nothere quits [Ping timeout: 272 seconds] |
| 04:54:37 | | nothere_ joins |
| 06:00:36 | | Guest58 quits [Client Quit] |
| 06:01:27 | | Guest58 joins |
| 06:06:41 | | Guest58 quits [Ping timeout: 272 seconds] |
| 06:10:59 | | DogsRNice quits [Read error: Connection reset by peer] |
| 06:16:06 | | Guest58 joins |
| 06:35:11 | | midou quits [Ping timeout: 272 seconds] |
| 07:01:19 | <steering> | nicolas17: https://cpcex.sec.samsung.net/Windchill/ext/cpcex/common/gate/jsp/guideDrmEN.jsp#1 amazing |
| 07:01:19 | | Guest58 quits [Client Quit] |
| 07:04:22 | <steering> | some sort of fs filter or something to recognize that header, seems lovely |
| 07:04:43 | | midou joins |
| 07:04:47 | <nicolas17> | I assume IA is just seeing that it's not a valid PDF |
| 07:05:00 | <steering> | oh yeah I mean for samsung's use of it |
| 07:09:32 | | midou quits [Ping timeout: 256 seconds] |
| 07:25:39 | | midou joins |
| 07:39:00 | | midou quits [Ping timeout: 256 seconds] |
| 07:47:25 | | Guest58 joins |
| 07:49:05 | | midou joins |
| 07:52:35 | | twse346865 joins |
| 07:53:43 | <twse346865> | consider hiding the privacy and disclaimer links in the footer from the wiki, by editing the LocalSettings.php file. there's a MediaWiki extension called FooterManager to do this. |
| 07:54:53 | <twse346865> | in the LocalSettings.php file, there are lines for $wgFooterManagerLinks['privacy'] and $wgFooterManagerLinks['disclaimer'], which are set to true. please change them to false! |
| 07:58:54 | <nicolas17> | https://data.nicolas17.xyz/samsung-grab/ 13 files pending, don't let ungeskriptet do everything :p |
| 08:02:48 | | midou quits [Ping timeout: 256 seconds] |
| 08:16:58 | | skyrocket quits [Ping timeout: 256 seconds] |
| 08:18:07 | <twse346865> | InMotion Hosting did a tutorial on editing the footer links in MediaWiki, find it at inmotionhosting dot com/support/edu/mediawiki/edit-footer-mediawiki. |
| 08:21:03 | | twse346865 quits [Client Quit] |
| 08:21:58 | | skyrocket joins |
| 08:23:53 | | twse525053 joins |
| 08:24:18 | <twse525053> | in the Special:Version page from the wiki, I don't have any entry shown for FooterManager. |
| 08:28:52 | | cmlow quits [Ping timeout: 256 seconds] |
| 08:29:51 | | lennier2_ joins |
| 08:30:05 | <twse525053> | the team could have downloaded the FooterManager extension from MediaWiki and disabled the privacy policy and disclaimer links, adding the entries of $wgFooterManagerLinks lines of ['privacy'] and ['disclaimer'] to false. |
| 08:30:44 | <twse525053> | and in the LocalSettings.php file from the MediaWiki installation being used, there is no FooterManager extension line! |
| 08:32:59 | | lennier2 quits [Ping timeout: 272 seconds] |
| 08:35:07 | <twse525053> | taking a look at Wayback Machine snapshots of Archiveteam:Privacy policy and Archiveteam:General disclaimer pages, all the snapshots have the 404 Not Found error code, and they are shown in orange. |
| 08:38:23 | <twiswist> | How do I view the history of an individual file in an Internet Archive upload? I remember there being a page (either like /details/ and /download/ but something else, or appended to the end of the download url) that showed you a log of operations that have been performed on the file, the most interesting of which is whether the file was generated by IA or originally uploaded by the uploader |
| 08:38:48 | <twiswist> | It's not discoverable anywhere (or I'm just overlooking it) but I swear I've stumbled across it before |
| 08:41:09 | <twiswist> | It's supposed to be in my browser history but isn't |
| 08:41:36 | <twse525053> | in the MediaWiki page for Extension:FooterManager, it says that the extension is no longer available for download and has been archived. |
| 08:48:46 | <twse525053> | InMotion Hosting provided the FooterManager extension for download in its support page in a comment from 2013. |
| 08:49:43 | <twse525053> | please add the URL bobsgame dot com (was excluded in December 15th, 2012) to the wiki page: "List of websites excluded from the Wayback Machine/Former exclusions". the edits won't go through if I don't create an account! |
| 08:53:56 | | AK (AK) joins |
| 08:56:40 | <twse525053> | the footer links to privacy policy and disclaimer are MediaWiki:Privacy and MediaWiki:Disclaimer. please edit LocalSettings.php file from the MediaWiki installation being used by Archiveteam, to remove the links to privacy policy and disclaimer from the footer! |
| 08:59:43 | | AK quits [Client Quit] |
| 09:01:19 | <chrismrtn> | twiswist: Is the file named like itemIdentifierHere_files.xml what you are looking for? It doesn't show a history, but it does show if a file is original or a derivative (generated by IA) |
| 09:43:48 | <c3manu> | Rince: thanks, will do! :) i think i'd need a more or less complete checklist, even if it would be straight forward for some. |
| 09:44:07 | <c3manu> | masterx244|m: the latter :) |
| 09:48:28 | | twse525053 quits [Client Quit] |
| 10:04:14 | <masterx244|m> | twistwist: you mean the /history/ page showing the task logs? |
| 10:04:22 | <masterx244|m> | *twiswist |
| 10:05:17 | <masterx244|m> | c3manu that makes it much easier than fighting the F5 wars. |
| 10:05:17 | <masterx244|m> | also: waiting for guru3 to open to snatch up my usual DECT extension |
| 10:08:43 | | midou joins |
| 10:34:12 | | ducky quits [Ping timeout: 260 seconds] |
| 10:35:36 | | ducky (ducky) joins |
| 10:38:20 | | szczot3k|t (szczot3k) joins |
| 10:41:06 | | szczot3k|t quits [Client Quit] |
| 10:41:15 | | szczot3k|t (szczot3k) joins |
| 11:17:31 | <katia> | Yayyyy 39c3 |
| 11:17:34 | <katia> | Hype hype hype hype |
| 12:00:02 | | Bleo182600722719623455222 quits [Quit: The Lounge - https://thelounge.chat] |
| 12:02:46 | | Bleo182600722719623455222 joins |
| 12:17:41 | <masterx244|m> | letzs hope that we manage the AT nerd meetup this time |
| 12:17:44 | <masterx244|m> | *lets |
| 12:27:51 | | etnguyen03 (etnguyen03) joins |
| 12:51:18 | | etnguyen03 quits [Client Quit] |
| 12:55:31 | | etnguyen03 (etnguyen03) joins |
| 13:07:44 | | ducky quits [Ping timeout: 260 seconds] |
| 13:42:13 | <katia> | Yaaaa |
| 13:46:29 | | chaoticbee quits [Ping timeout: 272 seconds] |
| 14:03:35 | | nine quits [Ping timeout: 272 seconds] |
| 14:07:08 | | nine joins |
| 14:07:08 | | nine is now authenticated as nine |
| 14:07:08 | | nine quits [Changing host] |
| 14:07:08 | | nine (nine) joins |
| 14:20:45 | | AK (AK) joins |
| 15:00:26 | | chaoticbee (chaoticbee) joins |
| 15:21:36 | | ducky (ducky) joins |
| 15:25:29 | | cyanbox quits [Read error: Connection reset by peer] |
| 15:26:33 | | nicolas17 quits [Ping timeout: 272 seconds] |
| 15:33:34 | <h2ibot> | Justauser edited Deathwatch (+343, /* 2026 */ computersciencewiki.org, wormbase.org): https://wiki.archiveteam.org/?diff=57795&oldid=57781 |
| 15:41:25 | | emanuele6 (emanuele6) joins |
| 15:46:44 | | Guest58_ joins |
| 15:46:44 | | Guest58 quits [Read error: Connection reset by peer] |
| 15:50:52 | <c3manu> | masterx244|m: you didn’t call dibs on yours? ;) |
| 15:54:36 | <c3manu> | katia: \o/ |
| 16:00:51 | <emanuele6> | katia: \o/ |
| 16:11:55 | | nicolas17 (nicolas17) joins |
| 16:12:48 | | Wohlstand quits [Quit: Wohlstand] |
| 16:40:52 | | Wohlstand (Wohlstand) joins |
| 17:05:24 | | etnguyen03 quits [Quit: Konversation terminated!] |
| 17:06:34 | | etnguyen03 (etnguyen03) joins |
| 17:09:02 | | Guest58_ quits [Client Quit] |
| 17:11:46 | | ThreeHM quits [Quit: WeeChat 4.7.1] |
| 17:12:35 | | ThreeHM (ThreeHeadedMonkey) joins |
| 17:15:43 | | Snivy quits [Quit: Ping timeout (120 seconds)] |
| 17:15:56 | | Snivy (Snivy) joins |
| 17:16:02 | | Snivy quits [Client Quit] |
| 17:16:53 | | Snivy (Snivy) joins |
| 17:26:19 | | etnguyen03 quits [Client Quit] |
| 17:44:39 | | etnguyen03 (etnguyen03) joins |
| 18:33:03 | | etnguyen03 quits [Client Quit] |
| 18:44:26 | | etnguyen03 (etnguyen03) joins |
| 18:46:40 | | Sluggs quits [Excess Flood] |
| 18:47:02 | <h2ibot> | Manu edited Distributed recursive crawls (+51, Candidates: Add ildb.nadir.org): https://wiki.archiveteam.org/?diff=57796&oldid=57786 |
| 18:48:48 | | skyrocket quits [Read error: Connection reset by peer] |
| 18:51:18 | | Sluggs (Sluggs) joins |
| 18:53:34 | <emanuele6> | 'twasn't me |
| 19:03:57 | | skyrocket joins |
| 19:06:19 | | pokechu22 quits [Read error: Connection reset by peer] |
| 19:08:59 | | pokechu22 (pokechu22) joins |
| 19:23:08 | | epoch joins |
| 19:23:18 | | epoch is now authenticated as epoch |
| 19:25:34 | <epoch> | https://hackaday.com/2025/11/07/oldversion-com-archive-facing-shutdown-due-to-financing-issues/ dunno if anyone has mentioned this yet or if anyone is interested |
| 19:35:53 | <nicolas17> | downloads themselves are POST |
| 19:37:01 | | emanuele6 is now known as Manu |
| 19:38:45 | <@JAA> | Do we know the total size (or even a rough estimate)? |
| 19:39:01 | <nicolas17> | we should probably contact them |
| 19:39:57 | <@JAA> | (The financial troubles have been known for at least a month, by the way.) |
| 19:47:18 | <nicolas17> | hm I think I can do some scraping and get the total size |
| 19:48:37 | <@JAA> | That'd be great. Even an extrapolated estimate is fine. Just to get a sense of the scale. |
| 19:49:34 | <nicolas17> | something to note |
| 19:49:36 | <nicolas17> | <span class="viewmore clickable" onclick="getpage('/windows/software/office/')"> |
| 19:49:53 | <nicolas17> | function getpage(page) { window.location = page; } |
| 19:49:54 | <nicolas17> | sir have you heard of normal links |
| 19:50:33 | <@JAA> | I'd think it can't be huge. 30k versions, maybe tens of MB per version on average. That'd put it at the scale of a terabyte or a couple. |
| 19:51:41 | <masterx244|m> | but POST messes with wayback-ability depending on how the URLs work. in the worst case a static-item form needs to be derived from the downloaded WARCs |
| 19:52:07 | <nicolas17> | I have little hope for WBM replay due to that POST, yeah... |
| 19:53:17 | <masterx244|m> | thx for the reminder. had to check a set of currently running zip uploads if something failed |
| 19:53:30 | <@JAA> | I think it can work, although it'd be annoying to navigate. But let's not worry about that for now. |
| 19:55:22 | | Wohlstand quits [Quit: Wohlstand] |
| 19:58:24 | <nicolas17> | hm do I remember how to use bs4 |
| 19:59:08 | <@JAA> | grep :-) |
| 20:14:33 | | Manu is now known as emanuele6 |
| 20:18:00 | <nicolas17> | ok I have a list of all apps, they are indeed 1963 |
| 20:21:59 | | NeonGlitch (NeonGlitch) joins |
| 20:22:36 | | ljcool2006 quits [Quit: Leaving] |
| 20:26:55 | | Aoede_ quits [Read error: Connection reset by peer] |
| 20:29:06 | <@JAA> | Is the sitemap complete? |
| 20:29:22 | <nicolas17> | I didn't think to check if there even was a sitemap >_< |
| 20:29:29 | | Aoede (Aoede) joins |
| 20:29:40 | <@JAA> | Heh |
| 20:29:50 | <nicolas17> | I'm fetching every app to get the list of versions now, so far it looks like your 1TB estimate was pretty good |
| 20:30:15 | <nicolas17> | 25% fetched, 869 GB extrapolated |
| 20:31:57 | <nicolas17> | oh it will actually be smaller due to android being smaller files |
| 20:32:17 | <@JAA> | Ah |
| 20:32:31 | <katia> | nicolas17, you know there's a warc right |
| 20:32:32 | <nicolas17> | I went in order so I got all Windows first |
| 20:32:35 | <nicolas17> | katia: the what |
| 20:32:42 | <katia> | nothing |
| 20:32:48 | <@JAA> | Also, I've forgotten to check for sitemaps often enough. :-) |
| 20:33:14 | <nicolas17> | katia: WHERE |
| 20:33:43 | <@JAA> | There's an AB job (which obviously didn't grab any of the actual software). |
| 20:34:47 | <emanuele6> | it's not me |
| 20:35:09 | <nicolas17> | my script reached http://www.oldversion.com/android/com-flipkart-android/ and crashed T_T |
| 20:35:25 | <masterx244|m> | and thats usually the time when some WARC-parsing can be necessary to figure out the final stage of requests |
| 20:35:42 | <@JAA> | What's the last estimate? That'll be good enough. |
| 20:36:17 | <emanuele6> | I guess around 1123GB |
| 20:36:27 | <emanuele6> | we'll see how close I was |
| 20:37:14 | <nicolas17> | I fetched 1172 apps, 21909 app versions, 350 GB |
| 20:38:24 | <katia> | nicolas17, you don't get ratelimited? |
| 20:38:37 | <nicolas17> | now I'm getting slowdowns and 502s |
| 20:38:44 | <katia> | ah |
| 20:42:36 | <nicolas17> | yeah this looks like 500 gigs |
| 20:44:10 | <katia> | i'm going to guess 263278956.29 KB |
| 20:44:49 | <@JAA> | Because you have a copy already? |
| 20:44:56 | <katia> | no i parsed the warc |
| 20:45:00 | <@JAA> | Ah :-D |
| 20:45:47 | <katia> | https://vyxg5mxrl.i.katia.sh/2025-11-10-oldversion.com-warc-parse.py.txt |
| 20:46:47 | <nicolas17> | well how come I got 514322400 KB |
| 20:46:53 | | arch_ (arch) joins |
| 20:47:01 | | arch quits [Ping timeout: 272 seconds] |
| 20:47:04 | | arch_ is now known as arch |
| 20:47:09 | <katia> | dunno i can't read the code in your computer trivially at the moment |
| 20:49:50 | <nicolas17> | https://transfer.archivete.am/inline/Xc00w/oldversion.com-appversions.txt |
| 20:49:59 | <@JAA> | Well, 'hundreds of GB, probably under a TB' is fine enough for me. :-) |
| 20:51:29 | <katia> | ah i omitted android |
| 20:52:10 | <katia> | and mac i wrote macos i guess |
| 20:54:20 | <katia> | 517673105.54 KB |
| 21:03:28 | <twiswist> | chrismrtn: Yes, that contains what I was looking for (derivative vs original), thank you! |
| 21:14:01 | | etnguyen03 quits [Client Quit] |
| 21:14:06 | | nine quits [Quit: See ya!] |
| 21:14:20 | | nine joins |
| 21:14:20 | | nine is now authenticated as nine |
| 21:14:20 | | nine quits [Changing host] |
| 21:14:20 | | nine (nine) joins |
| 21:21:19 | | ducky quits [Read error: Connection reset by peer] |
| 21:32:06 | | ducky (ducky) joins |
| 22:01:43 | | etnguyen03 (etnguyen03) joins |
| 22:09:01 | <@arkiver> | not sure how the rate limiting of oldversion is |
| 22:09:08 | <@arkiver> | do we need a warrior project? or can AB handle it? |
| 22:09:24 | | @arkiver would be happy to setup a warrior project if that is what we need |
| 22:09:54 | <emanuele6> | no war |
| 22:10:03 | <@JAA> | I think it should be feasible without DPoS. |
| 22:10:11 | <@JAA> | AB can't do it though due to POST. |
| 22:10:37 | <@JAA> | I'll look at it more closely sometime this week. |
| 22:15:03 | | wickedplayer494 quits [Ping timeout: 272 seconds] |
| 22:16:50 | <@arkiver> | JAA: alright! |
| 22:16:55 | <@arkiver> | if needed though, i'm happy to make one :) |
| 22:17:10 | <@arkiver> | at under a TB though, it's definitely not needed for size, maybe for IPs |
| 22:33:22 | <nicolas17> | hmm replay might work |
| 22:34:05 | <nicolas17> | the website sends a POST, but all the data is in the URL (no body) and if you do a GET to the same URL it still works |
| 22:34:34 | <nicolas17> | what does WBM do if you send a POST to an archived URL? |
| 22:34:51 | <nicolas17> | error, or pretends it's a GET and returns the archived response? |
| 22:36:27 | <nicolas17> | looks like the latter! so this might Just Work for replay! |
| 22:42:31 | <nicolas17> | actually there are POST fields, but the server doesn't seem to care, the data in the URL is enough |
| 22:45:33 | <nicolas17> | URL data has the current timestamp, so it's likely it expires, but I don't know what the expiration is |
| 22:53:00 | <@arkiver> | nicolas17: should work then |
| 22:53:15 | <@arkiver> | i believe (didn't check now) that it just return data to a POST as if it were a GET |
| 22:53:28 | <@arkiver> | another strategy i |
| 22:54:21 | <@arkiver> | another strategy i've used sometimes it that most POST request don't check any "extra" parameter you add, so you can add identifiable information in there, so the POSTed URLs can still be found/looked up manually (or with customization in the Wayback Machine at some point) |
| 22:57:19 | <nicolas17> | oh that should work without needing extra params |
| 22:57:49 | <nicolas17> | the URL will be unique |
| 22:57:57 | <nicolas17> | what I was checking is if you can just follow links in a browser and get a working download, and I *think* you can |
| 22:58:38 | <@arkiver> | sounds good :) |
| 23:00:36 | | Guest58 joins |
| 23:00:59 | | nine quits [Client Quit] |
| 23:01:12 | | nine joins |
| 23:01:12 | | nine is now authenticated as nine |
| 23:01:12 | | nine quits [Changing host] |
| 23:01:12 | | nine (nine) joins |
| 23:02:18 | <nicolas17> | http://www.oldversion.com/windows/winrar-2-00 download button does a POST (with a timestamp and signature) to http://www.oldversion.com/windows/download/winrar-2-00, then that page does a POST to http://software.oldversion.com/download.php?f=<base64 data> which gets the actual file, the timestamp is in the base64 data too |
| 23:03:06 | <nicolas17> | that timestamp/signature seems to expire quick so those 3 should be done sequentially |
| 23:10:47 | | wickedplayer494 joins |
| 23:10:57 | | wickedplayer494 is now authenticated as wickedplayer494 |