| 00:30:14 | | Arcorann (Arcorann) joins |
| 01:00:01 | | dm4v quits [Client Quit] |
| 01:06:18 | | dm4v joins |
| 01:06:20 | | dm4v is now authenticated as dm4v |
| 01:06:20 | | dm4v quits [Changing host] |
| 01:06:20 | | dm4v (dm4v) joins |
| 02:04:07 | | dm4v quits [Ping timeout: 252 seconds] |
| 02:04:16 | | dm4v_ joins |
| 02:04:42 | | dm4v_ is now known as dm4v |
| 02:04:42 | | dm4v is now authenticated as dm4v |
| 02:04:42 | | dm4v quits [Changing host] |
| 02:04:42 | | dm4v (dm4v) joins |
| 02:17:31 | | Atom-- joins |
| 02:19:41 | | Atom quits [Ping timeout: 265 seconds] |
| 03:14:01 | | fuzzy8021 quits [Read error: Connection reset by peer] |
| 03:15:02 | | fuzzy8021 (fuzzy8021) joins |
| 03:15:37 | | @dxrt quits [Ping timeout: 252 seconds] |
| 03:15:55 | | dxrt joins |
| 03:15:57 | | dxrt is now authenticated as dxrt |
| 03:15:57 | | dxrt quits [Changing host] |
| 03:15:57 | | dxrt (dxrt) joins |
| 03:15:57 | | @ChanServ sets mode: +o dxrt |
| 03:18:10 | | ThreeHM quits [Ping timeout: 265 seconds] |
| 03:19:58 | | ThreeHM (ThreeHeadedMonkey) joins |
| 03:24:30 | | shoghicp quits [Read error: Connection reset by peer] |
| 03:24:48 | | shoghicp (shoghicp) joins |
| 03:34:17 | | mutantmonkey quits [Remote host closed the connection] |
| 03:34:23 | | mutantmnky (mutantmonkey) joins |
| 03:44:48 | | sonick quits [Client Quit] |
| 03:48:17 | | fuzzy8021 quits [Read error: Connection reset by peer] |
| 03:49:24 | | fuzzy8021 (fuzzy8021) joins |
| 03:55:52 | | katocala quits [Ping timeout: 265 seconds] |
| 03:56:03 | | katocala joins |
| 04:00:43 | | katocala quits [Ping timeout: 252 seconds] |
| 04:01:00 | | katocala joins |
| 04:07:37 | | fuzzy8021 quits [Read error: Connection reset by peer] |
| 04:08:30 | | fuzzy8021 (fuzzy8021) joins |
| 04:10:04 | | katocala quits [Ping timeout: 252 seconds] |
| 04:10:14 | | katocala joins |
| 04:18:45 | | qw3rty_ joins |
| 04:22:27 | | qw3rty__ quits [Ping timeout: 265 seconds] |
| 04:25:43 | | fuzzy8021 quits [Read error: Connection reset by peer] |
| 04:27:40 | | fuzzy8021 (fuzzy8021) joins |
| 04:27:46 | | HP_Archivist quits [Ping timeout: 265 seconds] |
| 04:40:53 | | fuzzy8021 quits [Read error: Connection reset by peer] |
| 04:41:45 | | fuzzy8021 (fuzzy8021) joins |
| 04:53:23 | | katocala quits [Ping timeout: 265 seconds] |
| 04:53:45 | | katocala joins |
| 05:01:58 | | fuzzy8021 quits [Read error: Connection reset by peer] |
| 05:02:25 | | fuzzy8021 (fuzzy8021) joins |
| 05:04:44 | | Jonboy345 joins |
| 05:07:53 | | Jonboy3451 quits [Ping timeout: 265 seconds] |
| 05:12:21 | | qwertyasdfuiopghjkl quits [Client Quit] |
| 05:18:13 | | katocala is now authenticated as katocala |
| 05:19:58 | | russss quits [Ping timeout: 265 seconds] |
| 05:20:56 | | Ctrl-S quits [Ping timeout: 265 seconds] |
| 05:20:56 | | Dallas quits [Ping timeout: 265 seconds] |
| 05:20:56 | | mgrandi quits [Ping timeout: 265 seconds] |
| 05:21:02 | | Ctrl-S joins |
| 05:21:26 | | jrwr__ joins |
| 05:22:24 | | Dallas (Dallas) joins |
| 05:22:25 | | mgrandi (mgrandi) joins |
| 05:22:41 | | russss (russss) joins |
| 05:23:51 | | jrwr_ quits [Ping timeout: 622 seconds] |
| 05:23:51 | | jrwr__ is now known as jrwr_ |
| 06:19:59 | | ranr joins |
| 06:37:32 | | ranr quits [Remote host closed the connection] |
| 07:24:07 | | BlueMaxima quits [Read error: Connection reset by peer] |
| 07:32:50 | | ghuntley joins |
| 07:33:23 | <ghuntley> | Hey folks, how can we kick off an official project to archive microsoft ch9 before it goes offline? |
| 07:33:37 | <ghuntley> | FYI $ YouTube-dl supports downloading the videos |
| 07:33:45 | <ghuntley> | How can I help? |
| 07:36:11 | <ghuntley> | https://www.zdnet.com/article/microsoft-is-folding-channel-9-into-its-learn-portal/ |
| 07:36:29 | <ghuntley> | > Most of the video content published on or after November 1, 2017, will be automatically migrated |
| 07:36:50 | <ghuntley> | Basically content before 2017 (all the valuable stuff with engineers from key compsci people) are at risk. |
| 07:38:31 | <ghuntley> | Content after 2017 is questionable quality when compared to the timeless stuff from 2009 (https://channel9.msdn.com/Shows/Going+Deep/Expert-to-Expert-Brian-Beckman-and-Erik-Meijer-Inside-the-NET-Reactive-Framework-Rx) |
| 07:43:38 | | fuzzy8021 quits [Read error: Connection reset by peer] |
| 07:46:01 | | fuzzy8021 (fuzzy8021) joins |
| 08:15:10 | <ghuntley> | In theory is the approach to fork https://github.com/ArchiveTeam/youtube-grab and specialise it? |
| 08:16:21 | | RJHacker96492 quits [Ping timeout: 258 seconds] |
| 08:30:12 | | nepeat joins |
| 08:31:09 | | nepeat is now known as RJHacker88623 |
| 08:57:17 | <AK> | Hi ghuntley, just woken up and seen your tweets. You're correct in that the most likely method is going to be a customised version of youtube-grab |
| 08:58:17 | <AK> | Just taking a look through the logs and I think JAA did an AB job of https://channel9.msdn.com/ back in 2020 |
| 09:04:22 | <ghuntley> | I’ve got 2tb of space on my bare metal machine and have started a grab-site job. Any idea how big that previous job was? I don’t think 2TB will be enough due to all the video content. |
| 09:06:08 | <AK> | Hmm, looks like there were two jobs: 775489.25 MiB and 4453392.02 MiB |
| 09:06:14 | <AK> | So pretty large haha |
| 09:06:39 | <AK> | I think this is going to be too big for an AB job again this time |
| 09:06:55 | <ghuntley> | AB? |
| 09:08:32 | <AK> | ArchiveBot, dedicated grab machines essentially: https://wiki.archiveteam.org/index.php/ArchiveBot |
| 09:08:52 | <ghuntley> | My lua skills are non existent (been a long time since my world of Warcraft days) but happy to help out where I can with promotion and whatever. |
| 09:09:03 | <AK> | Work great for smaller sites, or those without a deadline, but as it's a single machine (for each job) it'll probably take too long this time |
| 09:09:38 | <AK> | I'm too young to have ever learnt lua, I just try to help out with server capacity mainly |
| 09:10:11 | <ghuntley> | Times like this I’m sad I quit the Microsoft MVP program; would have been able to use $13,000USD in credits to help here. |
| 09:11:14 | <ghuntley> | Okay so if AB isn’t going to help. That means we either need a machine with lots of space + grab-site or do warrior project? |
| 09:12:41 | <AK> | I think it's probably going to be a warrior project yeah |
| 09:16:20 | <h2ibot> | AK edited Deathwatch (+204, Add Channel9): https://wiki.archiveteam.org/?diff=47753&oldid=47686 |
| 09:52:37 | <ghuntley> | So we got circa 25 days to archive it all |
| 09:53:22 | | h3ndr1k quits [Quit: ] |
| 10:01:36 | | h3ndr1k (h3ndr1k) joins |
| 10:03:27 | | driib7 quits [Client Quit] |
| 10:03:46 | | driib7 (driib) joins |
| 11:06:02 | | mutantmnky quits [Remote host closed the connection] |
| 11:06:48 | | mutantmnky (mutantmonkey) joins |
| 11:07:10 | | sonick (sonick) joins |
| 12:19:30 | | Arcorann quits [Ping timeout: 265 seconds] |
| 13:31:41 | <@arkiver> | if those archivebot jobs were able to get the videos, it may be just enough to run it through archivebot again |
| 13:34:57 | <monika> | https://nopy.to/ which is a file host seemingly mostly used to host (pirated?) adult games went down 2 days ago due to payment processor issues |
| 13:35:05 | <monika> | https://old.reddit.com/r/Piracy/comments/qmgjgl/nopyto_is_going_death/ |
| 13:35:11 | <AK> | Alright, I'll start a run on ak-was-here and we can see how it goes |
| 13:39:11 | <@arkiver> | thanks AK |
| 13:45:00 | | Gereon62 quits [Read error: Connection reset by peer] |
| 13:45:12 | | Gereon62 (Gereon) joins |
| 13:46:01 | | HP_Archivist (HP_Archivist) joins |
| 13:59:49 | | wizards quits [Ping timeout: 258 seconds] |
| 14:01:34 | | wizards joins |
| 14:11:52 | <@arkiver> | the archivebot job seems to be getting mp4s |
| 14:15:59 | | wizards quits [Ping timeout: 265 seconds] |
| 14:17:27 | | sec^nd quits [Ping timeout: 258 seconds] |
| 14:17:46 | | wizards joins |
| 14:20:27 | | sec^nd (second) joins |
| 15:21:06 | | IDK quits [Quit: Connection closed for inactivity] |
| 15:40:05 | | tzt quits [Ping timeout: 265 seconds] |
| 16:38:08 | | TheTechRobo quits [Ping timeout: 258 seconds] |
| 16:50:06 | | TheTechRobo (TheTechRobo) joins |
| 17:18:27 | | lunik1 quits [Quit: :x] |
| 17:36:32 | | tzt (tzt) joins |
| 17:54:59 | | lunik1 joins |
| 18:01:26 | | qwertyasdfuiopghjkl joins |
| 18:10:24 | | sec^nd quits [Remote host closed the connection] |
| 18:10:50 | | sec^nd (second) joins |
| 18:48:41 | | qwertyasdfuiopghjkl93 joins |
| 18:50:18 | | qwertyasdfuiopghjkl quits [Ping timeout: 244 seconds] |
| 18:50:52 | | qwertyasdfuiopghjkl93 is now known as qwertyasdfuiopghjkl |
| 18:55:09 | | sonick quits [Client Quit] |
| 19:47:07 | | wizards quits [Ping timeout: 258 seconds] |
| 19:48:58 | | wizards joins |
| 19:56:19 | | wizards quits [Ping timeout: 258 seconds] |
| 19:58:10 | | wizards joins |
| 19:58:23 | <h2ibot> | Hyperrobbe edited List of websites excluded from the Wayback Machine (+32, www.systutorials.com – a Linux command…): https://wiki.archiveteam.org/?diff=47754&oldid=47731 |
| 20:00:09 | | TheTechRobo quits [Ping timeout: 258 seconds] |
| 20:00:23 | <h2ibot> | JAABot edited List of websites excluded from the Wayback Machine (+0): https://wiki.archiveteam.org/?diff=47755&oldid=47754 |
| 20:05:29 | | TheTechRobo (TheTechRobo) joins |
| 20:07:07 | <ghuntley> | Morning folks. Okay we need to kick off a job to archive a bugzilla. Just got tip-off that https://bugzilla.xamarin.com/ will be turned off in less than 30 hours. Contains the history of the development of mono. |
| 20:07:53 | <ghuntley> | @arkiver: ^^ |
| 20:30:21 | | Atom__ joins |
| 20:33:53 | | Atom-- quits [Ping timeout: 258 seconds] |
| 20:50:04 | <ThreeHM> | Interesting site; went read only in 2019 and has been converted into static HTML. Old links to the original bugzilla pages are being redirected through JS in their 404 page (https://bugzilla.xamarin.com/404.html). There is a copy on github (https://github.com/xamarin/bugzilla-archives), but that seems to be missing attachments. |
| 20:58:07 | <@OrIdow6> | ThreeHM: I am fairly sure that the subdomain is just GH pages for that repo |
| 20:58:58 | <ghuntley> | “Just a heads up, https://github.com/xamarin/bugzilla-archives |
| 20:58:58 | <ghuntley> | The Xamarin Bugzilla archives (http://bugzilla.xamarin.com) are gonna be taken down soon, probably as early as Monday. Apparently, there are some tokens user accidentally posted within the archives from 2013, and security flagged the repo. Instead of clearing it, the Release Engineering team wants to delete it since it could have more tokens or other personal info.” |
| 21:00:40 | <ThreeHM> | OrIdow6: Yeah, seems to be. The DNS record points to xamarin.github.io. |
| 21:00:47 | <@OrIdow6> | ghuntley: You said "less than 30 hours", are you just inferring that from "as early as Monday"? And is the source of this text public? |
| 21:04:59 | <@OrIdow6> | A bit on attachments here https://github.com/xamarin/bugzilla-archives/issues/11 |
| 21:10:17 | | BlueMaxima joins |
| 21:11:10 | <ghuntley> | Source of the text is not public and I cannot reveal my sources apart from saying it’s from folks within Microsoft. |
| 21:11:53 | <ghuntley> | By less than 30 hours, I’m I am interfering “as early as Monday” so it could be sooner. |
| 21:18:32 | | qwertyasdfuiopghjkl95 joins |
| 21:21:10 | | qwertyasdfuiopghjkl quits [Ping timeout: 244 seconds] |
| 21:30:34 | <ghuntley> | Can we kick off a AB job that does https://bugzilla.xamarin.com/ plus all links to https://xamarinbugzillaarchives.blob.core.windows.net |
| 21:32:18 | | sonick (sonick) joins |
| 21:37:14 | <@OrIdow6> | It sounds like there is not a way to avoid grabbing "tokens or other personal info" |
| 21:37:20 | <@JAA> | Running now |
| 21:39:34 | <@OrIdow6> | The attachments seem to happen in a JS redirect |
| 21:40:01 | <@OrIdow6> | But since this site is served from a public git repo should be easy to make a list |
| 21:41:03 | <@JAA> | Have an example with an attachment? |
| 21:42:10 | <@JAA> | Nevermind, found one: https://bugzilla.xamarin.com/55/55721/bug.html |
| 21:50:11 | <ThreeHM> | We might also want to grab the original bugzilla URLs (https://bugzilla.xamarin.com/show_bug.cgi?id=XYZ ; https://bugzilla.xamarin.com/show_activity.cgi?id=XYZ) for all bug IDs to keep external links intact. Those are also redirected through JS. |
| 21:52:23 | <@JAA> | Attachments are running now. For reference, this is how I generated the list: git grep -F '/attachment.cgi' | grep -Po 'href="\K[^"]+' | sed 's,&,\&,g' | ~/little-things/uniqify | grep -Po '^https://bugzilla\.xamarin\.com/attachment\.cgi\?id=\K\d+&file=[^&]+$' | sed 's,&file=, ,' | awk '$1 < 10 { print "0/" $1 "/" $2; } $1 >= 10 { print substr($1, 1, 2) "/" $1 "/" $2; }' | sed |
| 21:52:29 | <@JAA> | 's,^,https://xamarinbugzillaarchives.blob.core.windows.net/attachments/,' |
| 21:57:43 | <@JAA> | ThreeHM: Yup, running as well now. |
| 22:24:48 | | qwertyasdfuiopghjkl joins |
| 22:26:47 | | qwertyasdfuiopghjkl95 quits [Ping timeout: 244 seconds] |
| 22:32:25 | <@JAA> | Also dumped all their GitHub repos as bundles because why not: https://archive.org/details/github.com_xamarin_bundles_20211106 (still uploading) |
| 22:59:23 | <@JAA> | Attachments should all be covered. Had to handle three separately because they fucked up the filename conversion (+ was converted to space), all others seem to have worked. |
| 23:02:43 | | AlsoHP_Archivist joins |
| 23:06:27 | | HP_Archivist quits [Ping timeout: 258 seconds] |
| 23:56:16 | <ghuntley> | Thanks so much |