| 00:04:36 | | etnguyen03 quits [Client Quit] |
| 00:11:23 | | arch quits [Remote host closed the connection] |
| 00:11:35 | | arch (arch) joins |
| 00:32:43 | | APOLLO03 joins |
| 00:46:08 | | notarobot17 quits [Ping timeout: 256 seconds] |
| 01:04:16 | | APOLLO03 quits [Ping timeout: 256 seconds] |
| 01:08:06 | | notarobot17 joins |
| 01:18:46 | | bug quits [Quit: Leaving] |
| 01:24:39 | | Wohlstand quits [Quit: Wohlstand] |
| 01:28:23 | | azalea_sh_ quits [Ping timeout: 272 seconds] |
| 01:28:36 | | azalea_sh_ (azalea_sh_) joins |
| 01:39:47 | | notarobot17 quits [Ping timeout: 272 seconds] |
| 01:49:46 | | AS54591 joins |
| 01:53:52 | | corentin0 joins |
| 01:54:28 | | corentin quits [Read error: Connection reset by peer] |
| 01:54:28 | | corentin0 is now known as corentin |
| 02:13:13 | | sec^nd quits [Remote host closed the connection] |
| 02:21:14 | | cyanbox joins |
| 02:25:49 | | etnguyen03 (etnguyen03) joins |
| 02:27:31 | | sec^nd (second) joins |
| 03:04:31 | | Webuser105657 joins |
| 03:06:06 | <Webuser105657> | Hi, I would like to request the archiving of an important trove of legal / legislative materials regarding the former British colony of Newfoundland, a significant portion of which seems to be not yet archived. |
| 03:08:06 | <pokechu22> | Webuser105657: We can probably do that (though saving full laws is often too big with how they present previous versions) |
| 03:09:00 | <pokechu22> | What's the site? |
| 03:09:34 | <Webuser105657> | The collection starts here: |
| 03:09:34 | <Webuser105657> | https://dai.mun.ca/digital/statutesnl/ |
| 03:09:34 | <Webuser105657> | Many of those links there are to pages in the subdomain collections.mun.ca, eventually leading to PDF files in the dai.mun.ca subdomain, which are important documents that I think should be protected by archiving. (Only a portion have been archived at all.) |
| 03:09:34 | <Webuser105657> | Other links go to webpages and then PDF files in the subdomain assemblynl.inmagic.com and in the resource www.assembly.nl.ca/legislation (on which some links/PDFs have previously been archived and some have not). |
| 03:10:04 | | Mateon1 quits [Ping timeout: 256 seconds] |
| 03:10:32 | | Mateon1 joins |
| 03:12:15 | | cultpony quits [Ping timeout: 272 seconds] |
| 03:13:39 | | cultpony (cultpony) joins |
| 03:14:10 | <pokechu22> | https://collections.mun.ca/ is ContentDM, which I have tooling to save (but looking at https://collections.mun.ca/sitemap.xml it's also *really* big - each entry listed there is ~50k URLs, so that's ~4 million pages, each of which has several other subpages) |
| 03:14:56 | <pokechu22> | I can make a list for archivebot, but it'll probably take several months to actually finish |
| 03:18:31 | <Webuser105657> | My personal concern is mostly about the legal and legislative materials that are linked from the start page (https://dai.mun.ca/digital/statutesnl/), which is probably only a tiny fraction of the overall DAI.mun.ca materials. |
| 03:18:49 | <Webuser105657> | Longer-term, I'm sure that scholars and researchers who focus on other areas of Newfoundland history and culture would probably appreciate having the entire site archived, just in case. But I definitely understand how that could take months to be archived. |
| 03:19:03 | | gosc joins |
| 03:23:04 | <pokechu22> | I'll see if I can modify my script to save only the statutesnl collection on https://collections.mun.ca/ |
| 03:24:48 | <Webuser105657> | Thank you. And thank you for what I'm sure is a tremendous amount of volunteer labour for all of the archiving efforts. |
| 03:25:36 | <pokechu22> | https://assemblynl.inmagic.com/Presto/content/AdvancedSearch.aspx?ctID=MDQ2Yzk1MjctMTgxNC00ZWRlLTk0NGUtMDg4NTc4MzgwMWVi looks like it's largely a POST-based search which can't be saved directly, but if they're linked from https://assembly.nl.ca/legislation/ or something like that then I can just save https://assembly.nl.ca/ |
| 03:26:57 | <pokechu22> | oh, or actually it seems like all of the 1969 statues are links to pages in https://www.assembly.nl.ca/ArchivedStatutes/SN1969.pdf |
| 03:37:51 | <pokechu22> | I started an archivebot job for https://transfer.archivete.am/inline/oi6cG/www.assembly.nl.ca_ArchivedStatutes.txt - I *think* the only valid files are from 1833-1970 but it's easy enough to save the whole range |
| 03:40:17 | <pokechu22> | I've started a second archivebot job for all of https://assembly.nl.ca/ / https://www.assembly.nl.ca/; you can watch that at http://archivebot.com/?initialFilter=assembly.nl.ca |
| 03:44:55 | | etnguyen03 quits [Client Quit] |
| 03:50:10 | | etnguyen03 (etnguyen03) joins |
| 04:05:32 | | etnguyen03 quits [Remote host closed the connection] |
| 04:08:33 | <nulldata> | A broadcast quality version of the CBS CECOT segment https://www.thereset.news/p/breaking-heres-the-60-minutes-segment |
| 04:09:08 | <Webuser105657> | @pokechu22 - do you have a sense of when the .mun.ca archiving might show up on the archivebot site? (I imagine that the modifying of the script could take some time. I was just trying to figure out when I should come back to the archivebot progress page to start looking for that.) |
| 04:09:27 | <nicolas17> | nulldata: wonder if it's this same version https://archive.org/details/60minutes-cecotsegment |
| 04:11:05 | | DogsRNice quits [Read error: Connection reset by peer] |
| 04:11:48 | <nulldata> | Seems like it - though the archive.org version includes the teaser at the start too |
| 04:14:06 | | adryd019 quits [Ping timeout: 256 seconds] |
| 04:22:59 | <pokechu22> | Webuser105657: I don't know yet, but should know in ~15 minutes |
| 04:33:24 | <Webuser105657> | thank you @pokechu22 |
| 04:34:17 | | FoodNerd quits [Quit: Bye for now!] |
| 04:36:55 | <pokechu22> | Doesn't look too bad since https://collections.mun.ca/digital/collection/statutesnl/search is only 169 items (though each item is composed of multiple pages). I've got the script configured and am generating a URL list, but that might take around an hour. It will find links like https://dai.mun.ca/PDFs/statutesnl/StatutesofNewfoundland1833.pdf as well as allow navigation on |
| 04:36:57 | <pokechu22> | that part of https://collections.mun.ca/ though |
| 04:37:29 | | FoodNerd joins |
| 04:38:40 | <pokechu22> | (there won't be anything on archivebot.com until I generate the list, as generating the list happens on my laptop) |
| 04:46:40 | | Webuser660499 joins |
| 04:47:21 | | Webuser660499 quits [Client Quit] |
| 04:51:03 | | Webuser465956 joins |
| 04:51:52 | | Webuser465956 quits [Client Quit] |
| 04:58:00 | | HP_Archivist quits [Quit: Leaving] |
| 05:17:50 | <pokechu22> | OK, it's going to be significantly longer than an hour it seems. There are 61716 pages under there (since each of the 169 items is a book with a ton of pages) |
| 05:24:59 | | nexussfan quits [Quit: Konversation terminated!] |
| 05:36:47 | <Webuser105657> | @pokechu22 ah, okay that makes sense. |
| 06:27:10 | | ArchivalEfforts quits [Quit: No Ping reply in 180 seconds.] |
| 06:28:19 | | ArchivalEfforts joins |
| 06:29:48 | | Aurora joins |
| 06:35:20 | <Aurora> | hi i downloaded 11,299,548 videos and jsons from gif sharing platform tenor i was told you guys would be interested in that, it should be every post with a legacy ID in the json (every post before september 2024 or so) with the exception of 11 broken links |
| 07:09:16 | | Dango3608 (Dango360) joins |
| 07:12:55 | | Dango360 quits [Ping timeout: 272 seconds] |
| 07:12:55 | | Dango3608 is now known as Dango360 |
| 08:03:05 | | Webuser599043 joins |
| 08:04:13 | | Webuser599043 quits [Client Quit] |
| 08:39:10 | | Hackerpcs quits [Quit: Hackerpcs] |
| 08:48:22 | | Shard79591 quits [Ping timeout: 256 seconds] |
| 08:55:18 | | Shard79591 (Shard) joins |