02:23:04 | <pabs> | did these get added? https://moddingfridays.bleu255.com/ https://things.bleu255.com/runyourown/ http://wiki.artserver.org/index.php/Main_Page http://www.gc-forever.com/wiki/index.php?title=Main_Page https://wiki.hyperbola.info/ https://nick-black.com/dankwiki/index.php/Hack_on |
02:23:33 | <pokechu22> | I don't think so, I've been a bit behind |
02:23:41 | <pokechu22> | I did do gc-forever probably half a year ago though |
02:23:58 | <pabs> | ok |
02:24:23 | <pokechu22> | I'll run the rest now though |
02:24:49 | <pabs> | thanks |
04:04:23 | <pokechu22> | https://wiki.hyperbola.info is dokuwiki, Exorcism|m or someone else will need to dump it as I don't have the tools for that set up. But it does link to https://wiki.parabola.nu/Main_Page in the footer and I can save that |
04:33:14 | <DigitalDragon> | ^^ done: https://archive.org/details/wiki-wiki.hyperbola.info-20230530 |
07:48:19 | | Sir_Bedivere joins |
07:48:23 | | atphoenix quits [Remote host closed the connection] |
07:48:23 | | AnotherIki quits [Remote host closed the connection] |
07:48:23 | | Bedivere quits [Remote host closed the connection] |
07:48:30 | | AnotherIki joins |
07:48:38 | | atphoenix (atphoenix) joins |
10:09:02 | | TastyWiener95 quits [Quit: So long, farewell, auf wiedersehen, good night] |
10:09:56 | | TastyWiener95 (TastyWiener95) joins |
10:18:51 | | TastyWiener95 quits [Client Quit] |
11:31:47 | | TastyWiener95 (TastyWiener95) joins |
14:38:17 | | parfait (kdqep) joins |
14:42:07 | | hitgrr8 joins |
15:23:07 | | eroc19909 is now known as eroc1990 |
16:35:51 | | parfait quits [Ping timeout: 265 seconds] |
19:03:12 | | icedice (icedice) joins |
19:03:46 | <icedice> | Can someone scrape The Lost Media Wiki for Imgur links to archive? |
19:04:06 | <icedice> | The forums have been scraped already, but apparently most Imgur links are on the wiki itself |
19:04:56 | <icedice> | https://lostmediawiki.com/ |
19:05:32 | <pokechu22> | Looks like there's a 2018 dump at https://archive.org/download/wiki-lostmediawikicom |
19:06:59 | <pokechu22> | https://lostmediawiki.com/Special:MediaStatistics says 11GB of files (which, granted, isn't required for this purpose, but would still be good to save) |
19:09:57 | <icedice> | Worth a full archivation, I'd say |
19:11:21 | <icedice> | There's also The Cutting Room Floor which has some Imgur links and would be worth saving |
19:11:22 | <icedice> | https://tcrf.net/The_Cutting_Room_Floor |
19:11:37 | <pokechu22> | Yeah, that one was saved about a month ago I think? |
19:11:54 | <icedice> | Nice |
19:12:07 | <pokechu22> | and for it, https://tcrf.net/Special:MediaStatistics is 87.58 GB |
19:12:15 | <icedice> | Then it can just be scraped |
19:12:21 | <icedice> | from the WARC |
19:13:56 | <pokechu22> | wikiteam tools don't generate a WARC, but it should be possible to just download the history.xml for it and work with that (and fortunately the tools generate two 7z files, one with images and one without, specifically so that you don't need to download 87GB of images if you only want page text) |
19:15:10 | <pokechu22> | https://archive.org/details/wiki-tcrfnet-20230322 - I'll download and extract from this (wow, the compressed XML is still 500 MB, that's pretty big) |
19:16:02 | <masterX244> | running wikiteam tool atm on LMW |
19:18:58 | <icedice> | Nice |
19:19:00 | <icedice> | Thanks! |
19:28:57 | <pokechu22> | For reference, the command I use to extract links is `7z x -so tcrfnet-20230322-history.xml.7z '*.xml' | grep -Fhai -e 'mediafire' -e 'imgur' > tcrfnet-20230322-history.xml_temp.txt` followed by `grep -Pahoi '\S*imgur\S*' tcrfnet-20230322-history.xml_temp.txt | sort -u > tcrfnet-20230322-history.xml_imgur.txt` |
20:16:15 | | icedice quits [Ping timeout: 265 seconds] |
20:16:48 | | icedice (icedice) joins |
21:10:55 | | hitgrr8 quits [Client Quit] |
22:06:41 | | luckcolors_ quits [Quit: No Ping reply in 180 seconds.] |
22:08:15 | | luckcolors (luckcolors) joins |
22:09:20 | | Nemo_bis quits [Remote host closed the connection] |
22:09:27 | | Nemo_bis (Nemo_bis) joins |
22:15:44 | | TastyWiener95 quits [Client Quit] |
22:30:37 | | icedice quits [Ping timeout: 265 seconds] |
23:18:28 | | Bedivere joins |
23:20:14 | | Sir_Bedivere quits [Ping timeout: 252 seconds] |
23:29:13 | | TastyWiener95 (TastyWiener95) joins |