00:15:37nerdguy1138 quits [Ping timeout: 258 seconds]
00:31:27nerdguy1138 (nerdguy1138) joins
00:34:09Sylirana quits [Ping timeout: 244 seconds]
00:34:58Sylirana (Sylirana) joins
00:40:15Mineroboter_ joins
00:41:00Mineroboter quits [Ping timeout: 250 seconds]
00:47:56Arcorann_ quits [Ping timeout: 250 seconds]
01:03:32dm4v quits [Ping timeout: 250 seconds]
01:05:07dm4v joins
01:05:09dm4v quits [Changing host]
01:05:09dm4v (dm4v) joins
01:52:57HP_Archivist quits [Client Quit]
02:26:18Zerote_ quits [Ping timeout: 250 seconds]
02:31:47Zerote joins
03:07:37Zerote_ joins
03:10:30Zerote quits [Ping timeout: 250 seconds]
03:25:29DopefishJustin quits [Remote host closed the connection]
03:33:14DopefishJustin joins
03:37:12DogsRNice quits [Read error: Connection reset by peer]
03:39:19qw3rty__ joins
03:41:11pcr leaves
03:41:13pcr joins
03:43:00qw3rty_ quits [Ping timeout: 258 seconds]
03:51:52webdownload quits [Remote host closed the connection]
04:08:19superkuh joins
04:21:48etnguyen03 quits [Client Quit]
04:43:57benjins quits [Ping timeout: 258 seconds]
05:18:45cmlow quits [Quit: Connection closed for inactivity]
05:45:38howardad quits [Ping timeout: 250 seconds]
05:56:49rbraun joins
06:20:41benjins joins
06:39:25howardad (howardad) joins
06:42:11VukkyWork (VukkyWork) joins
06:54:34MaxG joins
07:00:25VukkyWork quits [Remote host closed the connection]
07:28:17duce1337 (duce1337) joins
07:39:51Arcorann_ joins
07:53:52tzt quits [Changing host]
07:53:52tzt (tzt) joins
08:31:33BlueMaxima quits [Read error: Connection reset by peer]
08:59:20icedice quits [Ping timeout: 250 seconds]
09:33:44bobbyb quits [Remote host closed the connection]
09:33:56bobbyb joins
09:35:31lennier1 (lennier1) joins
10:02:09hilda quits [Read error: Connection reset by peer]
10:06:40hilda joins
10:28:49duce1337_ (duce1337) joins
10:28:49duce1337 quits [Read error: Connection reset by peer]
11:15:43Gereon quits [Ping timeout: 258 seconds]
11:16:30Gereon (Gereon) joins
11:52:38LeGoupil joins
11:54:15pcr leaves
11:54:17pcr joins
12:18:48ThreeHea1 (ThreeHeadedMonkey) joins
12:18:58ThreeHeadedMonkey quits [Ping timeout: 258 seconds]
12:19:32ThreeHea1 is now known as ThreeHeadedMonkey
12:27:26IKI joins
13:08:14benjinsmith joins
13:11:06benjins quits [Ping timeout: 250 seconds]
13:24:43benjinsmith is now known as benjins
14:03:51hilda quits [Client Quit]
14:14:01cmlow (cmlow) joins
14:17:25HackMii_ quits [Ping timeout: 258 seconds]
14:19:15HackMii_ (hacktheplanet) joins
14:21:53nuroten quits [Remote host closed the connection]
14:24:44duce1337_ quits [Read error: Connection reset by peer]
14:24:44duce1337 (duce1337) joins
14:25:39sonick quits [Quit: Connection closed for inactivity]
14:52:51nuroten joins
15:14:32<betamax>jodizzle: FYI I've just reprocessed the lists of party / candidate websites, to only have base URLs (and then removed duplicates). This will have removed any that are just subsections of a larger site. New lists linked on the wiki page.
15:15:18Arcorann_ quits [Ping timeout: 258 seconds]
15:15:27<betamax>The one danger with the new lists is that there could be sites like "about.me/<candidate>" that are now just "about.me" - I've already removed "about.me" and "youtube.com" from the lists, but there could be more.
15:18:58<betamax>I've also just put the candidate web pages (not sites, the single pages) into AB as an '!ao <' job
16:24:38lennier1 quits [Client Quit]
16:33:17Sylirana quits [Read error: Connection reset by peer]
16:34:25Sylirana (Sylirana) joins
16:43:26endrift quits [Ping timeout: 250 seconds]
16:43:36endrift joins
16:47:49lennier1 (lennier1) joins
16:48:02lennier2 quits [Client Quit]
16:52:57DogsRNice (Webuser299) joins
17:49:19duce1337_ (duce1337) joins
17:49:19duce1337 quits [Read error: Connection reset by peer]
17:55:00rbraun quits [Client Quit]
17:59:30<Sanqui>comic genesis would be nice to archive
18:00:18<Sanqui>hmm
18:00:28<Sanqui>their search is broken, but I'm going to start a forums grab and we can derive domains from that
18:02:00Daloader joins
18:32:15yarrow leaves
18:42:22<AK>ori
18:42:25<AK>Well oops
18:44:42<@EggplantN>What you done now AK
18:44:58<AK>Attempted to launch origin to play some titanfall 2
18:45:37<@EggplantN>Fuck sake AK
18:47:37spirit joins
19:30:42Daloader quits [Ping timeout: 250 seconds]
19:39:00spirit quits [Client Quit]
19:41:48@EggplantN is now known as @EggplantBot
19:41:59@EggplantBot is now known as @EggplantN
19:51:41LeighR (LeighR) joins
21:07:23<betamax>JAA: my plan is to start feeding the candidate / party sites into AB via '!a <'. If I do 100 per job then there will be around 16 jobs total.
21:07:53<betamax>For the first one or two I may try with outlinks enabled, and can turn that off in future jobs if it proves to be an issue.
21:10:55<betamax>An alternative approach would be for me to archive them all manually and upload the WARCs to IA, like I did for the US 2018 midterms - https://archive.org/details/2018_us_midterm_campaign_site_archive
21:11:00<betamax>But I don't think there's really any benefits to that approach aside form not clogging up AB pipelines - the resulting WARCs are less complete (no outlinks) and can't go into the wayback
21:20:03duce1337_ quits [Read error: Connection reset by peer]
21:20:18duce1337 (duce1337) joins
21:23:13godane joins
21:43:11duce1337 quits [Client Quit]
21:43:27sonick (sonick) joins
21:44:10Wayward quits [Ping timeout: 250 seconds]
21:45:11Wayward (wayward) joins
21:48:34LeGoupil quits [Client Quit]
22:00:10LeighR quits [Client Quit]
22:01:43sec^nd quits [Remote host closed the connection]
22:02:07sec^nd (second) joins
22:16:53<@JAA>betamax: Yeah, let's try. !a < is restricted though as it has many pitfalls. Let me know when you have the lists ready, and I'll look over them and throw them in.
22:17:16<@JAA>Try to group them such that there's little chance of crosslinks as those mess with the recursion.
22:29:20webdownload joins
22:30:09<betamax>JAA: thanks. It's late here (got distrated - oops) so I'll make the lists tomorow. By "crosslinks", you mean sites that refer to each other? I'm not sure if there's an easy / obvious way to do that...
22:31:31<webdownload>I pronounce www.ted.com to be fully archived at Heatengine.
22:36:53<@JAA>betamax: Yeah, that's what I mean. The problem is that if you !a < a list that has example.org and example.net, and then the former has a link to example.net/foo/ and gets retrieved before a page from example.net linking there, it won't recurse further from that page.
22:40:02<@JAA>And nope, there isn't an easy way to do this. You'd have to group the sites accordingly, e.g. build lists of candidates all from different parties or parts of the country, which while obviously not a guarantee would at least lower the risk considerably.
22:40:56MaxG quits [Remote host closed the connection]
23:00:54BlueMaxima joins
23:27:14Arcorann_ joins
23:52:58sonick quits [Client Quit]