#webroasting log for 2023-09-15

Home Search Previous day Next day

00:11:20	<@JAA>	arkiver, rewby: Probably a typo on the second one, should be 'pagespersoorange_' not 'pagepersoorange_', right?
00:16:26		Jake quits [Client Quit]
00:18:55		Jake (Jake) joins
00:25:09		Jake quits [Client Quit]
00:25:39		nstrom\|m joins
00:51:59		Jake (Jake) joins
01:11:30		Jake quits [Client Quit]
01:15:38		qwertyasdfuiopghjkl (qwertyasdfuiopghjkl) joins
01:18:32		Jake (Jake) joins
03:58:59		Peroniko quits [Client Quit]
08:51:41		Jake quits [Client Quit]
09:23:38		Jake (Jake) joins
10:10:41		plcp joins
10:18:15	<plcp>	115Go for the ~3100 orange sites here, and their definitely are power-law-ish, my top 20 is ~1Go each min, my top 100 is ~80Mo each min, and my top -1000 is 60Ko or less
10:18:59	<plcp>	have a friend that done ~300Go of websites yet, and have similar results
10:19:47	<plcp>	(and we only noticed yesterday that "mainline" wget isn't that recommended to output warcs for web archival)
10:29:18	<plcp>	*222Go for 19.3k sites
10:30:19	<plcp>	and that includes the wget logs, that for some small websites, are sometimes larger than the website itself
10:31:03	<plcp>	(he's downloading at random, I've sorted my targets by the number of links / amount of text on the homepage)
12:26:56	<@rewby\|backup>	JAA: I don't remember which I used but I just copy paste the tracker slug and the script works out the file and item prefixes. I do manually copy the item title prefix though
12:58:56		Peroniko joins
12:59:07		Peroniko is now authenticated as Peroniko
18:52:50		Jake quits [Client Quit]
18:58:36		Jake (Jake) joins
18:59:42		Jake quits [Client Quit]
19:21:23		Jake (Jake) joins
22:58:44		Jake quits [Client Quit]
23:00:31		Jake (Jake) joins

Home Search Previous day Next day