00:03:22AlsoHP_Archivist joins
00:06:25HP_Archivist quits [Ping timeout: 258 seconds]
00:06:38HP_Archivist (HP_Archivist) joins
00:09:16AlsoHP_Archivist quits [Ping timeout: 250 seconds]
00:10:27<Ryz>!delay bfarujiu2orkbdx4l27roivbp 0 200
00:10:29<Ryz>Oops
00:12:27BlueMaxima quits [Read error: Connection reset by peer]
00:12:57fuzzy8021 quits [Killed (NickServ (GHOST command used by fuzzy802!~fuzzy8021@173-224-26-244.ptcnet.net))]
00:13:02fuzzy8021 (fuzzy8021) joins
00:13:11BlueMaxima joins
00:14:02@dxrt quits [Ping timeout: 250 seconds]
00:14:39ave quits [Remote host closed the connection]
00:14:39lun4 quits [Client Quit]
00:14:39linuxgemini quits [Remote host closed the connection]
00:14:55lun4 (lun4) joins
00:14:56ave (ave) joins
00:15:00linuxgemini (linuxgemini) joins
00:15:56dxrt joins
00:15:58dxrt quits [Changing host]
00:15:58dxrt (dxrt) joins
00:15:58@ChanServ sets mode: +o dxrt
00:31:50@OrIdow6 quits [Remote host closed the connection]
00:53:40CittyKat (CittyKat) joins
01:02:17CittyKat quits [Remote host closed the connection]
01:02:34dm4v quits [Ping timeout: 250 seconds]
01:04:30dm4v joins
01:04:33dm4v quits [Changing host]
01:04:33dm4v (dm4v) joins
01:17:55fungisimp joins
01:19:45fungisimp quits [Remote host closed the connection]
01:20:46OrIdow6 (OrIdow6) joins
01:20:46@ChanServ sets mode: +o OrIdow6
01:25:04Mineroboter joins
01:26:09Mineroboter_ quits [Ping timeout: 258 seconds]
02:30:32Matthww quits [Ping timeout: 250 seconds]
02:34:01Matthww joins
02:41:40sliccricc quits [Ping timeout: 258 seconds]
03:24:15Iki quits [Ping timeout: 244 seconds]
03:36:23DogsRNice quits [Read error: Connection reset by peer]
03:38:01Matthww quits [Ping timeout: 258 seconds]
03:40:04Matthww joins
03:45:37Matthww3 joins
03:47:40Matthww quits [Ping timeout: 250 seconds]
03:47:40Matthww3 is now known as Matthww
03:56:35qw3rty_ joins
04:00:15qw3rty quits [Ping timeout: 258 seconds]
04:05:17etnguyen03 quits [Client Quit]
04:43:09lennier1 quits [Read error: Connection reset by peer]
04:43:23lennier1 (lennier1) joins
04:51:48Wayward quits [Ping timeout: 250 seconds]
05:36:45sgettel joins
05:37:21sgettel quits [Remote host closed the connection]
05:58:17<@JAA>ArchiveBox released a 'good karma kit' Docker Compose thingy a few days ago which includes our warrior image: https://github.com/ArchiveBox/good-karma-kit
06:20:02<thuban>a nice idea; however, due to the human costs of electricity, it is important to consider the expected benefits of providing each service (see https://www.gwern.net/Charity)
07:58:26<purplebot>Bandcamp edited by JesseW (+185, link to archivebot job of artist_index) just now -- https://www.archiveteam.org/?diff=46563&oldid=46548
08:03:26<purplebot>Coronavirus edited by Gridkr (+478, /* Miscellaneous */) just now -- https://www.archiveteam.org/?diff=46564&oldid=46271
08:04:26<purplebot>Chromebot edited by Iki (+176, +info on Wayback Machine ingestion …) just now -- https://www.archiveteam.org/?diff=46565&oldid=42865
08:10:50spirit joins
08:32:03@arkiver quits [Quit: .]
08:32:16arkiver (arkiver) joins
08:32:16@ChanServ sets mode: +o arkiver
08:46:17BlueMaxima quits [Client Quit]
09:01:20Zopolis4 (Zopolis4) joins
09:49:10<LeighR>"Does target site appear to support and/or prefer IPv6" looks like something to add to whatever checklist you guys use when determining grabber running advice
09:50:21<LeighR>because that increased the density of IP-ban-safe grabbers on a cheap Hetzner VM quite a bit
09:50:36<LeighR>(at least on the yahooanswers project)
09:50:43<@HCross>it depends tho
09:50:50<@HCross>on how the site blocks, if they kill off a /64 at a time
09:50:53<@HCross>or a /128
09:51:39<LeighR>knock wood, it appears that Y!A does not block /64s (or I'm still under their threshold)
09:53:50shoghicp quits [Ping timeout: 250 seconds]
09:54:55<LeighR>I have 400 single-concurrency containers running on one VM spread across 10 /80s, but of course, at current rate-limiting, they're each doing an average of 6 jobs/hr
09:57:28shoghicp (shoghicp) joins
10:20:54shoghicp quits [Ping timeout: 258 seconds]
10:24:51shoghicp (shoghicp) joins
10:44:57mrfooooo joins
11:04:20Hackerpcs quits [Quit: Hackerpcs]
11:17:19Matthww2 joins
11:18:20Matthww quits [Ping timeout: 250 seconds]
11:18:20Matthww2 is now known as Matthww
11:21:12Zopolis4 quits [Remote host closed the connection]
11:22:29spirit quits [Client Quit]
11:27:23Matthww4 joins
11:28:18Matthww quits [Ping timeout: 250 seconds]
11:28:18Matthww4 is now known as Matthww
11:34:50Hackerpcs (Hackerpcs) joins
11:36:27Matthww4 joins
11:38:42Matthww quits [Ping timeout: 250 seconds]
11:38:42Matthww4 is now known as Matthww
11:41:15VerifiedJ quits [Client Quit]
11:41:50Hackerpcs quits [Client Quit]
11:47:08Hackerpcs (Hackerpcs) joins
11:47:57VerifiedJ (VerifiedJ) joins
11:49:40Matthww3 joins
11:50:24Matthww quits [Ping timeout: 250 seconds]
11:50:24Matthww3 is now known as Matthww
11:55:02HP_Archivist quits [Read error: Connection reset by peer]
12:17:15Matthww3 joins
12:19:26Matthww quits [Ping timeout: 250 seconds]
12:19:26Matthww3 is now known as Matthww
13:20:29Sylirana quits [Ping timeout: 244 seconds]
13:20:56Sylirana (Sylirana) joins
13:26:36Iki joins
15:13:38Arcorann (Arcorann) joins
15:51:23Mineroboter quits [Client Quit]
15:53:42Mineroboter joins
16:02:37onetruth joins
16:16:38Arcorann quits [Ping timeout: 258 seconds]
16:20:06etnguyen03 (etnguyen03) joins
16:23:54Iki quits [Ping timeout: 244 seconds]
16:30:49Iki joins
17:02:22Wayward (wayward) joins
17:18:08ragu quits [Read error: Connection reset by peer]
17:34:27DogsRNice (Webuser299) joins
18:27:26<purplebot>ArchiveBot/Alternative media (political left)/list edited by Iki (+114, +urls. Some already-saved, some …) just now -- https://www.archiveteam.org/?diff=46566&oldid=38980
18:28:11<@JAA>Iki: FYI, these ArchiveBot/* pages aren't updated anymore and will be migrated to something else soon™.
18:49:59<mgrandi>Oh is that a handy manually curated list of what has been archived recently?
18:56:34<masterX244>had a few stupoid bugs due to site differences in my TM-exchange discovery crawler, capturing the last part of data and then once i got it its sorting and cross-checking time
19:23:05superkuh quits [Quit: the neuronal action potential is an electrical manipulation of reversible abrupt phase changes in the lipid bilayer]
19:24:49hooway joins
20:09:41LeighR quits [Ping timeout: 244 seconds]
20:10:28Barto quits [Ping timeout: 250 seconds]
20:10:41Barto (Barto) joins
20:18:05@EggplantN quits [Quit: Ping timeout (120 seconds)]
20:18:23EggplantN joins
20:18:32EggplantN quits [Changing host]
20:18:32EggplantN (EggplantN) joins
20:18:32@ChanServ sets mode: +o EggplantN
20:21:10LeighR (LeighR) joins
21:14:13hooway quits [Client Quit]
22:01:39LeighR quits [Client Quit]
22:27:57user (user) joins
22:28:18<user>Hi, all. New to IRC, so tell me if I break protocol.
22:30:44<user>I wanted to ask why some users are able to contribute so much, while my contributions are far and between. This has been my experience the last few days with archiving reddit and Yahoo! Answers. My internet connection is decent, yet I don't upload nearly as much as HCross or CCC3.
22:31:04<AK>Generally that will be related to the number of workers people are running
22:31:28<AK>Some of the big people run hundreds (Or thousands) of concurrent workers
22:31:35<AK>So we just get more assigned to us
22:31:49<AK>The tracker gives out tasks in random, so sometimes it's just being lucky or unlucky too
22:32:01<user>From different IPs, if I understand correctly?
22:32:25<user>As in, I can't compete with HCross or CCC3 if I use only one connection?
22:32:42<user>(I don't mean compete in a rivalry sense - just trying to contribute)
22:34:40<user>Also, is there a way to run two or more projects at the same time on one machine and IP? I run the Warrior via Docker and it lets me choose to work on one project at a time. How can I scrape reddit AND Yahoo! Answers?
22:34:57pcr leaves [Error from remote client]
22:36:04pcr joins
22:48:01Stilett0 quits [Ping timeout: 258 seconds]
22:50:32<Iki>user: You can run one project per warrior. However, I believe you can run multiple, separate warriors and run different projects on each
23:26:47HP_Archivist (HP_Archivist) joins
23:28:32<atphoenix>you can definitely run multiple warriors. Each warrior can only run one project at a time. If you use the docker images, you can have each docker image run a different project.
23:29:13<atphoenix>how many you can run depends on your available bandwidth and number of public IP addresses, and the characteristics of the projects you are trying to run.
23:32:26<atphoenix>some are IP constrained, so benefit from more unique IPs. Some projects are bandwidth heavy, so running too many workers at once will use up all your bandwidth and make each job you are running take longer.
23:32:26m0nika quits [Remote host closed the connection]
23:32:38<atphoenix>it's a balancing act overall
23:36:43m0nika (m0nika) joins
23:52:07<@JAA>Some projects use a lot of local disk for temporary storage, so you quickly run out of disk space if you run too many. Some require a lot of CPU. Some require a lot of RAM. Etc.
23:52:15<user>Thanks, atphoenix! I'm new to Docker and followed the instructions for running one warrior with the command:
23:52:20<user>sudo docker run -d --name archiveteam-warrior --label=com.centurylinklabs.watchtower.enable=true --restart=unless-stopped -p 8001:8001 atdr.meo.ws/archiveteam/warrior-dockerfile
23:52:40<@JAA>If you want to 'set it and forget it', it's probably best to just run one warrior at the default settings with the auto project.
23:52:41<user>To launch more warriros, do I repeat the same command with different --name arguments?
23:53:33<@JAA>If you don't mind juggling resources for each project, you likely want to run the project containers, not the warrior.
23:55:18<user>I can give at least a few hundred GB for the warriors and will be able to monitor the resourses the warriors take up. I don't want to 'set it and forget it', at least not for now. I'm OK with checking on it a few times a day and tweaking things.
23:55:36<@JAA>:-)
23:55:38<user>Thanks for the advice, I'll read up on how to run separate containers for each project
23:59:18<user>By the way, is there hope to archive all of Yahoo! Answers?
23:59:24<user>before May 4th