00:58:18 | | sec^nd quits [Remote host closed the connection] |
00:58:45 | | sec^nd (second) joins |
01:59:19 | | monoxane (monoxane) joins |
02:21:43 | | ramsey (ramsey) joins |
02:23:36 | <ramsey> | When Warrior finishes a job and uploads it, where does it go and what happens to the content/data at that point? |
02:24:43 | | ramsey quits [Changing host] |
02:24:43 | | ramsey (ramsey) joins |
02:26:16 | <@JAA> | ramsey: It's uploaded to the target, which aggregates items from many items and uploads the resulting megawarc to the Internet Archive. The local copy on the worker is deleted. (There's an option in the non-warrior to disable the deletion, but that'll fill up a disk fast, so I doubt anyone uses it.) |
02:26:40 | <@JAA> | s/items/files/ |
02:26:45 | <ramsey> | What's the target? |
02:27:05 | <@JAA> | See also here: https://wiki.archiveteam.org/index.php/Dev/Infrastructure |
02:27:07 | <ramsey> | In other words, is the target at the Internet Archive, or is it run by the volunteer ArchiveTeam? |
02:27:10 | <ramsey> | thanks! |
02:27:28 | | Barto quits [Ping timeout: 260 seconds] |
02:27:49 | <@JAA> | We used to have a target physically at IA (but separate from their actual system network), but that hasn't been the case in years. |
02:29:14 | <ramsey> | Nice. This is fascinating! |
02:39:57 | | BornOn420 quits [Remote host closed the connection] |
02:40:37 | | BornOn420 (BornOn420) joins |
03:58:03 | | Craigle quits [Quit: The Lounge - https://thelounge.chat] |
03:58:41 | | @imer quits [Quit: Oh no] |
04:00:29 | | Craigle (Craigle) joins |
04:00:38 | | imer (imer) joins |
04:00:39 | | @ChanServ sets mode: +o imer |
04:05:11 | | Craigle quits [Client Quit] |
04:07:51 | | Craigle (Craigle) joins |
04:15:25 | | thenes quits [Remote host closed the connection] |
04:15:44 | | thenes (thenes) joins |
04:59:02 | | ahm258 joins |
06:09:25 | | rosty joins |
06:10:01 | <rosty> | does allocating less ram/cores reduce the speed at which is archives? |
06:23:25 | | fuzzy8021 (fuzzy80211) joins |
06:23:42 | | fuzzy80211 quits [Read error: Connection reset by peer] |
06:32:02 | <BornOn420> | rosty Only when it starts swapping |
07:12:06 | | ahm258 quits [Ping timeout: 250 seconds] |
07:18:10 | | monika quits [Ping timeout: 250 seconds] |
07:22:25 | | monika (boom) joins |
07:31:23 | | abirkill quits [Ping timeout: 260 seconds] |
07:36:36 | | abirkill (abirkill) joins |
07:40:47 | | ahm258 joins |
09:03:36 | | myself quits [Read error: Connection reset by peer] |
09:04:06 | | myself (myself) joins |
09:15:03 | | kryptonian joins |
09:16:24 | <kryptonian> | It seems that my warrior is not even connecting to tracker as it doesn't list any projects? |
09:29:58 | | jrgn joins |
09:32:56 | | jargon quits [Ping timeout: 250 seconds] |
09:37:08 | <kryptonian> | Like usual, it was DNS, carry on. |
09:38:22 | <@JAA> | DNS is hard++ |
09:38:22 | <eggdrop> | [karma] 'DNS is hard' now has 3 karma! |
11:22:06 | | @arkiver quits [Remote host closed the connection] |
11:23:20 | | arkiver (arkiver) joins |
11:23:20 | | @ChanServ sets mode: +o arkiver |
11:42:52 | | kap0t quits [Quit: nothing personal, really.] |
11:43:28 | | kap0t (kap0t) joins |
12:00:02 | | Bleo18260072271962345 quits [Quit: The Lounge - https://thelounge.chat] |
12:02:56 | | Bleo18260072271962345 joins |
12:05:04 | | mls (mls) joins |
12:10:01 | | @arkiver quits [Remote host closed the connection] |
12:11:00 | | arkiver (arkiver) joins |
12:11:00 | | @ChanServ sets mode: +o arkiver |
12:55:25 | | nomead joins |
13:31:09 | | Barto (Barto) joins |
14:55:32 | <myself> | In the Warrior web interface, in the bottom-right of each item/task/thread/thingy, it shows the project name and the elapsed time for that individual item. That's lovely. Would it be possible to also update that to show the total _size_ of the item, perhaps only when it reaches the upload phase and thus the size is finalized? There's otherwise no |
14:55:32 | <myself> | place to find that info that I've seen. |
15:11:21 | <kap0t> | Bottom right corner you can see size info |
15:11:27 | <kap0t> | I mena left corner |
15:11:36 | <kap0t> | But only total tho. |
15:17:20 | <myself> | Yeah, that's like total transfer since reboot. Basically I got curious "if I shut down the warrior now and abandon these stuck jobs, am I throwing away 200kB or 20GB or somewhere in between?", and realized I don't know how much data is sitting in the warrior right now. |
16:01:23 | | ahm258 quits [Quit: The Lounge - https://thelounge.chat] |
16:05:24 | | ahm258 joins |
16:19:24 | | breadbrix (breadbrix) joins |
16:33:17 | <TheTechRobo> | myself: The Warrior doesn't really know how big the files are AFAIK, so it's not a trivial change. |
16:33:36 | <TheTechRobo> | You can su into the container and du -sh, though |
18:18:54 | | that_lurker quits [Remote host closed the connection] |
18:19:00 | | that_lurker (that_lurker) joins |
18:38:52 | | Rieer joins |
18:56:42 | <Rieer> | Hi, new to this, trying to run Warrior for the first time. I'm repeatedly seeing the following on the web interface, and I don't know what it means or how to troubleshoot it to fix it: |
18:56:42 | <Rieer> | Archiving item post:baraoo:999 |
18:56:42 | <Rieer> | Server returned bad response. Sleeping. |
18:56:42 | <Rieer> | Bad response on first URL. |
18:56:42 | <Rieer> | Aborting item post:baraoo:999. |
18:56:43 | <Rieer> | The "baraoo:999" bit varies, that's the item, but the rest just repeats again and again since "Starting Wget Download for Item". |
18:56:43 | <Rieer> | I don't even know if this is a problem on my end or what. Picking one of the links/items at random and going to there with my webbrowser has it resolve successfully, so I don't think it's that all the attempted items are unavailable. |
18:56:44 | <Rieer> | That's all when it's going through Mullvad VPN. Then I read that I shouldn't be using a VPN, so I excluded virtualbox.exe and virtualboxVM.exe from the VPN, and now I am getting a different error over and over: |
18:56:44 | <Rieer> | Failed to submit discovered URLs.temporary failure in name resolutionnil |
18:56:45 | <Rieer> | I don't even know where to begin with this. Sorry if this is the wrong channel. |
18:57:08 | | Sluggs quits [Ping timeout: 250 seconds] |
19:03:36 | | Sluggs joins |
19:37:43 | <opl> | Rieer: my guess is that the vpn is still capturing your DNS traffic from the virtual machine |
20:19:37 | <@JAA> | Yes, apparently Mullvad intercepts all DNS traffic. |
20:38:05 | <Rieer> | Ah, if Mullvad intercepts all DNS traffic even when using Split Tunnelling, that would be the problem |
20:39:30 | | rosty quits [Quit: Ooops, wrong browser tab.] |
20:40:21 | <@JAA> | It's documented, too: https://mullvad.net/en/blog/limitations-split-tunneling |
20:40:40 | <@JAA> | So yeah, no workers on machines that are running Mullvad. |
20:44:22 | <@JAA> | I've added a note to the wiki about that. |
20:45:37 | | DogsRNice joins |
21:29:55 | <Rieer> | Thanks! I had no idea. I'll see if I can change providers before trying again |
21:31:22 | <@JAA> | I would advise against running a worker next to any VPN. It's too easy to get this wrong and pollute the archives. |
21:36:08 | | dendory joins |
21:42:25 | | dendory quits [Client Quit] |
21:45:49 | <TheTechRobo> | JAA: While we're on this topic, what is the official stance on self-hosted VPNs like wireguard? |
21:46:15 | <TheTechRobo> | (Assuming it doesn't intercept anything) |
21:59:50 | <@JAA> | TheTechRobo: My view: if you control both ends, and if all worker traffic to the internet (including DNS) exits at the same place, and if that other place's internet connection is clean, and if that other place is exclusively used by you, it should be fine (unless I forgot about another condition). |
22:00:23 | <@JAA> | But it's easy to get the configuration wrong, so I still wouldn't recommend it. |
22:15:29 | | nomead quits [Quit: Leaving] |