00:05:35JaffaCakes118 quits [Remote host closed the connection]
00:05:59JaffaCakes118 (JaffaCakes118) joins
01:06:55wickerz quits [Ping timeout: 272 seconds]
01:14:30wickerz joins
07:05:01Unholy23619 quits [Remote host closed the connection]
07:06:20Unholy23619 (Unholy2361) joins
07:21:26JC joins
07:25:39nulldata quits [Ping timeout: 272 seconds]
07:31:24nulldata (nulldata) joins
07:32:01JC quits [Ping timeout: 255 seconds]
09:00:01Bleo182600 quits [Client Quit]
09:01:25Bleo182600 joins
09:13:56qwertyasdfuiopghjkl quits [Quit: Connection closed]
09:15:42qwertyasdfuiopghjkl (qwertyasdfuiopghjkl) joins
09:30:20nulldata quits [Client Quit]
09:31:50nulldata (nulldata) joins
10:29:38imer quits [Quit: Oh no]
10:49:02imer (imer) joins
10:55:54JC joins
11:04:51<JC>I am a new user. I tried to setup the warrior client today but had several problems.
11:06:04<JC>Im able to import the 3.2 client into virtualbox, but not the 4.0... 4.0 has some issues for me: On VirtualBox 7.0.10: Result Code: E_INVALIDARG (0X80070057)
11:07:37<JC>Also tried to setup on ESXI 8.0, but OVA file (archiveteam-warrior-v4.0-20230722.ova) had issues with importing: Line 61: Unsupported element 'StorageItem'
11:09:56<JC>Tried to setup a fresh Virtual machine on ESXI uising "Esxi 8, Linux, Otherlinux 64bit) 1cpu, 1gb ram, and using the following VMDK file as the disk (archiveteam-warrior-v4.0-20230722-160921.vmdk). This vm configuration was rejected (i am unable to find the actual error message...)
11:16:31<JC>Ended up with Docker..... that works.... (that unfortunatly is running on weaker hardware...) Just wanted to give you some feedback on these issues.
11:19:15<JC>Now im trying to figure out what is the strategy if i want to get some work done.... right now im running default (2 concurrent items). Tried to turn up to 6, but it seems that i just get allot of "server error, sleeping" ...... So i tried to run multiple docker instances with each their own project to run (that seems to go "OK") , but what is the strategy, if i just want to "crank it up"
11:19:15<JC>and get some work done ???
11:33:46<JC>I have 3 locations, with 1Gbit fiber, and was thinking about donating about "constant" 200mbit from each location.
11:40:01<JC>Is this at all realistic with all these "Server returned bad response. Sleeping."
13:54:23<myself>JC: the import issues are peculiar, I've been able to pull both OVAs into VirtualBox with no trouble, that sounds like ESXi is doing something weird. If you get the actual error it's worth digging into.
13:55:16<myself>As for the "server returned bad response, sleeping", that's per-project. Some projects have a lot of bad items that were rejected once and they're being retried and that's normal, in other cases it could mean that you've cranked your concurrency much too high.
13:56:22<myself>In general, there's not much point to having a ton of bandwidth behind a single IP on a single project, because each server rate-limits us. The way to use a bunch of bandwidth is to run a bunch of warriors _each on a different project_ so they're all hitting different servers.
14:45:28<nstrom|m>running docker containers for multiple projects is definitely the way to go if you're comfortable w/ docker and have the system resources to do so. every project is different in how much concurrency it can handle, generally depending on how strict the site we're archiving is about rate limits and such
14:46:15<nstrom|m>for example roblox basically can't go higher than 1 concurrency, telegram is fine with 4-6 generally, youtube can go to 20 without issue
14:46:59<nstrom|m>(with warrior you're limited to 6 concurrency, with the individual project docker images you can go to up to 20 per container, which is great for projects that can handle that)
14:48:33<nstrom|m>the warrior appliance is great for beginners, for more advanced users once you've run that for a bit and understand the ins and outs, separate project specific docker images is the way to go. a little less "set and forget" but a lot more flexibility/scalability
14:50:33<nstrom|m>and our projects tend to ebb and flow in terms of how much activity each has. at the moment, telegram has tons of work to do, roblox does too but only allows 1 concurrrency (so scaling there means having lots of IPs to throw at it and not really beefier connections/individual boxes). urls has work but that project is a little special
14:51:31<nstrom|m>youtube,mediafire can use lots of bandwidth but right now don't have lots of items avail and/or we're rate limiting them so they're mostly idle atm (but a couple weeks ago were flying)
14:53:09<nstrom|m>from time to time shorter term projects pop up that can take lots of bandwidth/connections but a lot of this is just a slow steady burn unless you have a ton of IPs to throw at something
14:53:36<nstrom|m>sorry for wall of text. just had a large coffee and feeling inspired lol
15:39:04<JC>nstrom|m - thanks for the explaination. I read somewhere that i should not be behind a vpn, is that still the case? Othervise, i am a paying member of a vpn service, and could scale horisontally that way.
15:40:13<JC>nstrom|m - i was also the happy user of the new Virtualbox BSOD error when running bridged networking..... that took some time before i read the RED writing on the download page "DONT INSTALL NEWEST VERSION !!!!" =)
15:40:53<JC>nstrom|m - ip vise, i am a home user, so i dont have a range of IP's i can bind to, only 1 per connection, and then there is the VPN question......
15:41:15<JC>by connection i mean Internet-connection ;)
15:43:32<JC>nstrom|m - regarding docker, i wont say im comfortable with it.... i "dont really know what im doing", but i got it working :) So in that sense im doing ok.... but i am in no means an expert on the subject.
15:44:02<@JAA>https://wiki.archiveteam.org/index.php/ArchiveTeam_Warrior#Can_I_use_whatever_internet_access_for_the_Warrior?
15:44:51<JC>thanks JAA, yes, i can se: "No VPNs. Data integrity is a very high priority for the Archive Team so use of VPNs with the official crawler is discouraged. Servers may also be more likely to deploy a rate limit or serve a CAPTCHA page when using a VPN which is unhelpful to archive."
15:45:06<JC>Then i guess horisontal scaling that way is not possible.
15:45:41<@JAA>Being an expert with containers is definitely not necessary for running projects that way. Source: me :-)
15:45:53<JC>hehe ;)
15:46:07<JC>i whent the easy say, and ran them on a QNAP on each location :)
15:46:11<JC>VERY easy
15:48:02<JC>its a bit CPU intensive on the "lower hier" models, but they are rarely cpu loaded.
15:50:07<@JAA>CPU load depends entirely on the specific project.
15:53:11<JC>vould you say Telegram is in the high end ? Othervise i need to get my dockers mooved to "propper" hw.
15:55:44<@JAA>No idea, I don't really pay attention to it. You might want to ask in the project channel.
15:59:11<JC>check. sorry about that.
18:45:46datechnoman quits [Quit: Ping timeout (120 seconds)]
18:47:30datechnoman (datechnoman) joins
19:23:22yano quits [Quit: WeeChat, the better IRC client, https://weechat.org/]
19:26:36yano (yano) joins
21:03:29Unholy23619 quits [Client Quit]
21:04:07Unholy23619 (Unholy2361) joins
21:16:40JC quits [Remote host closed the connection]
21:38:07Unholy23619 quits [Ping timeout: 272 seconds]
22:48:20JaffaCakes118 quits [Remote host closed the connection]
22:48:43JaffaCakes118 (JaffaCakes118) joins