00:00:55<mgrandi>I see a login page still
00:01:13<Ryz>But are there links to account pages with their galleries?
00:01:31<thuban>mgrandi: can you rephrase? i don't understand your first message
00:01:48<@OrIdow6>https://i6.photobucket.com/albums/y210/Planet-man/NguyensLOTFpart8.jpg still works for me
00:02:21<Ryz>OrIdow6, but is the account accessible still? I can browse their galleries?
00:02:37<mgrandi>For a DPoS project, if your worker claims a job, you claim it , but then If your worker crashes or encounters an error, it just sits at "out" and someone has to manually requeue it in the interface
00:02:39<@OrIdow6>IIRC there is some project that reports back to the tracker (#// or something), don't know exactly what it reports as I'm not privvy to tracker stuff
00:02:54<mgrandi>Maybe this has improved recently yeah
00:03:40<mgrandi>I'm envisioning something like rabbitMQ where it will deliver jobs, and of the client doesn't check-in and say "I'm still here and working on it", it gets requeued and sent to someone else
00:03:59<@JAA>mgrandi: Jobs get recycled automatically on most projects these days.
00:04:05<@JAA>Well, not quite, but close enough.
00:04:15<mgrandi>However if it has an error, the client says "finished with error X " and those can get pushed to another queue or retried or whatever
00:04:36<@JAA>What happens is that once there is no queue anymore, items that have been out for a while get handed out again.
00:04:40<@OrIdow6>Ryz: Ah, I see
00:04:50<mgrandi>Ah
00:05:12<mgrandi>I still feel there is use in having a error reporting mechanism that can be shown somewhere to the admins to see certain jobs that always fail
00:05:21<thuban>i see, thx. for some reason i thought we'd had automatic retry for years, but maybe that was manual and the old wiki pages just didn't use much detail
00:05:38<@JAA>Yeah, until last year or so, it was all manual.
00:06:17<@JAA>mgrandi: We have that on URLTeam, and yes, it's very useful.
00:06:38<Ryz>OrIdow6, there's still usermade images hosting, like https://i238.photobucket.com/albums/ff257/szubaark/Picture1530_zpsc77001c5.jpg - coming from https://www.amibay.com/showthread.php?50472-Commodore-64-Original-Software-Titles - but again, the accounts and their galleries are not accessible publicly
00:06:40<mgrandi>Is that automatic or does it have to sorta be manually done on the project code
00:06:51<Ryz>So yeah, archiving Photobucket stuff just became a lot harder...
00:07:14<@JAA>URLTeam has a completely different code base. There's basically a giant try-except that sends the exception + traceback to the tracker.
00:07:18<atphoenix>thuban, I think I wrote the about hard-stops on docker containers
00:07:26<atphoenix>wrote the note*
00:07:37<mgrandi>Is that not using seesaw? Or a newer version of it?
00:07:37<atphoenix>I was working on the domains project, which had huge items
00:08:16<atphoenix>the point was a clean-stop was the safest answer to avoid the project missing a a site or other data
00:08:25<mgrandi>For big items , I've noticed that when it encounters an error, it gives up and just kinda deletes the files, these should still be uploaded I think
00:08:48<atphoenix>because with a dirty stop, the retry process might not kick into effect until it was too late, i.e. after the domain was killed
00:08:53<mgrandi>Cause even if it's accidental, process crash , OS crash , power outage, etc
00:09:26<@JAA>mgrandi: Yeah, it uses seesaw, but the error reporting is in the project code.
00:09:39<thuban>uh, meanwhile,
00:09:41<atphoenix>the potential loss of already saved info could also happen with small items, but obviously it is easier to retry those in a reasonable time
00:09:41<thuban>JAA or somebody with equivalent privileges: can we get a toclimit? https://en.wikipedia.org/wiki/Template_talk:TOC_limit#Steps_to_limit_the_TOC_in_your_mediawiki
00:09:41pcr leaves
00:09:52<thuban>i'd do it myself but i don't have permission to edit the css
00:09:57<@JAA>Uploading data from crashed items will lead to all kinds of issues.
00:10:43<@JAA>thuban: Don't think I can either, but I've wished for that before, yeah. jrwr?
00:10:49<mgrandi>I dunno, I noticed it during the bitbucket one where crashing during a huge project dL was a big waste, feel like improving stuff around that would be good
00:11:17<@JAA>Yes, fixing the crashes is good. Uploading faulty data that then potentially crashes stuff on the targets etc. is not.
00:12:12<mgrandi>Yeah, error reporting potentially being built into seesaw and the tracker infra would help stop jobs dying halfway through probably
00:12:40<@OrIdow6>Warriors shutting down like that it still pretty rare
00:12:46<@OrIdow6>As opposed to crashes for other reasons
00:13:23<atphoenix>so anyhow, my notes are mostly aimed at ensuring clean docker worker shutdowns. I.e. don't hardkill them when a clean stop can be used instead.
00:15:31<@JAA>Yeah, clean stops should be preferred, but a hard stop isn't a huge issue if necessary.
00:15:55<atphoenix>I suppose there may be some cases where the data saved from a dirty-ended worker might also be of interest to keep. I think we checked a few such cases for Yahoo Groups workers. Maybe that was because of how Yahoo itself was changing things on the fly as we were saving stuff. And it was often erroring.
00:16:35<atphoenix>(yahoo was erroring)
00:18:55pcr joins
00:19:09<Ryz>Is there an ArchiveTeam project for Photobucket?
00:19:53<thuban>https://wiki.archiveteam.org/index.php/Photobucket looks like no
00:25:26<purplebot>ArchiveTeam Warrior edited by Switchnode (-6652, clean up vm/docker coexistence …) just now -- https://www.archiveteam.org/?diff=46504&oldid=46503
00:29:39<Ryz>!ignore djqaqsdtvjqpc4a4zjecp0izi ^https?://(www|m|music|au|ca|de|es|gaming|fr|ie|il|it|jp|mx|nl|pl|uk)\.youtube\.com/watch\?
00:29:39<Ryz>!ignore djqaqsdtvjqpc4a4zjecp0izi ^https?://youtube\.com/watch\?
00:29:39<Ryz>!ig djqaqsdtvjqpc4a4zjecp0izi ^https?://youtu\.be/
00:29:41<Ryz>Oops
00:30:18<thuban>ok, made the big edit. in the end i removed the loads of detail on managing docker from the intro section, because it seemed like a serious turnoff for the newbs at whom the warrior is primarily aimed
00:31:15<thuban>plus (a) we don't include loads of detail on managing virtualbox there and if you deliberately choose to use the cli option you are bright enough to google the docs, (b) redundant section with the projects page presents awkward sync issues (i did link to it in several places), and (c) collapsing sections was a nice thought but they don't work without js
00:31:25<thuban>feel free to yell at me, change it back, etc
00:40:58<thuban>https://mommacomms.tumblr.com/post/646772288587513856/nug-juggler-demishock-k-vichan-k-vichan-take i hear ffn (which has been rotting for many years) is undergoing some technical changes soon
00:41:30<thuban>last at scrape in 2012, last "nearly complete" scrape in 2015--time for a revisit?
00:45:29@Fusl quits [Excess Flood]
00:45:49Fusl (Fusl) joins
00:45:49@ChanServ sets mode: +o Fusl
00:56:26<purplebot>ArchiveTeam Warrior edited by Switchnode (+8, use correct vm name and consistent …) 22 minutes ago -- https://www.archiveteam.org/?diff=46505&oldid=46504
01:02:44dm4v quits [Read error: Connection reset by peer]
01:03:34dm4v joins
01:03:36dm4v quits [Changing host]
01:03:36dm4v (dm4v) joins
01:10:26<purplebot>Deathwatch edited by JustAnotherArchivist (-20, /* 2019 */ Link to 99.se page) just now -- https://www.archiveteam.org/?diff=46506&oldid=46492
01:19:26<purplebot>99.se edited by JustAnotherArchivist (+43, Link to my forums archive, clarify …) just now -- https://www.archiveteam.org/?diff=46507&oldid=46182
01:43:12<@JAA>s-crypt: Yes, of course AB goes into the WBM. That's the point really. :-P See also https://wiki.archiveteam.org/index.php/ArchiveBot
01:45:25Mineroboter_ joins
01:45:25<s-crypt>Is that some special permission granted to archivebot or the archiveteam group? or is it all uploaded WARCs
01:47:06Mineroboter quits [Ping timeout: 250 seconds]
01:49:28<@JAA>Only WARCs uploaded by whitelisted accounts get ingested.
01:54:59<s-crypt>Thanks for the answers! :)
01:57:29<s-crypt>Quick side question. Does the Warrior have the capability to switch desired (Archiveteam's pick) projects without restarting and getting a new docker image?
02:00:27sliccricc_ quits [Remote host closed the connection]
02:12:53<Hyenadae>I've seen it switch as long as you have the "preferred project" task selected (in the VM version )
02:13:22<Hyenadae>You can also switch between two tasks I guess and back to the preferred/current project to get it restarted on the latest thing
02:41:23fuzzy802 joins
02:41:23fuzzy8021 quits [Killed (NickServ (GHOST command used by fuzzy802!~fuzzy8021@173-224-26-244.ptcnet.net))]
02:41:24fuzzy802 is now known as fuzzy8021
02:41:26fuzzy8021 quits [Changing host]
02:41:26fuzzy8021 (fuzzy8021) joins
02:41:30fuzzy8021 quits [Excess Flood]
02:41:48fuzzy8021 joins
02:41:48fuzzy8021 quits [Changing host]
02:41:48fuzzy8021 (fuzzy8021) joins
03:25:33DopefishJustin quits [Remote host closed the connection]
03:29:21DopefishJustin joins
03:30:03YazofArc quits [Remote host closed the connection]
03:36:58qw3rty__ joins
03:40:47qw3rty_ quits [Ping timeout: 258 seconds]
03:50:35DogsRNice quits [Read error: Connection reset by peer]
04:05:33etnguyen03 quits [Client Quit]
04:06:14Jonboy345 quits [Read error: Connection reset by peer]
04:06:32Jonboy345 joins
04:19:07pawbs joins
04:19:07<tech234a>Made a few more improvements to the Warrior page
04:19:25<purplebot>ArchiveTeam Warrior edited by Tech234a (+158, Shorten the infobox at the top …) just now -- https://www.archiveteam.org/?diff=46508&oldid=46505
04:20:01<tech234a>I combined the Docker setup commands into a one-liner which should hopefully simplify the instructions a little bit
04:20:50<pawbs>Has anybody gotten the Warrior to work well with BSD userland tools yet? I remember running into problems with that for #tumbledown
04:27:50<tech234a>Well there's always the VM or Docker... you might be able to manually run projects without either of those but that is more complicated and requires you to get the needed dependencies yourself
04:29:11<pawbs>Yeah, I’ve used the VM before, it’s just that the machine I have available for warrior-ing is on FreeBSD and also like a decade old. VMs make it unhappy :(
04:29:49<tech234a>Some instructions for running projects manually are in project READMEs; for example: https://github.com/ArchiveTeam/periscope-grab#readme
04:33:29pawbs|2 joins
04:33:39pawbs quits [Client Quit]
04:33:55pawbs|2 is now known as pawbs
04:35:45<pawbs>I’ll give that a shot then, hopefully I can convince it to work well
04:35:50<pawbs>Thanks!
04:45:20@Fusl quits [Excess Flood]
04:45:26<purplebot>ArchiveTeam Warrior edited by Tech234a (+0, Correct capitalization of VBoxManage) 23 minutes ago -- https://www.archiveteam.org/?diff=46509&oldid=46508
04:45:37Fusl (Fusl) joins
04:45:37@ChanServ sets mode: +o Fusl
05:02:38Jonboy3451 joins
05:05:53Jonboy345 quits [Ping timeout: 258 seconds]
05:33:18pawbs quits [Ping timeout: 250 seconds]
06:29:00dewdrop quits [Remote host closed the connection]
06:29:44britm0b joins
06:32:54britmob quits [Ping timeout: 258 seconds]
06:43:07dewdrop (dewdrop) joins
06:45:19@Fusl quits [Excess Flood]
06:45:39Fusl (Fusl) joins
06:45:39@ChanServ sets mode: +o Fusl
06:46:03hooway joins
07:29:22Wayward (wayward) joins
07:34:33s-crypt quits [Remote host closed the connection]
07:34:33flashfire42 quits [Remote host closed the connection]
07:34:33kiska quits [Remote host closed the connection]
07:37:10LeighR (LeighR) joins
07:50:03jonboy3452 joins
07:53:24Jonboy3451 quits [Ping timeout: 258 seconds]
08:01:36britmob25 quits [Quit: britmob25]
08:22:49Arcorann_ joins
08:42:19themadpro_ is now known as themadpro
08:43:48<themadpro>Regarding Yahoo answers, someone raised an interesting point on the IA Discord server:
08:43:56<themadpro>> The versions in German, French, Spanish, Italian, etc. are also shutting down (see https://de.answers.yahoo.com/, https://fr.answers.yahoo.com/). These should be archived too, shouldn't they?
08:44:26<themadpro>Probably not as big as English obviously, but are we CURRENTLY grabbing from other the localization endpoints as well?
08:44:47<themadpro>If not, can we start crawling?
08:45:49@Fusl quits [Excess Flood]
08:46:10Fusl (Fusl) joins
08:46:10@ChanServ sets mode: +o Fusl
08:49:27<themadpro>Preliminary analysis suggests that we are? https://github.com/ArchiveTeam/yahooanswers-grab/search?q=answers.yahoo.com https://usercontent.irccloud-cdn.com/file/EZvDUAF1/preliminary%20analysis.png
08:56:10EggplantN joins
08:56:28EggplantN quits [Changing host]
08:56:29EggplantN (EggplantN) joins
08:56:29@ChanServ sets mode: +o EggplantN
08:57:02Arcorann_ quits [Ping timeout: 258 seconds]
09:05:22Arcorann_ joins
09:07:39BlueMaxima quits [Read error: Connection reset by peer]
09:16:29s-crypt (s-crypt) joins
09:16:29flashfire42 (flashfire42) joins
09:17:05kiska (kiska) joins
09:20:13LeighR quits [Ping timeout: 244 seconds]
09:35:58nathan quits [Ping timeout: 250 seconds]
09:36:43nathan joins
09:50:31<AK>#noanswers is the place for the project, but I believe the plan is to grab all languages if possible. Code is just being finished up before we start at the moment (That code is from the 2017 grab and I don't think the new version has been pushed yet)
09:56:27Arcorann_ quits [Ping timeout: 258 seconds]
10:12:06<themadpro>Guess I will ask there as well then
10:14:28katocala quits [Ping timeout: 258 seconds]
10:42:10Arcorann_ joins
10:46:09@Fusl quits [Excess Flood]
10:46:28Fusl (Fusl) joins
10:46:28@ChanServ sets mode: +o Fusl
10:48:45@Fusl quits [Excess Flood]
10:49:02Fusl (Fusl) joins
10:49:02@ChanServ sets mode: +o Fusl
11:35:35Hyenadae quits [Ping timeout: 244 seconds]
11:45:46@Fusl quits [Excess Flood]
11:46:04Fusl (Fusl) joins
11:46:04@ChanServ sets mode: +o Fusl
12:10:40yanome quits [Quit: The Lounge - https://thelounge.chat]
12:10:48yanome (yano) joins
12:24:12ATG64 joins
12:25:28ATG64 quits [Remote host closed the connection]
12:43:47Arcorann (Arcorann) joins
12:46:13@Fusl quits [Excess Flood]
12:46:16Arcorann_ quits [Ping timeout: 258 seconds]
12:46:33Fusl (Fusl) joins
12:46:33@ChanServ sets mode: +o Fusl
12:49:29katocala joins
12:54:52britmob25 joins
13:26:30rewby quits [Ping timeout: 250 seconds]
13:26:43rewby (rewby) joins
13:42:45Arcorann_ joins
13:45:18Arcorann quits [Ping timeout: 258 seconds]
13:46:22@Fusl quits [Excess Flood]
13:46:41Fusl (Fusl) joins
13:46:41@ChanServ sets mode: +o Fusl
13:47:25@Fusl quits [Client Quit]
13:47:32Fusl (Fusl) joins
13:47:32@ChanServ sets mode: +o Fusl
13:53:44katocala quits [Ping timeout: 258 seconds]
13:54:07katocala joins
14:02:18LeGoupil joins
14:38:30kyilani joins
14:43:51kyilani quits [Remote host closed the connection]
14:56:32Arcorann (Arcorann) joins
14:59:14Arcorann_ quits [Ping timeout: 250 seconds]
15:20:56themadpro quits [Read error: Connection reset by peer]
15:24:21@HCross quits [Read error: Connection reset by peer]
15:25:00themadpro (themadpro) joins
15:25:01HCross (HCross) joins
15:25:01@ChanServ sets mode: +o HCross
15:29:11etnguyen03 (etnguyen03) joins
15:41:50LeighR (LeighR) joins
15:58:03<thuban>tech234a: thanks! that was a good idea
16:01:41<thuban>couple of concerns: (1) did you intend to delete the 'using the web interface' section? & (2) it might not be obvious from the link in the setup section that the running-projects-with-docker page is about running _individual_ projects, not the warrior (in the sense that the warrior page uses it (we might need some new terminology))
16:03:15<thuban>i think i'll probably just add a line of explanation in re the latter; shouldn't be too much
16:09:06<thuban>(on linux vboxmanage and VBoxManage are both symlinked to the same bin, lol. i take it that's not the case for windows?)
16:25:32Arcorann quits [Ping timeout: 258 seconds]
16:39:35emerald (emerald) joins
16:46:37LeGoupil quits [Ping timeout: 258 seconds]
16:48:07Daloader_ joins
17:04:40LeGoupil joins
17:05:00brgtt joins
17:08:30brgtt leaves
17:10:23LeighR quits [Ping timeout: 244 seconds]
17:13:12onetruth joins
17:15:28brgtt joins
17:45:36forkwhilefork (forkwhilefork) joins
17:49:45<atphoenix>I think a table of "ways to assist with AT archiving projects" could contain 3 or so columns to compare the methods:
17:50:24<atphoenix>1.) VM Warrior 2.) Docker-warrior 3.) Docker-direct projects
17:52:18AlsoHP_Archivist joins
17:55:36HP_Archivist quits [Ping timeout: 250 seconds]
17:56:57brgtt quits [Client Quit]
18:02:23AlsoHP_Archivist quits [Read error: Connection reset by peer]
18:02:50AlsoHP_Archivist joins
18:11:16spirit joins
19:00:50LeGoupil quits [Client Quit]
19:02:20pcr leaves
19:03:01pcr joins
19:05:09brgtt joins
19:11:29brgtt quits [Client Quit]
19:21:02Jonboy3451 joins
19:24:33jonboy3452 quits [Ping timeout: 258 seconds]
19:38:21spirit quits [Client Quit]
19:44:20brgtt joins
20:01:36jonboy3452 joins
20:04:48Jonboy3451 quits [Ping timeout: 258 seconds]
20:10:36<tech234a>thuban: Thanks! I liked your edits too. As for (1) I deleted the 'using the web interface' section since it was unnecessary: when you open your browser to the control panel on a new Warrior, you are immediately taken to the screen to set your username, and once you save your username, you are immediately taken to the project list. I figure that we didn't need instructions for that. As for (2) if you are referring to the link
20:10:36<tech234a>labelled "here" then yeah, I think that might be a little unclear. (In general, labelling links "here" or "click here" is not a best practice.) Also something else that should be considered: while I agree the recommended method for shutting down the Warrior should be through the web interface, currently the Docker container will automatically restart after being shut down this way because of `--restart=unless-stopped`. Perhaps the
20:10:36<tech234a>Docker container could be updated to access the Docker socket from the host to stop itself when the shutdown button is used?
20:11:26<purplebot>ArchiveTeam Warrior edited by Tech234a (+6, Un-abbreviate --volume in Docker …) just now -- https://www.archiveteam.org/?diff=46510&oldid=46509
20:12:47billy549 quits [Remote host closed the connection]
20:13:58<thuban>tech234a: gotcha. i didn't know the docker container couldn't shut itself down that way--seems like a good feature to add
20:14:59<tech234a>Yeah, alternatively we could consider changing the restart policy, but I don't think there are any other ones that fit what we want
20:20:43billy549 (Billy549) joins
20:22:22<tech234a>oh and as for VBoxManage: perhaps the lowercase version does work, but pretty much everywhere online uses that capitalization so I figure that is the standard way to run
20:22:23<tech234a>it
20:23:33<thuban>sounds good (i sure don't feel like digging out a windows box to test it!)
20:34:17pcr leaves
20:38:47pcr joins
20:58:02Daloader_ quits [Ping timeout: 250 seconds]
21:35:01<tech234a>We got called archive.org again (see last paragraph) https://www.reviewgeek.com/76740/yahoo-answers-no-more-the-qa-platform-shuts-down-may-4th/
22:08:35hooway quits [Client Quit]
22:22:07brgtt quits [Client Quit]
22:38:59brgtt joins
22:40:21brgtt quits [Client Quit]
23:21:13BlueMaxima joins
23:21:53notbasetwo (notbasetwo) joins