02:58:48DogsRNice quits [Read error: Connection reset by peer]
02:59:02DogsRNice joins
03:07:35DogsRNice quits [Read error: Connection reset by peer]
03:34:00michaelblob quits [Quit: yoop]
03:37:40michaelblob joins
04:24:38wotd joins
04:28:02__wotd__ quits [Ping timeout: 256 seconds]
05:27:55igloo222259 quits [Quit: The Lounge - https://thelounge.chat]
05:28:33igloo22225 (igloo22225) joins
05:32:18igloo22225 quits [Client Quit]
05:32:49igloo22225 (igloo22225) joins
07:01:03sg72 joins
07:01:36sg-72 quits [Ping timeout: 256 seconds]
11:34:44Matthww quits [Ping timeout: 256 seconds]
11:38:50Matthww joins
15:48:04DogsRNice joins
16:22:44Noble_Fish joins
16:29:41<Noble_Fish>Hello, can someone explain to me the process mentioned by the [saveweb/wikiteam3 project](https://github.com/saveweb/wikiteam3/?tab=readme-ov-file#for-webmaster):
16:29:41<Noble_Fish>> We archive every MediaWiki site yearly and upload to the Internet Archive.
16:29:41<Noble_Fish>What is the process like?
16:29:41<Noble_Fish>The wiki I am in is called [ModEnc](https://modenc.renegadeprojects.com/), but according to the link at the bottom of `README.md`, I jumped to subject:wikiteam3 and could only find the backup version of 2024-01-21 when [searching in it](https://archive.org/search?query=subject%3Awikiteam3+ModEnc).
16:29:41<Noble_Fish>Has wikiteam been unable to update the backup files in the past two years due to reasons such as insufficient resources or process operation failure? Or does 'yearly' here not mean backing up every MediaWiki site on a yearly cycle?
16:30:21<justauser>Without saying anything about STWP activities:
16:30:52<justauser>One year is the minimum period. We do a new archive earlier only for some significant reason.
16:31:06<Noble_Fish>okay
16:31:24<justauser>The system is automated but not fully automatic - each new dump is initiated by a human.
16:32:07<justauser>This happens either if member was browsing, encountered a wiki and decided to check
16:32:30<justauser>or by random sampling from previously saved wikis (pabs should know more)
16:32:40<justauser>or by request from someone.
16:33:39<justauser>Want to have a new dump created?
16:34:48<Noble_Fish>I thought there would be a periodic plan for backups at least every so often, haha.
16:34:48<Noble_Fish>I guess the act of uploading backup files to IA also needs to be done manually?
16:35:35<justauser>If everything works well, single message to an adjacent channel (#wikibot) does all the magic, including the upload.
16:36:56<justauser>If not - something is borked, bot defenses try to stop us, etc - a human is summoned to do it step-by-step.
16:42:12<Noble_Fish>Thank you for your guidance. I will try it after learning more about it. :)
16:44:14<justauser>Actually, started a dump right now.
16:44:33<justauser>You can see its barf at https://wikibot.digitaldragon.dev/
16:49:39<Noble_Fish>Yes, I have already noticed. I was originally looking up relevant materials because I didn't understand the instructions, but in the #wikibot channel, there immediately appeared a complete demonstration. It's really delightful! Thank you again!
16:54:04<justauser>The documentation about WikiBot is at https://wiki.archiveteam.org/index.php/Wikibot - not that you need it, unless you intend to stay here as a team member :).
16:55:49<justauser>The documentation about the software itself is embedded - run "wikiteam3dumpgenerator --help".
17:04:14<Noble_Fish>I have followed the prompts on the Wikibot Dashboard to navigate and read some of the WikiBot Documentation. However, to be honest, the wikis I continuously maintain or even just monitor are not many. At most, I might only have a need to back up specific 1-3 wikis once per year. Therefore, I think I won't contribute much to the main work of this
17:04:14<Noble_Fish>project. It seems I had better only base the subsequent use of wikibot on that instruction example.
17:09:07<justauser>1. The WT3DG itself is used in a shell, not in IRC.
17:09:27<justauser>2. Perfectly sensible decision. Come back when you have something new, then.
17:10:12<justauser>3. A shortcut to check when was something last saved: https://archive.org/search?query=originalurl%3A%28*wiki.example.com*%29
17:13:29<Noble_Fish>It seems that only team members have permission to operate Wiki Bot, while others need to send requests via application messages to team members? I also want to back up cnc.fandom.com.
17:13:35<klea>I wonder if having a extra option to the wikiteam tools to after X failed attempts of getting an url, automatically mark it as failed, move it to some other list.
17:13:41<justauser>Already running for you.
17:14:01<Noble_Fish>Thanks!
17:14:04<klea>s/url/image url/
17:14:07<justauser>klea: DWD has --ignore-errors, but WB doesn't expose it.
17:14:27<klea>Huh, so if I put --ignore-errors i don't have to keep removing urls from the image list?
17:14:36<justauser>For DW only.
17:14:55<klea>Oh, it'd be neat if that was implemented in the MW one.
17:15:17<justauser>Page URLs are also fail-able when running --xml or --xmlapiexport, FWIW.
17:15:28<justauser>Only listing is on the critical path.
17:16:08<justauser>PR's welcome, I guess.
17:16:47<justauser>If you make one, keep in mind that for some cases, there is a ready behavior for error handling, but 5xx doesn't invoke it.
17:17:45<justauser>For listers: reducing the page size; for single page export: trying less revisions, than skipping; for images: retrying once more, then skipping.
17:18:14<justauser>But currently, only specific errors provoke those handlers.
17:18:58<justauser>Lister wants a specific message, exporter trips on "HTTP 200 but invalid XML", imager is OK with 4xx but not with 5xx.
17:19:23<justauser>5xx seems to be generally intercepted by lower-level retry logic, which is stupid.
17:22:27<klea>Somewhat confusing, but yay I guess.
17:25:15<justauser>I think I once tried to run things through two nc's and sed to force every answer to be HTTP 200.
17:25:21<justauser>Didn't work for some reason.
17:44:25Noble_Fish quits [Client Quit]
17:48:05<klea>Time to write a script to automate doing what I do in tmux, which is go to the panel down, make sed remove the line that made it fail, then go arrow up and enter on the other terminal :p
18:58:13Webuser086539 joins
19:06:02Webuser086539 quits [Client Quit]