00:24:27<nstrom|m>thought site was staying up just api was going away, so not really anything to archive there besides documentation
00:31:18s-crypt quits [Quit: Ping timeout (120 seconds)]
00:31:23Ryz2 quits [Quit: Ping timeout (120 seconds)]
00:31:28s-crypt (s-crypt) joins
00:31:35Ryz2 (Ryz) joins
00:34:39lennier2_ joins
00:37:41lennier2 quits [Ping timeout: 272 seconds]
00:47:14Island joins
00:47:30benjins3 quits [Remote host closed the connection]
00:47:48benjins3 joins
00:50:49Notrealname1234 (Notrealname1234) joins
00:53:43<eggdrop>[remind] OrIdow6: add LJ (https://bsky.app/profile/rahaeli.bsky.social/post/3mbebi2xfxc25) to https://wiki.archiveteam.org/index.php/Shutdown_rumors,_hoaxes,_and_scares while at it
00:53:44<eggdrop>[remind] OrIdow6: add realkalos to hoaxes
00:55:29Notrealname1234 quits [Client Quit]
00:57:51Notrealname1234 (Notrealname1234) joins
00:58:01Notrealname1234 quits [Client Quit]
01:20:27<DogsRNice>api going away probably isnt a good sign for the long term health of the site
01:35:22<@JAA>That, and it might be easier to discover all contents through the API.
01:42:42sec^nd quits [Remote host closed the connection]
01:43:04sec^nd (second) joins
01:48:25<nicolas17>JAA: well that could be tricky
01:48:42<katia>it would take some tenor
01:48:52<nicolas17>if I understood correctly, the API will be shut down in June for existing API users, but it's already not possible to get a new API key
01:49:07<katia>find one somewhere
02:05:58KerwoodDerby6 joins
02:08:59<KerwoodDerby6>Is anyone here concerned with piano roll archiving?
02:09:24<nicolas17>like physical rolls?
02:09:56<KerwoodDerby6>Well, mostly their digitized form
02:10:05opl9 (opl) joins
02:12:20opl quits [Ping timeout: 256 seconds]
02:12:20opl9 is now known as opl
02:13:06<KerwoodDerby6>20 years ago I became part of a very small community which built their own digitization mechanisms to scan piano rolls, and since then have scanned over 8,000 rolls. I've ceased scanning a couple years ago, but I have info about the file formats we used and the means of converting them to MIDI format.
02:14:07<KerwoodDerby6>I think there might be about 20,000-30,000 rollscans in the world right now
02:15:12<KerwoodDerby6>The best archive of this to date is http://www.pianorollmusic.org/rolldatabase.php
02:15:33<KerwoodDerby6>which I think is suboptimal for the future
02:16:51<nicolas17>many items there have no links, how does that work?
02:17:22<nicolas17>does that mean someone has the roll but didn't scan it? or didn't upload the scan?
02:18:57<KerwoodDerby6>Annoyingly, the titles which have not entered the US public domain will not be linked. Also, titles, which had no publication date when scanned, cannot be assigned a public domain date
02:19:23<nicolas17>I see
02:20:02<KerwoodDerby6>That said, I have a back-channel to access the files as necessary. My query relates to any larger attempt to once-and-for-all get these rolls archived
02:21:12<nicolas17>we could archive this website and all files in it, yes
02:22:55<nicolas17>technical notes for the archivists reading: we can't just archivebot the front page and let it crawl because rolldatabase pagination uses form controls rather than links, but it's all GET, not POST
02:24:00<nicolas17>so I guess we can make a list of all the page numbers and sortby params, and feed that into !a<
02:26:39<KerwoodDerby6>Of course, it gets more complicated -- these rolls have holes punched in them, and the holes have different meanings depending on where on the roll they get punched, but when they are reduced to MIDI that information is lost. There is an intermediate file format (".CIS") which captures the physical, longitudinal image of the roll, which also has
02:26:39<KerwoodDerby6>archival value, but does anyone want to take ownership of such a fiddly, historically-particular, ancient digital music storage medium?
02:28:07<KerwoodDerby6>There were rolls for ordinary home listening, and there were other roll formats which supported expressive musical performance for serious pieces
02:28:39<nicolas17>what do you mean by "take ownership"?
02:29:31<KerwoodDerby6>I guess I mean that a proper archive of piano rolls should probably be curated, and that's asking a lot more than simply copying files
02:29:57<nicolas17>is the IAMMP website still being "maintained"?
02:30:20<KerwoodDerby6>not really, the maintainer is an IT guy for the local public schools
02:31:29<KerwoodDerby6>I just thought I should bring it up here since piano rolls are mentioned in the file-formats part of the wiki
02:35:45<KerwoodDerby6>Well, at least this can be a start of discussion for now. I ran across archiveteam over a year ago and thought that someday I should ask them about piano rolls, which was today, so there's that.
02:35:46sknebel (sknebel) joins
02:42:15<@JAA>I think it's worth preserving the digital data pre-MIDI-conversion. I'm sure the Internet Archive wouldn't mind accepting such uploads, perhaps as one item per roll containing CIS, MIDI, and (if available) photos of the packaging etc. They'd almost certainly also take the rolls that aren't public domain yet (and just not make them publicly accessible if necessary).
02:48:14<ats>It's a similar kind of deal to archiving computer magtapes/papertapes - it's worth keeping the raw format because there's extra information there, even if most people will use a derivative format.
02:49:01<ats>(Or, for a more dramatic example, RF captures of LaserDiscs, which are about 100 times bigger than the video file you end up with after decoding - there are plenty of those on IA...)
02:51:49<@JAA>Same with floppies and archiving the full magnetic scan with a KryoFlux or similar.
02:51:56<pokechu22>Hmm, for US copyright law, do piano rolls count as a recording, or a composition? (I guess for that matter, how does sheet music count)? I know there are two types of music copyrights but haven't really looked into it beyond that
02:53:26<@JAA>That part sounds like a fun rabbit hole: https://en.wikipedia.org/wiki/White-Smith_Music_Publishing_Co._v._Apollo_Co.?useskin=vector
03:01:15nexussfan quits [Quit: Konversation terminated!]
03:03:03nexussfan (nexussfan) joins
03:38:11jinn6 quits [Ping timeout: 272 seconds]
03:42:18Hackerpcs quits [Quit: Hackerpcs]
03:45:30<@JAA>https://www.opendiary.com/ is back online, by the way. Still with the shutdown notice for 2026-01-31.
03:50:32jinn6 (jinn6) joins
04:03:30nine quits [Quit: See ya!]
04:03:43nine joins
04:03:43nine quits [Changing host]
04:03:43nine (nine) joins
04:06:49Island quits [Read error: Connection reset by peer]
04:07:26beardicus1 (beardicus) joins
04:09:51beardicus quits [Ping timeout: 272 seconds]
04:09:51beardicus1 is now known as beardicus
04:16:36<h2ibot>PaulWise edited Finding subdomains (+97, status page strategy): https://wiki.archiveteam.org/?diff=60109&oldid=58320
04:23:27Hackerpcs (Hackerpcs) joins
04:36:04DogsRNice quits [Read error: Connection reset by peer]
04:36:51khaoohs quits [Read error: Connection reset by peer]
05:13:40<pabs>KerwoodDerby6: I note that the piano rolls site has an open directory, so looks like we can save everything including non-PD stuff https://www.pianorollmusic.org/html
05:13:52<pabs>https://www.pianorollmusic.org/html/tsmythe/midifiles/NonPDfiles/
05:14:16beastbg8 quits [Read error: Connection reset by peer]
05:16:24<pabs>KerwoodDerby6: should I just start an ArchiveBot job for https://www.pianorollmusic.org/html? /cc nicolas17 JAA ats pokechu22
05:17:03beastbg8 (beastbg8) joins
05:17:35<nicolas17>that would get all the raw data yeah, but not the listings
05:18:11<nicolas17>to get the lists, as I said, we'd need to make a list of all the pages because AB can't crawl that
05:18:42<nicolas17>(and exclude /html to avoid duplication with your job now :p)
05:19:00<pabs>it will get / and then /rolldatabase.php, where are the listings?
05:20:00<pabs>another open dir is https://www.pianorollmusic.org/design/
05:20:29<pabs>so probably want to do the sitemap trick I guess
05:21:13<pabs>oh, I see https://www.pianorollmusic.org/rolldatabase.php?showpage=26&sortby=catalog
05:21:24<nicolas17>pabs: I mean rolldatabase.php, the pagination uses a <form> so AB won't crawl it
05:21:24<pabs>hmm I thought AB could find those
05:21:35<nicolas17>well
05:21:47<nicolas17>it's not an <a href="?page=2">Next</a> so I *assumed* it won't crawl
05:21:59<nicolas17>if AB is smart enough to parse forms then all the better
05:23:51<pokechu22>It parses <form action=/rolldatabase.php> to https://www.pianorollmusic.org/rolldatabase.php but it won't fill in the rest
05:26:19<pokechu22>https://transfer.archivete.am/inline/PdSFb/www.cagematch.net_seed_urls.txt might be suitable for inspiration (though that site was a bit more complicated)
05:26:22<pabs>ok, sitemap trick then
05:27:01<pabs>ah no, not needed since everything is at the top level
05:27:40<pokechu22>(and that site actually *does* use href for pagination, but not for selecting months/years; I listed all pages mainly so that it would get information about everything at the same depth)
05:28:21cyanbox_ quits [Read error: Connection reset by peer]
05:28:34<nicolas17>pabs: https://transfer.archivete.am/inline/BYwFH/pianoroll-rolldatabase.txt
05:32:24cyanbox joins
05:33:03Guest58 joins
05:33:58<pabs>nicolas17++
05:33:59<eggdrop>[karma] 'nicolas17' now has 24 karma!
05:34:06<pabs>will run https://transfer.archivete.am/VYr9M/www.pianorollmusic.org-open-dirs-and-rolldatabase.php-all-pages.txt
05:34:06<eggdrop>inline (for browser viewing): https://transfer.archivete.am/inline/VYr9M/www.pianorollmusic.org-open-dirs-and-rolldatabase.php-all-pages.txt
05:36:19<nicolas17>I think everything else is reachable from /
05:36:28<nicolas17>so that's good
05:37:18<pabs>KerwoodDerby6: going to run https://www.mmdigest.com/ too
05:39:06<@arkiver>justauser: yes on blogs.sapo.pt. that will run a bit closer to the deadline, not yet now
05:39:15<pabs>I couldn't find any other open dirs on pianorollmusic.org btw
05:39:59<@arkiver>https://icosa.gallery/ is nice
05:41:03<@arkiver>so tenor shutdown is not official right?
05:41:08<@arkiver>oh
05:41:32<@arkiver>nvm
05:41:44<nicolas17>arkiver: API shutdown is official, "if they kill the API I bet they'll kill the entire website by next year" is speculation based on Google's track record
05:44:35<pabs>KerwoodDerby6: also https://www.spencerserolls.com/ https://jeffbourdier.github.io/
05:44:43<@arkiver>time for tenor
05:44:47<@arkiver>thanks nicolas17
05:44:51<@arkiver>any ideas for a channel name?
05:45:05<nicolas17>also using the API will be tricky
05:45:17<Hans5958>end of tenure idk
05:45:25<@arkiver>deadline june 30
05:45:33<nicolas17>afaik the API will work until June 30, but you can no longer get new API keys
05:45:45<@arkiver>would someone be able to add that to the deathwatch page?
05:45:55<nicolas17>so where will we find an API key we can abuse
05:46:16<@arkiver>i was thinking more about backing up the entirety of tenor with June 30 as deadline
05:46:25<nicolas17>via website?
05:46:32<@arkiver>yes and API as much as possible
05:46:44<@arkiver>need to look into how tenor is used in other services, if it uses API, etc.
05:46:50<@arkiver>we'd archive the public facing part
05:49:20<nicolas17>hmm... new game launched today https://hytale.com/, a friend was looking to archive every build to avoid what happened with minecraft, where an early build was lost media until someone randomly found it in an old PC backup
05:51:23<nicolas17>there's currently 2 builds but idk how often they'll be patching it
05:52:28<nicolas17>and it's already 1.5GiB x {v1,v2} x {windows,linux,mac} x {release,pre-release} = 18GiB... this could grow fast :p
05:54:23michaelblob quits [Quit: yoop]
05:55:05michaelblob joins
06:00:03<pabs>KerwoodDerby6: btw, the AB job for www.pianorollmusic.org is getting some 404s on .mid files, I think there are some typoed links. we will get the real files though
06:05:07sg72 quits [Ping timeout: 272 seconds]
06:07:15sg72 joins
06:08:50<h2ibot>PaulWise edited Category:Software archiving (+60, add Software Preservation Society): https://wiki.archiveteam.org/?diff=60110&oldid=58332
06:18:13nexussfan quits [Quit: Konversation terminated!]
06:53:07<pabs>JAA: canonical.com pinged me again, could you send them the relevant AB/etc IP addresses?
06:57:16<pabs>forwarded the mail to AT
06:58:18<@JAA>pabs: Oops, right, ack
07:03:23midou quits [Ping timeout: 272 seconds]
07:20:29Chris5010 quits [Ping timeout: 272 seconds]
07:35:34Chris5010 (Chris5010) joins
07:37:03midou joins
07:39:28mrminemeet joins
07:40:26mrminemeet_ quits [Ping timeout: 256 seconds]
07:42:01midou quits [Ping timeout: 272 seconds]
07:44:55benjins3 quits [Read error: Connection reset by peer]
07:58:16midou joins
08:13:02<h|ca2>nicolas17: can you deduplicate builds somehow?
08:37:32midou quits [Read error: Connection reset by peer]
08:47:14midou joins
08:51:16ducky_ (ducky) joins
08:54:06ducky quits [Ping timeout: 256 seconds]
08:54:06ducky_ is now known as ducky
09:00:21linuxgemini quits [Quit: Ping timeout (120 seconds)]
09:00:34linuxgemini (linuxgemini) joins
09:11:30Lord_Nightmare quits [Quit: ZNC - http://znc.in]
09:12:15Webuser899855 joins
09:12:28Webuser899855 quits [Client Quit]
09:26:24Shard1 quits [Ping timeout: 256 seconds]
09:30:15Webuser659585 joins
09:31:58Webuser659585 quits [Client Quit]
09:33:12Webuser000000001 joins
09:34:43Webuser000000001 quits [Client Quit]
09:34:54nathang2184 quits [Ping timeout: 256 seconds]
09:34:54Webuser288239 joins
09:35:23<Webuser288239>Hello. A famous Japanese Pokémon BBS will be shutting down tomorrow (2026/15/01 JST). Can anyone help?
09:35:23<Webuser288239>https://pokemonbbs.com/post/
09:36:12Webuser288239 leaves
09:44:23nathang2184 joins
09:55:21chunkynutz60 quits [Quit: The Lounge - https://thelounge.chat]
10:01:21midou quits [Ping timeout: 272 seconds]
10:02:01MrMcNuggets (MrMcNuggets) joins
10:19:35cyanbox quits [Read error: Connection reset by peer]
10:22:32cyanbox joins
10:24:11chunkynutz60 joins
10:32:32midou joins
10:39:30midou quits [Ping timeout: 256 seconds]
10:42:47midou joins
10:47:07Notrealname1234 (Notrealname1234) joins
10:47:34Notrealname1234 quits [Client Quit]
10:52:28Dada joins
11:06:00cyanbox_ joins
11:06:17Webuser304965 joins
11:09:07cyanbox quits [Ping timeout: 272 seconds]
11:15:31Webuser304965 quits [Client Quit]
12:00:01Bleo182600722719623455222 quits [Quit: The Lounge - https://thelounge.chat]
12:00:30benjins3 joins
12:02:45Bleo182600722719623455222 joins
12:20:08<KerwoodDerby6>pabs: OK, thanks, that's a good start. I'm new at this, so I don't know where the bot is putting stuff, and so can't comment on the results so far.
12:21:48<KerwoodDerby6>pabs: MMDigest is a good resource with high SNR, in spite of the age of the information
12:30:54<KerwoodDerby6>pokechu22: As I understand it, piano rolls are categorized as "musical recording", and the copyright remains in force for 95 years in the US
12:41:06Ointment8862 (Ointment8862) joins
12:46:31midou_ joins
12:46:38midou quits [Read error: Connection reset by peer]
12:46:39midou_ is now known as midou
12:50:55<pabs>KerwoodDerby6: the piano rolls job finished, it will get uploaded to archive.org in a few days, and eventually get indexed into web.archive.org (they have a large backlog though)
12:51:43midou quits [Ping timeout: 272 seconds]
12:55:27KerwoodDerby6 quits [Quit: Ooops, wrong browser tab.]
12:55:31SootBector quits [Remote host closed the connection]
12:56:40SootBector (SootBector) joins
12:57:16midou joins
12:57:32KerwoodDerby joins
13:07:57<cruller>According to https://pokemonbbs.com/bbs.html, “I intend to keep the logs for at least several years.”
13:25:42<Dango360>wiki idea: cdx summaries of finished DPoS projects https://github.com/internetarchive/cdx-summary
13:29:24Webuser426147 joins
13:29:47Webuser426147 quits [Client Quit]
13:38:12<justauser>JAA: ran an AB from root, but this smells like an !a < and I don't even know what to use for starting points.
13:44:12<justauser>For Tenor, we can use #soprano or #bass.
13:49:18gosc joins
13:50:33<gosc>I've got an api I've been posting to, how do I go about preserving the api output? I've already done a dozen or so post requests to it, I feel like I can't save all of it myself efficiently
14:06:46<@JAA>justauser: What's that referring to?
14:07:38khaoohs joins
14:09:06<justauser>opendiary.com
14:09:47<justauser>Docs say there is a user search, but I can't find it. /circles/ may work...
14:10:39<@JAA>Ah, yeah. Profile pages also have a calendar thingy with <select>. That's as far as I've looked at it though.
14:37:22<klea>btw, how do i make a websocket connection to get all the tracker feed?
14:37:30<klea>it seems it's not possible to get a json formatted feed easily?
14:38:44<@imer>klea: you'd have to grab each projects feed separately if that's what you're asking
14:39:05<klea>and even then, i see the browser uses socket.io which seems to not have a static endpoint.
14:42:02<@imer>mh, yeah not sure about the internals there.
14:42:11<klea>thanks anyways :)
14:42:35<justauser>-dev?
14:45:11Dada quits [Remote host closed the connection]
14:45:23Dada joins
15:02:47BearFortress_ joins
15:06:24BearFortress quits [Ping timeout: 256 seconds]
15:14:16<h2ibot>Klea edited Phorge (+199, Add more phorge/phabricator instances.): https://wiki.archiveteam.org/?diff=60111&oldid=59991
15:15:17<h2ibot>Klea edited Phorge (+31, Add feedback.bistudio.com): https://wiki.archiveteam.org/?diff=60112&oldid=60111
15:17:17<h2ibot>Klea edited Phorge (+27, Add ticket.majava.org): https://wiki.archiveteam.org/?diff=60113&oldid=60112
15:19:17<h2ibot>Klea edited Phorge (+165, Add phabricator.testwiki.wiki,…): https://wiki.archiveteam.org/?diff=60114&oldid=60113
15:19:18<h2ibot>Klea edited Phorge (+56, Add phabricator.ushow.media, pha.tmcdx.com): https://wiki.archiveteam.org/?diff=60115&oldid=60114
15:22:17<h2ibot>Klea edited Discourse (+17, /* Active Discourses */ Fix edit link): https://wiki.archiveteam.org/?diff=60116&oldid=60077
15:23:18<h2ibot>Klea edited Discourse (-1, /* Active Discourses */ Make edit not look odd.): https://wiki.archiveteam.org/?diff=60117&oldid=60116
15:23:46<klea>i should've made it in another way wait a sec
15:24:18<h2ibot>Klea edited Discourse (+16, /* List of Archived Discourse Forums */ Fix…): https://wiki.archiveteam.org/?diff=60118&oldid=60117
15:24:19<h2ibot>Manu edited Discourse/active (+55, Add discourse.opencode.de): https://wiki.archiveteam.org/?diff=60119&oldid=60108
15:25:18<h2ibot>Manu edited Discourse/active (+0, Insert sort recent additions): https://wiki.archiveteam.org/?diff=60120&oldid=60119
15:26:18<h2ibot>Klea edited Discourse (+109, Make edit texts less odd): https://wiki.archiveteam.org/?diff=60121&oldid=60118
15:27:18<h2ibot>Klea edited URLTeam/Dead (+25, Don't include references section.): https://wiki.archiveteam.org/?diff=60122&oldid=59421
15:28:18<h2ibot>Klea edited URLTeam (+33, Readd references section now that template…): https://wiki.archiveteam.org/?diff=60123&oldid=60106
15:39:41panopticon quits [Quit: Bye for now!]
15:42:20<h2ibot>Klea created Phorge/uncategorized (+1405, Created page with "* [https://dev.gnupg.org/…): https://wiki.archiveteam.org/?title=Phorge/uncategorized
15:43:20<h2ibot>Klea edited Phorge (-661, Add subpage): https://wiki.archiveteam.org/?diff=60125&oldid=60115
15:49:27BornOn420 quits [Remote host closed the connection]
15:52:13BearFortress_ quits [Ping timeout: 272 seconds]
16:00:08<@arkiver>imer: could we perhaps have a target for maxmodels? it's starting within the hour i hope
16:00:13<@arkiver>i think it will not be very big
16:00:18<@imer>sure thing
16:00:26<@arkiver>(i know we have a ton of stuff running right now, many deadline)
16:00:28<@arkiver>for
16:00:31<@arkiver>archiveteam_maxmodels_
16:00:33<@arkiver>maxmodels_
16:00:37<@arkiver>Archive Team Maxmodels.pl:
16:03:05<@imer>arkiver: target's up and drone has been poked
16:04:12<@arkiver>imer: thanks a lot as always
16:07:17<klea>arkiver, what's the channel name?
16:07:31<@arkiver>no channel name
16:07:35<@arkiver>i think this will be a quick easy one
16:07:36<klea>oh
16:08:11<klea>inb4 arkiver changes default project to it :p
16:08:38<@arkiver>after the #kickthebucket deadline
16:09:02<@arkiver>but if they have no rate limiting (did they?) i think other will have no problem getting it done without default warrior project
16:09:45<@arkiver>hmm i vaguely remember their being rate limiting
16:09:47<@arkiver>we'll see i guess
16:10:51<Hans5958>we'll see (tm)
16:47:24midou quits [Read error: Connection reset by peer]
16:47:25midou joins
16:52:14midou quits [Read error: Connection reset by peer]
17:00:25BornOn420 (BornOn420) joins
17:19:33<h2ibot>Haiseiko edited Namuwiki (+962): https://wiki.archiveteam.org/?diff=60126&oldid=59037
17:20:33<h2ibot>Haiseiko created Dcwiki (+103, Created page with "it is the wiki of…): https://wiki.archiveteam.org/?title=Dcwiki
17:20:34<h2ibot>Haiseiko uploaded File:Board.namu.wiki.png (screenshot): https://wiki.archiveteam.org/?title=File%3ABoard.namu.wiki.png
17:20:35<h2ibot>Arkiver uploaded File:Maxmodels logo.jpeg: https://wiki.archiveteam.org/?title=File%3AMaxmodels%20logo.jpeg
17:20:53DogsRNice joins
17:22:23<justauser>Haiseiko, Dcwiki - ????
17:23:09lennier2 joins
17:24:43<@arkiver>the maxmodels project has started
17:25:06<@arkiver>for admins - the rate limit is controlled with a patternlimits on "^page:"
17:26:22lennier2_ quits [Ping timeout: 256 seconds]
17:26:35<@imer>omnomnom items :D
17:27:21<@imer>aaand blocked
17:27:41klea wonders why it got blocked ;P
17:28:00<@imer>I didn't ramp up that hard >.>
17:28:34<h2ibot>Klea edited Namuwiki (-136, Fix spelling somewhat.): https://wiki.archiveteam.org/?diff=60130&oldid=60126
17:29:17<@imer>seeing timeouts mixed with 200's - don't see timeouts locally though
17:29:35<h2ibot>Klea edited Namuwiki (+20, Wikis): https://wiki.archiveteam.org/?diff=60131&oldid=60130
17:29:38<@imer>arkiver: 7=404 https://www.maxmodels.pl/artykuly-start-w-maxmodels,c,6,strona,XXX.html 2026-01-14T17:29:05.009434346Z Server returned bad response. Sleeping 103 seconds. do you want me to dig for the item?
17:30:35<h2ibot>Klea edited Dcwiki (+50, Wikify somewhat, add stub.): https://wiki.archiveteam.org/?diff=60132&oldid=60127
17:30:38<@arkiver>imer: no, the item is clear from the URL
17:30:40<@arkiver>fixing tomorrow
17:30:41<klea>aa i should add an infobox project to dcwiki
17:30:42<@imer>fab
17:31:38<justauser>Already running in Wikibot, and I don't think it deserves more.
17:31:39<@imer>arkiver: another https://m.maxmodels.pl/propozycje-wspolpracy-jak-nie-dac-sie-oszukac,a,465.html?nmrd=1?nmrd=1?nmrd=1?nmrd=1?nmrd=1?nmrd=1?nmrd=1?nmrd=1?[... more repetitions ...]
17:31:53<@arkiver>filtering that
17:32:33@arkiver is afk for 9 hours
17:32:35<h2ibot>Hans5958 edited Main Page/Current Projects (+127, Add Maxmodels.pl): https://wiki.archiveteam.org/?diff=60133&oldid=60048
17:32:36<h2ibot>Hans5958 edited Main Page/Current Projects (+3, Wrong template use): https://wiki.archiveteam.org/?diff=60134&oldid=60133
17:33:21<@arkiver>imer: ah it's a loop
17:33:35<h2ibot>Hans5958 edited Main Page/Current Projects (+3, Modeling): https://wiki.archiveteam.org/?diff=60135&oldid=60134
17:34:03<fuzzy80211>imer what concurrency were you running?
17:34:36<@arkiver>imer: fixedin the code
17:34:38<@imer>fuzzy80211: 20ish, don't see timeouts anymore now - just slow 200's at a glance
17:34:48<@imer>thanks arkiver
17:35:11<@arkiver>imer: fuzzy80211: please force stop all current jobs, there may be a loop
17:35:17<fuzzy80211>k
17:35:19<@arkiver>imer: so not a long ban?
17:35:27<@imer>no doesn't look like an outright ban
17:35:54<fuzzy80211>netcup looks banned
17:35:59<fuzzy80211>2026-01-14T17:35:36.797030948Z 281=200 http://www.maxmodels.pl/modelka-blonddziewczyna/portret-zdjecie-9049629.html?nmrd=1?nmrd=1?nmrd=1?nmrd=1?nmrd=1?nmrd=1?nmrd=1?nmrd=1?nmrd=1?nmrd=1?nmrd=1?nmrd=1?nmrd=1?nmrd=1?nmrd=1?nmrd=1?nmrd=1?nmrd=1?nmrd=1?nmrd=1?nmrd=1?nmrd=1?nmrd=1?nmrd=1?nmrd=1?nmrd=1?nmrd=1?nmrd=1?nmrd=1?nmrd=1?nmrd=1?nmrd=1?nmrd=1?nmrd=1?nmrd=1?nmrd=1?nmrd=1?nmrd=1?nmrd=
17:35:59<fuzzy80211>1?nmrd=1?nmrd=1?nmrd=1?nmrd=1?nmrd=1?nmrd=1?nmrd=1?nmrd=1?nmrd=1?nmrd=1?nmrd=1?nmrd=1?nmrd=1?nmrd=1?nmrd=1?nmrd=1?nmrd=1?nmrd=1?nmrd=1?nmrd=1?nmrd=1?nmrd=1?nmrd=1?nmrd=1?nmrd=1?nmrd=1?nmrd=1?nmrd=1?nmrd=1?nmrd=1?nmrd=1?nmrd=1?nmrd=1?nmrd=1?nmrd=1?nmrd=1?nmrd=1?nmrd=1?nmrd=1?nmrd=1?nmrd=1?nmrd=1?nmrd=1?nmrd=1?nmrd=1?nmrd=1?nmrd=1?nmrd=1?nmrd=1?nmrd=1?nmrd=1
17:36:02<fuzzy80211>yes on loop :)
17:37:04<fuzzy80211>arkiver imer may want to pause digitaldragon is returning items too
17:37:44<DigitalDragons>i will kill it
17:38:36<h2ibot>Hans5958 edited Main Page/Current Projects (-154, /* Recently finished projects */ Remove Adobe…): https://wiki.archiveteam.org/?diff=60136&oldid=60135
17:38:48<fuzzy80211>arkiver ping when ready to go again please. should be working :)
17:39:05<@arkiver>i think it's still running
17:39:08<@arkiver>unless imer paused it
17:39:14<@imer>I did not
17:39:55<@arkiver>fuzzy80211: should be running
17:40:18<fuzzy80211>k thanks
17:40:29<fuzzy80211>DigitalDragon ^
17:41:11<@imer>hetzner ip on conc 5 got banned (or whatever it is) too
17:41:31<fuzzy80211>was it banned before you started?
17:41:36<h2ibot>Klea edited Dcwiki (+144, Add infobox): https://wiki.archiveteam.org/?diff=60137&oldid=60132
17:41:44<fuzzy80211>netcup was blocked before i started
17:41:55<DigitalDragons>ty, recreated my containers
17:42:09<@imer>logs are gone unfortunately :(
17:42:19<fuzzy80211>k
17:42:33<@imer>ovh definitely worked before
17:42:36<h2ibot>Klea edited Dcwiki (-34, Forgot to remove the link from the page itself.): https://wiki.archiveteam.org/?diff=60138&oldid=60137
17:43:05<@imer>ovh started working again just now
17:43:28<szczot3k>yay for maxmodels
17:43:36<h2ibot>Klea uploaded File:Dc Logo.png: https://wiki.archiveteam.org/?title=File%3ADc%20Logo.png
17:44:45<szczot3k>Judging from the ab job for maxmodels - it looks like there's no (harsh) automatic ban, but needs a manual action from admins at onet.pl
17:45:16<szczot3k>Wonder what concurrency would work best
17:45:35<nicolas17>maxmodels discovery 16x
17:45:37<h2ibot>Klea uploaded File:Dc Homepage.png: https://wiki.archiveteam.org/?title=File%3ADc%20Homepage.png
17:45:38<h2ibot>Klea edited Dcwiki (+38, Add images): https://wiki.archiveteam.org/?diff=60141&oldid=60138
17:46:11<fuzzy80211>started a bunch at 2. whats our deadline?
17:46:19<szczot3k>fuzzy80211: 20th
17:46:32<fuzzy80211>k
17:46:40<fuzzy80211>szczot3k did you get banned?
17:47:01<szczot3k>fuzzy80211: yeah, 20 is quite over their threshold it seems
17:47:10<szczot3k>will run another set with lower
17:47:21<nicolas17>I started at concurrency 1
17:47:36<fuzzy80211>5 seems ok from a hivelocity dc
17:48:48<DigitalDragons>6 seems okay from Contabo (US)
17:49:13<szczot3k>20 seemed okay for like 6 minutes
17:50:07<@imer>my hetzner ip started working again as well, so likely block fuzzy80211
17:50:39midou joins
17:51:39<nicolas17>discovery 13x
17:54:38SDRedneck leaves
17:55:43midou quits [Ping timeout: 272 seconds]
18:05:58<nicolas17>9x, good it's slowing down
18:06:36<szczot3k>unbanned, let's go
18:07:25<szczot3k>It's 7PM in Poland, hope that the on-call onet.pl's admin isn't monitoring the maxmodels infra that much
18:08:25gosc quits [Quit: Leaving]
18:09:03<nicolas17>queue stabilized
18:09:20<nicolas17>I didn't realize my previous numbers were from a 30min average, which is too long for this
18:09:37<szczot3k>socket.gaierror: [Errno -3] Temporary failure in name resolution yay
18:10:05ATinySpaceMarine quits [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.]
18:10:06<szczot3k>Some of my nodes don't want to play ball today
18:10:54<masterx244|m>szczot3k: time to grab the wrench for a slap
18:10:56<fuzzy80211>eyeball test says its still growing
18:11:33Webuser14 joins
18:11:43<Webuser14>hi!
18:12:05<Webuser14>can I suggest something to be archived, forgive me if this is the wrong channel, I'm new
18:12:16<Webuser14>(both to IRC and ArchiveTeam)
18:12:27<fuzzy80211>Webuser14 heres a good place
18:12:30<nicolas17>fuzzy80211: any idea what's being prioritized?
18:12:57<nicolas17>I see stuff appearing in todo sometimes so I guess the tracker is moving some items from backfeed
18:13:03<fuzzy80211>no clue just watching when todo clears, todo:backfeed starts dropping then todo starts to fill again
18:13:03<Webuser14>I want to suggest tenor, might be useful in the future as I found out somewhere that tenor api is being removed
18:13:07nine quits [Quit: See ya!]
18:13:13<nicolas17>we know about tenor
18:13:17<Webuser14>oh
18:13:18<Webuser14>ok
18:13:21nine joins
18:13:21nine quits [Changing host]
18:13:21nine (nine) joins
18:13:34<fuzzy80211>thanks Webuser14 it is on the radar and a project is planned in the future
18:15:25<nicolas17>fuzzy80211: maxmodels queue has mostly stabilized but it seems discovered items come in bursts
18:15:59<nicolas17>kiska: can we have grafana for maxmodels?
18:16:04<szczot3k>nicolas17: nature of the site, some profiles are empty, some have thousands of photos
18:16:17<Webuser14>I would also want to contribute to archiving stuff in general, how do I do that? I have some disk space (like a few GB), 8 GB ram, and use bootcamp with windows 10 on an intel macbook air (yes I know, massively outdated, but it works for me at least). Can I install some program to contribute but also that doesn't like bloat my PC? Sorry if this is
18:16:17<Webuser14>a stupid question, I am new.
18:16:59<fuzzy80211>Webuser14 look at https://wiki.archiveteam.org/index.php/ArchiveTeam_Warrior
18:17:07<szczot3k>nicolas17: https://www.maxmodels.pl/dziekujemy-ze-byliscie-z-nami,a,2568.html claims 5.6mln of photos
18:17:41<szczot3k>add to that different views (one photo can show in multiple places) + different types of profiles (model/photographer/visage/hairdresser, etc)
18:17:55<Webuser14>I need to go for some time, I will return back and try to install warrior, if it succeeds I will say
18:18:02<nicolas17>queue seems to be shrinking... and then suddenly thousands of items are added ^^ so yeah I can't give an ETA yet
18:18:06<Webuser14>also sorry for any spelling mistake I am not english
18:18:17Webuser14 quits [Client Quit]
18:18:17<fuzzy80211>no worries Webuser14 have a good day
18:18:40<szczot3k>root@a7f7a0f10c3e:/grab# timeout 5 bash -c '</dev/tcp/1.1.1.1/53' && echo OK || echo FAIL > FAIL
18:18:49<szczot3k>huh, docker has a broken internet connectivity
18:21:48midou joins
18:25:44<h2ibot>Nyakase created Tenor (+3221, Created page with "'''Tenor''' (originally…): https://wiki.archiveteam.org/?title=Tenor
18:26:33<nyakase>^ not the most comprehensible page right now but I had an API key lying around so sharing findings there, mostly in regard to media formats in the API
18:28:42midou quits [Ping timeout: 256 seconds]
18:31:24<DigitalDragons>speaking of tenor, i propose #goner
18:35:37nepeat quits [Ping timeout: 272 seconds]
18:37:45Webuser14 joins
18:37:47<Webuser14>I'm back!
18:37:54<nicolas17>re: hytale, I think I'll archive the files that exist so far (24GB) and I'll keep an eye on future growth
18:37:55<Webuser14>Anyways I'm now installing warrior.
18:38:07<Webuser14>ok so I downloaded it
18:39:18<Webuser14>I need to install virtualbox now, I forgot I had it only installed on my PC (I can't use it right now because it's somewhere else) and laptop, no problem, I will just install it
18:41:20<Dango360>is there a channel for maxmodels?
18:41:43<nicolas17>Dango360: no, it's small enough that we use #archiveteam-bs
18:41:48<Dango360>ah okay
18:41:57midou joins
18:41:58<Dango360>recommended concurrency? i'm trying 6
18:42:14<fuzzy80211>somewhere in there seems fine
18:43:25<Webuser14>began installing... it says it finishes in several mins
18:43:54nepeat (nepeat) joins
18:44:12<Webuser14>but it finished just now?
18:45:48<Webuser14>Importing appliance...
18:45:56<Webuser14>warrior installed!
18:48:35<Webuser14>currently waiting for docker to be ready
18:49:08<justauser>-> #warrior
18:49:31Webuser14 quits [Client Quit]
18:50:20<nicolas17>maxmodels queue growing at 1000/min (avg over the last 15 min)
18:50:48midou quits [Ping timeout: 256 seconds]
18:53:54<IDK>wait, is there a limit? just started on 100
18:54:14<IDK>ah fuck there definately is
18:54:47<IDK>ban seems to be =0s
18:59:33<szczot3k>IDK: there is, 20 is too much
18:59:44<szczot3k>6 is fine
18:59:49<szczot3k>Might try to push it a little
19:01:12<nicolas17>I'm doing 5 at home and 5 at digitalocean
19:01:22<nicolas17>maybe I should raise it a bit more
19:03:31<IDK>btw, i think a great channel name for this would be #minmodels!
19:14:25<IDK>all workers banned on 6
19:16:36<fuzzy80211>IDK believe there is a cool off time before it will let you run again
19:18:11<IDK>eh, already nuked all my vms
19:18:23<IDK>fuzzy80211: how long would the cool off time be?
19:18:52<fuzzy80211>szczot3k do you remember?
19:19:21<fuzzy80211>i have learned my lesson and always start around 5 any more
19:20:56<szczot3k>IDK, fuzzy80211, I got banned for around 30 minutes at 20
19:21:19FiTheArchiver joins
19:21:30<szczot3k>But I've completlely stopped the dockers after ~15minutes, in case there's some aggresive bucket banning
19:22:44<IDK>bucket banning would be fun, since my 10 vms are now all in the same /23 range
19:23:12FiTheArchiver quits [Client Quit]
19:23:24<szczot3k>nah, I mean adding more points to hosts that try to send requests during a ban
19:23:52<IDK>ah I see
19:24:49<szczot3k>Not sure what's the mechanism there though
19:25:18<szczot3k>If we get the work done within 12 hours, the onet.pl sysadmins might not even see that we scraped everything /s
19:25:44AlsoHP_Archivist joins
19:29:20HP_Archivist quits [Ping timeout: 256 seconds]
19:29:59<IDK>or send the list of IP address to the IC3 :-) /hj
19:30:05<IDK>(hopefully/half joking)
19:31:34<fuzzy80211>szczot3k should i push harder? only running 2 concurrency for most of my ips
19:31:57<nicolas17>what's the deadline?
19:32:07<fuzzy80211>20th
19:32:27<szczot3k>fuzzy80211: I was mostly joking, deadline is 6days from now, so I believe even with the current work done/s we'll be safe
19:32:42<fuzzy80211>k
19:32:49<nicolas17>I *think* we're fine but it's still hard to tell
19:33:19<h2ibot>Nyakase edited Tenor (+63, forgot gif_transparent media type): https://wiki.archiveteam.org/?diff=60143&oldid=60142
19:34:04<IDK>oh.. just realized maxmodels is a website for actual models
19:34:10<IDK>i thought we are grabbing 3d models
19:34:21midou joins
19:34:35<nicolas17>don't grab the models without consent
19:37:25<IDK>restraining orders aint fun :-)
19:41:38Rejoin_HP_Archivist joins
19:44:13BearFortress joins
19:45:46AlsoHP_Archivist quits [Ping timeout: 256 seconds]
19:46:20leo60228 quits [Read error: Connection reset by peer]
19:46:24leo60228 (leo60228) joins
19:52:18<klea>i think having a channel like #dpos for dpos stuff that's not specific to warrior might be neat, (it might also be neat to have it on irclogs.archivete.am, sine it would only be for giving concurrencies, which aren't private info?)
19:52:35<nicolas17>too many channels /o\
19:58:29kansei- (kansei) joins
19:59:05leo60228 quits [Remote host closed the connection]
19:59:29leo60228 (leo60228) joins
19:59:51kansei quits [Ping timeout: 272 seconds]
20:00:43<nicolas17>JAA: webuser in #down-the-tube continues adding crap to the queue and ignoring us
20:01:45midou quits [Ping timeout: 272 seconds]
20:08:20ATinySpaceMarine joins
20:13:29<nicolas17>maxmodels queue is now net-shrinking
20:14:48<nicolas17>on average we're adding 90 items for every 100 items completed, still fluctuating but it hasn't gone above 100 for a while now
20:30:37midou joins
20:39:36midou quits [Ping timeout: 256 seconds]
20:46:59Lord_Nightmare (Lord_Nightmare) joins
21:02:21midou joins
21:08:58<IDK>5 concurrency per IP seems to hold up well for maxmodels, hopefully I don't get banned overnight 🙏
21:10:21<klea>good night IDK
21:10:33<klea>someone should make an account called good :p
21:11:19good joins
21:11:29<good>who mentioned me?
21:12:07who joins
21:12:12<szczot3k>they did not
21:12:18<who>No I didn't!
21:12:37good quits [Client Quit]
21:12:38who quits [Client Quit]
21:14:35midou quits [Ping timeout: 272 seconds]
21:21:44<fuzzy80211>overall todo for maxmodels has gone down 91k in the last hour
21:23:05<fuzzy80211>assuming discovery and completion rate hold we have 62 hours left
21:23:13midou joins
21:24:08Yakov8 (Yakov) joins
21:25:59Yakov quits [Ping timeout: 272 seconds]
21:25:59Yakov8 is now known as Yakov
21:26:32<datechnoman>Just added a bunch of workers in to help :)
21:36:34n9nes joins
21:40:58<IDK>all my european workers just got banned on 5 cons after around 20 minutes
21:41:16<IDK>my los angeles workers are still going strong after 2 hours
21:48:09midou quits [Ping timeout: 272 seconds]
22:01:27<fuzzy80211>got one box that is getting off and on banning
22:18:49freetheplanet joins
22:28:56freetheplanet quits [Client Quit]
22:41:08n9nes quits [Client Quit]
22:41:29n9nes joins
22:48:17midou joins
22:50:54<Guest>#dpos++
22:50:55<eggdrop>[karma] '#dpos' now has 1 karma!
22:52:36<Guest>rather than trying to scan a channel for concurrencies
22:56:24klea leaves [~nyaa]
22:56:24klea (jmjl) joins
22:56:48<nicolas17>maxmodels ETA 40h
22:57:32<nicolas17>Guest: in a few recent projects I tried to keep the wiki page updated with the status, recommended concurrency, etc
22:57:42<nstrom|m>joining the party for maxmodels, guessing I want conc 4 or less?
22:57:44<nicolas17>we don't even *have* a wiki page for maxmodels :|
22:58:15<Guest>yeah i noticed this ^^
22:58:25<Guest>nstrom|m: im using conc 5
22:58:42<nstrom|m>IDK above said he got banned at 5
22:58:59<Guest>and it looks like i just got banned now too 😂
22:59:31<Guest>to be fair i was running multiple (3) at conc 5
22:59:38<nicolas17>...that's conc 15 lol
23:00:00<nicolas17>I'm doing 7 without ban, but I probably have higher latency which causes less requests/sec in practice
23:00:10<Guest>well it was only for like 10 minutes @ 3x..
23:00:26<Guest>i guess still conc 15
23:00:37<nicolas17>anyway
23:00:44<nicolas17>ETA Jan 16 12:00
23:01:06<nicolas17>so we'll be fine
23:01:17<@Fusl>got banned at 10 concurrent lol
23:03:05n9nes quits [Client Quit]
23:03:47n9nes joins
23:04:23<h2ibot>KleaBot made 1 bot changes: https://wiki.archiveteam.org/index.php?title=Special:Contributions/KleaBot&offset=20260114230419&limit=1
23:04:43<nicolas17>!botsnack
23:05:11<klea>JAA needs to give more people the h2i endpoint so we can make the bot have a snack.
23:05:32<@JAA>Hmm, it's not supposed to collapse single edits.
23:06:48<klea>oh lol, i agree with JAA
23:06:48<@JAA>Oh
23:07:03<klea>or maybe yes, not sure.
23:07:10<@JAA>It's because it also made two edits in the User namespace in the same minute.
23:07:23<nicolas17>fun edge case :D
23:07:34<Guest>IDK: are you still banned?
23:07:43<@JAA>That also means the offset/limit value might be wrong.
23:08:19<klea>oh lovely.
23:08:44<nicolas17>do you keep only main namespace?
23:08:55<klea>yes, JAA only keeps the main namespace.
23:09:07<klea>wait, no iirc there's some namespace exclusions.
23:09:25<@JAA>No, I'm excluding User and User_talk.
23:15:32<@JAA>namespace=2&wpfilters[]=nsInvert&wpfilters[]=associated seems to do the trick. Very ugly.
23:15:58<klea>JAA: why not make the special:contributions somehow show all changes that are also human changes?
23:16:05<klea>s/human/user:/
23:16:37<@JAA>Because non-bot User edits aren't announced here either.
23:17:05<klea>it'd also potentially solve the issue of my bot doing the nasty thing of editing pages in User and non User pages, in a way that the url it produces shows User: edits.
23:17:28<klea>because i believe offset and limit are the only params making those not appear :p
23:18:05<@JAA>Yes, that's what I'm trying to fix, because the generated URL would currently be wrong depending on the order of User and non-User edits.
23:18:51<klea>that's why i said about taking the easy route, if User and non-User edits, and non-User edits more than 1, make the url even if it shows the user edits too.
23:19:18<klea>probably change the text to say <n> bot changes (and <n> on user pages) maybe, but i guess better not.
23:23:19Island joins