00:09:30<@JAA>pokechu22: What hacks do you need on s3-bucket-list for those? They should work fine over HTTPS (as invalid certs are already ignored).
01:00:05eggdrop quits [Ping timeout: 272 seconds]
01:11:03SootBector quits [Remote host closed the connection]
01:11:21eggdrop (eggdrop) joins
01:12:11SootBector (SootBector) joins
01:14:40Woodie quits [Ping timeout: 260 seconds]
01:22:23Woodie joins
01:27:16Woodie quits [Ping timeout: 260 seconds]
01:56:37Woodie (Woodie) joins
02:02:48sec^nd quits [Ping timeout: 256 seconds]
02:07:35sec^nd (second) joins
02:21:08Suika_ joins
02:22:10Suika quits [Ping timeout: 256 seconds]
02:44:44Woodie quits [Ping timeout: 260 seconds]
03:02:07Woodie (Woodie) joins
03:47:18<pokechu22>JAA: I think I'm specifically using an older hacked-up version; probably I should just update to the newer one
03:52:21beardicus quits [Ping timeout: 272 seconds]
03:54:52<@JAA>pokechu22: Ah. It does fail on HTTP, FWIW.
04:21:10midou quits [Ping timeout: 256 seconds]
04:31:32midou joins
04:35:50Island quits [Read error: Connection reset by peer]
04:53:06DogsRNice quits [Read error: Connection reset by peer]
05:18:32Jens quits []
05:19:03Jens (JensRex) joins
05:59:10eythian quits [Quit: http://quassel-irc.org - Chat comfortabel. Waar dan ook.]
06:00:40eythian joins
06:01:22nexussfan quits [Quit: Konversation terminated!]
06:20:29AlsoHP_Archivist joins
06:24:08HP_Archivist quits [Ping timeout: 256 seconds]
06:55:24HP_Archivist (HP_Archivist) joins
06:59:11AlsoHP_Archivist quits [Ping timeout: 272 seconds]
08:05:43<triplecamera|m>I tried to install grab-site with uv. It failed.
08:05:48<triplecamera|m>The cause was that the latest google-re2 no longer supports Python 3.8
08:05:49<triplecamera|m>The dependency list was not frozen, so the latest google-re2 was used
08:05:50<triplecamera|m>https://github.com/ArchiveTeam/grab-site/issues/245
08:26:10<triplecamera|m>OK. After applying <https://github.com/ArchiveTeam/grab-site/pull/248>, grab-site can be successfully installed and ran
08:28:25<triplecamera|m>I'm really worried that grab-site is lacking maintainers, especially code maintainers
08:28:33<triplecamera|m>The last update on code (not README) was 1.5 years ago
08:43:48<ivan>I stopped using it in favor of SingleFile in batch mode and stuff, but it doesn't make WARCs
08:44:10<ivan>is there anything like grab-site now besides heritrix and wget-lua
08:44:26<@arkiver>does this mean grab-site is not usable anymore nowadays?
08:44:29<@arkiver>that should be fixed then
08:51:50<triplecamera|m>arkiver: Yes. There are many stashed issues and PRs
08:52:27<@arkiver>did grab-site use wpull?
08:53:52<triplecamera|m>It uses a fork of wpull, <https://github.com/ArchiveTeam/ludios_wpull>, as said in README
08:55:19<triplecamera|m>In my humble opinion, this is a bit confusing, because I don't know what's the difference
09:16:13<ivan>https://github.com/ArchiveTeam/ludios_wpull/commits/master/?after=c3e7be68c7acf2fddb8d6bec72e352551c12f38f+104 ludios_wpull ripped out some stuff and went with html5-parser
09:19:07<ivan>maybe it should become wpull unless there are objections from chfoo or JAA to those decisions or more-recent commits (I have totally forgotten whether there was an issue with the choice of parser or phantomjs removal.)
10:07:15LddPotato (LddPotato) joins
10:14:12MrMcNuggets quits [Ping timeout: 256 seconds]
10:32:06MrMcNuggets (MrMcNuggets) joins
10:57:47Dada joins
11:02:00igloo222259 quits [Quit: The Lounge - https://thelounge.chat]
11:02:32igloo222259 joins
11:15:51ducky_ (ducky) joins
11:17:40ducky quits [Ping timeout: 256 seconds]
11:18:09<c3manu>pabs: you queued https://felipec.substack.com/ last September saying it was abandoned. looks like two more posts appeared in October :) just fyi, in case that info is useful to you
11:19:34<c3manu>klea: i’d be fine with a weekly or daily run of that sorting script. the hard part is gonna be finding a good time for that
11:20:45ducky_ quits [Ping timeout: 272 seconds]
11:28:54MrMcNuggets quits [Client Quit]
11:32:51ducky (ducky) joins
11:42:06mrminemeet_ joins
11:43:10mrminemeet quits [Ping timeout: 256 seconds]
11:57:15ducky_ (ducky) joins
11:57:20ducky quits [Ping timeout: 256 seconds]
11:57:52ducky_ is now known as ducky
12:00:02Bleo182600722719623455222 quits [Quit: The Lounge - https://thelounge.chat]
12:02:40BornOn420 quits [Remote host closed the connection]
12:02:45Bleo182600722719623455222 joins
12:15:51Shjosan quits [Ping timeout: 272 seconds]
12:18:39v01d joins
12:36:35Shjosan (Shjosan) joins
12:48:04lennier2 joins
12:48:09lennier2_ quits [Ping timeout: 272 seconds]
13:02:30Shard quits [Quit: Im doing something rq. Il brb]
13:06:00Sluggs quits [Excess Flood]
13:06:05Shard (Shard) joins
13:08:34Sluggs (Sluggs) joins
13:12:02SootBector quits [Ping timeout: 256 seconds]
13:13:55phuzion quits [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.]
13:14:51SootBector (SootBector) joins
13:34:16panopticon quits [Quit: Bye for now!]
13:37:46FiTheArchiver joins
13:38:20FiTheArchiver quits [Remote host closed the connection]
13:41:05panopticon (panopticon) joins
13:53:43<justauser>Open Diary is back. Sort of, from the cleanest proxy I could find - not from my real connection and not from Tor.
13:54:27<justauser>Anybody still feeling like saving it?
14:01:39<justauser>Started an AB job, but it's unlikely to discover much without hints.
14:05:58corentin quits [Ping timeout: 256 seconds]
14:08:45corentin joins
14:09:46panopticon quits [Client Quit]
14:22:41panopticon (panopticon) joins
14:28:09dracohiro joins
14:28:23dracohiro quits [Client Quit]
14:54:38<@arkiver>fixed a problem in warrior-dockerfile that prevented urls-grab from running on warrior
14:54:57<@arkiver>it also always had nodejs installed, while it was only required for youtube-grab, that is now also only installed when youtube-grab is run
15:24:59MrMcNuggets (MrMcNuggets) joins
15:29:57BennyOtt quits [Quit: ZNC 1.10.1 - https://znc.in]
15:31:28BennyOtt (BennyOtt) joins
15:35:11BennyOtt quits [Remote host closed the connection]
15:39:30BennyOtt (BennyOtt) joins
15:48:19<@arkiver>yay first warriors shower up :)
15:48:33<@arkiver>also using tini now in warrior-dockerfile in an attempt to prevent zombies
15:57:06Webuser399212 joins
15:57:21Webuser399212 quits [Client Quit]
16:00:01BornOn420 (BornOn420) joins
16:10:03BennyOtt_ joins
16:11:27BennyOtt quits [Ping timeout: 272 seconds]
16:11:39BennyOtt_ is now known as BennyOtt
16:23:49beardicus (beardicus) joins
17:23:12tuna (tuna) joins
17:28:14phuzion (phuzion) joins
17:28:54phuzion quits [Client Quit]
17:29:30phuzion (phuzion) joins
17:31:55nine quits [Quit: See ya!]
17:32:08nine joins
17:32:08nine quits [Changing host]
17:32:08nine (nine) joins
17:43:17v01d quits [Ping timeout: 272 seconds]
17:51:07Hackerpcs_1 (Hackerpcs) joins
17:53:46Hackerpcs quits [Ping timeout: 256 seconds]
18:20:48Deewiant quits [Remote host closed the connection]
18:21:51Deewiant (Deewiant) joins
18:23:11lennier2 quits [Ping timeout: 272 seconds]
18:32:34MrMcNuggets quits [Quit: WeeChat 4.3.2]
19:03:25Webuser036219 joins
19:03:26Webuser036219 quits [Client Quit]
19:10:48fuzzy80211 quits [Read error: Connection reset by peer]
19:11:07chunkynutz60 quits [Read error: Connection reset by peer]
19:11:16chunkynutz60 joins
19:11:37fuzzy80211 (fuzzy80211) joins
19:12:15fuzzy80211 quits [Excess Flood]
19:12:37fuzzy80211 (fuzzy80211) joins
19:18:37gosc joins
19:19:16<gosc>got another list of a game that needs to be incremented lol, this one isn't too high priority but it isn't a good sign that the game's servers have died before
19:19:27<gosc>will send in a bit
19:23:56<gosc>here https://transfer.archivete.am/XWvUk/cardcaptor_sakura_info.txt
19:23:56<eggdrop>inline (for browser viewing): https://transfer.archivete.am/inline/XWvUk/cardcaptor_sakura_info.txt
19:24:23<gosc>there's a bunch of files there which exist for each version of the game, which aren't a lot but I don't have the time to cycle through them myself
19:25:02<gosc>it's like the sims game again actually, there is a txt file with json data containing the filenames of assetbundles
19:25:09<gosc>which exist for each version of the game
19:25:30<gosc>the game is Cardcaptor Sakura: Memory Key
19:31:46<gosc>just send me a Tell message if anyone picks this up since I have to go now
19:32:44<pokechu22>gosc: that one uses 4.0.0 rather than a single number - do you have a list of all versions?
19:33:37<gosc>I don't, but I checked the google play and iOS versions, starts in alpha (0.something) and ends with 4.0.3, but the latest uses 4.0.0
19:33:47<gosc>the lowest I got was 3.0.0
19:33:59<gosc>yes I did check on both /Android/ and /iOS/
19:37:43<gosc>also I see someone ran my scholastic homebase list? that wasn't the final url list
19:55:19gosc quits [Client Quit]
20:31:32ducky_ (ducky) joins
20:33:00ducky quits [Ping timeout: 256 seconds]
20:33:00ducky_ is now known as ducky
20:49:48Webuser069071 joins
20:49:53Webuser069071 quits [Client Quit]
21:01:57DogsRNice joins
21:51:05<h2ibot>Cooljeanius edited Archiveteam:Copyrights (+8, /* You and ArchiveTeam Wiki content */ use URL…): https://wiki.archiveteam.org/?diff=60090&oldid=60064
23:11:42Dada quits [Remote host closed the connection]
23:12:33nexussfan (nexussfan) joins
23:35:50<klea>btw, does anybody know of some OCR tool that will not struggle with iranian text?
23:36:27<klea>this stupid page i found seems to only give me links to shitter, not to the telegram groups, and i'd like to pass those trough #telegrab. https://iran.liveuamap.com/en/2026/11-january-19-brigadier-general-javad-keshavarz-was-killed
23:38:21<ericgallager>Persian is basically Arabic with a few extra letters: https://en.wikipedia.org/wiki/Persian_alphabet
23:38:36<klea>thanks
23:41:04<klea>i've found <https://olocr.com/ocr/persian> usefull.
23:41:28<ericgallager>if you're using tesseract, you may need to install the Persian language data package as well; in MacPorts it's tesseract-fas
23:41:55<ericgallager>("fas" being short for "Farsi" I suppose)
23:42:01<nexussfan>Yes
23:42:15<klea>im not automating it :p, im manually taking screenshots with firefox, and then putting them trough the ocr, copying things i think may give me more links, and giving them to my bot to make qubert queue.
23:44:54<klea>thanks ericgallager
23:44:56<klea>ericgallager++
23:44:56<eggdrop>[karma] 'ericgallager' now has 1 karma!
23:50:34<@JAA>ISO 639++
23:50:34<eggdrop>[karma] 'ISO 639' now has 1 karma!
23:54:35<nstrom|m>huh you don't see the 3 letter variants that often in the wild
23:54:40<nstrom|m>usually it's just 639-2
23:55:01<nexussfan>I usually see `fa` on farsi-related stuff