| 00:09:30 | <@JAA> | pokechu22: What hacks do you need on s3-bucket-list for those? They should work fine over HTTPS (as invalid certs are already ignored). |
| 01:00:05 | | eggdrop quits [Ping timeout: 272 seconds] |
| 01:11:03 | | SootBector quits [Remote host closed the connection] |
| 01:11:21 | | eggdrop (eggdrop) joins |
| 01:12:11 | | SootBector (SootBector) joins |
| 01:14:40 | | Woodie quits [Ping timeout: 260 seconds] |
| 01:22:23 | | Woodie joins |
| 01:22:23 | | Woodie is now authenticated as Woodie |
| 01:27:16 | | Woodie quits [Ping timeout: 260 seconds] |
| 01:56:37 | | Woodie (Woodie) joins |
| 02:02:48 | | sec^nd quits [Ping timeout: 256 seconds] |
| 02:07:35 | | sec^nd (second) joins |
| 02:21:08 | | Suika_ joins |
| 02:22:10 | | Suika quits [Ping timeout: 256 seconds] |
| 02:44:44 | | Woodie quits [Ping timeout: 260 seconds] |
| 03:02:07 | | Woodie (Woodie) joins |
| 03:47:18 | <pokechu22> | JAA: I think I'm specifically using an older hacked-up version; probably I should just update to the newer one |
| 03:52:21 | | beardicus quits [Ping timeout: 272 seconds] |
| 03:54:52 | <@JAA> | pokechu22: Ah. It does fail on HTTP, FWIW. |
| 04:21:10 | | midou quits [Ping timeout: 256 seconds] |
| 04:31:32 | | midou joins |
| 04:35:50 | | Island quits [Read error: Connection reset by peer] |
| 04:53:06 | | DogsRNice quits [Read error: Connection reset by peer] |
| 05:18:32 | | Jens quits [] |
| 05:19:03 | | Jens (JensRex) joins |
| 05:59:10 | | eythian quits [Quit: http://quassel-irc.org - Chat comfortabel. Waar dan ook.] |
| 06:00:40 | | eythian joins |
| 06:01:22 | | nexussfan quits [Quit: Konversation terminated!] |
| 06:20:29 | | AlsoHP_Archivist joins |
| 06:24:08 | | HP_Archivist quits [Ping timeout: 256 seconds] |
| 06:55:24 | | HP_Archivist (HP_Archivist) joins |
| 06:59:11 | | AlsoHP_Archivist quits [Ping timeout: 272 seconds] |
| 08:05:43 | <triplecamera|m> | I tried to install grab-site with uv. It failed. |
| 08:05:48 | <triplecamera|m> | The cause was that the latest google-re2 no longer supports Python 3.8 |
| 08:05:49 | <triplecamera|m> | The dependency list was not frozen, so the latest google-re2 was used |
| 08:05:50 | <triplecamera|m> | https://github.com/ArchiveTeam/grab-site/issues/245 |
| 08:26:10 | <triplecamera|m> | OK. After applying <https://github.com/ArchiveTeam/grab-site/pull/248>, grab-site can be successfully installed and ran |
| 08:28:25 | <triplecamera|m> | I'm really worried that grab-site is lacking maintainers, especially code maintainers |
| 08:28:33 | <triplecamera|m> | The last update on code (not README) was 1.5 years ago |
| 08:43:48 | <ivan> | I stopped using it in favor of SingleFile in batch mode and stuff, but it doesn't make WARCs |
| 08:44:10 | <ivan> | is there anything like grab-site now besides heritrix and wget-lua |
| 08:44:26 | <@arkiver> | does this mean grab-site is not usable anymore nowadays? |
| 08:44:29 | <@arkiver> | that should be fixed then |
| 08:51:50 | <triplecamera|m> | arkiver: Yes. There are many stashed issues and PRs |
| 08:52:27 | <@arkiver> | did grab-site use wpull? |
| 08:53:52 | <triplecamera|m> | It uses a fork of wpull, <https://github.com/ArchiveTeam/ludios_wpull>, as said in README |
| 08:55:19 | <triplecamera|m> | In my humble opinion, this is a bit confusing, because I don't know what's the difference |
| 09:16:13 | <ivan> | https://github.com/ArchiveTeam/ludios_wpull/commits/master/?after=c3e7be68c7acf2fddb8d6bec72e352551c12f38f+104 ludios_wpull ripped out some stuff and went with html5-parser |
| 09:19:07 | <ivan> | maybe it should become wpull unless there are objections from chfoo or JAA to those decisions or more-recent commits (I have totally forgotten whether there was an issue with the choice of parser or phantomjs removal.) |
| 10:07:15 | | LddPotato (LddPotato) joins |
| 10:14:12 | | MrMcNuggets quits [Ping timeout: 256 seconds] |
| 10:32:06 | | MrMcNuggets (MrMcNuggets) joins |
| 10:57:47 | | Dada joins |
| 11:02:00 | | igloo222259 quits [Quit: The Lounge - https://thelounge.chat] |
| 11:02:32 | | igloo222259 joins |
| 11:15:51 | | ducky_ (ducky) joins |
| 11:17:40 | | ducky quits [Ping timeout: 256 seconds] |
| 11:18:09 | <c3manu> | pabs: you queued https://felipec.substack.com/ last September saying it was abandoned. looks like two more posts appeared in October :) just fyi, in case that info is useful to you |
| 11:19:34 | <c3manu> | klea: i’d be fine with a weekly or daily run of that sorting script. the hard part is gonna be finding a good time for that |
| 11:20:45 | | ducky_ quits [Ping timeout: 272 seconds] |
| 11:28:54 | | MrMcNuggets quits [Client Quit] |
| 11:32:51 | | ducky (ducky) joins |
| 11:42:06 | | mrminemeet_ joins |
| 11:43:10 | | mrminemeet quits [Ping timeout: 256 seconds] |
| 11:57:15 | | ducky_ (ducky) joins |
| 11:57:20 | | ducky quits [Ping timeout: 256 seconds] |
| 11:57:52 | | ducky_ is now known as ducky |
| 12:00:02 | | Bleo182600722719623455222 quits [Quit: The Lounge - https://thelounge.chat] |
| 12:02:40 | | BornOn420 quits [Remote host closed the connection] |
| 12:02:45 | | Bleo182600722719623455222 joins |
| 12:15:51 | | Shjosan quits [Ping timeout: 272 seconds] |
| 12:18:39 | | v01d joins |
| 12:36:35 | | Shjosan (Shjosan) joins |
| 12:48:04 | | lennier2 joins |
| 12:48:09 | | lennier2_ quits [Ping timeout: 272 seconds] |
| 13:02:30 | | Shard quits [Quit: Im doing something rq. Il brb] |
| 13:06:00 | | Sluggs quits [Excess Flood] |
| 13:06:05 | | Shard (Shard) joins |
| 13:08:34 | | Sluggs (Sluggs) joins |
| 13:12:02 | | SootBector quits [Ping timeout: 256 seconds] |
| 13:13:55 | | phuzion quits [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.] |
| 13:14:51 | | SootBector (SootBector) joins |
| 13:34:16 | | panopticon quits [Quit: Bye for now!] |
| 13:37:46 | | FiTheArchiver joins |
| 13:38:20 | | FiTheArchiver quits [Remote host closed the connection] |
| 13:41:05 | | panopticon (panopticon) joins |
| 13:53:43 | <justauser> | Open Diary is back. Sort of, from the cleanest proxy I could find - not from my real connection and not from Tor. |
| 13:54:27 | <justauser> | Anybody still feeling like saving it? |
| 14:01:39 | <justauser> | Started an AB job, but it's unlikely to discover much without hints. |
| 14:05:58 | | corentin quits [Ping timeout: 256 seconds] |
| 14:08:45 | | corentin joins |
| 14:09:46 | | panopticon quits [Client Quit] |
| 14:22:41 | | panopticon (panopticon) joins |
| 14:28:09 | | dracohiro joins |
| 14:28:23 | | dracohiro quits [Client Quit] |
| 14:54:38 | <@arkiver> | fixed a problem in warrior-dockerfile that prevented urls-grab from running on warrior |
| 14:54:57 | <@arkiver> | it also always had nodejs installed, while it was only required for youtube-grab, that is now also only installed when youtube-grab is run |
| 15:24:59 | | MrMcNuggets (MrMcNuggets) joins |
| 15:29:57 | | BennyOtt quits [Quit: ZNC 1.10.1 - https://znc.in] |
| 15:31:28 | | BennyOtt (BennyOtt) joins |
| 15:35:11 | | BennyOtt quits [Remote host closed the connection] |
| 15:39:30 | | BennyOtt (BennyOtt) joins |
| 15:48:19 | <@arkiver> | yay first warriors shower up :) |
| 15:48:33 | <@arkiver> | also using tini now in warrior-dockerfile in an attempt to prevent zombies |
| 15:57:06 | | Webuser399212 joins |
| 15:57:21 | | Webuser399212 quits [Client Quit] |
| 16:00:01 | | BornOn420 (BornOn420) joins |
| 16:10:03 | | BennyOtt_ joins |
| 16:11:27 | | BennyOtt quits [Ping timeout: 272 seconds] |
| 16:11:39 | | BennyOtt_ is now known as BennyOtt |
| 16:12:11 | | BennyOtt is now authenticated as BennyOtt |
| 16:23:49 | | beardicus (beardicus) joins |
| 17:23:12 | | tuna (tuna) joins |
| 17:28:14 | | phuzion (phuzion) joins |
| 17:28:54 | | phuzion quits [Client Quit] |
| 17:29:30 | | phuzion (phuzion) joins |
| 17:31:55 | | nine quits [Quit: See ya!] |
| 17:32:08 | | nine joins |
| 17:32:08 | | nine is now authenticated as nine |
| 17:32:08 | | nine quits [Changing host] |
| 17:32:08 | | nine (nine) joins |
| 17:43:17 | | v01d quits [Ping timeout: 272 seconds] |
| 17:51:07 | | Hackerpcs_1 (Hackerpcs) joins |
| 17:53:46 | | Hackerpcs quits [Ping timeout: 256 seconds] |
| 18:20:48 | | Deewiant quits [Remote host closed the connection] |
| 18:21:51 | | Deewiant (Deewiant) joins |
| 18:23:11 | | lennier2 quits [Ping timeout: 272 seconds] |
| 18:32:34 | | MrMcNuggets quits [Quit: WeeChat 4.3.2] |
| 19:03:25 | | Webuser036219 joins |
| 19:03:26 | | Webuser036219 quits [Client Quit] |
| 19:10:48 | | fuzzy80211 quits [Read error: Connection reset by peer] |
| 19:11:07 | | chunkynutz60 quits [Read error: Connection reset by peer] |
| 19:11:16 | | chunkynutz60 joins |
| 19:11:37 | | fuzzy80211 (fuzzy80211) joins |
| 19:12:15 | | fuzzy80211 quits [Excess Flood] |
| 19:12:37 | | fuzzy80211 (fuzzy80211) joins |
| 19:18:37 | | gosc joins |
| 19:19:16 | <gosc> | got another list of a game that needs to be incremented lol, this one isn't too high priority but it isn't a good sign that the game's servers have died before |
| 19:19:27 | <gosc> | will send in a bit |
| 19:23:56 | <gosc> | here https://transfer.archivete.am/XWvUk/cardcaptor_sakura_info.txt |
| 19:23:56 | <eggdrop> | inline (for browser viewing): https://transfer.archivete.am/inline/XWvUk/cardcaptor_sakura_info.txt |
| 19:24:23 | <gosc> | there's a bunch of files there which exist for each version of the game, which aren't a lot but I don't have the time to cycle through them myself |
| 19:25:02 | <gosc> | it's like the sims game again actually, there is a txt file with json data containing the filenames of assetbundles |
| 19:25:09 | <gosc> | which exist for each version of the game |
| 19:25:30 | <gosc> | the game is Cardcaptor Sakura: Memory Key |
| 19:31:46 | <gosc> | just send me a Tell message if anyone picks this up since I have to go now |
| 19:32:44 | <pokechu22> | gosc: that one uses 4.0.0 rather than a single number - do you have a list of all versions? |
| 19:33:37 | <gosc> | I don't, but I checked the google play and iOS versions, starts in alpha (0.something) and ends with 4.0.3, but the latest uses 4.0.0 |
| 19:33:47 | <gosc> | the lowest I got was 3.0.0 |
| 19:33:59 | <gosc> | yes I did check on both /Android/ and /iOS/ |
| 19:37:43 | <gosc> | also I see someone ran my scholastic homebase list? that wasn't the final url list |
| 19:55:19 | | gosc quits [Client Quit] |
| 20:31:32 | | ducky_ (ducky) joins |
| 20:33:00 | | ducky quits [Ping timeout: 256 seconds] |
| 20:33:00 | | ducky_ is now known as ducky |
| 20:49:48 | | Webuser069071 joins |
| 20:49:53 | | Webuser069071 quits [Client Quit] |
| 21:01:57 | | DogsRNice joins |
| 21:51:05 | <h2ibot> | Cooljeanius edited Archiveteam:Copyrights (+8, /* You and ArchiveTeam Wiki content */ use URL…): https://wiki.archiveteam.org/?diff=60090&oldid=60064 |
| 23:11:42 | | Dada quits [Remote host closed the connection] |
| 23:12:33 | | nexussfan (nexussfan) joins |
| 23:35:50 | <klea> | btw, does anybody know of some OCR tool that will not struggle with iranian text? |
| 23:36:27 | <klea> | this stupid page i found seems to only give me links to shitter, not to the telegram groups, and i'd like to pass those trough #telegrab. https://iran.liveuamap.com/en/2026/11-january-19-brigadier-general-javad-keshavarz-was-killed |
| 23:38:21 | <ericgallager> | Persian is basically Arabic with a few extra letters: https://en.wikipedia.org/wiki/Persian_alphabet |
| 23:38:36 | <klea> | thanks |
| 23:41:04 | <klea> | i've found <https://olocr.com/ocr/persian> usefull. |
| 23:41:28 | <ericgallager> | if you're using tesseract, you may need to install the Persian language data package as well; in MacPorts it's tesseract-fas |
| 23:41:55 | <ericgallager> | ("fas" being short for "Farsi" I suppose) |
| 23:42:01 | <nexussfan> | Yes |
| 23:42:15 | <klea> | im not automating it :p, im manually taking screenshots with firefox, and then putting them trough the ocr, copying things i think may give me more links, and giving them to my bot to make qubert queue. |
| 23:44:54 | <klea> | thanks ericgallager |
| 23:44:56 | <klea> | ericgallager++ |
| 23:44:56 | <eggdrop> | [karma] 'ericgallager' now has 1 karma! |
| 23:50:34 | <@JAA> | ISO 639++ |
| 23:50:34 | <eggdrop> | [karma] 'ISO 639' now has 1 karma! |
| 23:54:35 | <nstrom|m> | huh you don't see the 3 letter variants that often in the wild |
| 23:54:40 | <nstrom|m> | usually it's just 639-2 |
| 23:55:01 | <nexussfan> | I usually see `fa` on farsi-related stuff |