00:12:56khaoohs_ quits [Read error: Connection reset by peer]
00:13:15khaoohs_ joins
00:18:01<pabs>klea: re HN, I think we want both the website (since WBM lookups will mostly be that) and the API (for completeness)
00:18:23<pabs>klea: don't use ws://archivebot.com:4568/ please, rationale on https://wiki.archiveteam.org/index.php/ArchiveBot/Monitoring
00:18:44nomadgeek quits [Client Quit]
00:22:06<pabs>Guest: HN is mostly a good source of outlinks, not in trouble AFAIK
00:24:03Dango3602 (Dango360) joins
00:24:30Dango360 quits [Read error: Connection reset by peer]
00:24:30Dango3602 is now known as Dango360
00:29:37Guest58 quits [Client Quit]
00:30:15Guest58 joins
00:37:13kansei quits [Quit: ZNC 1.10.1 - https://znc.in]
00:38:26etnguyen03 quits [Client Quit]
00:44:21kansei (kansei) joins
00:51:13cyanbox_ joins
00:52:05<pabs>not sure how real this is but GrapheneOS seems to be targeted by France https://mamot.fr/@LaQuadrature/115581775965025042 https://news.ycombinator.com/item?id=46035977
00:52:49<nicolas17>the GrapheneOS maintainer has a persecution complex
00:53:16DogsRNice_ joins
00:53:32<pabs>yeah
00:54:17cyanbox quits [Ping timeout: 272 seconds]
00:56:19DogsRNice__ joins
00:56:49DogsRNice quits [Ping timeout: 272 seconds]
00:59:21DogsRNice_ quits [Ping timeout: 272 seconds]
01:02:27<pabs>saving their static site anyway, since thats what they suggest is at risk
01:10:50Guest58 quits [Client Quit]
01:12:24etnguyen03 (etnguyen03) joins
01:17:24Guest58 joins
01:28:48xkey quits [Quit: WeeChat 4.7.1]
01:29:04xkey (xkey) joins
01:55:06thalia (thalia) joins
02:05:00etnguyen03 quits [Client Quit]
02:12:11Guest58 quits [Ping timeout: 272 seconds]
02:13:42Guest58 joins
02:27:12<dendory>Re: GrapheneOS.. "À l’instant où les informaticiens de la police ont tenté de l’exploiter, l’appareil s’est mystérieusement réinitialisé." Basically as soon as they tried to hack it, the phone reset itself, ie. it did exactly what anything that respects privacy should do. That article seems to be a pretty big endorsement of GrapheneOS
02:27:12<dendory>:P
02:30:35TastyWiener95 (TastyWiener95) joins
02:40:21Guest58 quits [Client Quit]
02:40:34wickedplayer494 quits [Ping timeout: 256 seconds]
02:41:01etnguyen03 (etnguyen03) joins
02:41:28wickedplayer494 joins
03:00:24klg quits [Ping timeout: 256 seconds]
03:12:35TunaLobster quits [Quit: So long and thanks for all the fish]
03:16:28TunaLobster joins
03:22:49Guest58 joins
03:27:06Wohlstand quits [Quit: Wohlstand]
03:27:59Guest58 quits [Client Quit]
03:29:04Guest58 joins
03:29:34etnguyen03 quits [Remote host closed the connection]
03:30:57Wohlstand (Wohlstand) joins
03:31:26Guest58 quits [Client Quit]
04:02:10Sokar quits [Ping timeout: 256 seconds]
04:10:44klg (klg) joins
04:13:53Wohlstand quits [Client Quit]
04:44:36sec^nd quits [Remote host closed the connection]
04:44:57sec^nd (second) joins
05:18:20ThetaDev quits [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.]
05:19:08ThetaDev joins
05:37:32ducky quits [Ping timeout: 260 seconds]
05:39:09ducky (ducky) joins
05:39:34Guest58 joins
05:56:02Sokar joins
06:45:38<klea>pabs: thanks
06:46:02<klea>pabs: then should i generate another url list for pokechu22 for the website?
06:46:11<klea>i guess that probably DOES have a ratelimit
06:47:50<pokechu22>klea: it doesn't really matter, since archivebot can't go faster than con=6, d=0 per job. It does seem like US/canada pipelines are a lot faster than the ones in Europe though so I'll stick to those
06:51:56<pabs>re website, probably not since #// covers that
06:52:32Hackerpcs quits [Quit: Hackerpcs]
07:01:19<klea>oh ok
07:01:23<klea>so the api was just uncovered
07:01:39<klea>pabs: could #// cover new API entries?
07:01:47<klea>i shouldn't be pinging you so much :(
07:02:08<pabs>nothing links to the API entries, so unlikely
07:09:32<cruller><cruller> "Fortunately, the Google search..." <- For now, I've compiled the 33,176 known URLs on kinet-tv.ne.jp (scheduled to close on 11/30) into https://transfer.archivete.am/wtfmn/kinet-tv.ne.jp_urls.txt. This list contains a lot of broken/dead links.
07:09:32<eggdrop>inline (for browser viewing): https://transfer.archivete.am/inline/wtfmn/kinet-tv.ne.jp_urls.txt.
07:13:39nine quits [Ping timeout: 272 seconds]
07:15:00<cruller>If needed, I can exclude potentially broken URLs, but false positives will inevitably occur.
07:18:43DogsRNice__ quits [Read error: Connection reset by peer]
07:25:19<cruller>TIL Bing sometimes secretly hides some search results.
07:30:21nahkis joins
07:38:29michaelblob quits [Quit: yoop]
07:39:06michaelblob joins
07:39:13nahkis quits [Client Quit]
08:11:55raccoon1 quits [Ping timeout: 272 seconds]
08:17:57raccoon1 (raccoon) joins
08:53:26^ quits [Ping timeout: 256 seconds]
08:53:27^ (^) joins
09:03:42choochaa quits [Remote host closed the connection]
09:04:02choochaa (choochaa) joins
09:36:26HackMii quits [Remote host closed the connection]
09:36:46HackMii (hacktheplanet) joins
09:43:14nine joins
09:43:15nine quits [Changing host]
09:43:15nine (nine) joins
10:01:20nine quits [Client Quit]
10:23:34VerifiedJ quits [Quit: The Lounge - https://thelounge.chat]
10:24:04VerifiedJ (VerifiedJ) joins
10:33:07NF885 (NF885) joins
10:33:35NF885 quits [Client Quit]
10:35:03nicolas17 quits [Ping timeout: 272 seconds]
10:43:12nine joins
10:43:13nine quits [Changing host]
10:43:13nine (nine) joins
11:06:58Webuser085523 joins
11:08:28Webuser085523 quits [Client Quit]
11:29:42ThetaDev quits [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.]
11:30:33ThetaDev joins
11:41:51Guest58 quits [Quit: My Mac has gone to sleep. ZZZzzz…]
12:00:03Bleo182600722719623455222 quits [Quit: The Lounge - https://thelounge.chat]
12:02:49Bleo182600722719623455222 joins
12:27:05cyanbox_ quits [Read error: Connection reset by peer]
12:51:33Wohlstand (Wohlstand) joins
13:45:24<hexagonwin|m>in wget-lua, how can i load and parse html? pagination buttons are <a> links with href='javascript:;' and onclick='goCommentPage(1)', so tryna get the page num from onclick and convert that to URL like jjang0u.com/comment/list/pes/15986868/1
13:47:01<hexagonwin|m>slowly learning how to script, this is what i wrote currently https://termbin.com/3k20
14:25:36<@arkiver>hexagonwin|m: you can open and read the file at the `file` parameter to that function
14:28:43Island quits [Read error: Connection reset by peer]
14:36:06<@arkiver>cruller: nice! do you have an idea how complete that list may be?
14:36:22<@arkiver>datechnoman: would you be able to search your stash of URLs for anything kinet-tv.ne.jp ?
14:39:30<@arkiver>i was talking with hexagonwin|m privately
14:39:35<@arkiver>but should move that to a channel
14:39:53<@arkiver>anyone ( hexagonwin|m ? ) have a channel name idea for jjang0u.com ? :)
14:40:19<justauser|m>junk0u
14:40:34<justauser|m>jjang0nfile
14:44:54<justauser|m>g0ullash
14:46:36<hexagonwin|m>maybe jjang0ff
14:47:35hexagonwin quits [Quit: still connected thru matrix]
14:47:43<justauser|m>Or even jjang0n, which is just a vertical flip away.
14:48:13<hexagonwin|m>sounds good
14:48:51<@arkiver>hexagonwin|m: which shall we take? :P
14:49:37<hexagonwin|m>arkiver i'm out of ideas lol
14:50:59<@arkiver>i mean for the decision
14:51:05<@arkiver>but let's do #jjang0n
14:51:28<hexagonwin|m>alright
14:52:08^ quits [Ping timeout: 256 seconds]
14:53:04^ (^) joins
14:54:40<katia>Thibaultmol, https://archive.fart.website/archivebot/viewer/job/20240917081938dyqni
14:55:44<katia>this is the archive of direct stl/pdf links for all of printables i did about a year ago
14:56:16<Thibaultmol>all of it, nice! good to know it's being backed up thx :thum
14:56:19<Thibaultmol>👍️
14:56:32<katia>the list of all links was https://transfer.archivete.am/zHl4y/files.printables.com (warning, big file)
14:56:32<eggdrop>inline (for browser viewing): https://transfer.archivete.am/inline/zHl4y/files.printables.com
14:56:43<katia>but; since then; i haven't archived anything new
14:57:24<katia>i've been meaning to make another few hundreds of thousand requests to printables API to get a new list of direct URLs to archive but alas
15:00:46Shard795 (Shard) joins
15:01:46Shard79 quits [Ping timeout: 256 seconds]
15:01:47Shard795 is now known as Shard79
15:06:45^ quits [Ping timeout: 272 seconds]
15:07:08^ (^) joins
15:09:10nicolas17 (nicolas17) joins
15:18:01<cruller>arkiver: Roughly speaking, I think it's almost complete. At least, it should include all URLs within the Google index. Bing might yield a few more small subdirectories.
15:18:08<cruller>First, I searched for site:kinet-tv.ne.jp, and then repeated the search by excluding the subdirectory that was most frequent in the previous search results.
15:19:54Shard79 quits [Ping timeout: 256 seconds]
15:21:11<cruller>I stopped searching on Google when the number of results fell below 300, and on Bing when no new subdirectories were found (however there were over 600 results at that time).
15:21:56<cruller>Note 1: No new URLs were obtained from "Open Directory Project data," "MediaWiki wikis," Twitter, Yandex, Google Scholar, Google Books, or Hatena bookmark.
15:22:05Wohlstand quits [Client Quit]
15:22:18<cruller>Note 2: The final search query on Bing is "site:kinet-tv.ne.jp -site:kinet-tv.ne.jp/~tai -site:kinet-tv.ne.jp/~katagiri -site:kinet-tv.ne.jp/~tam-y -site:kinet-tv.ne.jp/~nisimura -site:kinet-tv.ne.jp/~sakura_2 -site:kinet-tv.ne.jp/~precious".
15:30:11^ quits [Ping timeout: 272 seconds]
15:31:36<h2ibot>Justauser edited Site exploration (+194, /* Wayback Machine */ added JAA's script): https://wiki.archiveteam.org/?diff=57864&oldid=57585
15:33:39^ (^) joins
15:44:30^ quits [Read error: Connection reset by peer]
15:45:14^ (^) joins
15:54:18archiveDrill quits [Quit: The Lounge - https://thelounge.chat]
15:58:54archiveDrill joins
15:59:23<@arkiver>cruller: it seems that kinet-tv.ne.jp is small then
16:07:14<cruller>Yes, that makes sense since it was a service provided only to Kyoto residents until 2007.
16:30:52ducky quits [Ping timeout: 260 seconds]
16:33:04ducky (ducky) joins
16:58:56hexagonwin (hexagonwin) joins
17:06:20ducky quits [Ping timeout: 260 seconds]
17:12:19ducky (ducky) joins