00:08:36skankhunt42 (skankhunt42) joins
00:11:25Webuser692424 quits [Client Quit]
00:19:40iseaup quits [Remote host closed the connection]
00:48:45iseaup (iseaup) joins
01:04:00Shard1115828 (Shard) joins
01:04:36Shard111582 quits [Ping timeout: 268 seconds]
01:04:37Shard1115828 is now known as Shard111582
01:13:48<pabs>HN thread about IA, AT is mentioned https://news.ycombinator.com/item?id=47464818
01:28:18<h2ibot>PaulWise edited Obstacles (+203, multi-protocol passive fingerprinting): https://wiki.archiveteam.org/?diff=60776&oldid=60601
01:31:19<pabs>someone in the thread mentioned mTLS as a possible option for allowing archivists but blocking AI scrapers, seems like the best option to me, apart from IP address allowlists
01:35:21azalea_sh_ quits [Ping timeout: 268 seconds]
01:35:34azalea_sh_ (azalea_sh_) joins
01:35:36polypept1 (polypeptide) joins
01:38:50polypeptide quits [Ping timeout: 240 seconds]
01:39:59retrograde quits [Remote host closed the connection]
01:40:47retrograde (retrograde) joins
01:48:32Webuser623642 joins
01:48:42Webuser623642 quits [Client Quit]
01:57:10<nicolas17>pabs: another option would be https://developers.cloudflare.com/bots/reference/bot-verification/web-bot-auth/
01:59:07<nicolas17>I don't think we can use that for DPoS though, how would we distribute keys and prevent their misuse outside archival?
01:59:16<nicolas17>would work for AB
01:59:41<klea>Maybe time to convince Jason to talk as IA to CF to get them to allow it for AB, for sites where they have that box ticket to go to WBM, and otherwise have easier access.
02:00:30<nicolas17>we'd have to actually implement it in AB first :P
02:00:46<klea>I wonder if you got enough signatures, if it would mean AB would need to rotate signature keys
02:00:49<klea>oh yeah.
02:18:47<pabs>huh, archive.today can't always bypass Cloudflare: https://archive.md/U8kxE
02:22:50hamouda joins
02:29:38dabs quits [Quit: Leaving]
02:35:15DogsRNice joins
02:40:48hamouda quits [Client Quit]
02:41:27<h2ibot>John5433 edited 4chan (+4799, /* Closed Archives */ added image): https://wiki.archiveteam.org/?diff=60777&oldid=60759
02:41:28<h2ibot>John5433 edited Talk:4chan (+599, /* 4rchive.org */ new section): https://wiki.archiveteam.org/?diff=60778&oldid=55421
02:41:29<h2ibot>John5433 uploaded File:Archive-palanq-win.png: https://wiki.archiveteam.org/?title=File%3AArchive-palanq-win.png
02:41:30<h2ibot>John5433 uploaded File:Installgentoo.png: https://wiki.archiveteam.org/?title=File%3AInstallgentoo.png
02:41:31<h2ibot>John5433 uploaded File:4rchive.png: https://wiki.archiveteam.org/?title=File%3A4rchive.png
02:41:32<h2ibot>John5433 uploaded File:Ayasequart.png: https://wiki.archiveteam.org/?title=File%3AAyasequart.png
02:48:17<pabs>a replacement for Myrent: https://minerva-archive.org/
02:48:40<pabs>allegedly. doesn't have much yet
03:02:05retrograde quits [Remote host closed the connection]
03:02:29retrograde (retrograde) joins
03:05:30iseaup quits [Ping timeout: 240 seconds]
03:05:50polypept1 quits [Ping timeout: 240 seconds]
03:06:56polypeptide (polypeptide) joins
03:12:39DrowsyCrow quits [Quit: Ooops, wrong browser tab.]
03:36:24Lord_Nightmare quits [Quit: ZNC - http://znc.in]
03:39:47Lord_Nightmare (Lord_Nightmare) joins
03:45:12PredatorIWD48 joins
03:47:19PredatorIWD4 quits [Ping timeout: 268 seconds]
03:47:19PredatorIWD48 is now known as PredatorIWD4
04:02:17etnguyen03 quits [Remote host closed the connection]
04:14:37nexussfan quits [Quit: Konversation terminated!]
04:17:11DogsRNice quits [Read error: Connection reset by peer]
04:36:13n9nes quits [Remote host closed the connection]
04:36:44Webuser898935 joins
04:36:54Webuser898935 quits [Client Quit]
04:37:18n9nes joins
04:55:18benjins3 joins
04:57:37benjins3_ quits [Ping timeout: 268 seconds]
05:04:29n9nes quits [Ping timeout: 268 seconds]
05:04:50n9nes joins
05:34:42ducky quits [Ping timeout: 268 seconds]
05:58:17Island_ joins
05:59:59grill quits [Ping timeout: 268 seconds]
06:01:28grill (grill) joins
06:06:26<systwi_>Doesn't have anything yet from what I've seen. :-<
06:06:37<systwi_>Hopefully they do add real links one day.
07:09:42ducky (ducky) joins
07:15:15Webuser350808 joins
07:16:52Webuser350808 quits [Client Quit]
07:25:00ducky quits [Ping timeout: 268 seconds]
07:25:10ducky (ducky) joins
07:32:31Island quits [Read error: Connection reset by peer]
07:32:31Island_ quits [Read error: Connection reset by peer]
07:41:11<h2ibot>Manu edited Mailing Lists (+29, Sympa: Add lists.gruene-rlp.de): https://wiki.archiveteam.org/?diff=60783&oldid=60520
07:45:21ramsey quits [Ping timeout: 633 seconds]
07:46:24ramsey (ramsey) joins
07:46:41th3ph3d quits [Read error: Connection reset by peer]
07:46:44th3ph3d joins
07:47:42jonty quits [Read error: Connection reset by peer]
07:47:47jonty (jonty) joins
07:47:49IDK quits [Ping timeout: 633 seconds]
07:49:27JSharp quits [Read error: Connection reset by peer]
07:49:31JSharp (JSharp) joins
07:50:08IDK (IDK) joins
08:09:04Webuser925734 joins
08:09:07Webuser925734 quits [Client Quit]
08:18:02ericgallager quits [Ping timeout: 268 seconds]
08:34:04Webuser304702 joins
08:34:19Webuser304702 quits [Client Quit]
09:01:49twiswist_ quits [Read error: Connection reset by peer]
09:01:49ducky quits [Ping timeout: 268 seconds]
09:02:00twiswist (twiswist) joins
09:02:26ducky (ducky) joins
09:07:22<h2ibot>Exorcism edited Open Diary (+2): https://wiki.archiveteam.org/?diff=60784&oldid=60391
09:19:24<h2ibot>Exorcism uploaded File:Medica-screenshot.png: https://wiki.archiveteam.org/?title=File%3AMedica-screenshot.png
09:23:24<h2ibot>Exorcism created Medica Bibliothèque Numérique (+1008, Created page with "{{Infobox project | title =…): https://wiki.archiveteam.org/?oldid=60786
09:26:57<triplecamera|m>TheTechRobo: Thank you, I will have try
09:28:25<h2ibot>Exorcism edited Main Page/Current Projects (-188): https://wiki.archiveteam.org/?diff=60787&oldid=60568
09:30:50<triplecamera|m>justauser: I can use grab-site for now, but I'm worried that grab-site (and wpull) lacks maintenance. In the contrary, wget-lua is still actively maintained.
09:36:26<h2ibot>Bzc6p edited ArchiveTeam Warrior (+63, /* Installing and running with Docker */…): https://wiki.archiveteam.org/?diff=60788&oldid=59360
09:42:27<h2ibot>Bzc6p edited Talk:ArchiveTeam Warrior (+767, /* "Otherwise, the Docker-specific images are…): https://wiki.archiveteam.org/?diff=60789&oldid=60740
10:00:29<h2ibot>Manu edited Discourse/archived (+148, Queued forum.piratskastranka.si): https://wiki.archiveteam.org/?diff=60790&oldid=60754
10:02:11<gamer191-1|m>Regarding archive.today, if we just wanted to archive it as a link shortener (and archive a thumbnail image for each url) we could archive 20 URLs at a time by searching for common domains. I don’t know how rate-limited the search is though
10:02:30<h2ibot>Manu edited Discourse/archived (+110, Queued forum.contextualelectronics.com): https://wiki.archiveteam.org/?diff=60791&oldid=60790
10:06:10<gamer191-1|m>Wait actually we can do much better by saving their Google custom searches
10:11:58<gamer191-1|m>OMG, I found a rate-limit free archive.md api:
10:13:27<gamer191-1|m>https://archive.md/cse.js?id=XXXXX
10:13:27<gamer191-1|m>Where XXXXX is 5 alphanumeric letters
10:13:33<gamer191-1|m>Someone send this to urlteam
10:15:23<gamer191-1|m>Actually I will (I got over excited and forgot they obviously have a public channel)
10:21:46Webuser172360 joins
10:22:12Webuser172360 quits [Client Quit]
10:37:34Shard111582 quits [Read error: Connection reset by peer]
10:37:46Shard111582 (Shard) joins
11:00:04Bleo18260072271962345522201 quits [Quit: The Lounge - https://thelounge.chat]
11:02:44Bleo18260072271962345522201 joins
11:38:35klea points out that cse.js just gives out apparently prefilled stuff.
11:39:32<klea>oh I'm stupid.
11:44:57etnguyen03 (etnguyen03) joins
11:46:44<gamer191-1|m>No it doesn’t (eg: https://archive.md/cse.js?id=T3Hc7)
11:47:57<gamer191-1|m>Well I don’t know what prefilled means
11:48:06<gamer191-1|m>But it definitely gives useful information
11:50:43<klea>oh yeah, i tested with common things which weren't valid references.
11:56:47<gamer191-1|m>Actually they originally used 4 digit IDs until they ran out (thanks URLTeam wiki). So any 4 alphanumeric digits will be a valid id
11:57:54<klea>→ #urlteam
12:07:26hackbug quits [Ping timeout: 268 seconds]
12:11:54<Yakov>it seems to give the full archived url and an exact timestamp in the form of returning html in a js function
12:12:19<klea>yep
12:12:30<klea>I just tested with something that wasn't valid at first.
12:14:24<Yakov>another url it contains is the thumb png, which is ratelimited. this is a pretty interesting endpoint
12:16:11<Yakov>thumb seems to be the screenshot preview of the capture e.g.: https://archive.ph/wdq4u/70f04774d40bd8b94aa69b1c058b908a619c7888/thumb.png
12:17:35<Yakov>ArchiveBot 429d at the thumb for .md but not for .ph, i would assume that it is under the same ratelimit restrictions though and .md had some previous load from AB
12:23:30SootBector quits [Ping timeout: 240 seconds]
12:26:30SootBector (SootBector) joins
12:27:02Arcorann (Arcorann) joins
12:29:01Arcorann_ quits [Ping timeout: 268 seconds]
12:34:39skankhunt42 quits [Ping timeout: 268 seconds]
12:38:44<Yakov>i experimented with queuing 500 sequential ids starting from 5 chars with charset [a-zA-Z0-9]: https://transfer.archivete.am/inline/aI6Eg/archiveph-csejs-first-500-sequential.txt
12:41:00<Yakov>seems to be very impressive results (https://img.yakov.cloud/rqBHc.png, job id: 1ihfmd8cwv8rmjimyc8bh1f1y)
12:42:17<Yakov>I wonder if this is worth scraping. also don't know how long this will be unrestricted for.
12:47:15<gamer191-1|m>Yakov: thumbnails are supposed to be accessed from a 14 character subdomain beginning with a d (eg https://d1234567890123.archive.md/THUMBNAILURL). Does that bypass the rate limiting?
12:48:13<Yakov>where do you see 14 character subdomains (and beginning with a d) being used?
12:49:15<Yakov>also i didnt get ratelimited doing 500 consecutively, not sure if this has to do with me trying it on archive.ph instead of .md or i just didn't hit the limit yet for that pipeline
12:49:30<Yakov>even on the thumb url*
12:51:10<gamer191-1|m>Yakov: That’s what the website uses (all this stuff is loaded when you search for something other than a domain name in the search box on archive.today). I didn’t check the JavaScript, I just experimented to see which domains do and don’t resolve
12:53:30<Yakov>Yeah, interesting. I don't know then.
12:53:51<Yakov>I'll try queuing the next 500 and we'll see if any ratelimits kick in for thumbs
12:54:56<gamer191-1|m>Btw, if you use 4 characters then it will always find something (because archive.today exhausted the 4 character IDs before switching to 5 characters)
12:55:36Webuser685136 joins
12:56:00FiTheArchiver joins
12:58:18<gamer191-1|m>Wait, I’m wrong, it does load the images from the main server
12:58:45<gamer191-1|m>I got confused because the image-based captures load from that subdomain
12:58:46<gamer191-1|m>Sorry
12:58:56<gamer191-1|m>But you can use those subdomains though
12:59:32<Yakov>Once again, another success on the next 500 IDs (job id: bebx4hdjcsixcwdpk5s89p6zf) no ratelimits on thumbs either. Very strange, not sure why it happened the first try on that single cse.js job for archive.md
12:59:39<gamer191-1|m>But yeah, the actual site uses the main domain for the thumbnails and the d subdomains for the image based captures
12:59:54<gamer191-1|m>Nice!
13:02:41FiTheArchiver quits [Client Quit]
13:04:01Webuser837525 joins
13:04:06Webuser837525 quits [Client Quit]
13:20:54Arcorann quits [Ping timeout: 268 seconds]
13:29:19oxtyped quits [Read error: Connection reset by peer]
13:34:31Wohlstand1 (Wohlstand) joins
13:36:54Wohlstand1 is now known as Wohlstand
13:39:08oxtyped joins
13:45:06nexussfan (nexussfan) joins
13:56:50Webuser685136 quits [Client Quit]
14:17:52SootBector quits [Remote host closed the connection]
14:19:12SootBector (SootBector) joins
14:38:21Webuser776797 joins
14:39:34Webuser776797 quits [Client Quit]
15:00:53Webuser554683 joins
15:01:26Webuser554683 quits [Client Quit]
15:16:09SootBector quits [Remote host closed the connection]
15:17:19SootBector (SootBector) joins
15:20:48hackbug (hackbug) joins
15:23:52Cornelius705 quits [Quit: Cornelius705]
15:24:52Cornelius705 (Cornelius) joins
15:26:16hackbug quits [Remote host closed the connection]
15:37:06SootBector quits [Remote host closed the connection]
15:38:15SootBector (SootBector) joins
15:39:41hackbug (hackbug) joins
15:54:30SootBector quits [Ping timeout: 240 seconds]
15:56:23SootBector (SootBector) joins
16:11:38@imer quits [Quit: Oh no]
16:24:40ducky quits [Remote host closed the connection]
16:34:09Webuser480342 joins
16:34:21Webuser480342 quits [Client Quit]
16:35:22<h2ibot>Exorcism uploaded File:Pinger-screenshot.png: https://wiki.archiveteam.org/?title=File%3APinger-screenshot.png
16:36:23<h2ibot>Exorcism edited Pinger (+57): https://wiki.archiveteam.org/?diff=60793&oldid=49285
16:39:44imer (imer) joins
16:39:44@ChanServ sets mode: +o imer
16:42:48ducky (ducky) joins
16:54:41DogsRNice joins
16:56:55Cornelius705 quits [Client Quit]
16:57:50Cornelius705 (Cornelius) joins
17:05:08ducky quits [Remote host closed the connection]
17:08:34ducky (ducky) joins
17:11:01ducky quits [Remote host closed the connection]
17:14:42ducky (ducky) joins
17:32:30<h2ibot>DoomTay edited Web Roasting/ISP Hosting (-7): https://wiki.archiveteam.org/?diff=60794&oldid=60586
17:52:00Cornelius705 quits [Client Quit]
18:10:02ducky quits [Ping timeout: 268 seconds]
18:10:32dabs joins
18:11:30dabs quits [Remote host closed the connection]
18:11:42dabs joins
18:20:16ducky (ducky) joins
18:45:48azalea_sh_ quits [Ping timeout: 268 seconds]
18:46:01azalea_sh_ (azalea_sh_) joins
18:58:41root joins
18:59:12Webuser366519 joins
19:00:26Webuser366519 is now known as Hyperion-Op
19:06:47Hyperion-Op is now known as Hyperion-SysOps
19:07:07Hyperion-SysOps quits [Client Quit]
19:08:49root quits [Client Quit]
19:14:54Cornelius705 (Cornelius) joins
19:21:30Island joins
19:24:07lennier2 quits [Ping timeout: 268 seconds]
19:26:16lennier2 joins
19:27:15Hyperion-SysOp (Hyperion-SysOp) joins
19:35:37APOLLO03 quits [Quit: .]
19:46:22Cornelius705 quits [Client Quit]
19:47:16Cornelius705 (Cornelius) joins