| 00:08:36 | | skankhunt42 (skankhunt42) joins |
| 00:11:25 | | Webuser692424 quits [Client Quit] |
| 00:19:40 | | iseaup quits [Remote host closed the connection] |
| 00:48:45 | | iseaup (iseaup) joins |
| 01:04:00 | | Shard1115828 (Shard) joins |
| 01:04:36 | | Shard111582 quits [Ping timeout: 268 seconds] |
| 01:04:37 | | Shard1115828 is now known as Shard111582 |
| 01:13:48 | <pabs> | HN thread about IA, AT is mentioned https://news.ycombinator.com/item?id=47464818 |
| 01:28:18 | <h2ibot> | PaulWise edited Obstacles (+203, multi-protocol passive fingerprinting): https://wiki.archiveteam.org/?diff=60776&oldid=60601 |
| 01:31:19 | <pabs> | someone in the thread mentioned mTLS as a possible option for allowing archivists but blocking AI scrapers, seems like the best option to me, apart from IP address allowlists |
| 01:35:21 | | azalea_sh_ quits [Ping timeout: 268 seconds] |
| 01:35:34 | | azalea_sh_ (azalea_sh_) joins |
| 01:35:36 | | polypept1 (polypeptide) joins |
| 01:38:50 | | polypeptide quits [Ping timeout: 240 seconds] |
| 01:39:59 | | retrograde quits [Remote host closed the connection] |
| 01:40:47 | | retrograde (retrograde) joins |
| 01:48:32 | | Webuser623642 joins |
| 01:48:42 | | Webuser623642 quits [Client Quit] |
| 01:57:10 | <nicolas17> | pabs: another option would be https://developers.cloudflare.com/bots/reference/bot-verification/web-bot-auth/ |
| 01:59:07 | <nicolas17> | I don't think we can use that for DPoS though, how would we distribute keys and prevent their misuse outside archival? |
| 01:59:16 | <nicolas17> | would work for AB |
| 01:59:41 | <klea> | Maybe time to convince Jason to talk as IA to CF to get them to allow it for AB, for sites where they have that box ticket to go to WBM, and otherwise have easier access. |
| 02:00:30 | <nicolas17> | we'd have to actually implement it in AB first :P |
| 02:00:46 | <klea> | I wonder if you got enough signatures, if it would mean AB would need to rotate signature keys |
| 02:00:49 | <klea> | oh yeah. |
| 02:18:47 | <pabs> | huh, archive.today can't always bypass Cloudflare: https://archive.md/U8kxE |
| 02:22:50 | | hamouda joins |
| 02:29:38 | | dabs quits [Quit: Leaving] |
| 02:35:15 | | DogsRNice joins |
| 02:40:48 | | hamouda quits [Client Quit] |
| 02:41:27 | <h2ibot> | John5433 edited 4chan (+4799, /* Closed Archives */ added image): https://wiki.archiveteam.org/?diff=60777&oldid=60759 |
| 02:41:28 | <h2ibot> | John5433 edited Talk:4chan (+599, /* 4rchive.org */ new section): https://wiki.archiveteam.org/?diff=60778&oldid=55421 |
| 02:41:29 | <h2ibot> | John5433 uploaded File:Archive-palanq-win.png: https://wiki.archiveteam.org/?title=File%3AArchive-palanq-win.png |
| 02:41:30 | <h2ibot> | John5433 uploaded File:Installgentoo.png: https://wiki.archiveteam.org/?title=File%3AInstallgentoo.png |
| 02:41:31 | <h2ibot> | John5433 uploaded File:4rchive.png: https://wiki.archiveteam.org/?title=File%3A4rchive.png |
| 02:41:32 | <h2ibot> | John5433 uploaded File:Ayasequart.png: https://wiki.archiveteam.org/?title=File%3AAyasequart.png |
| 02:48:17 | <pabs> | a replacement for Myrent: https://minerva-archive.org/ |
| 02:48:40 | <pabs> | allegedly. doesn't have much yet |
| 03:02:05 | | retrograde quits [Remote host closed the connection] |
| 03:02:29 | | retrograde (retrograde) joins |
| 03:05:30 | | iseaup quits [Ping timeout: 240 seconds] |
| 03:05:50 | | polypept1 quits [Ping timeout: 240 seconds] |
| 03:06:56 | | polypeptide (polypeptide) joins |
| 03:12:39 | | DrowsyCrow quits [Quit: Ooops, wrong browser tab.] |
| 03:36:24 | | Lord_Nightmare quits [Quit: ZNC - http://znc.in] |
| 03:39:47 | | Lord_Nightmare (Lord_Nightmare) joins |
| 03:45:12 | | PredatorIWD48 joins |
| 03:47:19 | | PredatorIWD4 quits [Ping timeout: 268 seconds] |
| 03:47:19 | | PredatorIWD48 is now known as PredatorIWD4 |
| 04:02:17 | | etnguyen03 quits [Remote host closed the connection] |
| 04:14:37 | | nexussfan quits [Quit: Konversation terminated!] |
| 04:17:11 | | DogsRNice quits [Read error: Connection reset by peer] |
| 04:36:13 | | n9nes quits [Remote host closed the connection] |
| 04:36:44 | | Webuser898935 joins |
| 04:36:54 | | Webuser898935 quits [Client Quit] |
| 04:37:18 | | n9nes joins |
| 04:55:18 | | benjins3 joins |
| 04:57:37 | | benjins3_ quits [Ping timeout: 268 seconds] |
| 05:04:29 | | n9nes quits [Ping timeout: 268 seconds] |
| 05:04:50 | | n9nes joins |
| 05:34:42 | | ducky quits [Ping timeout: 268 seconds] |
| 05:58:17 | | Island_ joins |
| 05:59:59 | | grill quits [Ping timeout: 268 seconds] |
| 06:01:28 | | grill (grill) joins |
| 06:06:26 | <systwi_> | Doesn't have anything yet from what I've seen. :-< |
| 06:06:37 | <systwi_> | Hopefully they do add real links one day. |
| 07:09:42 | | ducky (ducky) joins |
| 07:15:15 | | Webuser350808 joins |
| 07:16:52 | | Webuser350808 quits [Client Quit] |
| 07:25:00 | | ducky quits [Ping timeout: 268 seconds] |
| 07:25:10 | | ducky (ducky) joins |
| 07:32:31 | | Island quits [Read error: Connection reset by peer] |
| 07:32:31 | | Island_ quits [Read error: Connection reset by peer] |
| 07:41:11 | <h2ibot> | Manu edited Mailing Lists (+29, Sympa: Add lists.gruene-rlp.de): https://wiki.archiveteam.org/?diff=60783&oldid=60520 |
| 07:45:21 | | ramsey quits [Ping timeout: 633 seconds] |
| 07:46:24 | | ramsey (ramsey) joins |
| 07:46:41 | | th3ph3d quits [Read error: Connection reset by peer] |
| 07:46:44 | | th3ph3d joins |
| 07:47:42 | | jonty quits [Read error: Connection reset by peer] |
| 07:47:47 | | jonty (jonty) joins |
| 07:47:49 | | IDK quits [Ping timeout: 633 seconds] |
| 07:49:27 | | JSharp quits [Read error: Connection reset by peer] |
| 07:49:31 | | JSharp (JSharp) joins |
| 07:50:08 | | IDK (IDK) joins |
| 08:09:04 | | Webuser925734 joins |
| 08:09:07 | | Webuser925734 quits [Client Quit] |
| 08:18:02 | | ericgallager quits [Ping timeout: 268 seconds] |
| 08:34:04 | | Webuser304702 joins |
| 08:34:19 | | Webuser304702 quits [Client Quit] |
| 09:01:49 | | twiswist_ quits [Read error: Connection reset by peer] |
| 09:01:49 | | ducky quits [Ping timeout: 268 seconds] |
| 09:02:00 | | twiswist (twiswist) joins |
| 09:02:26 | | ducky (ducky) joins |
| 09:07:22 | <h2ibot> | Exorcism edited Open Diary (+2): https://wiki.archiveteam.org/?diff=60784&oldid=60391 |
| 09:19:24 | <h2ibot> | Exorcism uploaded File:Medica-screenshot.png: https://wiki.archiveteam.org/?title=File%3AMedica-screenshot.png |
| 09:23:24 | <h2ibot> | Exorcism created Medica Bibliothèque Numérique (+1008, Created page with "{{Infobox project | title =…): https://wiki.archiveteam.org/?oldid=60786 |
| 09:26:57 | <triplecamera|m> | TheTechRobo: Thank you, I will have try |
| 09:28:25 | <h2ibot> | Exorcism edited Main Page/Current Projects (-188): https://wiki.archiveteam.org/?diff=60787&oldid=60568 |
| 09:30:50 | <triplecamera|m> | justauser: I can use grab-site for now, but I'm worried that grab-site (and wpull) lacks maintenance. In the contrary, wget-lua is still actively maintained. |
| 09:36:26 | <h2ibot> | Bzc6p edited ArchiveTeam Warrior (+63, /* Installing and running with Docker */…): https://wiki.archiveteam.org/?diff=60788&oldid=59360 |
| 09:42:27 | <h2ibot> | Bzc6p edited Talk:ArchiveTeam Warrior (+767, /* "Otherwise, the Docker-specific images are…): https://wiki.archiveteam.org/?diff=60789&oldid=60740 |
| 10:00:29 | <h2ibot> | Manu edited Discourse/archived (+148, Queued forum.piratskastranka.si): https://wiki.archiveteam.org/?diff=60790&oldid=60754 |
| 10:02:11 | <gamer191-1|m> | Regarding archive.today, if we just wanted to archive it as a link shortener (and archive a thumbnail image for each url) we could archive 20 URLs at a time by searching for common domains. I don’t know how rate-limited the search is though |
| 10:02:30 | <h2ibot> | Manu edited Discourse/archived (+110, Queued forum.contextualelectronics.com): https://wiki.archiveteam.org/?diff=60791&oldid=60790 |
| 10:06:10 | <gamer191-1|m> | Wait actually we can do much better by saving their Google custom searches |
| 10:11:58 | <gamer191-1|m> | OMG, I found a rate-limit free archive.md api: |
| 10:13:27 | <gamer191-1|m> | https://archive.md/cse.js?id=XXXXX |
| 10:13:27 | <gamer191-1|m> | Where XXXXX is 5 alphanumeric letters |
| 10:13:33 | <gamer191-1|m> | Someone send this to urlteam |
| 10:15:23 | <gamer191-1|m> | Actually I will (I got over excited and forgot they obviously have a public channel) |
| 10:21:46 | | Webuser172360 joins |
| 10:22:12 | | Webuser172360 quits [Client Quit] |
| 10:37:34 | | Shard111582 quits [Read error: Connection reset by peer] |
| 10:37:46 | | Shard111582 (Shard) joins |
| 11:00:04 | | Bleo18260072271962345522201 quits [Quit: The Lounge - https://thelounge.chat] |
| 11:02:44 | | Bleo18260072271962345522201 joins |
| 11:38:35 | | klea points out that cse.js just gives out apparently prefilled stuff. |
| 11:39:32 | <klea> | oh I'm stupid. |
| 11:44:57 | | etnguyen03 (etnguyen03) joins |
| 11:46:44 | <gamer191-1|m> | No it doesn’t (eg: https://archive.md/cse.js?id=T3Hc7) |
| 11:47:57 | <gamer191-1|m> | Well I don’t know what prefilled means |
| 11:48:06 | <gamer191-1|m> | But it definitely gives useful information |
| 11:50:43 | <klea> | oh yeah, i tested with common things which weren't valid references. |
| 11:56:47 | <gamer191-1|m> | Actually they originally used 4 digit IDs until they ran out (thanks URLTeam wiki). So any 4 alphanumeric digits will be a valid id |
| 11:57:54 | <klea> | → #urlteam |
| 12:07:26 | | hackbug quits [Ping timeout: 268 seconds] |
| 12:11:54 | <Yakov> | it seems to give the full archived url and an exact timestamp in the form of returning html in a js function |
| 12:12:19 | <klea> | yep |
| 12:12:30 | <klea> | I just tested with something that wasn't valid at first. |
| 12:14:24 | <Yakov> | another url it contains is the thumb png, which is ratelimited. this is a pretty interesting endpoint |
| 12:16:11 | <Yakov> | thumb seems to be the screenshot preview of the capture e.g.: https://archive.ph/wdq4u/70f04774d40bd8b94aa69b1c058b908a619c7888/thumb.png |
| 12:17:35 | <Yakov> | ArchiveBot 429d at the thumb for .md but not for .ph, i would assume that it is under the same ratelimit restrictions though and .md had some previous load from AB |
| 12:23:30 | | SootBector quits [Ping timeout: 240 seconds] |
| 12:26:30 | | SootBector (SootBector) joins |
| 12:27:02 | | Arcorann (Arcorann) joins |
| 12:29:01 | | Arcorann_ quits [Ping timeout: 268 seconds] |
| 12:34:39 | | skankhunt42 quits [Ping timeout: 268 seconds] |
| 12:38:44 | <Yakov> | i experimented with queuing 500 sequential ids starting from 5 chars with charset [a-zA-Z0-9]: https://transfer.archivete.am/inline/aI6Eg/archiveph-csejs-first-500-sequential.txt |
| 12:41:00 | <Yakov> | seems to be very impressive results (https://img.yakov.cloud/rqBHc.png, job id: 1ihfmd8cwv8rmjimyc8bh1f1y) |
| 12:42:17 | <Yakov> | I wonder if this is worth scraping. also don't know how long this will be unrestricted for. |
| 12:47:15 | <gamer191-1|m> | Yakov: thumbnails are supposed to be accessed from a 14 character subdomain beginning with a d (eg https://d1234567890123.archive.md/THUMBNAILURL). Does that bypass the rate limiting? |
| 12:48:13 | <Yakov> | where do you see 14 character subdomains (and beginning with a d) being used? |
| 12:49:15 | <Yakov> | also i didnt get ratelimited doing 500 consecutively, not sure if this has to do with me trying it on archive.ph instead of .md or i just didn't hit the limit yet for that pipeline |
| 12:49:30 | <Yakov> | even on the thumb url* |
| 12:51:10 | <gamer191-1|m> | Yakov: That’s what the website uses (all this stuff is loaded when you search for something other than a domain name in the search box on archive.today). I didn’t check the JavaScript, I just experimented to see which domains do and don’t resolve |
| 12:53:30 | <Yakov> | Yeah, interesting. I don't know then. |
| 12:53:51 | <Yakov> | I'll try queuing the next 500 and we'll see if any ratelimits kick in for thumbs |
| 12:54:56 | <gamer191-1|m> | Btw, if you use 4 characters then it will always find something (because archive.today exhausted the 4 character IDs before switching to 5 characters) |
| 12:55:36 | | Webuser685136 joins |
| 12:56:00 | | FiTheArchiver joins |
| 12:58:18 | <gamer191-1|m> | Wait, I’m wrong, it does load the images from the main server |
| 12:58:45 | <gamer191-1|m> | I got confused because the image-based captures load from that subdomain |
| 12:58:46 | <gamer191-1|m> | Sorry |
| 12:58:56 | <gamer191-1|m> | But you can use those subdomains though |
| 12:59:32 | <Yakov> | Once again, another success on the next 500 IDs (job id: bebx4hdjcsixcwdpk5s89p6zf) no ratelimits on thumbs either. Very strange, not sure why it happened the first try on that single cse.js job for archive.md |
| 12:59:39 | <gamer191-1|m> | But yeah, the actual site uses the main domain for the thumbnails and the d subdomains for the image based captures |
| 12:59:54 | <gamer191-1|m> | Nice! |
| 13:02:41 | | FiTheArchiver quits [Client Quit] |
| 13:04:01 | | Webuser837525 joins |
| 13:04:06 | | Webuser837525 quits [Client Quit] |
| 13:20:54 | | Arcorann quits [Ping timeout: 268 seconds] |
| 13:29:19 | | oxtyped quits [Read error: Connection reset by peer] |
| 13:34:31 | | Wohlstand1 (Wohlstand) joins |
| 13:36:54 | | Wohlstand1 is now known as Wohlstand |
| 13:39:08 | | oxtyped joins |
| 13:45:06 | | nexussfan (nexussfan) joins |
| 13:56:50 | | Webuser685136 quits [Client Quit] |
| 14:17:52 | | SootBector quits [Remote host closed the connection] |
| 14:19:12 | | SootBector (SootBector) joins |
| 14:38:21 | | Webuser776797 joins |
| 14:39:34 | | Webuser776797 quits [Client Quit] |
| 15:00:53 | | Webuser554683 joins |
| 15:01:26 | | Webuser554683 quits [Client Quit] |
| 15:16:09 | | SootBector quits [Remote host closed the connection] |
| 15:17:19 | | SootBector (SootBector) joins |
| 15:20:48 | | hackbug (hackbug) joins |
| 15:23:52 | | Cornelius705 quits [Quit: Cornelius705] |
| 15:24:52 | | Cornelius705 (Cornelius) joins |
| 15:26:16 | | hackbug quits [Remote host closed the connection] |
| 15:37:06 | | SootBector quits [Remote host closed the connection] |
| 15:38:15 | | SootBector (SootBector) joins |
| 15:39:41 | | hackbug (hackbug) joins |
| 15:54:30 | | SootBector quits [Ping timeout: 240 seconds] |
| 15:56:23 | | SootBector (SootBector) joins |
| 16:11:38 | | @imer quits [Quit: Oh no] |
| 16:24:40 | | ducky quits [Remote host closed the connection] |
| 16:34:09 | | Webuser480342 joins |
| 16:34:21 | | Webuser480342 quits [Client Quit] |
| 16:35:22 | <h2ibot> | Exorcism uploaded File:Pinger-screenshot.png: https://wiki.archiveteam.org/?title=File%3APinger-screenshot.png |
| 16:36:23 | <h2ibot> | Exorcism edited Pinger (+57): https://wiki.archiveteam.org/?diff=60793&oldid=49285 |
| 16:39:44 | | imer (imer) joins |
| 16:39:44 | | @ChanServ sets mode: +o imer |
| 16:42:48 | | ducky (ducky) joins |
| 16:54:41 | | DogsRNice joins |
| 16:56:55 | | Cornelius705 quits [Client Quit] |
| 16:57:50 | | Cornelius705 (Cornelius) joins |
| 17:05:08 | | ducky quits [Remote host closed the connection] |
| 17:08:34 | | ducky (ducky) joins |
| 17:11:01 | | ducky quits [Remote host closed the connection] |
| 17:14:42 | | ducky (ducky) joins |
| 17:32:30 | <h2ibot> | DoomTay edited Web Roasting/ISP Hosting (-7): https://wiki.archiveteam.org/?diff=60794&oldid=60586 |
| 17:52:00 | | Cornelius705 quits [Client Quit] |
| 18:10:02 | | ducky quits [Ping timeout: 268 seconds] |
| 18:10:32 | | dabs joins |
| 18:11:30 | | dabs quits [Remote host closed the connection] |
| 18:11:42 | | dabs joins |
| 18:20:16 | | ducky (ducky) joins |
| 18:45:48 | | azalea_sh_ quits [Ping timeout: 268 seconds] |
| 18:46:01 | | azalea_sh_ (azalea_sh_) joins |
| 18:58:41 | | root joins |
| 18:59:12 | | Webuser366519 joins |
| 19:00:26 | | Webuser366519 is now known as Hyperion-Op |
| 19:06:38 | | Hyperion-Op is now authenticated as Hyperion-Op |
| 19:06:47 | | Hyperion-Op is now known as Hyperion-SysOps |
| 19:07:07 | | Hyperion-SysOps quits [Client Quit] |
| 19:08:49 | | root quits [Client Quit] |
| 19:14:54 | | Cornelius705 (Cornelius) joins |
| 19:21:30 | | Island joins |
| 19:24:07 | | lennier2 quits [Ping timeout: 268 seconds] |
| 19:26:16 | | lennier2 joins |
| 19:27:15 | | Hyperion-SysOp (Hyperion-SysOp) joins |
| 19:35:37 | | APOLLO03 quits [Quit: .] |
| 19:46:22 | | Cornelius705 quits [Client Quit] |
| 19:47:16 | | Cornelius705 (Cornelius) joins |