| 00:05:28 | | Sluggs quits [Ping timeout: 265 seconds] |
| 00:05:54 | | Sluggs joins |
| 00:12:04 | | Sluggs quits [Ping timeout: 252 seconds] |
| 00:12:54 | | Sluggs joins |
| 00:23:26 | | Sluggs quits [Ping timeout: 252 seconds] |
| 00:23:47 | | Sluggs joins |
| 00:37:25 | | @dxrt quits [Quit: ZNC - http://znc.sourceforge.net] |
| 00:44:55 | | dxrt joins |
| 00:44:57 | | dxrt is now authenticated as dxrt |
| 00:44:57 | | dxrt quits [Changing host] |
| 00:44:57 | | dxrt (dxrt) joins |
| 00:44:57 | | @ChanServ sets mode: +o dxrt |
| 01:22:06 | | fl0w_ joins |
| 01:24:14 | <datechnoman> | What are your thoughts arkiver to the above comments? We never discussed load being an issue causing urls/page requisites to be dropped. We could look at running a little slower or just pushing all found URL's to a separate tracker/queue to be moved into the backfeed slowly until the issue is resolved? |
| 01:24:43 | <datechnoman> | Note that I dont have any understanding of the inner-outer working of the tracker backend so that solution may not be workable at all |
| 01:25:42 | | fl0w quits [Ping timeout: 265 seconds] |
| 01:26:38 | <@JAA> | datechnoman: It's not related to load as I understood it. |
| 01:28:17 | <datechnoman> | So it could be just a case of running through items that are in the tracker queue and stashing discovered urls to be processed by the bloom filter elsewhere? (If tenable) |
| 01:28:59 | <datechnoman> | Remove the automated function temporally |
| 01:39:21 | <@JAA> | I guess that should be possible in theory. But the bloom filters are massive. As in 'hundreds of GB of RAM' massive, I believe. So it's not exactly easy to do. |
| 01:40:02 | <datechnoman> | sorry i mean store the urls elsewhere until the bloom filter issue is properly solved |
| 01:40:10 | <datechnoman> | Dont want to move the bloom filters nooooooooooo |
| 01:40:28 | <@JAA> | Yeah, but then we can only run through the existing backlog, nothing more. |
| 01:40:56 | <@JAA> | And there isn't that much of that, is there? |
| 01:41:20 | <datechnoman> | Correct. Better than nothing. The #// backlog is nearly 300 million items |
| 01:41:24 | <datechnoman> | So pretty large |
| 01:41:27 | <TheTechRobo> | Maybe we could queue URLs both to the backfeed AND a separate location? Then after everything is fixed, we run the separate location through the backfeed again, which will catch duplicates. Or would that be too resource-intensive? |
| 01:41:50 | <datechnoman> | That aint a bad idea unless like you said compute is a killer later down the track |
| 01:42:35 | <datechnoman> | 9.84M out + 291.80M to do |
| 01:43:37 | <datechnoman> | Its also new month when we re-archive a metric tonne of websites sitemaps |
| 01:48:32 | <@JAA> | Not a clue what order of magnitude of data we're talking there. |
| 01:48:46 | <@JAA> | Re keeping a copy of the backfeed stream |
| 01:50:57 | <datechnoman> | If its .ztsd zipped it will be much smaller but it will require some storage. I guess real ballpark figure is the URLTeam project exports 8 million url's into a compressed file of approximately 400MB. We would need a few TB's of local storage at a bare minimum |
| 01:51:49 | <datechnoman> | Not sure how tenable that would be over time though as it will keep growing, unless we can slowly process them and verify they are going through the bloom filter and being queued |
| 01:52:20 | | Mateon2 joins |
| 01:52:21 | <@JAA> | That's not the figure I mean. I have no idea what the rate of URLs thrown at #// for example is or what the dupe rate is. |
| 01:52:51 | <@JAA> | I suppose this temporary dump could be deduped with another separate bloom filter, but that's just asking for trouble. :-) |
| 01:53:47 | <datechnoman> | Haha yeah. Well it goes hetic when we run the sitemaps at the start of the month. I see typically 700,000 urls per minute being discovered which I assume are sent to the bloom filter for processing |
| 01:53:52 | <@JAA> | And that'd basically be what the backfeed server does, anyway, which is broken, so at that point we're reinventing the wheel rather than fixing the bug. |
| 01:54:00 | | Mateon1 quits [Ping timeout: 252 seconds] |
| 01:54:00 | | Mateon2 is now known as Mateon1 |
| 01:54:09 | <datechnoman> | Can we squash the bug? :P easy fix |
| 01:54:29 | <TheTechRobo> | "Why are programmers paid so much? They just have to fix bugs and add features, that's easy!" :-) |
| 01:54:46 | <datechnoman> | https://media.tenor.com/ptLJfHc0PV4AAAAC/bug-bash-dt-bug-bash.gif |
| 01:54:58 | <@JAA> | Alternatively: 'Nothing works, why am I even paying you?' |
| 01:55:10 | <@JAA> | (And after it's fixed: 'Everything works, why am I even paying you?') |
| 01:55:16 | <datechnoman> | Lets get ChatGPT to fix the tracker |
| 01:56:14 | <datechnoman> | well the bloom filter issue |
| 02:01:32 | | leo60228 quits [Quit: ZNC 1.8.2 - https://znc.in] |
| 02:01:55 | | leo60228 (leo60228) joins |
| 02:55:22 | | Mateon2 joins |
| 02:57:04 | | Mateon1 quits [Ping timeout: 252 seconds] |
| 02:57:04 | | Mateon2 is now known as Mateon1 |
| 02:58:59 | | @rewby quits [Ping timeout: 265 seconds] |
| 03:01:13 | | rewby (rewby) joins |
| 03:01:13 | | @ChanServ sets mode: +o rewby |
| 03:34:28 | | Stiletto joins |
| 03:44:48 | | katocala quits [Remote host closed the connection] |
| 03:54:51 | | katocala joins |
| 03:55:10 | | katocala is now authenticated as katocala |
| 04:42:07 | | thuban quits [Read error: Connection reset by peer] |
| 04:42:25 | | thuban joins |
| 04:43:58 | | sonick quits [Client Quit] |
| 05:25:12 | <h2ibot> | Bear edited List of websites excluded from the Wayback Machine (+761, The first known .plus domain to be excluded is…): https://wiki.archiveteam.org/?diff=49506&oldid=49504 |
| 05:25:13 | <h2ibot> | Bear created The Chive (+446, Yes, you heard it right. That's "chive", not…): https://wiki.archiveteam.org/?title=The%20Chive |
| 05:25:14 | <h2ibot> | Bear created Wayback Machine exclusions (+64, A shorter and more memorable variant for the…): https://wiki.archiveteam.org/?title=Wayback%20Machine%20exclusions |
| 05:25:15 | <h2ibot> | Bear edited V Live (+49, category): https://wiki.archiveteam.org/?diff=49509&oldid=49422 |
| 05:25:16 | <h2ibot> | Bear edited 4chan (+48, As far as I can remember, 4chan is excluded…): https://wiki.archiveteam.org/?diff=49510&oldid=49392 |
| 05:25:17 | <h2ibot> | Bear created Mortis (+731, Created page with "{{Infobox project | title =…): https://wiki.archiveteam.org/?title=Mortis |
| 05:25:18 | <h2ibot> | CreaZyp154 edited List of websites excluded from the Wayback Machine/Partial exclusions (-88, Removed a link because it is available on the…): https://wiki.archiveteam.org/?diff=49512&oldid=49315 |
| 05:42:50 | | lun4 quits [Client Quit] |
| 05:42:51 | | lun42 (lun4) joins |
| 05:42:56 | | fl0w_ quits [Remote host closed the connection] |
| 05:42:59 | | fl0w joins |
| 06:00:18 | <h2ibot> | JAABot edited List of websites excluded from the Wayback Machine (+0): https://wiki.archiveteam.org/?diff=49513&oldid=49506 |
| 06:07:48 | | eroc1990 quits [Remote host closed the connection] |
| 06:08:03 | | lennier1 quits [Client Quit] |
| 06:08:11 | | eroc1990 (eroc1990) joins |
| 06:08:23 | | lennier1 (lennier1) joins |
| 06:19:32 | | Arcorann (Arcorann) joins |
| 06:24:44 | | BlueMaxima quits [Read error: Connection reset by peer] |
| 07:02:06 | | LegitSi quits [Ping timeout: 265 seconds] |
| 08:14:53 | | hitgrr8 joins |
| 08:25:58 | | Sluggs quits [Ping timeout: 252 seconds] |
| 08:28:55 | | Sluggs joins |
| 08:33:18 | | Sluggs quits [Ping timeout: 252 seconds] |
| 08:34:12 | | Sluggs joins |
| 08:35:26 | <@JAA> | Oh yeah, we didn't do anything significant about keybase.pub, did we? |
| 08:38:26 | | Sluggs quits [Ping timeout: 252 seconds] |
| 08:42:41 | | Sluggs joins |
| 08:47:58 | | Sluggs quits [Ping timeout: 252 seconds] |
| 08:48:26 | | Sluggs joins |
| 08:58:06 | | Sluggs quits [Ping timeout: 265 seconds] |
| 08:58:28 | | Sluggs joins |
| 09:11:23 | | jspiros_ quits [Client Quit] |
| 09:11:24 | | qwertyasdfuiopghjkl quits [Remote host closed the connection] |
| 09:11:24 | | fuzzy8021 quits [Read error: Connection reset by peer] |
| 09:11:35 | | fuzzy8021 (fuzzy8021) joins |
| 09:11:39 | | jspiros (jspiros) joins |
| 09:15:05 | | leo60228- (leo60228) joins |
| 09:15:24 | | leo60228 quits [Client Quit] |
| 09:15:24 | | fangfufu quits [Client Quit] |
| 09:15:29 | | fangfufu joins |
| 09:17:26 | | Arcorann quits [Ping timeout: 253 seconds] |
| 09:17:30 | | Arcorann (Arcorann) joins |
| 09:45:05 | | michaelblob quits [Read error: Connection reset by peer] |
| 10:04:21 | | qwertyasdfuiopghjkl joins |
| 10:05:01 | | umgr036 quits [Remote host closed the connection] |
| 10:05:15 | | umgr036 joins |
| 11:05:02 | | LeGoupil joins |
| 11:31:52 | | eroc1990 quits [Ping timeout: 252 seconds] |
| 11:31:52 | | LeGoupil quits [Ping timeout: 252 seconds] |
| 11:52:06 | | umgr036 quits [Remote host closed the connection] |
| 11:52:55 | | umgr036 joins |
| 11:57:10 | | Sluggs quits [Ping timeout: 252 seconds] |
| 11:57:36 | | Sluggs joins |
| 12:00:38 | | eroc1990 (eroc1990) joins |
| 12:21:41 | | hogchips (shoghicp) joins |
| 12:28:32 | | LeGoupil joins |
| 12:35:13 | | michaelblob (michaelblob) joins |
| 12:46:11 | | drin joins |
| 12:46:18 | | geezabiscuit quits [Ping timeout: 252 seconds] |
| 12:46:53 | | drin is now known as geezabiscuit |
| 13:00:14 | | Arcorann quits [Ping timeout: 252 seconds] |
| 14:52:48 | | knecht420 quits [Ping timeout: 252 seconds] |
| 15:15:35 | | knecht420 (knecht420) joins |
| 15:53:13 | | hitgrr8 quits [Client Quit] |
| 16:38:40 | | LeGoupil quits [Client Quit] |
| 16:49:46 | | hitgrr8 joins |
| 17:03:45 | | gazorpazorp quits [Quit: Leaving] |
| 17:26:55 | <Jake> | :( Don't believe so |
| 17:29:48 | | Ketchup901 quits [Client Quit] |
| 17:31:10 | | Ketchup901 (Ketchup901) joins |
| 18:19:29 | | HP_Archivist (HP_Archivist) joins |
| 18:28:44 | <@JAA> | cm: The idea comes up every now and then, and it's a good one, but it sadly can't work. Non-repudiation just wasn't a design goal. TLS works by first doing a key exchange with asymmetric algorithms, and then the agreed-on key is used for symmetric encryption of the payload. So if you keep the client-side internal state (pre-master secret etc.), you can prove that a specific key was used in the TLS |
| 18:28:50 | <@JAA> | connection to a specific server. But that's where it stops. You can freely manipulate the payload since it's encrypted symmetrically. |
| 18:29:27 | <@JAA> | There were attempts to make this work. TLS Sign is one of those. They didn't get anywhere. |
| 18:31:13 | <@JAA> | (And before someone brings up AES-GCM et al.: no, the payload still isn't authenticated there, only the sequence number, protocol version, packet type, and packet length. See RFC 5246 section 6.2.3.3 for example.) |
| 18:34:08 | <cm> | great thanks for the thorough explanation |
| 18:35:14 | <@JAA> | (Which probably means you can't manipulate the total length of the payload, I guess. Better than nothing, but not that useful overall.) |
| 18:55:04 | <cm> | hm yeah that's something |
| 19:02:13 | <@JAA> | Actually, no, still useless, because AES-GCM is still a symmetric algorithm. The authentication tag is not a signature. |
| 19:02:25 | | nothere quits [Quit: Leaving] |
| 19:02:56 | | systwi_ joins |
| 19:05:13 | | systwi_ quits [Client Quit] |
| 19:05:47 | | superkuh_ quits [Remote host closed the connection] |
| 19:06:04 | | superkuh_ joins |
| 19:16:20 | | nothere joins |
| 19:45:32 | | LegitSi joins |
| 20:07:37 | | lennier1 quits [Client Quit] |
| 20:08:20 | | lennier1 (lennier1) joins |
| 20:10:01 | | CreaZyp154 joins |
| 20:27:52 | <CreaZyp154> | Yesterday someone sent an open directory containing .IPAs and APKs, turns out they have a lot of interesting stuff and the link submitted was for http://s1.bitdl.ir/ but I checked out and there is http://s2.bitdl.ir/, http://s3.bitdl.ir/, etc up to at least 28 i'll check for more (s4, s7, s17 s21-s26 timeout; s6, s8, s16, s19, s27, s30 error; no |
| 20:27:52 | <CreaZyp154> | s0, s20, s29 (DNS_PROBE_FINISHED_NXDOMAIN) ; other works), they all have open directories so idk if it is worth putting in #archivebot |
| 20:32:22 | <@JAA> | Yeah, that site's been circulating in /r/opendirectories for years. |
| 20:33:18 | <CreaZyp154> | oh... didn't know that |
| 20:34:00 | <@JAA> | And until a couple years ago, it was known as bitdownload.ir. |
| 20:34:19 | <anarcat> | holy crap, wth |
| 20:34:24 | <@JAA> | It's full of *totally* legitimate stuff. |
| 20:35:00 | <@JAA> | Servers went up to at least s33 at one point, although quite a few are dead, yeah. |
| 20:36:02 | <CreaZyp154> | yeah after s32 and there's no more server (there's also video.bitdl.ir but I think that's it (thanks subdomain finder hehe)) |
| 20:36:37 | <CreaZyp154> | s33 is timing out and above that it's all DNS_PROBE_FINISHED_NXDOMAIN |
| 20:58:00 | | HP_Archivist quits [Client Quit] |
| 21:01:22 | | CreaZyp154 quits [Remote host closed the connection] |
| 21:04:59 | | sec^nd quits [Ping timeout: 276 seconds] |
| 21:06:52 | | HP_Archivist (HP_Archivist) joins |
| 21:08:06 | | Ketchup901 quits [Client Quit] |
| 21:09:40 | | Ketchup901 (Ketchup901) joins |
| 21:10:42 | | sec^nd (second) joins |
| 21:14:23 | | HP_Archivist quits [Client Quit] |
| 21:15:09 | | michaelblob_ (michaelblob) joins |
| 21:15:17 | | umgr036 quits [Remote host closed the connection] |
| 21:15:17 | | michaelblob quits [Remote host closed the connection] |
| 21:15:25 | | umgr036 joins |
| 21:25:45 | <kpcyrd> | sooo, I'm trying to archive InRelease files (what apt-get works with) from various high-profile repositories. Since this data is signed, I developed a p2p network to collect and exchange these files. Channel is ##apt-swarm-p2p on hackint. |
| 21:27:26 | | user_ joins |
| 21:27:51 | | umgr036 quits [Remote host closed the connection] |
| 21:27:57 | <kpcyrd> | the code I have so far is on github: https://github.com/kpcyrd/apt-swarm |
| 21:42:42 | | user_ quits [Remote host closed the connection] |
| 21:42:55 | | user_ joins |
| 22:00:19 | <h2ibot> | JAABot edited CurrentWarriorProject (-2): https://wiki.archiveteam.org/?diff=49514&oldid=49450 |
| 22:09:21 | <h2ibot> | Ravishshah edited ArchiveBot/Educational institutions/list (+74, /* Unsorted */): https://wiki.archiveteam.org/?diff=49515&oldid=48889 |
| 22:20:52 | | treora quits [Ping timeout: 252 seconds] |
| 22:21:49 | | treora joins |
| 22:25:23 | <h2ibot> | JustAnotherArchivist edited Deathwatch (+249, /* 2023 */ Add WirelessAdvisor.com): https://wiki.archiveteam.org/?diff=49516&oldid=49500 |
| 22:33:57 | | BlueMaxima joins |
| 22:41:21 | | hitgrr8 quits [Client Quit] |
| 23:39:34 | | Atom-- joins |
| 23:43:55 | | Atom quits [Ping timeout: 252 seconds] |