| 00:00:19 | <@JAA> | If their servers stop sucking, yeah. #noanswers for that project. |
| 00:01:37 | <user> | Thanks. (Is it considered bad IRC etiquette to reply with just "Thanks" and similar phrases?) |
| 00:01:56 | <@JAA> | No, that's fine. :-) |
| 00:02:12 | | BlueMaxima joins |
| 00:02:19 | <@JAA> | I'd suggest picking a more discernable nickname though. |
| 00:04:40 | <@OrIdow6> | So it doesn't look like Docket has announced the new deletion date yet? |
| 00:04:44 | <@OrIdow6> | *Docker |
| 00:05:53 | <user> | JAA, I tried registering with Nickserv, but for some reason I'm still seen as \'user' here. I'll fix it later. |
| 00:06:09 | <@JAA> | user: /nick NewNickname |
| 00:06:19 | | user is now known as gazorpazorp |
| 00:06:31 | <gazorpazorp> | is that too long? |
| 00:07:43 | <@JAA> | OrIdow6: Nope, doesn't look like it. |
| 00:07:58 | <@JAA> | gazorpazorp: No, that's fine. :-) |
| 00:32:56 | <@OrIdow6> | Has anyone looked into JFrog or Aimax-Z? |
| 00:41:45 | | tzt_ joins |
| 00:43:02 | | tzt quits [Ping timeout: 250 seconds] |
| 00:43:08 | | webdownload joins |
| 00:45:09 | | webdownload quits [Remote host closed the connection] |
| 00:45:26 | <@OrIdow6> | Both look like they may need warrior projects (though I haven't yet determined how many forums Aimix-Z hosts) |
| 00:46:04 | <@OrIdow6> | JFrog is basically a big repository for software builds and is likely many TB at least |
| 01:02:52 | | dm4v quits [Read error: Connection reset by peer] |
| 01:04:31 | | dm4v joins |
| 01:04:33 | | dm4v is now authenticated as dm4v |
| 01:04:33 | | dm4v quits [Changing host] |
| 01:04:33 | | dm4v (dm4v) joins |
| 01:08:49 | | gazorpazorp quits [Client Quit] |
| 01:13:30 | | gazorpazorp (gazorpazorp) joins |
| 01:17:14 | <lennier1> | A trend I have mixed feelings about, but relevant for archiving--news sites considering removal of old stories that may have unfair consequences for the subjects covered, such as someone arrested for a crime but never charged: https://www.kentucky.com/opinion/editorials/article250720789.html |
| 01:22:15 | | Iki quits [Remote host closed the connection] |
| 01:22:17 | <gazorpazorp> | What about "the right to be forgotten" and laws like that? I'm in favor of such stories being archived and publicly available, but am genuinely wondering if that's a real problem |
| 01:22:32 | <gazorpazorp> | legally speaking |
| 01:23:10 | | Mineroboter_ joins |
| 01:25:04 | | Mineroboter quits [Ping timeout: 250 seconds] |
| 01:25:41 | | Sylirana quits [Remote host closed the connection] |
| 01:27:46 | <lennier1> | In this case, it's a US site, so there's no legal requirement for them to do anything. But I see the point that someone can be arrested for a minor crime, and even if they're not convicted, it's still keeping them from getting jobs a decade later because it shows up in search engines. |
| 01:29:21 | <gazorpazorp> | IANAL, but even a US site has to conform to GDPR, so maybe it has to conform to other EU laws if it wants to operate in the EU? Like how some sites just block EU IPs. |
| 01:29:30 | | Sylirana (Sylirana) joins |
| 01:31:57 | <gazorpazorp> | Are any of the current archives accessible via a distributed system? Or are they only on centalized sites like the Wayback Machine? Because if it's centralized, it could be subject to requests for removal from the respective government |
| 01:32:45 | <gazorpazorp> | So perhaps if we want to archive such news articles, it should be done in a censorship-resistant way |
| 01:33:01 | <gazorpazorp> | (I'm new to the whole scene, so forgive anything ignorant I write) |
| 01:33:27 | <G4te_Keep3r> | an evolving news collection torrent? |
| 01:36:34 | <gazorpazorp> | Are torrents flexible enough to support different overlapping subsets of information? For example, you have news categorized by country and by topic (two dimensions). If someone wants to download news for country X on topic Y, will they be able to easily download only those news, or will they have to add many torrents or select specific files from a torrent with lots of files? |
| 01:43:36 | <G4te_Keep3r> | touche...great for distributed files not so much for common everyday person to grab specific thing/location...and would probably be in warc format so would need whatever is used to read those |
| 01:46:19 | <gazorpazorp> | I don't much about IPFS, but that seems like an alternative for file sharing/storage worth investigating |
| 02:29:12 | | G4te_Keep3r quits [Ping timeout: 258 seconds] |
| 02:30:08 | | G4te_Keep3r joins |
| 02:50:29 | <ave> | gazorpazorp, I believe most archives go to IA, usually in wayback machine. Requests for removals are understandable and are probably for the best, though I can't see US govt sending one. Definitely can see news conglomerates sending DMCA strikes tho. |
| 02:50:53 | <ave> | Most torrent clients let you see a file/folder list before you download and let you pick which ones you want. |
| 02:54:11 | <ave> | re IPFS, while it's distributed, the data is only guaranteed to stay up as long as one node is pinning it, or at least that's my understanding. Similar to torrents in that regard, I'd say. *sigh* As much as I hate cryptocurrencies, there's filecoin which tries to address that afaik. |
| 02:54:23 | <ave> | IA was granted a ton of filecoins by filecoin foundation this month. https://blog.archive.org/2021/04/01/filecoin-foundation-grants-50000-fil-to-the-internet-archive/ |
| 02:56:54 | <@OrIdow6> | A US news site removing material under order of a foreign country enforcing a law that violates the First Amendment would not looked well upon |
| 02:58:51 | <ave> | I assume they could do what google does when you use the right to be forgotten thing in EU: Hide the content for certain regions. (https://www.google.com/webmasters/tools/legal-removal-request?complaint_type=rtbf&hl=en&rd=1) |
| 03:06:01 | <gazorpazorp> | I'm glad I was wrong about compliance with EU laws for foreign sites |
| 03:06:53 | <gazorpazorp> | But the problem with censorship still stands - as long as data is kept in a central location, it's vulnerable to any number of bad (wrt freedom of speech/archiving) actors |
| 03:08:17 | <gazorpazorp> | Filecoin seems interesting. I also suspected a cryptocurrency would have to get mixed up in a distributed archiving/file sharing network, to incentivize people to keep the data |
| 03:26:00 | <@JAA> | Poking a bit at the JFrog platforms... |
| 03:28:08 | <@OrIdow6> | Aimax-Z is my own first priority among the two, both because of expected capacity and because of potential value, so may see about writing a script for that soon |
| 03:28:29 | <@OrIdow6> | Though need to do must more exploration first |
| 03:28:40 | <@OrIdow6> | Thanks for looking at JFrog |
| 03:29:01 | <@JAA> | Bintray has no sitemap or obvious option of enumeration. The API has ridiculous rate limits and is useless. The search seems to return partial matches and requires at least two letters. Bruteforcing aa through zz should work for discovery. There are four tabs for each search term, and they often have many thousands of pages. So even that enumeration will be millions of requests. |
| 03:29:34 | <@JAA> | I haven't fully understood the relation between 'packages', 'files', and 'repositories' yet. |
| 03:30:29 | <@JAA> | GoCenter seems to have vanished already. Both gocenter.io and search.gocenter.io redirect to the shutdown announcement. |
| 03:32:27 | <@JAA> | As I understand it though, GoCenter was essentially a caching platform similar to the main Go proxy. I'm not sure there was anything original on it, but perhaps it had some things that are no longer on GitHub et al. Alas, it appears to be too late. |
| 03:34:17 | <@JAA> | ChartCenter apparently similar but for Helm. |
| 03:35:05 | <@JAA> | There is still something at https://repo.chartcenter.io/, will try to get a size estimate. |
| 03:35:57 | <@JAA> | Actually yeah, the shutdown notice says that the GoCenter and ChartCenter websites were taken down end of February, but 'client requests will still work' until 1 May. |
| 03:37:36 | <@JAA> | JCenter will stay online until February 2022 for package downloads. However, discovery might no longer be possible after 1 May. |
| 03:44:38 | <@JAA> | ChartCenter size according to a 1 % sample HEAD of the URLs returned in https://repo.chartcenter.io/: 1.6 GB...? I have no idea whether that's correct. |
| 03:46:11 | <@JAA> | I'll throw that into AB anyway. |
| 03:51:23 | | etnguyen03 quits [Client Quit] |
| 03:53:19 | | DogsRNice quits [Read error: Connection reset by peer] |
| 03:55:47 | | qw3rty__ joins |
| 03:57:53 | <@OrIdow6> | Looks like there are something like 16k "rooms" (i.e. forums, as far as I can tell) in Aimax-Z in the IA CDX |
| 03:59:17 | | qw3rty_ quits [Ping timeout: 258 seconds] |
| 04:03:36 | <@OrIdow6> | Roughly accords with the "sites" count on this page http://www.aimix-z.com/view.html, if that's what it's referring to |
| 04:03:42 | <atphoenix> | gazorpazorp: you can generally run 1 instance of URLteam in parallel with anything else. It has minimal system and network requirements, and is IP-limited. |
| 04:20:41 | <@JAA> | I've sent an email to JFrog about GoCenter as there is no public index of the modules. |
| 04:25:11 | <@JAA> | JCenter has an index at https://bintray.com/bintray/jcenter but the pagination seems to max out at an offset of 10k, so page 1250 of allegedly 54k. It stops at Schema* though (sorted alphabetically), so that doesn't seem quite right. |
| 04:47:13 | | Stiletto joins |
| 05:12:22 | | tzt_ is now known as tzt |
| 05:19:54 | <@JAA> | No idea what this is exactly and how it relates to the shutdown: http://oss.jfrog.org/artifactory/jcenter-cache/ |
| 05:35:49 | | Arcorann (Arcorann) joins |
| 05:42:38 | <tech234a> | “Founder of Adobe and developer of PDFs dies at age 81” https://apnews.com/article/business-john-warnock-san-francisco-b77f216f52d736a6b5a383a429208f51 |
| 07:12:50 | | rsn quits [Remote host closed the connection] |
| 07:13:12 | | rsn joins |
| 07:21:16 | | G4te_Keep3r quits [Ping timeout: 250 seconds] |
| 07:21:55 | | G4te_Keep3r joins |
| 07:55:05 | | LeighR (LeighR) joins |
| 07:57:12 | | wickedplayer494 quits [Remote host closed the connection] |
| 08:04:49 | | wickedplayer494 joins |
| 08:05:24 | | wickedplayer494 is now authenticated as wickedplayer494 |
| 08:33:50 | | BlueMaxima quits [Client Quit] |
| 08:48:17 | | hooway joins |
| 09:55:02 | | mutantmonkey quits [Remote host closed the connection] |
| 09:55:17 | | mutantmonkey (mutantmonkey) joins |
| 10:02:29 | | Dj-Wawa quits [Quit: Dj-Wawa] |
| 10:02:53 | | Dj-Wawa joins |
| 10:02:53 | | Dj-Wawa is now authenticated as Dj-Wawa |
| 10:30:12 | | Terbium quits [Quit: http://quassel-irc.org - Chat comfortably. Anywhere.] |
| 10:30:35 | | Terbium joins |
| 10:47:55 | | Matthww quits [Ping timeout: 258 seconds] |
| 10:57:08 | | Matthww joins |
| 10:58:42 | | spirit joins |
| 11:03:20 | | chriscoffee quits [Quit: Connection closed for inactivity] |
| 12:47:41 | | Matthww quits [Client Quit] |
| 12:51:40 | | spirit2 joins |
| 12:51:52 | | Matthww joins |
| 12:55:11 | | spirit quits [Ping timeout: 258 seconds] |
| 14:37:51 | | emernic joins |
| 14:43:59 | <emernic> | Basic question from a newbie: would it be useful for me to spin up several workers on Google Cloud K8s and just leave the project set to "auto" all the time? or is that kind of "raw firepower" rarely the bottleneck? |
| 14:44:31 | <emernic> | (or maybe there's some other reason that wouldn't work, e.g. I don't know how GKE manages outgoing IPs and whether they could just be banned quickly) |
| 14:47:09 | <emernic> | (I mean the warrior images, sorry if that was ambiguous) |
| 15:03:01 | <LeighR> | Unless you have a lot of varied outbound IP addresses, not really. |
| 15:03:24 | <LeighR> | and you would need to figure out how to make your pods use them |
| 15:04:42 | <LeighR> | perhaps run warriors as a StatefulSet, with one per node, on a cluster that you already have up doing other stuff |
| 15:05:12 | <LeighR> | Diverse IPs are the main bottleneck for most projects, from what I've seen |
| 15:05:32 | <LeighR> | most of us have more "firepower" than the projects can use |
| 15:05:59 | <emernic> | That makes sense, thanks for the info! |
| 15:11:52 | <LeighR> | based on how Azure Kubernetes Service worked before I went on maternity leave, GKS might also route all cluster outbound traffic over a single load balancer public IP |
| 15:12:03 | <LeighR> | which would be very counterproductive for AT |
| 15:12:58 | <LeighR> | do some testing with plain Ubuntu/your favorite distro and "curl http://ip4.me/api/" as well as "curl http://ip6.me/api/" |
| 15:30:17 | <emernic> | Yeah, that makes sense. I know it's not just 1 stable, predictable IP for all outbound GKE traffic by default (since I wanted this a while ago and it was a huge pain to setup), but I think it's one per node by default (which, like you said, wouldn't make sense unless those nodes were also being used for something else). I'll keep digging for ways |
| 15:30:18 | <emernic> | to get a bunch of "good" outgoing IPs on Google Cloud. |
| 15:31:55 | <LeighR> | if there's a way you could get a random one each time, that would be amazing |
| 15:58:14 | | Arcorann quits [Ping timeout: 250 seconds] |
| 16:22:33 | | emernic quits [Remote host closed the connection] |
| 16:51:24 | | DogsRNice (Webuser299) joins |
| 17:41:12 | | etnguyen03 (etnguyen03) joins |
| 19:47:30 | | Jonboy3451 quits [Read error: Connection reset by peer] |
| 19:53:28 | | Jonboy345 joins |
| 22:09:43 | | LeighR quits [Client Quit] |
| 22:11:11 | | @EggplantN quits [Client Quit] |
| 22:11:55 | | EggplantN joins |
| 22:41:07 | | EggplantN is now authenticated as EggplantN |
| 22:41:07 | | EggplantN quits [Changing host] |
| 22:41:07 | | EggplantN (EggplantN) joins |
| 22:41:07 | | @ChanServ sets mode: +o EggplantN |
| 23:02:33 | | BlueMaxima joins |
| 23:06:31 | | hooway quits [Client Quit] |
| 23:07:52 | | Viniter7 (Viniter) joins |
| 23:08:46 | | Sanqui quits [Remote host closed the connection] |
| 23:08:59 | | Sanqui joins |
| 23:09:01 | | Sanqui is now authenticated as Sanqui |
| 23:09:50 | | Viniter quits [Ping timeout: 250 seconds] |
| 23:09:50 | | Viniter7 is now known as Viniter |
| 23:34:10 | | Arcorann (Arcorann) joins |
| 23:44:16 | | Stilett0 joins |
| 23:48:00 | | Stiletto quits [Ping timeout: 258 seconds] |
| 23:53:23 | | aphitex22 joins |