00:37:30onetruth quits [Remote host closed the connection]
00:37:42onetruth joins
00:41:20Chris50105 (Chris5010) joins
00:41:21HP_Archivist quits [Remote host closed the connection]
00:41:21onetruth quits [Remote host closed the connection]
00:41:21Chris5010 quits [Client Quit]
00:41:22chrismeller quits [Remote host closed the connection]
00:41:22driib quits [Client Quit]
00:41:22Chris50105 is now known as Chris5010
00:41:31onetruth joins
00:41:36HP_Archivist (HP_Archivist) joins
00:41:38rellem (chrismeller) joins
00:42:01rellem quits [Remote host closed the connection]
00:42:09driib (driib) joins
00:43:08rellem (chrismeller) joins
00:43:31rellem quits [Remote host closed the connection]
00:44:38rellem (chrismeller) joins
00:45:01rellem quits [Remote host closed the connection]
00:45:58march_happy quits [Ping timeout: 265 seconds]
00:46:08rellem (chrismeller) joins
00:46:30march_happy (march_happy) joins
00:46:31rellem quits [Remote host closed the connection]
00:47:38rellem (chrismeller) joins
00:48:01rellem quits [Remote host closed the connection]
00:49:08rellem (chrismeller) joins
00:49:31rellem quits [Remote host closed the connection]
00:50:38rellem (chrismeller) joins
00:51:01rellem quits [Remote host closed the connection]
00:52:08rellem (chrismeller) joins
00:52:31rellem quits [Remote host closed the connection]
00:53:38rellem (chrismeller) joins
00:54:01rellem quits [Remote host closed the connection]
00:55:08rellem (chrismeller) joins
00:55:31rellem quits [Remote host closed the connection]
00:56:38rellem (chrismeller) joins
00:57:01rellem quits [Remote host closed the connection]
00:57:12march_happy quits [Ping timeout: 245 seconds]
00:57:20march_happy (march_happy) joins
00:58:08rellem (chrismeller) joins
00:58:31rellem quits [Remote host closed the connection]
00:58:54rellem (chrismeller) joins
01:01:22bonga quits [Ping timeout: 245 seconds]
01:02:46dm4v_ joins
01:02:48bonga joins
01:02:52HP_Archivist quits [Client Quit]
01:02:53dm4v quits [Ping timeout: 265 seconds]
01:02:58dm4v_ is now known as dm4v
01:03:01dm4v quits [Changing host]
01:03:01dm4v (dm4v) joins
01:03:02eroc1990 quits [Ping timeout: 245 seconds]
01:03:28eroc1990 (eroc1990) joins
01:07:04eroc1990 quits [Remote host closed the connection]
01:36:20Arcorann (Arcorann) joins
01:45:50Megame (Megame) joins
02:01:54Discant joins
02:22:02yay joins
02:22:47DogsRNice (Webuser299) joins
02:23:48<h2ibot>Arkiver uploaded File:Scratch Logo.png: https://wiki.archiveteam.org/?title=File%3AScratch%20Logo.png
02:59:53<h2ibot>Wickedplayer494 uploaded File:Scratch1.4.png: https://wiki.archiveteam.org/?title=File%3AScratch1.4.png
03:02:54<h2ibot>Wickedplayer494 edited Scratch (+120, Imagery and navbox): https://wiki.archiveteam.org/?diff=48671&oldid=48651
03:15:14<pabs>might be worth archiving this webOS archive? https://www.webosarchive.com/ https://news.ycombinator.com/item?id=31607318
03:15:52<pabs>its spread over several domains and there are GitHub repos
04:38:09DogsRNice quits [Read error: Connection reset by peer]
04:43:52march_happy quits [Ping timeout: 245 seconds]
04:44:32march_happy (march_happy) joins
05:12:00Jake quits [Client Quit]
05:36:01drexler joins
05:36:25<drexler>So, is it possible to get a database index of OpenVerse?
05:36:47<drexler>https://rom1504.github.io/clip-retrieval/
05:36:57<drexler>I want to put together a public domain version of LAION 400m
05:37:02<drexler>And that would save a lot of time
06:44:45march_happy quits [Remote host closed the connection]
06:53:40Jake (Jake) joins
06:53:41Jake quits [Client Quit]
06:53:55Jake (Jake) joins
06:54:17Jake quits [Client Quit]
06:54:46Jake (Jake) joins
06:55:56Megame quits [Client Quit]
07:16:47eroc1990 (eroc1990) joins
07:41:07eroc1990 quits [Client Quit]
07:46:56eroc1990 (eroc1990) joins
07:54:14jtagcat6 quits [Quit: Bye!]
07:54:30jtagcat6 (jtagcat) joins
08:05:57kiska quits [Quit: Ping timeout (120 seconds)]
08:05:57s-crypt quits [Quit: Ping timeout (120 seconds)]
08:05:57Ryz2 quits [Quit: Ping timeout (120 seconds)]
08:06:58kiska (kiska) joins
08:07:07s-crypt (s-crypt) joins
08:07:08Ryz2 (Ryz) joins
08:08:42ArchivalEfforts_ quits [Ping timeout: 265 seconds]
08:09:13ArchivalEfforts joins
08:31:54NotEggplant quits [Read error: Connection reset by peer]
08:32:14NotEggplant joins
09:38:27march_happy (march_happy) joins
10:42:04spirit quits [Quit: Leaving]
11:22:23march_happy quits [Remote host closed the connection]
11:23:33march_happy (march_happy) joins
11:33:19qwertyasdfuiopghjkl joins
11:33:52Discant quits [Ping timeout: 245 seconds]
11:39:20Discant joins
12:17:12Discant quits [Ping timeout: 245 seconds]
12:53:53Larsenv quits [Quit: ZNC 1.8.2+deb2build5 - https://znc.in]
12:56:18rellem quits [Remote host closed the connection]
12:56:18drexler quits [Remote host closed the connection]
12:56:48drexler joins
12:57:08rellem (chrismeller) joins
12:57:31rellem quits [Remote host closed the connection]
12:58:38rellem (chrismeller) joins
12:59:01rellem quits [Remote host closed the connection]
13:00:08rellem (chrismeller) joins
13:00:12BlueMaxima quits [Client Quit]
13:00:31rellem quits [Remote host closed the connection]
13:01:38rellem (chrismeller) joins
13:02:01rellem quits [Remote host closed the connection]
13:02:23rellem (chrismeller) joins
13:03:54HP_Archivist (HP_Archivist) joins
13:42:18tea joins
13:46:04rellem quits [Ping timeout: 265 seconds]
13:57:37Arcorann quits [Ping timeout: 245 seconds]
15:15:00march_happy quits [Ping timeout: 265 seconds]
15:16:19march_happy (march_happy) joins
15:53:32Larsenv (Larsenv) joins
15:59:57march_happy quits [Ping timeout: 265 seconds]
16:10:32Sluggs quits [Ping timeout: 245 seconds]
16:11:02Sluggs joins
16:13:03march_happy (march_happy) joins
17:00:59<drexler>SketchTheCow, speaking of AI art, you around?
17:13:25march_happy quits [Ping timeout: 265 seconds]
17:13:47march_happy (march_happy) joins
17:24:59mgwatts (mgwatts) joins
17:40:00march_happy quits [Ping timeout: 265 seconds]
18:16:36HP_Archivist quits [Ping timeout: 265 seconds]
18:23:49<thuban>i waited too long to archive something i knew might be time-sensitive. now it's gone and i have only myself to blame ._.
18:23:59tea quits [Ping timeout: 265 seconds]
18:24:13<@JAA>[x] I'm in this picture, and I don't like it.
18:27:40<Ryz>More archiving looty <#>;
18:33:31yay quits [Ping timeout: 265 seconds]
19:29:24yay joins
19:49:56HP_Archivist (HP_Archivist) joins
19:50:13HP_Archivist quits [Remote host closed the connection]
19:50:40HP_Archivist (HP_Archivist) joins
19:52:38<drexler>thuban, many such cases lol
19:54:02<thuban>sad!
20:00:45Nulo quits [Remote host closed the connection]
21:07:50<@arkiver>thuban: don't beat yourself up too much about it!
21:08:26<@arkiver>it happens unfortunately (I missed blogos.com), let's move on, plenty of other archival jobs to run
21:19:21lennier1 quits [Quit: Going offline, see ya! (www.adiirc.com)]
21:20:47lennier1 (lennier1) joins
21:24:07mgwatts quits [Remote host closed the connection]
21:53:41<Ryz>More loot to find and archiveeeeeeee
22:11:31<drexler>I'm currently annoyed because WordPress doesn't publish the OpenVerse database index
22:11:49<drexler>Which, if I had and scraped it, would basically be my public domain LAION 400m right there
22:12:14<drexler>And I find this annoying because it's like...it's a Creative Commons backed project, WikiMedia publishes one, why doesn't OpenVerse lol
22:12:27sepro quits [Client Quit]
22:13:50<drexler>It's especially annoying because the biggest publisher/host of public domain imagery right now is...Flickr
22:13:59<drexler>And we all know Flickr isn't in great financial shape
22:14:41<drexler>If they go without making arrangements for stewardship of the data, we could easily lose one of the most valuable media collections in existence.
22:17:04<drexler>It's triply annoying because search methods were recently invented that are easy to implement and let us much more powerfully search these kinds of collections than we could before using deep learning, so it's not even like the old "eh, who would ever look through it?" argument holds water anymore.
22:22:14<drexler>Like, you can search by image and text using open source models to find things that aren't in a caption for the image at all.
22:22:36<drexler>And with some preprocessing it's lightweight enough to run on CPU
22:23:50<drexler>https://rom1504.github.io/clip-retrieval/ implements this
22:28:55sepro (sepro) joins
22:35:29<drexler>Then if you look at the OpenVerse API https://api.openverse.engineering/v1/#operation/image_search it says in the docs that they are trying to structurally prevent you from downloading the whole database
22:36:06<@arkiver>also ".engineering"
22:36:35<drexler>What? Why? *This is public domain data*, why is WordPress being allowed to build a moat around this, why is Creative Commons allowing it? It's not like Creative Commons needs WordPress to provide search from a technical perspective, you can see the link I just gave you gives you an index over 5 *billion* images.
22:37:06<drexler>In short, this is chickenshit, and I'm pissed.
22:40:09<drexler>arkiver, I didn't even notice that lol
22:43:55<drexler>By contrast, turning this stuff into an ML dataset is probably one of the easier ways to make sure it stays available to the public. Since ML guys will train on the whole 400 million images, raise holy hell if you try to silently wall off access to the full thing, stick the entire dataset up on academictorrents, etc
22:45:17<drexler>It's honestly just a better strategy than giving the whole thing to WordPress and having them pinky promise they won't follow their structural incentives to build a moat.
22:48:37march_happy (march_happy) joins
22:58:27sepro quits [Ping timeout: 245 seconds]
23:01:27Nulo joins
23:30:51BlueMaxima joins
23:33:02wickedplayer494 quits [Ping timeout: 245 seconds]
23:33:24wickedplayer494 joins
23:58:07sepro (sepro) joins