| 00:01:11 | | VerifiedJ quits [Quit: The Lounge - https://thelounge.chat] |
| 00:01:21 | | gazorpazorp quits [Client Quit] |
| 00:01:37 | | gazorpazorp (gazorpazorp) joins |
| 00:06:54 | | VerifiedJ (VerifiedJ) joins |
| 00:20:15 | <@OrIdow6> | Does anyone have examples of projects from before ~late 2019 that have data in the WBM but aren't viewable there? If I make a viewer for Google Drive, would like parts to be reusable |
| 00:20:27 | <@OrIdow6> | Date because that's about when I joined |
| 00:35:36 | <h2ibot> | JustAnotherArchivist edited Deathwatch (+167, /* 2022 */ Add Meta, fix order): https://wiki.archiveteam.org/?diff=47997&oldid=47949 |
| 00:40:29 | | FourFire quits [Remote host closed the connection] |
| 00:44:21 | | LegitSi quits [Remote host closed the connection] |
| 01:01:37 | | dm4v quits [Read error: Connection reset by peer] |
| 01:02:06 | | dm4v joins |
| 01:02:08 | | dm4v is now authenticated as dm4v |
| 01:02:08 | | dm4v quits [Changing host] |
| 01:02:08 | | dm4v (dm4v) joins |
| 01:19:42 | <katocala> | !a https://heimatkunde.boell.de/ --explain "think tank, Heinrich Boll Foundation (kakapo)" --igset blogs,badvideos --useragent firefox --pipeline jap-kakapo |
| 01:20:06 | <katocala> | blah |
| 01:26:42 | | qwertyasdfuiopghjkl quits [Client Quit] |
| 01:39:33 | | etnguyen03 (etnguyen03) joins |
| 02:02:43 | | dm4v_ joins |
| 02:03:51 | | dm4v quits [Ping timeout: 265 seconds] |
| 02:03:51 | | dm4v_ is now known as dm4v |
| 02:03:51 | | dm4v is now authenticated as dm4v |
| 02:03:52 | | dm4v quits [Changing host] |
| 02:03:52 | | dm4v (dm4v) joins |
| 02:06:12 | | etnguyen03 quits [Ping timeout: 244 seconds] |
| 02:35:05 | | etnguyen03 (etnguyen03) joins |
| 03:07:46 | | AlsoHP_Archivist joins |
| 03:11:35 | | HP_Archivist quits [Ping timeout: 258 seconds] |
| 03:12:43 | | Unverified joins |
| 03:15:26 | | epe07 joins |
| 03:32:29 | | Stiletto quits [Ping timeout: 244 seconds] |
| 03:35:05 | | HP_Archivist (HP_Archivist) joins |
| 03:35:27 | | AlsoHP_Archivist quits [Client Quit] |
| 03:35:53 | | Stiletto joins |
| 03:40:42 | | battlekeeper joins |
| 03:41:10 | | pawbs|2 quits [Quit: My ZNC server died. Probably updating my kernel…] |
| 03:44:40 | | AlsoHP_Archivist joins |
| 03:48:10 | <h2ibot> | Tech234a edited YouTube/Technical details (+388, Add additional domains): https://wiki.archiveteam.org/?diff=47998&oldid=47996 |
| 03:51:45 | | battlekeeper leaves |
| 03:57:11 | <h2ibot> | JustAnotherArchivist edited YouTube/Technical details (+780, /* Videos */ Add details on previously working…): https://wiki.archiveteam.org/?diff=47999&oldid=47998 |
| 04:00:24 | | BlueMaxima joins |
| 04:03:13 | <h2ibot> | JustAnotherArchivist edited YouTube/Technical details (+1635, /* Playlists */): https://wiki.archiveteam.org/?diff=48000&oldid=47999 |
| 04:13:14 | <h2ibot> | Tech234a edited YouTube/Technical details (+1719, Add some information about channel URLs): https://wiki.archiveteam.org/?diff=48001&oldid=48000 |
| 04:14:21 | | etnguyen03 quits [Client Quit] |
| 04:14:31 | | etnguyen03 (etnguyen03) joins |
| 04:24:26 | | nostalgebraist joins |
| 04:29:41 | | amazo joins |
| 04:29:50 | <amazo> | are the trackers down? |
| 04:30:17 | <h2ibot> | Tech234a edited YouTube/Technical details (+56, /* Playlists */ Add a little more detail about…): https://wiki.archiveteam.org/?diff=48002&oldid=48001 |
| 04:34:55 | <tech234a> | amazo: yes except for URLTeam |
| 04:38:06 | | amazo quits [Remote host closed the connection] |
| 04:47:49 | | qw3rty__ joins |
| 04:51:32 | | qw3rty_ quits [Ping timeout: 244 seconds] |
| 05:00:00 | | treora quits [Quit: blub blub.] |
| 05:01:17 | | treora joins |
| 05:12:21 | | AlsoHP_Archivist quits [Ping timeout: 265 seconds] |
| 05:18:05 | | etnguyen03 quits [Ping timeout: 258 seconds] |
| 05:21:55 | | HP_Archivist quits [Ping timeout: 258 seconds] |
| 05:32:34 | | etnguyen03 (etnguyen03) joins |
| 05:35:45 | | jamesp quits [Client Quit] |
| 05:43:46 | | Nay quits [Ping timeout: 265 seconds] |
| 06:01:06 | | etnguyen03 quits [Client Quit] |
| 06:30:55 | | Arcorann_ quits [Ping timeout: 258 seconds] |
| 07:45:35 | | nepeat quits [Read error: Connection reset by peer] |
| 07:46:39 | | nepeat (nepeat) joins |
| 08:03:31 | | BlueMaxima quits [Read error: Connection reset by peer] |
| 09:00:17 | | britmob25636477 quits [Quit: britmob25636477] |
| 09:36:06 | | Arcorann_ joins |
| 09:56:10 | | Hosseinifard (Hosseinifard) joins |
| 09:57:13 | | Hosseinifard quits [Client Quit] |
| 11:45:15 | | TheTechRobo quits [Ping timeout: 258 seconds] |
| 11:46:51 | | enowaldo joins |
| 11:46:57 | <enowaldo> | Legal/forensics Q for 3rd party re: problematic d/l archive. |
| 11:48:14 | | TheTechRobo joins |
| 12:21:33 | | enowaldo quits [Ping timeout: 265 seconds] |
| 12:38:37 | | enowaldo joins |
| 13:00:49 | | enowaldo quits [Ping timeout: 244 seconds] |
| 13:02:17 | | britmob25636477 joins |
| 13:02:38 | | enowaldo joins |
| 13:10:59 | | qwertyasdfuiopghjkl joins |
| 13:15:17 | | enowaldo quits [Ping timeout: 244 seconds] |
| 13:21:17 | | enowaldo joins |
| 13:26:08 | | enowaldo quits [Ping timeout: 244 seconds] |
| 13:29:54 | | Myself quits [Ping timeout: 258 seconds] |
| 13:31:22 | | enowaldo joins |
| 13:36:28 | | enowaldo quits [Ping timeout: 265 seconds] |
| 13:37:38 | | Myself (myself) joins |
| 13:40:38 | | Arcorann_ quits [Ping timeout: 258 seconds] |
| 13:41:30 | | enowaldo joins |
| 13:44:08 | | vukky (Vukky) joins |
| 13:46:23 | | enowaldo quits [Ping timeout: 258 seconds] |
| 13:51:37 | | enowaldo joins |
| 13:56:21 | | enowaldo quits [Ping timeout: 258 seconds] |
| 14:01:44 | | enowaldo joins |
| 14:06:26 | | enowaldo quits [Ping timeout: 244 seconds] |
| 14:11:52 | | enowaldo joins |
| 14:14:49 | | AlsoHP_Archivist joins |
| 14:16:46 | | enowaldo quits [Ping timeout: 244 seconds] |
| 14:21:07 | | Daloader joins |
| 14:21:58 | | enowaldo joins |
| 14:26:44 | | enowaldo quits [Ping timeout: 265 seconds] |
| 14:32:10 | | enowaldo joins |
| 14:36:59 | | enowaldo quits [Ping timeout: 258 seconds] |
| 14:42:13 | | enowaldo joins |
| 14:47:02 | | enowaldo quits [Ping timeout: 265 seconds] |
| 14:47:16 | | HP_Archivist (HP_Archivist) joins |
| 14:52:21 | | enowaldo joins |
| 14:57:04 | | enowaldo quits [Ping timeout: 244 seconds] |
| 15:02:28 | | enowaldo joins |
| 15:07:16 | | enowaldo quits [Ping timeout: 258 seconds] |
| 15:12:37 | | enowaldo joins |
| 15:17:13 | | enowaldo quits [Ping timeout: 244 seconds] |
| 15:18:28 | | enowaldo joins |
| 15:29:00 | <Nulo> | What is a good tool to download *many* small files in a small amount of time? |
| 15:29:38 | | qwertyasdfuiopghjkl quits [Ping timeout: 244 seconds] |
| 15:35:06 | <Zeklyn> | wget |
| 15:37:06 | <@JAA> | How many in how little time? |
| 15:38:02 | <rewby> | My usual go-to is curl + gnu parallel |
| 15:39:22 | <Nulo> | JAA, 548 thousand, I guess it's not that much |
| 15:39:42 | <Nulo> | From my testing just now parallel gets CPU bound |
| 15:40:06 | <Nulo> | I was trying aria2 but for some reason it's pretty slow (it tries every few seconds instead of constantly downloading) |
| 15:40:14 | <rewby> | Maybe shard the list into like 16 files and run 16 wget -i instances? |
| 15:40:56 | <rewby> | Although I'm curious as to what you're downloading that's that many files |
| 15:41:18 | <Nulo> | Scraping a subtitle website (subdivx.com) |
| 15:41:25 | <Nulo> | I'll try that, thanks |
| 15:41:39 | <Nulo> | The other problem with curl/wget + parallel is that it creates connections for no reason |
| 15:41:55 | <rewby> | Yeah. I think the wget-i approach should keep the connection open |
| 15:42:14 | <rewby> | And ideally there'll always be one downloading while the others do whatever administration they do |
| 15:42:32 | | AlsoHP_Archivist quits [Client Quit] |
| 15:44:03 | | AlsoHP_Archivist joins |
| 15:44:29 | | HP_Archivist quits [Client Quit] |
| 15:44:32 | | AlsoHP_Archivist quits [Remote host closed the connection] |
| 15:44:50 | | HP_Archivist (HP_Archivist) joins |
| 15:49:24 | <enowaldo> | Legal/forensics quesiton for 3rd party re: problematic d/l archive. Any docs or resources available? |
| 15:57:55 | <@OrIdow6> | enowaldo: If this is a question not directed at a specific person, I have no idea what you're talking about |
| 16:03:32 | <@OrIdow6> | Nulo: I don't know if you got told, but ArchiveTeam did start its own download of the site when you talked about it |
| 16:03:57 | <@OrIdow6> | Which looks like it is still running in ArchiveBot |
| 16:04:04 | <Nulo> | OrIdow6, I haven't got told, I couldn't find any info on the wiki so I took the matter into my own hands :P |
| 16:04:47 | <Nulo> | Is there a way to see the progress somewhere? |
| 16:06:56 | <@OrIdow6> | http://dashboard.at.ninjawedding.org/3 |
| 16:09:28 | <Nulo> | Okay, thanks. I will continue with my archive just to be sure. |
| 16:09:44 | <Nulo> | The site announced later on that they would stay and not close, but it's not clear so I would archive just to be sure |
| 16:17:56 | | Megame (Megame) joins |
| 16:27:23 | | enowaldo quits [Ping timeout: 258 seconds] |
| 16:32:52 | | enowaldo joins |
| 16:37:43 | | enowaldo quits [Ping timeout: 265 seconds] |
| 16:42:59 | | enowaldo joins |
| 16:47:38 | | enowaldo quits [Ping timeout: 244 seconds] |
| 16:53:05 | | enowaldo joins |
| 16:56:57 | | Nay (JeDa) joins |
| 16:58:01 | | enowaldo quits [Ping timeout: 265 seconds] |
| 17:03:12 | | enowaldo joins |
| 17:07:47 | | enowaldo quits [Ping timeout: 244 seconds] |
| 17:20:33 | | godane2 quits [Client Quit] |
| 17:20:45 | | godane (godane) joins |
| 17:38:06 | | ragu_ joins |
| 17:40:24 | | ragu__ joins |
| 17:41:45 | | ragu quits [Ping timeout: 258 seconds] |
| 17:41:48 | | ragu__ quits [Read error: Connection reset by peer] |
| 17:43:40 | | ragu_ quits [Ping timeout: 258 seconds] |
| 17:44:43 | | ragu__ joins |
| 17:58:56 | | ragu__ quits [Ping timeout: 244 seconds] |
| 18:00:57 | | lunik1 quits [Quit: :x] |
| 18:11:28 | | enowaldo joins |
| 18:12:28 | <enowaldo> | A friend is looking for help for someone in hot water over a CP-tainted archive, political leak. Any organisations or references to suggest? |
| 18:12:57 | <enowaldo> | s/help/legal help/ |
| 18:15:57 | <enowaldo> | My understanding is that the content was planted or incidental. |
| 18:16:25 | <Unverified> | iirc apple is using a CSAM hash list, not sure if those would be public but they're hashes to known CP archives that you could scan against to remove such content |
| 18:17:07 | <enowaldo> | Unverified: Right. They're sort of beyond the preventive stage, though that's a good suggestion. |
| 18:17:56 | <Unverified> | that's the best thing I can think of that'll help a little at least lol |
| 18:18:35 | <enowaldo> | Unverified: Right. I was hoping there might be some resource or discussion of how to avoid, or respond if the situation arises. |
| 18:19:22 | <enowaldo> | I've not found anything on AT website/wiki or a few other discussions (e.g., /r/datahoarder). |
| 18:19:41 | <enowaldo> | And I've passed on the usual organisations --- EFF/ACLU. |
| 18:19:50 | | Nay quits [Client Quit] |
| 18:20:07 | <@JAA> | I haven't heard of there having been such issues within AT before. I'm sure IA has dealt with it before though, so maybe shooting them an email might be an idea. |
| 18:21:11 | <enowaldo> | @JAA Also on my list, haven't reached out yet. Thanks. |
| 18:22:31 | | Nay (JeDa) joins |
| 18:25:09 | <@JAA> | By the way, I believe those CSAM hashes aren't public. Or at least I couldn't find them when I looked into it a while ago. The details of the fuzzy hashing methods are also not very public. |
| 18:30:45 | | lunik1 joins |
| 18:31:39 | | lunik1 quits [Client Quit] |
| 18:36:43 | | lunik1 joins |
| 18:42:47 | <enowaldo> | JAA: "Free to qualified organisations" apparently: https://www.microsoft.com/en-us/photodna "1Must be a qualified organization subject to approval by third-party vetting service." https://www.microsoft.com/en-us/PhotoDNA/CloudService |
| 18:43:39 | <@JAA> | Sounds about right. |
| 18:47:27 | | Iki quits [Read error: Connection reset by peer] |
| 18:52:44 | <Unverified> | so its quite unlikely that any small archive projects would be accepted then |
| 18:52:45 | <Unverified> | rip |
| 18:53:41 | <enowaldo> | Unverified: Though AT / IA might stand an in. Stand up an org specifically aimed at validating archives, maybe. Asking is free :) |
| 18:53:47 | <russss> | enowaldo: they don't send you the hashes though. You hash the content and send the hash to them, and then there's some fuzzy-matching going on server-side. |
| 18:54:03 | <enowaldo> | russss: Right. That would be sufficient IMO. |
| 18:54:23 | <Unverified> | probably some image upload API that gives a response |
| 18:55:10 | <russss> | you don't have to upload the image, because that in itself could give rise to liability (company I work for uses PhotoDNA) |
| 18:56:24 | <enowaldo> | russss: I'm guessing some processor that ingests images locally, computes a set of hashes based on transforms, and sends those. Apple and MSFT have some whitepapers / docs. |
| 18:57:07 | <russss> | yeah there is some white paper on PhotoDNA somewhere. I think it's a relatively unsophisticated perceptual hashing system by today's standards |
| 18:59:18 | <russss> | if you did have access to the list of hashes it would probably be quite trivial to generate images which happen to match them. |
| 19:00:23 | <@JAA> | And to manipulate matching images so they no longer match them. |
| 19:16:28 | <Unverified> | alternatively, use blind luck and hope some AI can detect something like that but I doubt there's anything like that online |
| 19:17:08 | <Unverified> | making your own would probably get you in trouble as well so I wouldn't think of it as a smart move lol |
| 20:14:06 | | Iki joins |
| 20:52:13 | | vukky quits [Client Quit] |
| 21:07:12 | | enowaldo quits [Client Quit] |
| 21:25:21 | | Minkafighter quits [Quit: The Lounge - https://thelounge.chat] |
| 21:27:03 | | Minkafighter joins |
| 21:32:13 | <Ryz> | Is there a way to archive https://ufile.io/84mirinu ? It looks like it may be behind a reCAPTCHA - came from https://boards.4channel.org/v/thread/580002471 |
| 21:47:35 | | monoxane4 quits [Quit: Ping timeout (120 seconds)] |
| 21:47:49 | | monoxane4 (monoxane) joins |
| 21:47:53 | | BlueMaxima joins |
| 22:27:37 | | Arcorann_ joins |
| 23:14:29 | | qwertyasdfuiopghjkl joins |
| 23:56:10 | | Earendil (Cobalt17) joins |
| 23:58:05 | | Earendil quits [Client Quit] |