01:10:46qw3rty__ quits [Read error: Connection reset by peer]
02:42:55<@hook54321>it's the coke URLs that return status 500 for some reason
02:43:04Ryz quits [Remote host closed the connection]
02:43:57<@hook54321>not sure whether to mark it as unavailable or no redirect
02:44:15Ryz (Ryz) joins
14:15:30atphoenix_ (atphoenix) joins
14:17:41atphoenix quits [Ping timeout: 258 seconds]
15:18:05kiska quits [Remote host closed the connection]
15:18:05flashfire42 quits [Remote host closed the connection]
16:40:59acm joins
17:25:12acm quits [Remote host closed the connection]
17:53:24kiskaWeebChat quits [Ping timeout: 250 seconds]
18:55:02Daloader_ joins
20:22:54Daloader_ quits [Ping timeout: 250 seconds]
20:27:24jtagcat quits [Quit: Bye!]
20:53:15jtagcat (jtagcat) joins
21:03:03<aarchi>On the topic of today's discussion in #noanswers about archiving responses as-is, without processing: hook54321 mentioned to me that saving WARCs of URLTeam data rather than just mappings had been considered before I joined.
21:04:44<aarchi>I think that would be very beneficial for data integrity. For example, when it was discovered that go-hawaii-edu had issues, the responses could be re-parsed to see which shortcodes need to be redone.
21:05:57<aarchi>Plus, the WARCs could be ingested by IA, so people not aware of URLTeam (or my URLHero link resolver, once I get that up), can still resolve dead redirects.
21:08:35<aarchi>The only downsides I can see are the increased storage size and it being a new format. Archive Team is already familiar with large storage requirements, so that should be old hat. If we kept releasing the vertical pipe-separated (|) mappings in addition to the WARCs, then clients that only want that data don't need to change their parsers.
21:09:45flashfire42 (flashfire42) joins
21:10:47kiska (kiska) joins
21:21:04<@JAA>I believe the desire for WARC on this project is as old as the project itself, basically.
21:21:27<@JAA>But it'd need a bunch of dev work on tracker and client.
21:29:04<@hook54321>aarchi: it probably won't happen anytime soon, it's been over 5 years that it's been talked about from what I know.
21:30:46<@hook54321>there was some talk at one point about potentially converting existing ones to WARC, but that's probably a bad idea, and not everything needed to do that is saved.
21:31:32<@JAA>s/probably/definitely/
21:31:43<aarchi>Yeah don’t convert old ones
21:31:54<@JAA>Unless 'converting' means retrieving the same URLs again as WARCs.
21:32:55<@hook54321>"<luckcolor> we must fake warc records for the dead ones"
21:33:56<aarchi>That would obfuscate any earlier processing errors in the client or tracker
21:34:42<aarchi>That wouldn’t provide the benefit I proposed, in the case of Hawaiʻi
21:34:52<@JAA>Yeah, hell no.