00:05:03etnguyen03 quits [Client Quit]
00:08:08etnguyen03 (etnguyen03) joins
00:08:19Unholy2361924645 quits [Ping timeout: 255 seconds]
00:13:39pedantic-darwin joins
00:27:18etnguyen03 quits [Client Quit]
00:27:59etnguyen03 (etnguyen03) joins
00:37:46etnguyen03 quits [Client Quit]
00:38:27etnguyen03 (etnguyen03) joins
01:00:32etnguyen03 quits [Client Quit]
01:01:02<fireonlive>"Stack Overflow Upset Over Users Deleting Answers After OpenAI Partnership" https://build5nines.com/stack-overflow-upset-over-users-deleting-answers-after-openai-partnership/ https://news.ycombinator.com/item?id=40302792
01:01:07<fireonlive>-> #stackunderflow
01:16:57treora quits [Remote host closed the connection]
01:16:58treora joins
01:17:12treora quits [Remote host closed the connection]
01:17:15treora joins
01:28:28kiryu joins
01:28:28kiryu quits [Changing host]
01:28:28kiryu (kiryu) joins
01:42:07etnguyen03 (etnguyen03) joins
01:54:24treora quits [Remote host closed the connection]
01:54:26treora joins
02:10:05<h2ibot>Usernam edited List of websites excluded from the Wayback Machine (+31): https://wiki.archiveteam.org/?diff=52219&oldid=52216
02:23:25<pabs>btw, why was WARC implemented instead of say adding decrypted packet content to network packet capture formats like pcap?
02:31:22<@OrIdow6>pabs: I've never done anything low-level with TCP or TLS but my impression is that that would be massively more complex to read
02:32:14<@OrIdow6>For very marginal benefit
02:33:09<@OrIdow6>It's not very hard to write a WARC parser by hand
02:40:23<nulldata>!con 1upc5677t2hep02hvkfxmtkvc 9
02:44:30<fireonlive>:3
02:45:33<@JAA>It's not very hard to write a WARC parser by hand that works for the WARCs produced by most tools.
02:45:48<@JAA>It's quite another thing to write a parser that parses all valid WARCs correctly.
02:48:34<fireonlive>and those quirky invalid ones..
03:00:14<h2ibot>JAABot edited List of websites excluded from the Wayback Machine (+0): https://wiki.archiveteam.org/?diff=52220&oldid=52219
03:54:51<nicolas17>OrIdow6: I think what pabs meant wasn't having to deal with TCP or TLS
03:54:59<nicolas17>but "HTTP request" as a pcap packet type
04:13:12etnguyen03 quits [Client Quit]
04:17:36etnguyen03 (etnguyen03) joins
04:24:17<@OrIdow6>Huh, not familiar with PCAP either, didn't know that was possible
04:26:10<@OrIdow6>The short answer for why WARC is the way it is is that newline-delineated headers and body is a fairly old and well-established format, for instance https://en.wikipedia.org/wiki/Mbox and HTTP look the same
04:26:14<@OrIdow6>Look similar
04:27:35<@OrIdow6>WARC's predecessor, ARC, is older than JSON, let alone HAR http://fileformats.archiveteam.org/wiki/ARC_(Internet_Archive)
04:29:03<@OrIdow6>(Though it does look funky)
04:30:00etnguyen03 quits [Client Quit]
04:30:56<nicolas17>.pcap and .pcapng files *usually* store Ethernet frames
04:31:23<nicolas17>but besides that popular use, it can also have any of these https://www.tcpdump.org/linktypes.html
04:35:43<nicolas17>pabs: btw TLS keys can be embedded in pcap files to make them decryptable
04:36:58<nicolas17>unfortunately (unlike storing decrypted content) that means they remain non-compressible
04:37:54<thuban>nicolas17: did you ever write that pcap-to-warc tool you were thinking about?
04:38:04<nicolas17>no :/
04:38:58<nicolas17>I guess it would be a wireshark/tshark plugin
04:39:16<nicolas17>since I'm not gonna write TLS decryption myself
04:39:26<@OrIdow6>Incidentally on Rust WARC writers, I did write something like that a while ago, it used the "correct" method of producing them (just dumping the TCP/TLS stream to a file) but I didn't test it nearly enough
04:40:03<thuban>(istr Sanqui also did some stuff with pcap for discard2)
04:40:12<nicolas17>Wireshark can already save HTTP bodies
04:41:43<nicolas17>like, if you captured traffic while an application downloaded a file, Wireshark can then extract that file from the packet capture, undoing chunking and compression
04:43:33<nicolas17>if *that* is an acceptable feature for Wireshark to have, I don't anticipate opposition to a WARC exporter :P
04:53:09Island quits [Read error: Connection reset by peer]
04:58:35eroc1990 quits [Client Quit]
05:07:04nicolas17 quits [Ping timeout: 265 seconds]
05:16:44eroc1990 (eroc1990) joins
06:06:16<pabs>OrIdow6: I was thinking standard pcap, but yeah I guess excluding the lower layers is one reason
06:42:55BlueMaxima quits [Read error: Connection reset by peer]
06:52:28jacksonchen666 (jacksonchen666) joins
07:00:16pixel leaves
07:00:16pixel (pixel) joins
07:05:43xkey is now known as x
07:06:13Unholy2361924645 (Unholy2361) joins
07:10:29trenton13333 joins
07:14:56wyatt8750 quits [Client Quit]
07:15:12wyatt8740 joins
07:16:37x is now known as xkey
07:21:14wyatt8750 joins
07:21:45wyatt8740 quits [Read error: Connection reset by peer]
08:10:34<flashfire42|m>Air Vanuatu might go under
08:10:41<flashfire42|m>Maybe grab the site
08:12:36<thuban>cloudflared, with js pow
08:13:08<flashfire42|m>Well fuck ok
08:19:40trenton13333 quits [Client Quit]
08:41:09xkey is now known as lennart
08:41:17lennart is now known as xkey
09:00:05Bleo1826007227 quits [Client Quit]
09:01:23Bleo18260072271 joins
09:04:48<@OrIdow6>Thinking about it, pcap -> warc does sound quite nice
09:05:07<@OrIdow6>Genuine Chrome sessions/whatever
09:05:26<@OrIdow6>Which is, I assume, the reason people have thought of it in the past
09:23:43jacksonchen666 quits [Client Quit]
09:47:23pseudorizer quits [Quit: ZNC 1.9.0 - https://znc.in]
09:50:53pseudorizer (pseudorizer) joins
10:09:34<h2ibot>Manu edited Mailman/2 (+63, /* http://dc.ketelhot.de/pipermail lost */): https://wiki.archiveteam.org/?diff=52221&oldid=52178
10:14:34<h2ibot>Manu edited Mailman/2 (+96, /* http://france.debian.net/pipermail/ lost */): https://wiki.archiveteam.org/?diff=52222&oldid=52221
10:17:23f_ (funderscore) joins
10:25:36<h2ibot>Manu edited Mailman/2 (+89, /* http://gnu.org.ve/pipermail/ lost */): https://wiki.archiveteam.org/?diff=52223&oldid=52222
10:27:36<h2ibot>Manu edited Mailman/2 (+64, /* http://hjemli.net/pipermail lost */): https://wiki.archiveteam.org/?diff=52224&oldid=52223
10:32:37<h2ibot>Manu edited Mailman/2 (+29, /* http://icculus.org/pipermail/ never seen */): https://wiki.archiveteam.org/?diff=52225&oldid=52224
10:40:46Wohlstand quits [Client Quit]
10:48:38f_ quits [Remote host closed the connection]
10:49:38f_ (funderscore) joins
10:52:41<h2ibot>Manu edited Mailman/2 (+55, /* http://icedtea.classpath.org/pipermail moved…): https://wiki.archiveteam.org/?diff=52226&oldid=52225
11:50:52<h2ibot>Manu edited Mailman/2 (+127, /* http://intrepid.danplanet.com/pipermail/ and…): https://wiki.archiveteam.org/?diff=52227&oldid=52226
11:53:23etnguyen03 (etnguyen03) joins
12:07:42<c3manu>JAA: I was planning on archiving https://tube.network.europa.eu/ which is shutting down on may 18th (see deathwatch). the contents hosted there are apparently mirrored from/to youtube.
12:08:08<c3manu>they are 202 videos in total, ranging from 1:30 min animations to some 2h or even 4h interviews
12:09:41<c3manu>what do you say: still archive it? archive the youtube channels instead (they have older videos that arent on the peertube instance)? or do the latter and fetch the instance while ignoring the video files?
12:09:58<c3manu>i think i'd do the latter, but i'd like to hear your opinion on it :)
12:10:38<c3manu>(and anyone else’s, too btw :))
12:19:01etnguyen03 quits [Client Quit]
12:36:11xkey just saw this on Mastodon:
12:36:16<xkey>https://open-archive.org/jobs
12:36:24<xkey>if anyone's looking for a job
12:36:42grid joins
12:36:51<xkey>source https://infosec.exchange/@cooperq/112406644788675194
12:38:13etnguyen03 (etnguyen03) joins
12:48:36etnguyen03 quits [Client Quit]
13:24:07etnguyen03 (etnguyen03) joins
13:27:19ThetaDev quits [Ping timeout: 265 seconds]
13:29:37ThetaDev joins
13:35:50loug quits [Client Quit]
13:36:19loug joins
13:42:29kiryu quits [Remote host closed the connection]
13:43:52kiryu (kiryu) joins
13:50:52nicolas17 joins
13:53:10Notrealname1234 (Notrealname1234) joins
14:02:52<eightthree>!help
14:03:13<eightthree>!con help
14:07:29etnguyen03 quits [Client Quit]
14:08:13Notrealname1234 quits [Client Quit]
14:08:35f_ is now known as funderscore
14:09:04funderscore is now known as f_
14:14:04etnguyen03 (etnguyen03) joins
14:14:50<eightthree>> @OpenArchive@mstdn.social, a radical archiving organization that is empowering human rights defenders and people in war zones to preserve video evidence
14:14:52<eightthree>worded like that, you might end up on a hitlist, of a country at war or organized crime or organized crime paid by a country at war...
14:26:18monika5 (boom) joins
14:26:36monika5 quits [Client Quit]
14:26:45<xkey>true
14:26:58<xkey>happend to friends previously working at OCCRP.org
14:33:13<@JAA>c3manu: https://tube.network.europa.eu/ sounds small enough that we could grab a copy with videos then.
14:36:17<c3manu>JAA: okay, thanks :)
14:40:25Earendil7 quits [Ping timeout: 255 seconds]
14:42:52Earendil7 (Earendil7) joins
14:49:03etnguyen03 quits [Client Quit]
14:51:52Jens quits [Client Quit]
14:52:08Jens (JensRex) joins
14:56:36grid quits [Client Quit]
14:58:00thuban quits [Quit: bbl, system upgrade]
15:03:24Notrealname1234 (Notrealname1234) joins
15:23:34Notrealname1234 quits [Client Quit]
15:24:28SootBector quits [Remote host closed the connection]
15:25:46SootBector (SootBector) joins
15:50:45kiryu quits [Remote host closed the connection]
15:51:24etnguyen03 (etnguyen03) joins
16:24:59etnguyen03 quits [Client Quit]
16:45:49thuban (thuban) joins
16:49:03etnguyen03 (etnguyen03) joins
17:40:25Unholy2361924645 quits [Ping timeout: 255 seconds]
17:58:09<h2ibot>Exorcism edited Discourse (+53, /* Active Discourses */): https://wiki.archiveteam.org/?diff=52228&oldid=52167
18:02:01Earendil7 quits [Ping timeout: 255 seconds]
18:04:15Earendil7 (Earendil7) joins
18:12:43Island joins
18:33:36somebody joins
18:42:33tapos quits [Client Quit]
19:14:52loug quits [Client Quit]
19:15:10loug joins
19:21:07etnguyen03 quits [Client Quit]
19:53:12etnguyen03 (etnguyen03) joins
20:05:07etnguyen03 quits [Client Quit]
20:08:48etnguyen03 (etnguyen03) joins
20:23:46knecht4 quits [Ping timeout: 255 seconds]
20:30:14etnguyen03 quits [Client Quit]
20:30:58Earendil7 quits [Ping timeout: 255 seconds]
20:31:50Earendil7 (Earendil7) joins
20:33:24JaffaCakes118 (JaffaCakes118) joins
20:37:16Earendil7 quits [Ping timeout: 255 seconds]
20:59:49f_ quits [Ping timeout: 250 seconds]
21:12:12AlsoHP_Archivist quits [Read error: Connection reset by peer]
21:18:30somebody quits [Client Quit]
21:39:48icedice joins
21:39:49<eggdrop>[tell] icedice: [2024-05-07T20:24:57Z] <thuban> i went through the scanlation discord scrape that Vokun did; have requested it in #//, submitted relevant urls to projects, and checked for custom blogspots (there were none)
21:40:40<icedice>Thanks thuban!
21:40:48BlueMaxima joins
21:51:47Fijxu|m joins
22:08:38Harzilein joins
22:10:52le0n quits [Ping timeout: 255 seconds]
22:18:03PredatorIWD quits [Read error: Connection reset by peer]
22:21:07PredatorIWD joins
22:32:24JaffaCakes118 quits [Remote host closed the connection]
22:32:48JaffaCakes118 (JaffaCakes118) joins
22:49:43systwi_ quits [Quit: systwi_]
22:49:43nothere quits [Quit: Leaving]
22:50:10systwi_ joins
22:52:32le0n (le0n) joins
22:54:58systwi_ quits [Ping timeout: 255 seconds]
23:04:24<joepie91|m>random archival-relevant find: https://www.youtube.com/watch?v=oSsZJS26D4E (a version of the soundtrack that few remaining copies exist of afaik)
23:04:52<joepie91|m>not sure what the best way is currently to get a copy of this archived
23:16:44BlueMaxima quits [Read error: Connection reset by peer]
23:22:19<that_lurker>arkiver: Would that fit #down-the-tube? ^
23:22:39nothere joins
23:47:13Island_ joins
23:48:45Island_ quits [Read error: Connection reset by peer]
23:51:18Island quits [Ping timeout: 265 seconds]