00:05:03 | | etnguyen03 quits [Client Quit] |
00:08:08 | | etnguyen03 (etnguyen03) joins |
00:08:19 | | Unholy2361924645 quits [Ping timeout: 255 seconds] |
00:13:39 | | pedantic-darwin joins |
00:27:18 | | etnguyen03 quits [Client Quit] |
00:27:59 | | etnguyen03 (etnguyen03) joins |
00:37:46 | | etnguyen03 quits [Client Quit] |
00:38:27 | | etnguyen03 (etnguyen03) joins |
01:00:32 | | etnguyen03 quits [Client Quit] |
01:01:02 | <fireonlive> | "Stack Overflow Upset Over Users Deleting Answers After OpenAI Partnership" https://build5nines.com/stack-overflow-upset-over-users-deleting-answers-after-openai-partnership/ https://news.ycombinator.com/item?id=40302792 |
01:01:07 | <fireonlive> | -> #stackunderflow |
01:16:57 | | treora quits [Remote host closed the connection] |
01:16:58 | | treora joins |
01:17:12 | | treora quits [Remote host closed the connection] |
01:17:15 | | treora joins |
01:28:28 | | kiryu joins |
01:28:28 | | kiryu is now authenticated as kiryu |
01:28:28 | | kiryu quits [Changing host] |
01:28:28 | | kiryu (kiryu) joins |
01:42:07 | | etnguyen03 (etnguyen03) joins |
01:54:24 | | treora quits [Remote host closed the connection] |
01:54:26 | | treora joins |
02:10:05 | <h2ibot> | Usernam edited List of websites excluded from the Wayback Machine (+31): https://wiki.archiveteam.org/?diff=52219&oldid=52216 |
02:23:25 | <pabs> | btw, why was WARC implemented instead of say adding decrypted packet content to network packet capture formats like pcap? |
02:31:22 | <@OrIdow6> | pabs: I've never done anything low-level with TCP or TLS but my impression is that that would be massively more complex to read |
02:32:14 | <@OrIdow6> | For very marginal benefit |
02:33:09 | <@OrIdow6> | It's not very hard to write a WARC parser by hand |
02:40:23 | <nulldata> | !con 1upc5677t2hep02hvkfxmtkvc 9 |
02:44:30 | <fireonlive> | :3 |
02:45:33 | <@JAA> | It's not very hard to write a WARC parser by hand that works for the WARCs produced by most tools. |
02:45:48 | <@JAA> | It's quite another thing to write a parser that parses all valid WARCs correctly. |
02:48:34 | <fireonlive> | and those quirky invalid ones.. |
03:00:14 | <h2ibot> | JAABot edited List of websites excluded from the Wayback Machine (+0): https://wiki.archiveteam.org/?diff=52220&oldid=52219 |
03:54:51 | <nicolas17> | OrIdow6: I think what pabs meant wasn't having to deal with TCP or TLS |
03:54:59 | <nicolas17> | but "HTTP request" as a pcap packet type |
04:13:12 | | etnguyen03 quits [Client Quit] |
04:17:36 | | etnguyen03 (etnguyen03) joins |
04:24:17 | <@OrIdow6> | Huh, not familiar with PCAP either, didn't know that was possible |
04:26:10 | <@OrIdow6> | The short answer for why WARC is the way it is is that newline-delineated headers and body is a fairly old and well-established format, for instance https://en.wikipedia.org/wiki/Mbox and HTTP look the same |
04:26:14 | <@OrIdow6> | Look similar |
04:27:35 | <@OrIdow6> | WARC's predecessor, ARC, is older than JSON, let alone HAR http://fileformats.archiveteam.org/wiki/ARC_(Internet_Archive) |
04:29:03 | <@OrIdow6> | (Though it does look funky) |
04:30:00 | | etnguyen03 quits [Client Quit] |
04:30:56 | <nicolas17> | .pcap and .pcapng files *usually* store Ethernet frames |
04:31:23 | <nicolas17> | but besides that popular use, it can also have any of these https://www.tcpdump.org/linktypes.html |
04:35:43 | <nicolas17> | pabs: btw TLS keys can be embedded in pcap files to make them decryptable |
04:36:58 | <nicolas17> | unfortunately (unlike storing decrypted content) that means they remain non-compressible |
04:37:54 | <thuban> | nicolas17: did you ever write that pcap-to-warc tool you were thinking about? |
04:38:04 | <nicolas17> | no :/ |
04:38:58 | <nicolas17> | I guess it would be a wireshark/tshark plugin |
04:39:16 | <nicolas17> | since I'm not gonna write TLS decryption myself |
04:39:26 | <@OrIdow6> | Incidentally on Rust WARC writers, I did write something like that a while ago, it used the "correct" method of producing them (just dumping the TCP/TLS stream to a file) but I didn't test it nearly enough |
04:40:03 | <thuban> | (istr Sanqui also did some stuff with pcap for discard2) |
04:40:12 | <nicolas17> | Wireshark can already save HTTP bodies |
04:41:43 | <nicolas17> | like, if you captured traffic while an application downloaded a file, Wireshark can then extract that file from the packet capture, undoing chunking and compression |
04:43:33 | <nicolas17> | if *that* is an acceptable feature for Wireshark to have, I don't anticipate opposition to a WARC exporter :P |
04:53:09 | | Island quits [Read error: Connection reset by peer] |
04:58:35 | | eroc1990 quits [Client Quit] |
05:07:04 | | nicolas17 quits [Ping timeout: 265 seconds] |
05:16:44 | | eroc1990 (eroc1990) joins |
06:06:16 | <pabs> | OrIdow6: I was thinking standard pcap, but yeah I guess excluding the lower layers is one reason |
06:42:55 | | BlueMaxima quits [Read error: Connection reset by peer] |
06:52:28 | | jacksonchen666 (jacksonchen666) joins |
07:00:16 | | pixel leaves |
07:00:16 | | pixel (pixel) joins |
07:05:43 | | xkey is now known as x |
07:06:13 | | Unholy2361924645 (Unholy2361) joins |
07:10:29 | | trenton13333 joins |
07:14:56 | | wyatt8750 quits [Client Quit] |
07:15:12 | | wyatt8740 joins |
07:16:37 | | x is now known as xkey |
07:21:14 | | wyatt8750 joins |
07:21:45 | | wyatt8740 quits [Read error: Connection reset by peer] |
08:10:34 | <flashfire42|m> | Air Vanuatu might go under |
08:10:41 | <flashfire42|m> | Maybe grab the site |
08:12:36 | <thuban> | cloudflared, with js pow |
08:13:08 | <flashfire42|m> | Well fuck ok |
08:19:40 | | trenton13333 quits [Client Quit] |
08:41:09 | | xkey is now known as lennart |
08:41:17 | | lennart is now known as xkey |
09:00:05 | | Bleo1826007227 quits [Client Quit] |
09:01:23 | | Bleo18260072271 joins |
09:04:48 | <@OrIdow6> | Thinking about it, pcap -> warc does sound quite nice |
09:05:07 | <@OrIdow6> | Genuine Chrome sessions/whatever |
09:05:26 | <@OrIdow6> | Which is, I assume, the reason people have thought of it in the past |
09:23:43 | | jacksonchen666 quits [Client Quit] |
09:47:23 | | pseudorizer quits [Quit: ZNC 1.9.0 - https://znc.in] |
09:50:53 | | pseudorizer (pseudorizer) joins |
10:09:34 | <h2ibot> | Manu edited Mailman/2 (+63, /* http://dc.ketelhot.de/pipermail lost */): https://wiki.archiveteam.org/?diff=52221&oldid=52178 |
10:14:34 | <h2ibot> | Manu edited Mailman/2 (+96, /* http://france.debian.net/pipermail/ lost */): https://wiki.archiveteam.org/?diff=52222&oldid=52221 |
10:17:23 | | f_ (funderscore) joins |
10:25:36 | <h2ibot> | Manu edited Mailman/2 (+89, /* http://gnu.org.ve/pipermail/ lost */): https://wiki.archiveteam.org/?diff=52223&oldid=52222 |
10:27:36 | <h2ibot> | Manu edited Mailman/2 (+64, /* http://hjemli.net/pipermail lost */): https://wiki.archiveteam.org/?diff=52224&oldid=52223 |
10:32:37 | <h2ibot> | Manu edited Mailman/2 (+29, /* http://icculus.org/pipermail/ never seen */): https://wiki.archiveteam.org/?diff=52225&oldid=52224 |
10:40:46 | | Wohlstand quits [Client Quit] |
10:48:38 | | f_ quits [Remote host closed the connection] |
10:49:38 | | f_ (funderscore) joins |
10:52:41 | <h2ibot> | Manu edited Mailman/2 (+55, /* http://icedtea.classpath.org/pipermail moved…): https://wiki.archiveteam.org/?diff=52226&oldid=52225 |
11:50:52 | <h2ibot> | Manu edited Mailman/2 (+127, /* http://intrepid.danplanet.com/pipermail/ and…): https://wiki.archiveteam.org/?diff=52227&oldid=52226 |
11:53:23 | | etnguyen03 (etnguyen03) joins |
12:07:42 | <c3manu> | JAA: I was planning on archiving https://tube.network.europa.eu/ which is shutting down on may 18th (see deathwatch). the contents hosted there are apparently mirrored from/to youtube. |
12:08:08 | <c3manu> | they are 202 videos in total, ranging from 1:30 min animations to some 2h or even 4h interviews |
12:09:41 | <c3manu> | what do you say: still archive it? archive the youtube channels instead (they have older videos that arent on the peertube instance)? or do the latter and fetch the instance while ignoring the video files? |
12:09:58 | <c3manu> | i think i'd do the latter, but i'd like to hear your opinion on it :) |
12:10:38 | <c3manu> | (and anyone else’s, too btw :)) |
12:19:01 | | etnguyen03 quits [Client Quit] |
12:36:11 | | xkey just saw this on Mastodon: |
12:36:16 | <xkey> | https://open-archive.org/jobs |
12:36:24 | <xkey> | if anyone's looking for a job |
12:36:42 | | grid joins |
12:36:51 | <xkey> | source https://infosec.exchange/@cooperq/112406644788675194 |
12:38:13 | | etnguyen03 (etnguyen03) joins |
12:48:36 | | etnguyen03 quits [Client Quit] |
13:24:07 | | etnguyen03 (etnguyen03) joins |
13:27:19 | | ThetaDev quits [Ping timeout: 265 seconds] |
13:29:37 | | ThetaDev joins |
13:35:50 | | loug quits [Client Quit] |
13:36:19 | | loug joins |
13:42:29 | | kiryu quits [Remote host closed the connection] |
13:43:52 | | kiryu (kiryu) joins |
13:50:52 | | nicolas17 joins |
13:53:10 | | Notrealname1234 (Notrealname1234) joins |
14:02:52 | <eightthree> | !help |
14:03:13 | <eightthree> | !con help |
14:07:29 | | etnguyen03 quits [Client Quit] |
14:08:13 | | Notrealname1234 quits [Client Quit] |
14:08:35 | | f_ is now known as funderscore |
14:09:04 | | funderscore is now known as f_ |
14:14:04 | | etnguyen03 (etnguyen03) joins |
14:14:50 | <eightthree> | > @OpenArchive@mstdn.social, a radical archiving organization that is empowering human rights defenders and people in war zones to preserve video evidence |
14:14:52 | <eightthree> | worded like that, you might end up on a hitlist, of a country at war or organized crime or organized crime paid by a country at war... |
14:26:18 | | monika5 (boom) joins |
14:26:36 | | monika5 quits [Client Quit] |
14:26:45 | <xkey> | true |
14:26:58 | <xkey> | happend to friends previously working at OCCRP.org |
14:33:13 | <@JAA> | c3manu: https://tube.network.europa.eu/ sounds small enough that we could grab a copy with videos then. |
14:36:17 | <c3manu> | JAA: okay, thanks :) |
14:40:25 | | Earendil7 quits [Ping timeout: 255 seconds] |
14:42:52 | | Earendil7 (Earendil7) joins |
14:49:03 | | etnguyen03 quits [Client Quit] |
14:51:52 | | Jens quits [Client Quit] |
14:52:08 | | Jens (JensRex) joins |
14:56:36 | | grid quits [Client Quit] |
14:58:00 | | thuban quits [Quit: bbl, system upgrade] |
15:03:24 | | Notrealname1234 (Notrealname1234) joins |
15:23:34 | | Notrealname1234 quits [Client Quit] |
15:24:28 | | SootBector quits [Remote host closed the connection] |
15:25:46 | | SootBector (SootBector) joins |
15:50:45 | | kiryu quits [Remote host closed the connection] |
15:51:24 | | etnguyen03 (etnguyen03) joins |
16:24:59 | | etnguyen03 quits [Client Quit] |
16:45:49 | | thuban (thuban) joins |
16:49:03 | | etnguyen03 (etnguyen03) joins |
17:40:25 | | Unholy2361924645 quits [Ping timeout: 255 seconds] |
17:58:09 | <h2ibot> | Exorcism edited Discourse (+53, /* Active Discourses */): https://wiki.archiveteam.org/?diff=52228&oldid=52167 |
18:02:01 | | Earendil7 quits [Ping timeout: 255 seconds] |
18:04:15 | | Earendil7 (Earendil7) joins |
18:12:43 | | Island joins |
18:33:36 | | somebody joins |
18:42:33 | | tapos quits [Client Quit] |
19:14:52 | | loug quits [Client Quit] |
19:15:10 | | loug joins |
19:21:07 | | etnguyen03 quits [Client Quit] |
19:53:12 | | etnguyen03 (etnguyen03) joins |
20:05:07 | | etnguyen03 quits [Client Quit] |
20:08:48 | | etnguyen03 (etnguyen03) joins |
20:23:46 | | knecht4 quits [Ping timeout: 255 seconds] |
20:30:14 | | etnguyen03 quits [Client Quit] |
20:30:58 | | Earendil7 quits [Ping timeout: 255 seconds] |
20:31:50 | | Earendil7 (Earendil7) joins |
20:33:24 | | JaffaCakes118 (JaffaCakes118) joins |
20:37:16 | | Earendil7 quits [Ping timeout: 255 seconds] |
20:59:49 | | f_ quits [Ping timeout: 250 seconds] |
21:12:12 | | AlsoHP_Archivist quits [Read error: Connection reset by peer] |
21:18:30 | | somebody quits [Client Quit] |
21:39:48 | | icedice joins |
21:39:49 | <eggdrop> | [tell] icedice: [2024-05-07T20:24:57Z] <thuban> i went through the scanlation discord scrape that Vokun did; have requested it in #//, submitted relevant urls to projects, and checked for custom blogspots (there were none) |
21:40:40 | <icedice> | Thanks thuban! |
21:40:48 | | BlueMaxima joins |
21:51:47 | | Fijxu|m joins |
22:08:38 | | Harzilein joins |
22:10:52 | | le0n quits [Ping timeout: 255 seconds] |
22:18:03 | | PredatorIWD quits [Read error: Connection reset by peer] |
22:21:07 | | PredatorIWD joins |
22:32:24 | | JaffaCakes118 quits [Remote host closed the connection] |
22:32:48 | | JaffaCakes118 (JaffaCakes118) joins |
22:49:43 | | systwi_ quits [Quit: systwi_] |
22:49:43 | | nothere quits [Quit: Leaving] |
22:50:10 | | systwi_ joins |
22:52:32 | | le0n (le0n) joins |
22:54:58 | | systwi_ quits [Ping timeout: 255 seconds] |
23:04:24 | <joepie91|m> | random archival-relevant find: https://www.youtube.com/watch?v=oSsZJS26D4E (a version of the soundtrack that few remaining copies exist of afaik) |
23:04:52 | <joepie91|m> | not sure what the best way is currently to get a copy of this archived |
23:16:44 | | BlueMaxima quits [Read error: Connection reset by peer] |
23:22:19 | <that_lurker> | arkiver: Would that fit #down-the-tube? ^ |
23:22:39 | | nothere joins |
23:47:13 | | Island_ joins |
23:48:45 | | Island_ quits [Read error: Connection reset by peer] |
23:51:18 | | Island quits [Ping timeout: 265 seconds] |