| 00:22:39 | | Jake quits [Quit: Leaving for a bit!] |
| 00:23:03 | | Jake (Jake) joins |
| 00:23:39 | | Jake quits [Client Quit] |
| 00:23:57 | | Jake (Jake) joins |
| 00:32:10 | | Jake quits [Client Quit] |
| 00:32:38 | | tzt quits [Ping timeout: 265 seconds] |
| 00:37:29 | | Jake (Jake) joins |
| 00:38:16 | | Jake quits [Client Quit] |
| 00:43:28 | | Jake (Jake) joins |
| 00:44:08 | | Jake quits [Read error: Connection reset by peer] |
| 00:44:20 | | Jake (Jake) joins |
| 00:44:51 | | Jake quits [Remote host closed the connection] |
| 00:45:03 | | Jake (Jake) joins |
| 00:45:34 | | Jake quits [Read error: Connection reset by peer] |
| 00:45:46 | | Jake (Jake) joins |
| 00:46:18 | | Jake quits [Remote host closed the connection] |
| 00:46:30 | | Jake (Jake) joins |
| 00:47:01 | | Jake quits [Remote host closed the connection] |
| 00:47:14 | | Jake (Jake) joins |
| 00:47:44 | | Jake quits [Remote host closed the connection] |
| 00:47:57 | | Jake (Jake) joins |
| 00:48:28 | | Jake quits [Remote host closed the connection] |
| 00:48:40 | | Jake (Jake) joins |
| 00:49:11 | | Jake quits [Remote host closed the connection] |
| 00:49:24 | | Jake (Jake) joins |
| 00:49:54 | | Jake quits [Remote host closed the connection] |
| 00:50:07 | | Jake (Jake) joins |
| 00:50:37 | | Jake quits [Remote host closed the connection] |
| 00:50:49 | | Jake (Jake) joins |
| 00:51:22 | | Jake quits [Remote host closed the connection] |
| 00:51:35 | | Jake (Jake) joins |
| 00:52:06 | | Jake quits [Remote host closed the connection] |
| 00:52:18 | | Jake (Jake) joins |
| 00:52:50 | | Jake quits [Remote host closed the connection] |
| 00:53:03 | | Jake (Jake) joins |
| 00:53:34 | | Jake quits [Remote host closed the connection] |
| 00:53:47 | | Jake (Jake) joins |
| 00:54:20 | | Jake quits [Remote host closed the connection] |
| 00:54:32 | | Jake (Jake) joins |
| 00:55:07 | | Jake quits [Remote host closed the connection] |
| 00:55:18 | | Jake (Jake) joins |
| 00:55:52 | | Jake quits [Remote host closed the connection] |
| 01:00:17 | | Hackerpcs quits [Quit: Hackerpcs] |
| 01:02:51 | | Hackerpcs (Hackerpcs) joins |
| 01:14:59 | | Jake (Jake) joins |
| 01:17:40 | <Jake> | (sorry for the disconnects y'all) |
| 02:00:02 | | TheTechRobo is now known as x_AmongUsFan69_x |
| 02:00:09 | | x_AmongUsFan69_x is now known as TheTechRobo |
| 02:02:00 | <h2ibot> | Mgoshawk edited FanFiction.Net (+190): https://wiki.archiveteam.org/?diff=48810&oldid=47849 |
| 02:02:01 | <h2ibot> | Paulmorriss edited Flickr (+1371, /* 2020 pricing and campaign */): https://wiki.archiveteam.org/?diff=48811&oldid=47723 |
| 02:02:02 | <h2ibot> | Entartet edited List of websites excluded from the Wayback Machine (+54, Added farfrommoscow.com and picrew.me.): https://wiki.archiveteam.org/?diff=48812&oldid=48803 |
| 02:02:03 | <h2ibot> | Magmaus3 edited URLTeam (+231, /* Alive */ Add cutt.ly): https://wiki.archiveteam.org/?diff=48813&oldid=48748 |
| 02:03:00 | <h2ibot> | ElijahPepe created LGTM.com (+803, Created page with "{{Infobox project | title =…): https://wiki.archiveteam.org/?title=LGTM.com |
| 02:03:01 | <h2ibot> | DJDunsie edited Deathwatch (+227, /* 2022 */ Lexico): https://wiki.archiveteam.org/?diff=48815&oldid=48808 |
| 02:33:46 | | JackThompson quits [Ping timeout: 240 seconds] |
| 03:08:03 | | t3 joins |
| 03:11:10 | | t3 quits [Remote host closed the connection] |
| 03:32:22 | | qwertyasdfuiopghjkl joins |
| 03:52:28 | | Craigle quits [Quit: The Lounge - https://thelounge.chat] |
| 03:53:00 | | Craigle (Craigle) joins |
| 03:58:43 | | Nemo_bis joins |
| 04:15:20 | | dunger quits [Ping timeout: 265 seconds] |
| 04:15:41 | | dunger (dunger) joins |
| 04:29:22 | | DogsRNice quits [Read error: Connection reset by peer] |
| 04:30:19 | <pabs> | some gaming company acquisitions: https://www.gamingonlinux.com/2022/08/embracer-group-to-swallow-up-tripwire-tuxedo-labs-the-lord-of-the-rings/ |
| 04:33:16 | | HackMii_ quits [Ping timeout: 240 seconds] |
| 04:33:20 | | sec^nd quits [Remote host closed the connection] |
| 04:34:02 | | sec^nd (second) joins |
| 04:35:51 | | HackMii_ (hacktheplanet) joins |
| 05:04:16 | | systwi quits [Ping timeout: 265 seconds] |
| 06:24:14 | | Doranwen quits [Client Quit] |
| 07:01:09 | | systwi (systwi) joins |
| 07:14:54 | | dm4v_ joins |
| 07:17:16 | | dm4v quits [Ping timeout: 240 seconds] |
| 07:17:16 | | dm4v_ is now known as dm4v |
| 07:39:46 | | Arcorann quits [Ping timeout: 240 seconds] |
| 08:00:44 | | Doranwen (Doranwen) joins |
| 08:26:53 | | BlueMaxima quits [Read error: Connection reset by peer] |
| 08:54:45 | | tech_exorcist (tech_exorcist) joins |
| 08:57:55 | | dm4v_ joins |
| 08:58:41 | | dm4v quits [Ping timeout: 265 seconds] |
| 08:58:41 | | dm4v_ is now known as dm4v |
| 08:58:51 | <Maakuth|m> | what timezone is arkiver on? should I try to reach them at US evening hours? |
| 09:10:13 | | sec^nd quits [Remote host closed the connection] |
| 09:11:26 | | sec^nd (second) joins |
| 10:53:50 | | Minkafighter quits [Quit: The Lounge - https://thelounge.chat] |
| 10:54:31 | | Minkafighter joins |
| 11:56:31 | | drexler quits [Remote host closed the connection] |
| 11:56:48 | | drexler joins |
| 12:20:12 | | tech_exorcist quits [Remote host closed the connection] |
| 12:20:17 | | tech_exorcist_ (tech_exorcist) joins |
| 12:44:58 | | Arcorann (Arcorann) joins |
| 13:06:04 | | tech_exorcist_ quits [Remote host closed the connection] |
| 13:06:49 | | tech_exorcist_ (tech_exorcist) joins |
| 13:33:09 | <TheTechRobo> | arkiver: ^ |
| 13:36:34 | <Maakuth|m> | I'm UTC+03:00 myself |
| 13:36:54 | <@arkiver> | just leave me a message |
| 14:24:46 | | Arcorann quits [Ping timeout: 240 seconds] |
| 17:00:18 | | tech_exorcist_ leaves |
| 17:39:58 | <systwi_> | Is it safe to simply `cat example.com-00000.warc example.com-00001.warc > example.com.warc`? Do I need to take note of the original input WARC filesizes in case I want to split them up again? |
| 17:41:05 | <@JAA> | Assuming the two input files are valid WARCs, yes, that is safe. |
| 17:41:06 | <systwi_> | The AT wiki mentions [megawarc](https://github.com/alard/megawarc) but I'm not sure if it's still needed. |
| 17:41:49 | <TheTechRobo> | Megawarc is also broken for me, at least for warc.gz. |
| 17:43:18 | <systwi_> | Thanks for the info. I suppose it's probably best to store the original filesizes anyway; that' |
| 17:43:21 | <systwi_> | :-/ |
| 17:43:49 | <systwi_> | Thanks for the info. I suppose it's probably best to store the original filesizes anyway; that's what, ~2 KB? |
| 17:45:59 | <@JAA> | megawarc is useful when you need to merge a larger number of WARCs, I guess. It keeps track of the original files and in theory allows extracting to that again (I think). The version on the AT org should work fine as that's used for all projects. It also does error checks and puts the broken files into a tar. |
| 17:47:09 | <TheTechRobo> | JAA: It's possible it's broken for me because of my Python version. I filed an issue a few months ago: https://github.com/ArchiveTeam/megawarc/issues/5 |
| 17:47:59 | <TheTechRobo> | Interesting, "move to Python 3" is an Issue. I can't remember if I tried Python 2 or not. |
| 17:48:25 | <@JAA> | TheTechRobo: That doesn't sound right, and I'm pretty sure the targets run Py 3. |
| 17:48:27 | <TheTechRobo> | Yep, it's set to use python 2. |
| 17:48:32 | <TheTechRobo> | JAA: Weird. |
| 17:48:37 | <TheTechRobo> | https://github.com/ArchiveTeam/megawarc/issues/3 is an issue. |
| 17:48:56 | <@JAA> | Issue 5 would indicate that you're giving it something that's neither a .gz nor a .zst file. |
| 17:49:08 | <@JAA> | But also, you should only give it WARC files, not a .warc.os.cdx.gz file. |
| 17:49:35 | <TheTechRobo> | Isn't the point of Megawarc that it converts a directory tree into a WARC, a tar, and a metadata file? |
| 17:50:05 | <@JAA> | converts a collection of WARCs into*, yes |
| 17:50:19 | <@JAA> | It doesn't handle other file formats, nor does it check the file format. |
| 17:50:21 | <TheTechRobo> | From the README: |
| 17:50:21 | <TheTechRobo> | FILE.warc.gz is the concatenated .warc.gz |
| 17:50:21 | <TheTechRobo> | FILE.tar contains any non-warc files from the .tar |
| 17:50:21 | <TheTechRobo> | FILE.json.gz contains metadata |
| 17:50:57 | <@JAA> | I'm pretty sure it never checks whether the file is actually a WARC, only whether it decompresses correctly. |
| 17:51:19 | <TheTechRobo> | Well, it crashes when `Checking 1652740309829c5a3e1fc0bf20-1_1652740337.315711/funeralhome-1934cdbeadc09ac1a98713bb2b1d8ca41f8f2ec1-20220516-223149.warc.gz`, which should be a valid WARC. |
| 17:52:12 | <TheTechRobo> | https://transfer.archivete.am/SWZGI/failed_on.warc.gz is an uploaded version of that WARC. |
| 17:52:57 | <@JAA> | Ok, correction, the packer does indeed still use Python 2.7. Eww... |
| 17:54:52 | <@JAA> | And true, it should append other files to the tar. test_gz would fail, but that only runs for .warc.gz and .warc.zst (the latter with further filename pattern restrictions for $reasons). |
| 17:56:28 | <@JAA> | I'd help with debugging, but I banished Python 2 from my systems a long while ago. |
| 17:56:59 | <TheTechRobo> | Understandable. |
| 17:57:09 | <TheTechRobo> | I did that, until I ran into legacy software with no modern alternative. :/ |
| 17:57:20 | <TheTechRobo> | Such as megawarc. |
| 17:58:47 | | tech_exorcist (tech_exorcist) joins |
| 18:03:11 | | tech_exorcist quits [Client Quit] |
| 18:07:50 | | tech_exorcist (tech_exorcist) joins |
| 18:28:06 | <systwi_> | Thanks for the info! |
| 18:37:14 | | Nulo quits [Ping timeout: 265 seconds] |
| 18:46:56 | | Nulo joins |
| 18:52:29 | | balrog quits [Quit: Bye] |
| 19:07:29 | | balrog (balrog) joins |
| 19:39:44 | | tech_exorcist quits [Remote host closed the connection] |
| 19:46:13 | | michaelblob_ (michaelblob) joins |
| 19:48:46 | | michaelblob quits [Ping timeout: 240 seconds] |
| 21:36:09 | | tzt (tzt) joins |
| 22:14:33 | | sec^nd quits [Remote host closed the connection] |
| 22:15:11 | | sec^nd (second) joins |
| 22:30:29 | | sec^nd quits [Remote host closed the connection] |
| 22:31:05 | | sec^nd (second) joins |
| 23:07:05 | | BlueMaxima joins |
| 23:50:16 | | sec^nd quits [Ping timeout: 240 seconds] |
| 23:56:07 | | sec^nd (second) joins |