00:22:39Jake quits [Quit: Leaving for a bit!]
00:23:03Jake (Jake) joins
00:23:39Jake quits [Client Quit]
00:23:57Jake (Jake) joins
00:32:10Jake quits [Client Quit]
00:32:38tzt quits [Ping timeout: 265 seconds]
00:37:29Jake (Jake) joins
00:38:16Jake quits [Client Quit]
00:43:28Jake (Jake) joins
00:44:08Jake quits [Read error: Connection reset by peer]
00:44:20Jake (Jake) joins
00:44:51Jake quits [Remote host closed the connection]
00:45:03Jake (Jake) joins
00:45:34Jake quits [Read error: Connection reset by peer]
00:45:46Jake (Jake) joins
00:46:18Jake quits [Remote host closed the connection]
00:46:30Jake (Jake) joins
00:47:01Jake quits [Remote host closed the connection]
00:47:14Jake (Jake) joins
00:47:44Jake quits [Remote host closed the connection]
00:47:57Jake (Jake) joins
00:48:28Jake quits [Remote host closed the connection]
00:48:40Jake (Jake) joins
00:49:11Jake quits [Remote host closed the connection]
00:49:24Jake (Jake) joins
00:49:54Jake quits [Remote host closed the connection]
00:50:07Jake (Jake) joins
00:50:37Jake quits [Remote host closed the connection]
00:50:49Jake (Jake) joins
00:51:22Jake quits [Remote host closed the connection]
00:51:35Jake (Jake) joins
00:52:06Jake quits [Remote host closed the connection]
00:52:18Jake (Jake) joins
00:52:50Jake quits [Remote host closed the connection]
00:53:03Jake (Jake) joins
00:53:34Jake quits [Remote host closed the connection]
00:53:47Jake (Jake) joins
00:54:20Jake quits [Remote host closed the connection]
00:54:32Jake (Jake) joins
00:55:07Jake quits [Remote host closed the connection]
00:55:18Jake (Jake) joins
00:55:52Jake quits [Remote host closed the connection]
01:00:17Hackerpcs quits [Quit: Hackerpcs]
01:02:51Hackerpcs (Hackerpcs) joins
01:14:59Jake (Jake) joins
01:17:40<Jake>(sorry for the disconnects y'all)
02:00:02TheTechRobo is now known as x_AmongUsFan69_x
02:00:09x_AmongUsFan69_x is now known as TheTechRobo
02:02:00<h2ibot>Mgoshawk edited FanFiction.Net (+190): https://wiki.archiveteam.org/?diff=48810&oldid=47849
02:02:01<h2ibot>Paulmorriss edited Flickr (+1371, /* 2020 pricing and campaign */): https://wiki.archiveteam.org/?diff=48811&oldid=47723
02:02:02<h2ibot>Entartet edited List of websites excluded from the Wayback Machine (+54, Added farfrommoscow.com and picrew.me.): https://wiki.archiveteam.org/?diff=48812&oldid=48803
02:02:03<h2ibot>Magmaus3 edited URLTeam (+231, /* Alive */ Add cutt.ly): https://wiki.archiveteam.org/?diff=48813&oldid=48748
02:03:00<h2ibot>ElijahPepe created LGTM.com (+803, Created page with "{{Infobox project | title =…): https://wiki.archiveteam.org/?title=LGTM.com
02:03:01<h2ibot>DJDunsie edited Deathwatch (+227, /* 2022 */ Lexico): https://wiki.archiveteam.org/?diff=48815&oldid=48808
02:33:46JackThompson quits [Ping timeout: 240 seconds]
03:08:03t3 joins
03:11:10t3 quits [Remote host closed the connection]
03:32:22qwertyasdfuiopghjkl joins
03:52:28Craigle quits [Quit: The Lounge - https://thelounge.chat]
03:53:00Craigle (Craigle) joins
03:58:43Nemo_bis joins
04:15:20dunger quits [Ping timeout: 265 seconds]
04:15:41dunger (dunger) joins
04:29:22DogsRNice quits [Read error: Connection reset by peer]
04:30:19<pabs>some gaming company acquisitions: https://www.gamingonlinux.com/2022/08/embracer-group-to-swallow-up-tripwire-tuxedo-labs-the-lord-of-the-rings/
04:33:16HackMii_ quits [Ping timeout: 240 seconds]
04:33:20sec^nd quits [Remote host closed the connection]
04:34:02sec^nd (second) joins
04:35:51HackMii_ (hacktheplanet) joins
05:04:16systwi quits [Ping timeout: 265 seconds]
06:24:14Doranwen quits [Client Quit]
07:01:09systwi (systwi) joins
07:14:54dm4v_ joins
07:17:16dm4v quits [Ping timeout: 240 seconds]
07:17:16dm4v_ is now known as dm4v
07:39:46Arcorann quits [Ping timeout: 240 seconds]
08:00:44Doranwen (Doranwen) joins
08:26:53BlueMaxima quits [Read error: Connection reset by peer]
08:54:45tech_exorcist (tech_exorcist) joins
08:57:55dm4v_ joins
08:58:41dm4v quits [Ping timeout: 265 seconds]
08:58:41dm4v_ is now known as dm4v
08:58:51<Maakuth|m>what timezone is arkiver on? should I try to reach them at US evening hours?
09:10:13sec^nd quits [Remote host closed the connection]
09:11:26sec^nd (second) joins
10:53:50Minkafighter quits [Quit: The Lounge - https://thelounge.chat]
10:54:31Minkafighter joins
11:56:31drexler quits [Remote host closed the connection]
11:56:48drexler joins
12:20:12tech_exorcist quits [Remote host closed the connection]
12:20:17tech_exorcist_ (tech_exorcist) joins
12:44:58Arcorann (Arcorann) joins
13:06:04tech_exorcist_ quits [Remote host closed the connection]
13:06:49tech_exorcist_ (tech_exorcist) joins
13:33:09<TheTechRobo>arkiver: ^
13:36:34<Maakuth|m>I'm UTC+03:00 myself
13:36:54<@arkiver>just leave me a message
14:24:46Arcorann quits [Ping timeout: 240 seconds]
17:00:18tech_exorcist_ leaves
17:39:58<systwi_>Is it safe to simply `cat example.com-00000.warc example.com-00001.warc > example.com.warc`? Do I need to take note of the original input WARC filesizes in case I want to split them up again?
17:41:05<@JAA>Assuming the two input files are valid WARCs, yes, that is safe.
17:41:06<systwi_>The AT wiki mentions [megawarc](https://github.com/alard/megawarc) but I'm not sure if it's still needed.
17:41:49<TheTechRobo>Megawarc is also broken for me, at least for warc.gz.
17:43:18<systwi_>Thanks for the info. I suppose it's probably best to store the original filesizes anyway; that'
17:43:21<systwi_>:-/
17:43:49<systwi_>Thanks for the info. I suppose it's probably best to store the original filesizes anyway; that's what, ~2 KB?
17:45:59<@JAA>megawarc is useful when you need to merge a larger number of WARCs, I guess. It keeps track of the original files and in theory allows extracting to that again (I think). The version on the AT org should work fine as that's used for all projects. It also does error checks and puts the broken files into a tar.
17:47:09<TheTechRobo>JAA: It's possible it's broken for me because of my Python version. I filed an issue a few months ago: https://github.com/ArchiveTeam/megawarc/issues/5
17:47:59<TheTechRobo>Interesting, "move to Python 3" is an Issue. I can't remember if I tried Python 2 or not.
17:48:25<@JAA>TheTechRobo: That doesn't sound right, and I'm pretty sure the targets run Py 3.
17:48:27<TheTechRobo>Yep, it's set to use python 2.
17:48:32<TheTechRobo>JAA: Weird.
17:48:37<TheTechRobo>https://github.com/ArchiveTeam/megawarc/issues/3 is an issue.
17:48:56<@JAA>Issue 5 would indicate that you're giving it something that's neither a .gz nor a .zst file.
17:49:08<@JAA>But also, you should only give it WARC files, not a .warc.os.cdx.gz file.
17:49:35<TheTechRobo>Isn't the point of Megawarc that it converts a directory tree into a WARC, a tar, and a metadata file?
17:50:05<@JAA>converts a collection of WARCs into*, yes
17:50:19<@JAA>It doesn't handle other file formats, nor does it check the file format.
17:50:21<TheTechRobo>From the README:
17:50:21<TheTechRobo> FILE.warc.gz is the concatenated .warc.gz
17:50:21<TheTechRobo> FILE.tar contains any non-warc files from the .tar
17:50:21<TheTechRobo> FILE.json.gz contains metadata
17:50:57<@JAA>I'm pretty sure it never checks whether the file is actually a WARC, only whether it decompresses correctly.
17:51:19<TheTechRobo>Well, it crashes when `Checking 1652740309829c5a3e1fc0bf20-1_1652740337.315711/funeralhome-1934cdbeadc09ac1a98713bb2b1d8ca41f8f2ec1-20220516-223149.warc.gz`, which should be a valid WARC.
17:52:12<TheTechRobo>https://transfer.archivete.am/SWZGI/failed_on.warc.gz is an uploaded version of that WARC.
17:52:57<@JAA>Ok, correction, the packer does indeed still use Python 2.7. Eww...
17:54:52<@JAA>And true, it should append other files to the tar. test_gz would fail, but that only runs for .warc.gz and .warc.zst (the latter with further filename pattern restrictions for $reasons).
17:56:28<@JAA>I'd help with debugging, but I banished Python 2 from my systems a long while ago.
17:56:59<TheTechRobo>Understandable.
17:57:09<TheTechRobo>I did that, until I ran into legacy software with no modern alternative. :/
17:57:20<TheTechRobo>Such as megawarc.
17:58:47tech_exorcist (tech_exorcist) joins
18:03:11tech_exorcist quits [Client Quit]
18:07:50tech_exorcist (tech_exorcist) joins
18:28:06<systwi_>Thanks for the info!
18:37:14Nulo quits [Ping timeout: 265 seconds]
18:46:56Nulo joins
18:52:29balrog quits [Quit: Bye]
19:07:29balrog (balrog) joins
19:39:44tech_exorcist quits [Remote host closed the connection]
19:46:13michaelblob_ (michaelblob) joins
19:48:46michaelblob quits [Ping timeout: 240 seconds]
21:36:09tzt (tzt) joins
22:14:33sec^nd quits [Remote host closed the connection]
22:15:11sec^nd (second) joins
22:30:29sec^nd quits [Remote host closed the connection]
22:31:05sec^nd (second) joins
23:07:05BlueMaxima joins
23:50:16sec^nd quits [Ping timeout: 240 seconds]
23:56:07sec^nd (second) joins