00:01:59dm4v quits [Read error: Connection reset by peer]
00:02:10dm4v joins
00:02:13dm4v quits [Changing host]
00:02:13dm4v (dm4v) joins
00:24:36Mayk78 quits [Ping timeout: 258 seconds]
00:26:23Megame (Megame) joins
00:31:00BlueMaxima joins
00:32:44Arcorann_ joins
00:33:39BlueMaxima_ joins
00:36:05Iki quits [Read error: Connection reset by peer]
00:37:38BlueMaxima quits [Ping timeout: 258 seconds]
01:02:33dm4v_ joins
01:02:36dm4v quits [Read error: Connection reset by peer]
01:02:45dm4v_ is now known as dm4v
01:02:47dm4v quits [Changing host]
01:02:47dm4v (dm4v) joins
01:45:17<@JAA>If anyone has an account at the Supercell forums, please get in touch with me. There is a lot of content that is apparently available only to registered users (but without further restrictions like manual approval or whatever). I'd like to archive that as well, although it probably won't go into the WBM.
02:40:18HP_Archivist quits [Ping timeout: 258 seconds]
03:19:27qw3rty_ joins
03:20:00DogsRNice quits [Read error: Connection reset by peer]
03:21:23atomicthumbs quits [Quit: No Ping reply in 180 seconds.]
03:21:25atomicthumbs joins
03:23:14qw3rty__ quits [Ping timeout: 258 seconds]
04:23:30Ajay_m quits [Ping timeout: 250 seconds]
04:25:16Iki joins
04:29:26Ajay_m joins
05:56:57supah__ quits [Ping timeout: 258 seconds]
06:17:01HP_Archivist (HP_Archivist) joins
06:27:52RJHacker48047 quits [Read error: Connection reset by peer]
06:28:51nepeat joins
06:29:48nepeat is now known as RJHacker97494
06:32:44spirit joins
06:35:51Mayk78 joins
06:40:39Mayk78 quits [Read error: Connection reset by peer]
07:05:01Hackerpcs quits [Client Quit]
07:06:32Hackerpcs (Hackerpcs) joins
07:10:00somerando3 joins
07:12:54Hackerpcs quits [Client Quit]
07:14:04Hackerpcs (Hackerpcs) joins
07:16:02<somerando3>@thuban on the rthk podcasts: a better source for metadata than podchaster is the wayback machine copies of the RSS, it has more links than the podchaser dump I posted earlier.
07:16:46<somerando3>I think I saved away the RSS files somewhere, but I can't find them now. They should be pretty easy to get.
07:19:40<somerando3>I threw all the links into jdownloader for open line open view and backchat from the RSS and my podchaser dumps, and have already downloaded it all, though I'm not sure what to do with it. I suspect there's still gaps for open line open view, but I've yet to check, and I don't think they're too big.
07:23:11<somerando3>I was able to download stuff back to Jan 2014. It seems like they implemented their 1000 link limit only a couple years ago, so once you get there the RSS will span back to 2014.
07:23:17<somerando3>There's RSS from before 2014, but it looks like they reorganized things and the links no longer work as-is.
07:24:34<somerando3>The files may still be out there somewhere on a different server, but none of my guesses yielded anything.
07:35:47<thuban>my one reservation with getting metadata directly from the rss feed (i've been getting it from the rthk page for each episode) is that sometimes the descriptions are truncated--will have to poke around and see whether that's a problem in this case / whether i can get it from other sources (wbm copies, podcast scrapers)
07:35:57<thuban>got a link for some of the pre-2014 urls?
08:01:16BlueMaxima_ quits [Client Quit]
08:52:37Mayk joins
09:22:22<themadpro>https://atdash.meo.ws/ requires a log in now?
09:22:33<themadpro>Uh... how do I sign up?
09:32:57<h3ndr1k>I guess its juat for core team members (sorry if you are one :) ) It got overloaded at some last project.
09:45:25HP_Archivist quits [Ping timeout: 258 seconds]
10:08:50<@OrIdow6>I've been told it's broken
11:41:10Iki quits [Read error: Connection reset by peer]
11:46:44EdSavoie_srv joins
12:15:25supah__ joins
12:22:45<pabs>"Now more than ever we need surveillance camera man. ... Unfortunately youtube has repeatedly deleted his videos over the years and we are currently in such a period" https://news.ycombinator.com/item?id=27904820
12:23:50<pabs>aw, none are available on youtube, only on vimeo
12:24:57<EdSavoie_srv>his channel is up though, could they be unlisted?
12:25:09<pabs>perhaps
12:25:25<pabs>I did read about them making lots of videos unlisted recently
12:26:47<EdSavoie_srv>something something "a fate worse than death" er, deletion
12:44:40ddd joins
12:47:53Mayk quits [Ping timeout: 258 seconds]
13:29:14ddd quits [Remote host closed the connection]
13:45:10Megame quits [Client Quit]
13:55:20<somerando3>thuban: https://web.archive.org/web/20130822103939/https://podcast.rthk.hk/podcast/radio1_openline_openview.xml, http://podcast.rthk.org.hk/podcast/media/radio1_openline_openview/radio1_openline_openview_2013072217_1.mp3
13:59:15<somerando3>Wouldn't the metadata from something like podchaster be derived from the rthk RSS, so it's suffer the same truncation issues? Going to the article pages in the wayback machine seems like a good idea, if they exist.
14:10:44balrog quits [Quit: Bye]
14:17:53<somerando3>oh and fyi I have 125GB of open line open view and 43GB of backchat mp3s/m4as. I don't have the most upload bandwidth/data cap, and there may be (not large) gaps. If it's needed I can upload somewhere. The RTHK archive server is not the fastest to download from.
14:19:55balrog (balrog) joins
14:58:31lunik1 quits [Client Quit]
15:03:03lunik1 joins
15:03:09Mayk78 joins
15:18:55Arcorann_ quits [Ping timeout: 258 seconds]
15:28:47Eighty quits [Quit: leaving]
15:30:14Eighty (Eighty) joins
15:49:53<@JAA>Hah, we tried to archive SketchFab's URL shortener through URLTeam a long while ago. They weren't very happy at the time.
16:25:29ragu joins
16:38:04ragu quits [Client Quit]
16:40:56Wingy9 (Wingy) joins
16:42:29Wingy quits [Ping timeout: 258 seconds]
16:42:29Wingy9 is now known as Wingy
16:43:31wizards quits [Client Quit]
16:46:17wizards joins
17:06:10HP_Archivist (HP_Archivist) joins
17:06:46supah_ joins
17:06:46<nyany>JAA: no, they were not...
17:09:11<nyany>also, OrIdow6 sure is broken
17:10:01ragu joins
17:10:28supah__ quits [Ping timeout: 258 seconds]
17:18:30Jens quits [Killed (NickServ (GHOST command used by jens_!~jens@hackint/user/JENS))]
17:18:45JensRex (JensRex) joins
17:29:08abcde79 joins
17:30:16abcde21 joins
17:30:22ragu quits [Read error: Connection reset by peer]
17:33:02abcde quits [Ping timeout: 244 seconds]
17:33:33abcde79 quits [Ping timeout: 244 seconds]
17:38:19Iki joins
17:39:06supah_ quits [Ping timeout: 250 seconds]
18:29:58DogsRNice (Webuser299) joins
18:50:29ragu joins
19:04:09C4K3 quits [Remote host closed the connection]
19:30:46HP_Archivist quits [Ping timeout: 258 seconds]
20:16:03<billy549><billy549> how would one accurately make WARC(s) of an entire site assuming it needs a login cookie?
20:16:13<billy549>since i posted in last channel before, ty JAA
20:16:30<ivan>https://github.com/ArchiveTeam/grab-site#website-requiring-login--cookies
20:16:46<ivan>accuracy, though... YMMV
20:25:16supah_ joins
20:26:05<billy549>ty; what's the best way to open a WARC, fwiw?
20:28:38<Jake>I find that https://replayweb.page/ works great, but there's quite a lot of tools: https://wiki.archiveteam.org/index.php/The_WARC_Ecosystem
20:31:42<@JAA>pywb works pretty well for local playback.
20:34:54<h2ibot>AK edited Hong Kong media (+62, Jobs in progress and done): https://wiki.archiveteam.org/?diff=47000&oldid=46993
20:40:10Pingerfowder quits [Quit: ZNC - https://znc.in]
20:40:59Pingerfowder (Pingerfowder) joins
20:57:44HP_Archivist (HP_Archivist) joins
21:01:29systwi quits [Read error: Connection reset by peer]
21:01:32fuzzy8021 quits [Read error: Connection reset by peer]
21:01:40HP_Archivist quits [Read error: Connection reset by peer]
21:02:08HP_Archivist (HP_Archivist) joins
21:02:12systwi (systwi) joins
21:02:22fuzzy8021 (fuzzy8021) joins
21:02:39ave quits [Quit: Ping timeout (120 seconds)]
21:02:59ave (ave) joins
21:03:52@dxrt quits [Quit: ZNC - http://znc.sourceforge.net]
21:04:10dxrt joins
21:04:12dxrt quits [Changing host]
21:04:12dxrt (dxrt) joins
21:04:12@ChanServ sets mode: +o dxrt
21:06:23<thuban>somerando3: tried some guesses of my own, no luck either :(
21:06:32<thuban>as for the truncation issue, probably, yes, but it's worth checking out. unfortunately i don't think we're going to have much luck with the episode pages--rthk's current setup seems to take them down as soon as they scroll off the 1000-ep backlog, wbm coverage is super spotty (~50 results for this podcast), and pre-2017 they may not have existed at all
21:08:48<thuban>i have also downloaded stuff and will get it on ia eventually, but i can ping you if i find any issues
21:14:17<balrog>did viewing archived tweets on Wayback break?
21:18:29<Jake>I believe some old tweets on twitter had bad captures?
21:30:08HP_Archivist quits [Read error: Connection reset by peer]
21:30:55HP_Archivist (HP_Archivist) joins
21:31:16lun4 quits [Client Quit]
21:31:16ave quits [Client Quit]
21:31:30lun4 (lun4) joins
21:32:33@dxrt quits [Client Quit]
21:32:35flashmeow quits [Quit: ZNC 1.8.2 - https://znc.in]
21:32:56flashmeow (flashmeow) joins
21:33:52ave (ave) joins
21:34:12HackMii quits [Ping timeout: 258 seconds]
21:34:58dxrt joins
21:35:00dxrt quits [Changing host]
21:35:00dxrt (dxrt) joins
21:35:00@ChanServ sets mode: +o dxrt
21:35:51HackMii (hacktheplanet) joins
21:47:37Ajay_m quits [Ping timeout: 258 seconds]
22:11:01Megame (Megame) joins
22:19:13spirit quits [Client Quit]
22:26:24phiresky quits [Ping timeout: 250 seconds]
22:28:15SCSi quits [Ping timeout: 258 seconds]
22:28:36phiresky joins
22:32:47Ajay_m joins
23:35:20qw3rty_ quits [Ping timeout: 258 seconds]
23:37:49lennier1 quits [Quit: Going offline, see ya! (www.adiirc.com)]
23:41:39lennier1 (lennier1) joins
23:44:57supah_ quits [Read error: Connection reset by peer]
23:55:23Arcorann_ joins
23:59:44qw3rty_ joins