00:09:00IKI joins
00:34:56HP_Archivist (HP_Archivist) joins
00:35:41mazet joins
00:44:16EColi joins
00:53:23BlueMaxima joins
01:02:39dm4v_ joins
01:04:08dm4v quits [Ping timeout: 258 seconds]
01:04:08dm4v_ is now known as dm4v
01:04:08dm4v quits [Changing host]
01:04:08dm4v (dm4v) joins
01:20:26mutantmonkey quits [Remote host closed the connection]
01:20:44mutantmonkey (mutantmonkey) joins
01:32:23Mineroboter joins
01:34:18Mineroboter_ quits [Ping timeout: 250 seconds]
01:55:01IKI quits [Ping timeout: 244 seconds]
02:06:41IKI joins
02:23:27JensRex quits [Remote host closed the connection]
02:25:40Jens (JensRex) joins
02:34:17IKI quits [Ping timeout: 244 seconds]
02:54:30IKI joins
03:45:42EColi quits [Remote host closed the connection]
03:48:10qw3rty_ joins
03:52:02qw3rty__ quits [Ping timeout: 258 seconds]
03:52:23<Ryz>Heya folks, I need help archiving the actual file downloads of user created levels from a video game called 'Blasterball 3' by WildTangent Games - https://games.wildtangent.com/blasterball3/ - here's the list of 260 user created levels ( https://transfer.notkiska.pw/Om5Ov/games.wildtangent.com-blasterball-3-level-entry-info ) - the latest being
03:52:26<Ryz>http://games.wildtangent.com/blasterball3/Community/level_details.aspx?LevelId=6583 on 2010 March
03:52:31<Ryz>The problem is that because of JS, AB wasn't able to archive the download files
03:52:38<Ryz>Here's the actual file download for example of http://games.wildtangent.com/blasterball3/Community/level_details.aspx?LevelId=6583 - http://games.wildtangent.com/blasterball3/Levels/_Levels/atomicbom_SupermanTruth.bb3
03:53:44godane1 joins
03:56:26godane quits [Ping timeout: 250 seconds]
03:59:01IKI quits [Ping timeout: 244 seconds]
03:59:54DogsRNice quits [Read error: Connection reset by peer]
04:02:52mutantmonkey quits [Remote host closed the connection]
04:03:11mutantmonkey (mutantmonkey) joins
04:05:42<thuban>and you just need the urls?
04:05:58<Ryz>Yeah~
04:06:34<Ryz>Continuing to go through the backlog can be a mess ><;
04:06:43<thuban>ok, lemme take a look
04:07:25etnguyen03 quits [Client Quit]
04:08:56<thuban>there's a 302 which points to the actual file. iirc ab follows 302s; do we want both or just the latter?
04:10:56<Ryz>Oh? What's the link?
04:11:05<thuban>i think both (it's easier anyway)
04:11:09<thuban>https://transfer.notkiska.pw/B3bOh/games.wildtangent.com-blasterball-3-level-download
04:11:39<thuban>it's just 'download' instead of 'details'
04:12:26<Ryz>Thank you so much, I really appreciate your help thuban
04:12:36<thuban>you're welcome!
04:13:03<Ryz>And continuing on me going through the backlog; ever so upset that some of the links I saw the last time I checked just die off... :C
04:22:03<Ryz>Oh, thuban, apparently through AB, it just grabs the redirects, but doesn't go through the redirects because AB applies --no-parent - so doesn't matter if I do "!a <" or "!ao <", welp; that I didn't know... ><;
04:22:15<thuban>oh, right
04:22:22<thuban>i'll just get the redirect info then
04:22:59<Ryz>That would be wonderful if you manage to do it~
04:31:06<thuban>Ryz: https://transfer.notkiska.pw/4zHSi/games.wildtangent.com-blasterball-3-level-file
04:32:38<thuban>btw you mentioned 260 levels earlier, but there are only 250 in the list you gave me; was that just a typo?
04:33:38<Ryz>Yeah, that was a typo, it's 250 levels
04:33:55<Ryz>Each pagination is 20 levels; the last pagination is fortunately 10 levels
04:37:05<Ryz>And yeah, it managed to grab all the levels; thank you again thuban
04:37:30<thuban>np :)
04:39:22<thuban>(`cat input | xargs -L1 curl -Is | grep -oP '^Location: \K.*$' | tee output`--but beware of dos line endings)
04:40:34<@JAA>tr -d '\r' is your friend.
04:42:20<thuban>i forgot to include it and had to dos2unix :<
04:42:59<@JAA>Yeah, I frequently forget it as well and then see the nasty ^M in less.
04:43:00<thuban>curl won't accept urls with trailing \r, but when it prints out headers (such as the location header...) it prints them with trailing \r, because that's in the http spec apparently
04:43:24<thuban>i spent a good five minutes on that one once
04:43:41<@JAA>Yeah, curl -I just dumps the raw headers to stdout, including CRLF line endings. And grep simply treats the CR as part of the line.
04:49:31<Ryz>Websites with Adobe Flash is going to be an extra pain in the ass...
05:28:09Arcorann (Arcorann) joins
05:31:59hooway joins
06:24:56superkuh__ joins
06:25:30Wayward- quits [Ping timeout: 250 seconds]
06:26:46Wayward (wayward) joins
06:27:40superkuh_ quits [Ping timeout: 258 seconds]
06:34:22space quits [Read error: Connection reset by peer]
06:36:37space (space) joins
06:59:09benjins quits [Read error: Connection reset by peer]
07:03:10<@JAA>Looks like clay.io, a free HTML5 game platform, disappeared sometime in late 2020.
07:10:34Arcorann quits [Ping timeout: 250 seconds]
07:16:23Sylirana quits [Ping timeout: 244 seconds]
07:16:50Sylirana (Sylirana) joins
07:19:14wessel1512 quits [Ping timeout: 250 seconds]
07:20:34wessel1512 joins
07:22:38Arcorann (Arcorann) joins
07:25:43<masterX244>discovery crawl failed... c# tool assumed that the result count is accurate. It wasn't ==>crash... got to hack together a workaround and rerun the thing
07:26:08<masterX244>(stupid dopostback-asp pagination)
08:00:50LeGoupil joins
08:14:14Arcorann quits [Remote host closed the connection]
08:14:33Arcorann (Arcorann) joins
08:39:34Zopolis435 quits [Ping timeout: 244 seconds]
08:40:56BlueMaxima quits [Read error: Connection reset by peer]
08:49:53Arcorann quits [Ping timeout: 258 seconds]
08:54:07Webuser466 joins
08:56:09Mineroboter quits [Client Quit]
08:58:21Mineroboter joins
09:06:23Webuser466 quits [Remote host closed the connection]
09:44:24lunik1 quits [Ping timeout: 250 seconds]
09:47:33lunik1 joins
10:17:38Webuser466 joins
10:31:04benjins joins
10:47:54Arcorann (Arcorann) joins
10:56:46Barto quits [Ping timeout: 258 seconds]
11:00:36grawity quits [Ping timeout: 258 seconds]
11:01:44SvenGarlic joins
11:34:56Barto (Barto) joins
11:36:35pcr leaves
11:46:10@Fusl quits [Excess Flood]
11:46:27Fusl (Fusl) joins
11:46:27@ChanServ sets mode: +o Fusl
11:53:35pcr joins
11:58:39Webuser466 quits [Remote host closed the connection]
12:01:20SvenGarlic leaves
12:01:53SvenGarlic41 joins
12:02:07SvenGarlic41 is now known as SvenGarlic
12:29:46etnguyen03 (etnguyen03) joins
13:01:48grawity (grawity) joins
13:09:54sliccricc (sliccricc) joins
13:39:49aleph quits [Client Quit]
13:41:13billy549 quits [Quit: ZNC - https://znc.in]
13:42:08aleph joins
14:53:11Somebody2 (Somebody2) joins
14:53:26<Somebody2>Hi, I'm (finally) back!
14:53:45<Somebody2>Can someone approve the outstanding new pages on the wiki?
14:56:44t32 joins
14:59:44xit quits [Quit: The Lounge - https://thelounge.chat]
15:02:29xit joins
15:12:39<@EggplantN>Somebody2 someone will shortly :)
15:12:51<Somebody2>Thanks!
15:29:21Iki joins
15:33:03katocala quits [Remote host closed the connection]
15:40:04katocala joins
15:49:26<purplebot>Reddit edited by Iki (+151, +<s>URLTeam</s> URLs connection) just now -- https://www.archiveteam.org/?diff=46543&oldid=46513
15:49:26<purplebot>Yahoo! Answers edited by Iki (+219, Answering some frequent questions. …) just now -- https://www.archiveteam.org/?diff=46544&oldid=46531
15:49:26<purplebot>URLs edited by Iki (+117, +specific sources) just now -- https://www.archiveteam.org/?diff=46545&oldid=46210
15:49:26<purplebot>TikTok edited by Iki (+158, /* Archival Locations */ +deindex …) just now -- https://www.archiveteam.org/?diff=46546&oldid=45504
15:49:26<purplebot>Category:YouTube edited by Nintendofan885 (+20, +[[Category:Google]]) just now -- https://www.archiveteam.org/?diff=46547&oldid=44609
15:50:26<purplebot>Bandcamp created by JesseW (+1617, more details on URLs) just now -- https://www.archiveteam.org/?diff=46548&oldid=0
15:51:40<@arkiver>Somebody2: those yours? ^
15:51:44<@arkiver>just approved
15:54:03billy549 (Billy549) joins
16:01:54Arcorann quits [Ping timeout: 258 seconds]
16:34:33LeighR (LeighR) joins
16:40:49<Somebody2>yep
16:42:22LeGoupil quits [Client Quit]
16:53:12lennier1 quits [Client Quit]
16:54:46lennier1 (lennier1) joins
17:15:22<Larsenv>It would be nice if someone could grab the videos of http://www.inchwormanimation.com/
17:17:57<Ryz>Larsenv, have it more akin to what my message is, like explain where the stuff is and what's the situation~ Take a look at the my earlier messages for example~
17:18:47<Larsenv>it's a DSiWare program similar to Flipnote Studio. the MP4s of the videos don't seem to be grabbed by ArchiveBot when I threw it in there
17:20:13<thuban>yeah, they're embedded by javascript
17:20:20<thuban>i'll have a look
17:24:56<@EggplantN>FLIPNOTE
17:24:58<@EggplantN>OMG
17:25:01<@EggplantN>i feel old now Larsenv
17:25:19<Larsenv>hehe
17:38:12<thuban>Larsenv: will going through the "studios" get me all of the movies?
17:38:19<Larsenv>ye
17:38:24<thuban>k cool
17:44:51<thuban>do you want any data or metadata besides the actual video urls?
17:46:28<thuban>(the mp4s are named by uuid; you could match each to its id/name/user with the stuff archivebot _did_ download but it would be a little bit of a pain)
17:48:31<Larsenv>thuban: sure
17:48:35<Larsenv>the thumbnails would be cool
17:50:13<thuban>oh, the full-size ones? gotcha
17:51:55<thuban>do you want metadata or no
17:57:09spirit quits [Client Quit]
18:37:50systwi quits [Ping timeout: 250 seconds]
18:38:30systwi (systwi) joins
18:40:47Jens quits [Killed (NickServ (GHOST command used by jens_!~jens@hackint/user/JENS))]
18:40:58LeonardoSaponara (LeonardoSaponara) joins
18:41:02JensRex (JensRex) joins
18:41:19<thuban>Larsenv: https://transfer.notkiska.pw/emcRQ/inchworm_urls.txt
18:41:37<thuban>https://transfer.notkiska.pw/Hg60s/inchworm_metadata_byuser.json / https://transfer.notkiska.pw/JvBas/inchworm_metadata_byuuid.json
18:45:56<thuban>(there's a request to something called "makeMP4.php" that afaict does nothing except confirm that the file exists--i tried several movies at random without having hit it and they all looked fine--but lmk if any of those fail)
18:55:21<masterX244>Backup of TM Exchange running... Wrote a tool to enumerate all track URLs since they use crappy ASP pagination ==> https://github.com/masterX244/TMExchange-Enumerator
19:11:58<masterX244>Result of the run will be uploaded straight to archive.org as usual
19:14:12t32 quits [Remote host closed the connection]
19:16:06Iki quits [Ping timeout: 244 seconds]
19:22:07<Larsenv>thuban: thank you! running now :)
19:23:20<thuban>you're welcome!
19:48:28aleph quits [Ping timeout: 250 seconds]
19:48:56pcr leaves
19:48:58pcr joins
19:49:16pcr leaves
19:49:17pcr joins
20:09:28<@arkiver>masterX244: is this simple unauthenticated requests?
20:09:39<@arkiver>if yes, are we you getting them in WARCs?
20:17:12godane1 quits [Ping timeout: 258 seconds]
20:24:05EColi joins
20:30:49EColi quits [Remote host closed the connection]
20:46:11<Ryz>Heya folks, I found what seems to be an alternative way to searching videos on YouTube, being http://idiotbox.codemadness.org/ - I'm not too sure if there's a use for it
20:47:42<Ryz>The filter options as represented by the dropdown menu doesn't seem to work
20:49:00<Ryz>WHen searching - for example http://idiotbox.codemadness.org/?q=match-3+games&o=relevance - each entry is a YouTube link but something like https://www.youtube.com/embed/zDBVG_9GKkA - http://idiotbox.codemadness.org/?q=Best%20Indie%20Games&o=relevance (which is unfortunately not going into the user but another search query), and
20:49:02<Ryz>https://www.youtube.com/feeds/videos.xml?channel_id=UCCd3jyJmOFzkJEMj5Bp89rw
20:49:10<Ryz>Has anyone seen this last link before?
21:04:12<tech234a>Google will be discontinuing some Feedburner features in July 2021, including browser-friendly viewing, email subscriptions, and password protection. https://support.google.com/feedburner/answer/10483501
21:11:04SvenGarlic quits [Remote host closed the connection]
21:22:47superkuh__ is now known as superkuh
21:23:10LeighR quits [Remote host closed the connection]
21:26:30<masterX244>tm-exchange is simple unauthenticated
21:26:37<masterX244>only issue was viewstate crap for pagination
21:27:15<masterX244>already crawling on my grab-site
21:30:48<masterX244>the pagination itself is not really wayback-friendly
21:32:24<masterX244>(no URL parameter at all, might do a rerun of my scraper with a WARC proxy)
21:33:42<masterX244>arkiver: ping
21:34:02<masterX244>https://nplusc.de/grabsite/
21:50:25aleph joins
22:00:03mutantmonkey quits [Remote host closed the connection]
22:00:18mutantmonkey (mutantmonkey) joins
22:02:16aleph quits [Client Quit]
22:03:25aleph joins
22:04:29<masterX244>200k of the track detail pages captured already. Running the urllist with recursion depth 2 to catch linked replays and tracks
22:22:39hooway quits [Read error: Connection reset by peer]
22:22:47Iki joins
22:43:42hilda joins
22:45:06Iki quits [Remote host closed the connection]
23:00:52aleph quits [Ping timeout: 250 seconds]
23:25:40<@OrIdow6>Ban in #archiveteam: spammer in #kickthebucket and the Yahoo Answers channel
23:26:35<@OrIdow6>The first time I've kicked or banned someone (besides myself) from something in about 7 years
23:26:52Webuser631 joins
23:36:41Lord_Nightmare quits [Quit: ZNC - http://znc.in]
23:44:50Lord_Nightmare (Lord_Nightmare) joins
23:50:27Webuser631 quits [Ping timeout: 244 seconds]