00:00:18<h2ibot>JAABot edited CurrentWarriorProject (-2): https://wiki.archiveteam.org/?diff=50433&oldid=50306
00:11:39nicolas17 quits [Client Quit]
00:25:52pabs quits [Quit: Don't rest until all the world is paved in moss and greenery.]
00:32:03pabs (pabs) joins
00:32:32DogsRNice joins
00:38:33nicolas17 joins
00:45:32<flashfire42>fireonlive you still wanted me to hit the polycom domains hard?
00:45:53<fireonlive>pls :)
00:49:39<flashfire42>fireonlive looks like some of them 403 with archivebot. If you wanna babysit the dashboard I can throw them in as fast as possible or I can put them on the backburner til I have time to better monitor it all
00:54:18<fireonlive>have the x over atm sadly so it’ll have to burner at the back
00:54:20<fireonlive>but thanks :)
01:11:54Megame quits [Read error: Connection reset by peer]
01:35:35<h2ibot>Ufarwisan edited Discord (+660, Rename /* Software */ to /* Self-archival */): https://wiki.archiveteam.org/?diff=50434&oldid=48752
02:00:38<h2ibot>TheTechRobo edited Discord (+11, /* Self-archival */ We should probably make the…): https://wiki.archiveteam.org/?diff=50435&oldid=50434
02:05:39<h2ibot>TheTechRobo edited Discord (+28, /* Self-archival */ More details about tools): https://wiki.archiveteam.org/?diff=50436&oldid=50435
02:06:39<h2ibot>TheTechRobo edited Discord (+125, Link my URL extractor): https://wiki.archiveteam.org/?diff=50437&oldid=50436
02:07:14etnguyen03 quits [Ping timeout: 252 seconds]
02:08:03balrog quits [Quit: Bye]
02:09:02balrog (balrog) joins
02:11:41<h2ibot>TheTechRobo edited Vanillo (+117, Appears to be back up, with content dating back…): https://wiki.archiveteam.org/?diff=50438&oldid=41059
02:15:41<h2ibot>TheTechRobo edited Wysp (-1, It is now offline): https://wiki.archiveteam.org/?diff=50439&oldid=50417
02:19:07<nicolas17>thuban: looks like I have to archive all 4 video qualities for the DASH .mpd to work
02:25:20<thuban>interesting
02:26:02<nicolas17>at least ffmpeg/mpv/etc try to read the first segment of *every available alt quality* before they even start playing
02:26:59<nicolas17>if the low quality segment 1 returns 404 then it says the video is corrupted and dies, even if you told it to play 1080p
02:36:20etnguyen03 (etnguyen03) joins
02:48:08<nicolas17>unfortunately archiving all qualities means 10GB per episode ugh
02:48:54systwi_ joins
02:52:17<nicolas17>do I archivebot?
02:53:02<flashfire42>what are you wanting to archivebot?
02:54:07<nicolas17>flashfire42: https://www.rtve.es/play/videos/grand-prix/ spanish TV game show
02:55:58<systwi_>nicolas17: ArchiveBot can't save full television programmes (typically), if that is what you were hoping for.
02:56:25<nicolas17>systwi_: what exactly do you mean by "can't"? file size limit?
02:57:06<systwi_>ArchiveBot's purpose is to save web pages and eventually make them available in https://web.archive.org/
02:57:11<systwi_>https://wiki.archiveteam.org/index.php/ArchiveBot
02:58:00<nicolas17>earlier I asked "video is in DASH format, should I remux it to .mp4 and upload it as an item, or archive the .mpd and video segments in a WARC, or give archivebot a URL list and let it do that for me?" and thuban said "1 and 3 imho"
02:58:09<systwi_>But if the URL to which you had linked were to be saved with ArchiveBot, it would try its best to save any web pages it can find.
02:59:39<systwi_>All three sound good, but I think thuban has a good point, so I second it.
02:59:46<systwi_>1 & 3.
03:00:05<nicolas17>https://transfer.archivete.am/inline/8x4IQ/6939444.txt this is what I planned to give to archivebot, not the web player :)
03:02:34<systwi_>Looks good to me. Thank you for the list. I'll save it with ArchiveBot for you.
03:03:21<nicolas17>note having multiple video qualities it adds up to 10GB
03:04:32<systwi_>~10GB shouldn't be too problematic.
03:04:34<nicolas17>someone uploaded most or all of the old seasons (1996-2007) to YouTube, probably from personal VHS
03:05:20<systwi_>Going the extra mile is nice. :-)
03:06:09<nicolas17>in fact, I searched for it on youtube to show someone, and that's where I discovered they were about to reboot it this year
03:08:50etnguyen03 quits [Ping timeout: 252 seconds]
03:11:26etnguyen03 (etnguyen03) joins
03:11:42DogsRNice quits [Remote host closed the connection]
03:11:47DogsRNice joins
03:27:58etnguyen03 quits [Client Quit]
03:28:08etnguyen03 (etnguyen03) joins
03:28:40<nicolas17>systwi_: also, the web player loads 6939444_drm.mpd and gets a FairPlay or Widevine license to decrypt it
03:29:28<nicolas17>I asked a friend if he knew how to break widevine nowadays, and then I realized I could just ... remove the "_drm" part of the URL >.>
03:30:18<systwi_>Haha, they store a decrypted version too? Lovely. :-P
03:32:07<nicolas17>I *hope* their paid content for RTVE Play+ subscribers is protected better than that
03:40:31railen63 joins
03:43:22railen69 quits [Ping timeout: 258 seconds]
03:45:45dumbgoy quits [Ping timeout: 265 seconds]
04:00:56sfsdfdsfsdfd joins
04:11:11etnguyen03 quits [Client Quit]
04:23:24BlueMaxima quits [Read error: Connection reset by peer]
04:23:26railen64 joins
04:24:46railen63 quits [Ping timeout: 258 seconds]
04:32:25DogsRNice quits [Read error: Connection reset by peer]
04:59:10railen64 quits [Remote host closed the connection]
05:02:46railen63 joins
05:04:58AmAnd0A quits [Ping timeout: 265 seconds]
05:05:08AmAnd0A joins
05:05:14gfhh joins
05:06:28AmAnd0A quits [Read error: Connection reset by peer]
05:06:45AmAnd0A joins
05:18:38fafa joins
05:19:32fafa quits [Remote host closed the connection]
05:30:22railen69 joins
05:33:58railen63 quits [Ping timeout: 265 seconds]
05:47:27<fireonlive>i hope it isn't
05:47:28<fireonlive>:D
05:47:34<fireonlive>🏴‍☠️
05:57:39<fireonlive>https://old.reddit.com/r/DataHoarder/comments/15k2fa4/what_data_do_you_think_is_at_risk_of_being/jv3sk36/
05:57:42<fireonlive>if only….
05:57:51<fireonlive>someone should mention AT there too :)
06:05:23AmAnd0A quits [Ping timeout: 265 seconds]
06:09:21<@OrIdow6>Hm
06:10:36<@OrIdow6>I'm kinda surprised that AI types don't toss around AT data as much as they seem to, like, pushshift
06:10:54<@OrIdow6>If that happens could put us at risk of being more aggressively blocked
06:26:37<fireonlive>maybe warcs are too difficult for them lol
06:33:26pokechu22 quits [Ping timeout: 252 seconds]
06:46:47nicolas17 quits [Client Quit]
06:49:49railen69 quits [Remote host closed the connection]
06:49:52railen69 joins
06:54:37sfsdfdsfsdfd quits [Remote host closed the connection]
06:54:38qwertyasdfuiopghjkl quits [Remote host closed the connection]
06:59:37qwertyasdfuiopghjkl (qwertyasdfuiopghjkl) joins
07:00:07nfriedly quits [Remote host closed the connection]
07:05:02Unholy236131661 quits [Remote host closed the connection]
07:06:35Unholy236131661 (Unholy2361) joins
07:08:05bilboed quits [Ping timeout: 252 seconds]
07:18:35<thuban>JAA, did you ever hear back from uktrainsim?
07:51:33Arcorann (Arcorann) joins
08:25:47Artem4ikBaik joins
08:32:10railen69 quits [Remote host closed the connection]
08:32:40railen63 joins
08:44:43<Artem4ikBaik>close
08:44:45Artem4ikBaik leaves
08:55:59<h2ibot>Exorcism uploaded File:Isitnormal-logo.png: https://wiki.archiveteam.org/?title=File%3AIsitnormal-logo.png
08:56:00<h2ibot>Exorcism uploaded File:Isitnormal-screenshot.png: https://wiki.archiveteam.org/?title=File%3AIsitnormal-screenshot.png
08:56:59<h2ibot>Exorcism edited Is It Normal? (+65): https://wiki.archiveteam.org/?diff=50442&oldid=50429
09:04:26Naruyoko quits [Remote host closed the connection]
09:04:48Naruyoko joins
09:27:12AmAnd0A joins
09:51:42ehmry quits [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.]
09:52:15ehmry joins
09:52:51nfriedly joins
10:00:01railen63 quits [Remote host closed the connection]
10:00:19railen63 joins
10:00:58hackbug quits [Read error: Connection reset by peer]
10:01:43hackbug (hackbug) joins
10:13:13<h2ibot>Exorcism edited Enjin (+37): https://wiki.archiveteam.org/?diff=50443&oldid=49748
10:14:55project10 joins
12:09:23pos12 joins
12:11:55<pos12>Canadian file host filegenie.com will shut down for undisclosed reasons on August 31; most of the links in its sitemap, including FAQ, are dead.
12:13:09T31M quits [Quit: ZNC - https://znc.in]
12:13:27T31M joins
12:15:13<pos12>Filegenie's file URL format is http://wl.filegenie.com/~<username>/<filename> . Websites that contain still-active wl.filegenie.com links should be archived too.
12:15:25pos12 quits [Remote host closed the connection]
12:23:45<imer>that sounds difficult to do a comprehensive grab on :|
12:28:44<pabs>needs some search engine queries I guess
12:29:22<pabs>oh, no directory listings :(
12:30:17<pabs>flashfire42 seems to be on it already
12:33:25lflare quits [Quit: Bye]
12:35:06<flashfire42>Thats everything from bing anyway
12:37:24etnguyen03 (etnguyen03) joins
12:42:16lflare (lflare) joins
12:42:38<pabs>does bing have a results limit like google/ddg do?
12:42:56<flashfire42>Yes
12:43:40<flashfire42>Alright I am looping on myself I am going back to be
12:44:03<pabs>ah, did you try the adding keywords trick from https://wiki.archiveteam.org/index.php/Site_exploration ?
12:44:56<flashfire42>No because my usual checked urls trick doesnt work on those pdfs because it tries to download them straight away instead of opening them in a web browser
12:49:15<pabs>Google/DDG don't find many URLs
13:03:26wyatt8750 quits [Remote host closed the connection]
13:07:59wyatt8740 joins
13:14:34katocala quits [Read error: Connection reset by peer]
14:01:41Arcorann quits [Ping timeout: 258 seconds]
14:19:21W7RFa6AbNFz quits [Read error: Connection reset by peer]
14:19:35W7RFa6AbNFz joins
14:24:29katocala joins
14:27:58nostalgebraist joins
14:43:26that_lurker quits [Client Quit]
14:44:27nostalgebraist quits [Client Quit]
14:46:17that_lurker (that_lurker) joins
14:53:01acocast joins
15:02:18kiryu joins
15:09:08acocast quits [Ping timeout: 265 seconds]
15:17:36nicolas17 joins
15:42:14<h2ibot>TheTechRobo edited The WARC Ecosystem (+713, Add section for people who just want to view…): https://wiki.archiveteam.org/?diff=50444&oldid=50100
15:45:14<h2ibot>Farrukhali6177 edited CNET Forums (+33, /* Shutdown notice */): https://wiki.archiveteam.org/?diff=50445&oldid=48231
15:45:15<h2ibot>Ersatzteilehome edited Discourse (+64): https://wiki.archiveteam.org/?diff=50446&oldid=50234
15:45:16<h2ibot>Ufarwisan edited Discord (+9): https://wiki.archiveteam.org/?diff=50447&oldid=50437
15:45:17<h2ibot>Exorcism edited Deathwatch (+96): https://wiki.archiveteam.org/?diff=50448&oldid=50318
15:52:15<h2ibot>TheTechRobo edited Discord (+178, /* Self-archival */ Add source code licences): https://wiki.archiveteam.org/?diff=50449&oldid=50447
15:57:48<@arkiver>i love all the changes to the wiki lately :)
16:04:02dumbgoy joins
16:20:50nicolas17 quits [Ping timeout: 252 seconds]
16:23:55kiryu quits [Remote host closed the connection]
16:25:36kiryu joins
16:39:16<jacksonchen666>hi, i intend to shutdown my warrior for a system upgrade. however, it seems like it's stuck doing nothing useful (server returned bad response & nearly 16 elapsed job). could i force stop the warrior right now?
16:39:41<jacksonchen666>*nearly 16 hours elapsed job
16:40:26<kiryu>jacksonchen666: It's fine
16:42:00<kiryu>I think you have already got banned and the failed items in the warrior project should return to the backfeed
16:48:55<jacksonchen666>doesn
16:48:57<jacksonchen666>doesn
16:49:00<jacksonchen666>oops again
16:49:50<jacksonchen666>seems like my warrior is still trying for some reason, switched it to another project manually
16:54:20Island_ quits [Read error: Connection reset by peer]
17:08:39Island joins
17:12:35DogsRNice joins
17:13:31<h2ibot>TheTechRobo edited Twitch.tv (+642, #burnthetwitch: Add directory structure and caveat): https://wiki.archiveteam.org/?diff=50450&oldid=50418
17:37:39g2147 joins
18:38:27emberquill08 quits [Quit: The Lounge - https://thelounge.chat]
18:39:13emberquill08 (emberquill) joins
18:41:36g2147 quits [Client Quit]
18:48:38tertu (tertu) joins
18:59:59decky_e quits [Remote host closed the connection]
19:00:21decky_e joins
19:04:36pokechu22 (pokechu22) joins
19:15:32lflare quits [Client Quit]
19:15:32that_lurker quits [Client Quit]
19:15:37that_lurker7 (that_lurker) joins
19:15:49lflare (lflare) joins
19:21:01tertu2 (tertu) joins
19:21:10lflare quits [Killed (nuke.hackint.org (Nickname regained by services))]
19:21:12lflare (lflare) joins
19:21:26katocala quits [Remote host closed the connection]
19:21:28tertu quits [Client Quit]
19:21:28qwertyasdfuiopghjkl quits [Remote host closed the connection]
19:21:28project10 quits [Remote host closed the connection]
19:21:33katocala joins
20:09:35medecau (medecau) joins
20:14:19that_lurker7 is now known as that_lurker
20:16:49project10 joins
20:46:56<@JAA>thuban: I didn't even remember sending that email, but no, I didn't.
20:47:21<thuban>ouch
20:50:26sec^nd quits [Ping timeout: 245 seconds]
20:52:21<pokechu22>Any ideas for what to do with a site like http://www.ericbrasseur.org/? It does a JS challenge of some sort that sets a cookie, and then redirects to a different page. But the challenge seems to fail randomly sometimes too. It seems like useful content at least
21:00:25ThetaDev quits [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.]
21:00:32ThetaDev joins
21:01:29PredatorIWD quits [Remote host closed the connection]
21:02:08PredatorIWD joins
21:35:10Jake quits [Ping timeout: 258 seconds]
21:48:52sec^nd (second) joins
21:53:49useretail joins
21:56:20etnguyen03 quits [Ping timeout: 252 seconds]
22:20:17AmAnd0A quits [Ping timeout: 265 seconds]
22:20:42AmAnd0A joins
22:22:34sfsdfdsfsdfd joins
22:50:34BlueMaxima joins
23:14:43etnguyen03 (etnguyen03) joins
23:17:13etnguyen03 quits [Client Quit]
23:17:23etnguyen03 (etnguyen03) joins
23:19:16Megame (Megame) joins
23:32:35AmAnd0A quits [Ping timeout: 252 seconds]
23:33:19AmAnd0A joins
23:39:10<pabs>pokechu22: IIRC JAA had a way to archive stuff that needs a cookie
23:39:29<pabs>JAA: did you end up getting the opensource.com cookie-requiring stuff btw?
23:47:02<flashfire42>any requests for archivebot focus today or just me going on with my ISP hosting stuff?
23:48:15<pokechu22>I'm doing some greek university stuff (for a school that I think was merged into a different one in 2019) but it's not super high priority
23:48:23nicolas17 joins
23:49:13<flashfire42>I can add site:teithessaly.gr to my tabs
23:50:06<pokechu22>Don't worry about it - the stuff I did was the only relevant cached stuff (the other domains are live)
23:50:18<flashfire42>Ah ok
23:50:20<flashfire42>all good
23:50:58<pokechu22>there's also some jank with teithessaly.gr and teilar.gr being the same site (I've already handled teilar.gr for the most part, currently checking subdomains)
23:56:22<nicolas17>JAA: it seems s3://origin.ka.cdn/ is entirely inaccessible now?
23:56:46<nicolas17>its CDNs too