00:28:21qwertyasdfuiopghjkl (qwertyasdfuiopghjkl) joins
00:36:04Earendil7 (Earendil7) joins
01:05:07Wohlstand (Wohlstand) joins
01:20:23bf_ quits [Ping timeout: 272 seconds]
01:25:40<pabs>https://slate.com/business/2024/02/messenger-gawker-vice-media-layoffs-sites-deleted-why.html
01:38:05Mateon2 joins
01:40:05Mateon1 quits [Ping timeout: 255 seconds]
01:40:05Mateon2 is now known as Mateon1
01:43:41cascode quits [Ping timeout: 255 seconds]
01:47:34cascode joins
01:47:51<fireonlive>oh, on one of the hacker news threads regarding someone wanting to backup a forum someone tried to start up an argument how it was wrong to backup a site or whatever lol, asking if the OP had permission/etc (and we see results like gawker..)
01:48:00<fireonlive>btw, wtf is https://gawker.com/ now anyways
01:48:02<fireonlive>o_O
01:51:25<nulldata>fireonlive - It's a website on the internet.
01:51:31<fireonlive>true
01:51:33<fireonlive>:p
01:54:24<@JAA>No, it's a URL.
01:54:42<nicolas17>map, territory
02:00:23<pabs>ugh, the article recommends Webrecorder
02:03:18<fireonlive>:(
02:03:28fireonlive sends an angry JAA to slate HQ
02:23:43@Sanqui quits [Ping timeout: 272 seconds]
02:39:55cascode quits [Read error: Connection reset by peer]
02:40:11cascode joins
03:16:50lennier2_ quits [Ping timeout: 255 seconds]
03:21:42lennier2_ joins
03:38:04<@OrIdow6>It mentions everything
03:38:15<@OrIdow6>It mentions us, the IA, webrecorder, and archive.is/whatever
03:38:35<@OrIdow6>Everything but Google Web Cache and archive.eu
03:47:51<@JAA>Looks like there are about 165 domains on our WBM exclusion list that are no longer excluded! I'll update it soon.
03:59:50<fireonlive>ooh yay :D
04:03:04<@JAA>Actually, the count might be lower because I checked www and non-www. Will fix that as well while at it (exclusions are usually for entire domains, not just the www subdomain).
04:05:12<fireonlive>is one of them syd? :p
04:35:41<@JAA>Oh
04:35:51<@JAA>Looks like the WBM actually returns inconsistent data. :-|
04:36:58<@JAA>There are two endpoints that return whether or not a URL is excluded. The one I was looking at is apparently irrelevant but sometimes returns that an excluded domain isn't excluded.
04:37:14BlueMaxima quits [Read error: Connection reset by peer]
04:37:49<@JAA>I wonder whether these indicate different kinds of exclusions.
04:39:00<@JAA>Nevermind then, I guess...
04:43:43<fireonlive>:| sigh
05:01:42<pabs>hmm, I don't see any AT mention OrIdow6 ?
05:03:33<@JAA>> There’s the Archive Team, a volunteer group that [...]
05:07:16<pabs>ah. I always call AT ArchiveTeam not Archive Team :)
05:11:19<@JAA>Yeah, Archiveteam, ArchiveTeam, and Archive Team are all used on the wiki.
05:11:42<@JAA>I also tend to use the middle one.
05:23:24<fireonlive>from https://archive.eu/:
05:23:24<fireonlive>>The European Web Archive has been discontinued in Q4 2022.
05:23:24<fireonlive>>You can access historical websites via intelx.io. You will be redirected in 5 seconds.
05:23:26<fireonlive>oof.
05:23:53<@JAA>Yeah :-/
05:24:03<fireonlive>>Intelligence X is an independent European technology company founded in 2018 by Peter Kleissner. The company is based in Prague, Czech Republic. Its mission is to develop and maintain the search engine and data archive.
05:24:04<fireonlive>ew
05:24:08<fireonlive>sad :/
05:25:31<fireonlive>hm, they do tor/i2p as well though - interesting at the very least i guess
05:43:38Island quits [Read error: Connection reset by peer]
06:01:59pabs quits [Ping timeout: 255 seconds]
06:06:07pabs (pabs) joins
06:19:51wyatt8740 quits [Remote host closed the connection]
06:22:45wyatt8740 joins
06:23:50Ketchup901 quits [Remote host closed the connection]
06:23:58Ketchup901 (Ketchup901) joins
06:27:33wyatt8740 quits [Ping timeout: 272 seconds]
06:27:53wyatt8740 joins
06:31:44Ketchup901 quits [Remote host closed the connection]
06:32:06Ketchup901 (Ketchup901) joins
06:33:09Ketchup901 quits [Remote host closed the connection]
06:33:20Ketchup901 (Ketchup901) joins
06:41:01Ketchup901 quits [Remote host closed the connection]
06:41:23Ketchup901 (Ketchup901) joins
07:22:55bf_ joins
07:23:03bf_ quits [Remote host closed the connection]
07:35:19Ruthalas59 quits [Ping timeout: 272 seconds]
07:43:46Arcorann (Arcorann) joins
07:46:43cascode quits [Ping timeout: 272 seconds]
07:48:41cascode joins
07:50:30Carnildo quits [Remote host closed the connection]
07:51:20Doranwen quits [Ping timeout: 255 seconds]
08:00:44Ruthalas59 (Ruthalas) joins
08:11:07Carnildo joins
08:11:21Doranwen (Doranwen) joins
08:15:56cascode quits [Read error: Connection reset by peer]
08:16:18cascode joins
08:28:54sec^nd quits [Ping timeout: 255 seconds]
08:37:22sec^nd (second) joins
08:38:17sec^nd quits [Remote host closed the connection]
08:38:35sec^nd (second) joins
08:42:25sec^nd quits [Remote host closed the connection]
08:42:59sec^nd (second) joins
08:50:49qwertyasdfuiopghjkl quits [Remote host closed the connection]
09:06:20ace24x joins
09:10:44ace24x quits [Remote host closed the connection]
09:11:01ace24x joins
09:11:16ace24x quits [Remote host closed the connection]
09:11:16yonerboner quits [Read error: Connection reset by peer]
09:13:04yonerboner joins
09:15:00ace24x joins
09:20:02bf_ joins
09:35:17qwertyasdfuiopghjkl (qwertyasdfuiopghjkl) joins
10:00:02Bleo18260 quits [Client Quit]
10:01:14Bleo18260 joins
10:24:16<@OrIdow6>Yeah, archive.eu is kind of an old reference at this point :P
11:02:06<xkey>hej, may I ask what the state of archival of vice.com is or where I can check on that?
11:05:36akrillic_ joins
12:38:53<imer>xkey: iirc there was a complete archival last year and newer articles have already been archived incrementally, there seems to be another full archivebot job running for vice.com too
12:39:34<xkey>imer: thanks! for the last part, are archivebot jobs public to look at?
12:39:50<xkey>meaning the state they are in
12:40:43<imer>xkey: yes http://archivebot.com/3 if you search for vice.com, there's also the non /3 ui, but i find this one easier to look at
12:41:48<xkey>imer: yea thanks a lot!
12:51:59Arcorann quits [Ping timeout: 272 seconds]
13:10:50ace24x quits [Client Quit]
13:12:49blue_0000ff quits [Read error: Connection reset by peer]
13:13:10blue_0000ff joins
13:26:31Maika (Maika) joins
13:28:37<bf_>oh boy
13:28:45<bf_>softwareheritage indexes their content blobs by sha1
13:28:49<bf_>but they use the git blob sha1
13:28:57<bf_>and I think git blob sha1 is different than sha1sum file
13:29:05<bf_>I hope I am wrong
13:29:12<bf_>https://archive.softwareheritage.org/browse/content/sha1_git:94a9ed024d3859793618152ea559a168bbcbb5e2/raw/
13:30:07<bf_>"You see a difference because git hash-object doesn't just take a hash of the bytes in the file - it prepends the string "blob " followed by the file size and a NUL to the file's contents before hashing."
13:30:08<bf_>good lord
13:31:14<bf_>ok I'm an idiot :) their id's use git blob hashes but the api accepts query for sha1 and git_sha1 types. puh
14:19:29Wohlstand quits [Client Quit]
15:37:45<AK>https://transfer.archivete.am/ "Upload up to 500 MB", Is this limit correct? šŸ¤”I'm sure we've had larger files uploaded there
15:39:22<nulldata>Can someone throw https://www.supermassivegames.com/ into AB? Incoming layoffs. https://twitter.com/SuperMGames/status/1762125258159431750
15:39:23<eggdrop>nitter: https://farside.link/nitter/SuperMGames/status/1762125258159431750
15:45:09<TheTechRobo>AK: I don't think any of the limits on that page are correct
15:45:20<TheTechRobo>IIRC they're to discourage random people from using it as file storage
16:22:07HP_Archivist quits [Client Quit]
17:07:04etnguyen03 (etnguyen03) joins
17:47:40akrillic_ quits [Remote host closed the connection]
18:01:33<@JAA>Correct
18:02:31<@JAA>nulldata: Ack
18:13:41Mateon1 quits [Ping timeout: 255 seconds]
18:38:14<pokechu22>from #archivebot, haven't had a change to look into this yet:
18:38:16<pokechu22>12:46 <moto> MotorTrend+ may shut down soon https://variety.com/2024/digital/news/motortrend-plus-shutting-down-discovery-plus-1235921047/
18:38:18<pokechu22>12:46 <moto> Please archive https://www.motortrend.com/plus
18:38:20<pokechu22>12:46 <moto> https://help.motortrendondemand.com/hc/en-us
18:38:22<pokechu22>12:46 <moto> Due to region locking, site only allowed American IP addresses
18:39:41magmaus3 quits [Ping timeout: 272 seconds]
18:40:19magmaus32 (magmaus3) joins
19:47:21Island joins
19:57:11Earendil7 quits [Ping timeout: 255 seconds]
19:57:59Earendil7 (Earendil7) joins
20:10:41etnguyen03 quits [Ping timeout: 255 seconds]
20:18:59<Darken>Could someone archive https://tjharman.com/ with archivebot please (reason being there's not much coverage on the site, lots of links from there archived but there's blog posts dating back to the 2000s that have not been archived on there)
20:55:12elid joins
20:57:49elid quits [Remote host closed the connection]
21:21:11cascode quits [Ping timeout: 272 seconds]
21:24:21cascode joins
21:25:10Maika quits [Client Quit]
21:28:03BlueMaxima joins
21:31:50cascode quits [Read error: Connection reset by peer]
21:32:13cascode joins
22:00:52ThetaDev quits [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.]
22:01:05ThetaDev joins
22:44:47<pabs>from #archivebot:
22:44:48<pabs><vesz> Could you please archive following Japanese sites. Sorry for long long list and flooding.
22:44:49<pabs><vesz> Mitsuki Imamura's Official Fan Club - Shutdown by March 31, 2024 https://imamuramitsuki-fc.jp/
22:44:49<pabs><vesz> Rilakkuma Official Fan Club shutdown by February 29, 2024 https://rilakkuma-tomonokai.jp/
22:44:49<pabs><vesz> Tokyo Metropolitan Industrial Technology Research Institute's Techno Knowledge Freeway (TKF) - Shutdown by March 31, 2024 https://www.iri-tokyo.jp/site/tkf/
22:44:52<pabs><vesz> Tavigator - travel agency - Shutdown by March 29, 2024 https://www.tavigator.co.jp/
22:44:54<pabs><vesz> DOD Store (Old site) - Shutdown by March 29, 2024 https://ec.dod.camp/
22:44:56<pabs><vesz> Daiichi Sankyo RD Novare Co. Ltd. - Shutdown due to merger, March 31, 2024 https://www.daiichisankyo-rdn.co.jp/
22:44:59<pabs><vesz> Futari Game Plus - Game show hosted by two voice actors - Website shutdown by June 28, 2024. https://nicochannel.jp/futarigameplus/
22:45:02<pabs><vesz> Orix Corp's Online lending service - website shutdown by March 31, 2024 https://biz.orix.co.jp/online-lending/
22:45:05<pabs><vesz> Sunaogatari - Food community shutdown by February 29, 2024 https://sunao-mom.coorum.jp/
22:45:07<pabs><vesz> Hiki Sta - Hikikomori-themed community shutdown by March 22, 2024 https://hkst.gr.jp/
22:45:09<pabs><vesz> Glenfield - Fashion retail shop shutdown by Feburuary 29, 2024 https://glenfield.co.jp/
22:45:11<pabs><vesz> NTT Docomo Business DX Store - Shutdown by February 28, 2024 but delayed to May 2024. https://biz-dxstore.docomo.ne.jp/
22:45:14<pabs><vesz> Itonowa - Handmade shop shutdown by April 25, 2024 https://itonowa-lifestyle.jp/
22:45:16<pabs><vesz> Marinero - Official fanclub of Bardral Urayasu futsal club shutdown by April 30, 2024 https://bardral-marinero.net/
22:45:21<pabs><vesz> Cho Sento Puroresu FMW - pro wrestling promotion website shutdown by February 29, 2024 https://fmw.jp/
22:45:24<pabs><vesz> Awesome Store - Bankruptcy filed in January and all shops closed by February 2024. https://awesome-store.jp/
22:45:27<pabs><pabs> vesz: I suggest adding them here, so we archive them just before their shutdown dates instead of now https://wiki.archiveteam.org/index.php/Deathwatch
22:45:30<pabs><vesz> Ikuple - child's toy subscription service, shutdown by March 15, 2024 https://ikuple.com/
22:45:32<pabs><vesz> Leap Arrows Ltd Developer Blog - Shutdown by March 31, 2024 https://developer.leap-arrows.jp/
22:45:35<pabs><vesz> Sapporo Apollo Blog - Shutdown by March 2024 http://s-apollo.jugem.jp/
22:45:37<pabs><vesz> Nonbiri Tankentai 3 - Shutdown announced on May 2022 but still active http://tanken183.da-te.jp/
22:45:40<pabs><vesz> Japanese Black Army - Shutdown announced http://blog.livedoor.jp/hisui666/
22:45:42<pabs><vesz> Sitemap: http://blog.livedoor.jp/hisui666/sitemap.xml
22:45:44<pabs><vesz> Yukicolo (Note) - Shutdown by March 2024 https://note.com/yukicolo/
22:45:46<pabs><vesz> Life of eSports Games - Shutdown by March 31, 2024 https://vistainfinity.sakura.ne.jp/
22:45:48<pabs><vesz> Twitter de Game Haishin - Shutdown announced on July 2022 but still alive https://meeyamow.com/
22:45:53<pabs><vesz> Hilltop (Yamanoue) Hotel - Closed for long renovation starting February 2, 2024 https://www.yamanoue-hotel.co.jp/
22:45:56<pabs><vesz> Cambridge Research Institute - merger announced will be shutdown by July 2024 https://cambridge-research.jp/
22:45:59<pabs><vesz> Fujiya Fukushima - merger announced will be shutdown by June 1, 2024 https://www.fujiya-fukushima.co.jp/
22:46:02<pabs><vesz> Sumilena - merger announced will be shutdown by April 1, 2024 https://www.sumilena.co.jp/
22:46:04<pabs><vesz> Monobit Engine - merger announced will be shutdown by April 1, 2024 https://www.monobitengine.com/
22:46:07<pabs><vesz> Habitus Care - merger announced will be shutdown by April 1, 2024 https://www.habituscare.co.jp/
22:46:10<pabs><vesz> Souzou - merger announced will be shutdown by April 1, 2024 https://souzoh.com/
22:48:15<nicolas17>could have pastebin'd that even if the original poster didn't -.-
22:48:16etnguyen03 (etnguyen03) joins
22:48:52<steering>i know your client probably didnt show you but that took you over a minute to post btw
22:49:44<@JAA>Much easier to search for if it isn't behind a link that will probably be dead in a few years.
22:56:56<pabs>yeah, I prefer to have it in the logs than a pastebin
22:57:07<pabs>I expected it would take a while to paste
23:09:29etnguyen03 quits [Ping timeout: 272 seconds]
23:34:15marktheworst joins
23:35:07<marktheworst>I wanted to ask about a site that seems to be excluded from the Wayback Machine but not explicitly
23:35:24<marktheworst>Does it go on the List of websites excluded from the Wayback Machine wiki page?
23:40:07<pokechu22>What do you mean by that?
23:43:49<marktheworst>If you look it up on the Wayback Machine, it says "This page is unavailable for archiving. The server returned code: because server does not respond"
23:44:03<marktheworst>If you try to archive a new page, it won't show up
23:44:30<marktheworst>and will show that error message
23:44:39<marktheworst>no archives exist, seemingly all previous archives were nuked
23:44:45<marktheworst>The domain in question is divested.dev
23:46:16<marktheworst>this is different from a "Sorry. This URL has been excluded from the Wayback Machine." but the effect is still the same
23:50:44<h2ibot>JustAnotherArchivist edited List of websites excluded from the Wayback Machine/Former exclusions (+2941): https://wiki.archiveteam.org/?diff=51782&oldid=51336
23:50:45<h2ibot>JustAnotherArchivist edited List of websites excluded from the Wayback Machine (-1139, Rechecked the entire list. Moved the unblocked…): https://wiki.archiveteam.org/?diff=51783&oldid=51778
23:51:12<@JAA>Unblocked highlights: members.tripod.com, mitglied.lycos.de, snopes.com
23:51:36<@JAA>I think geocities.com was already known?
23:52:04<@JAA>marktheworst: Yes, that's the errors we're recording on the page I happen to have just beem going through. :-)
23:52:07<@JAA>been*
23:53:41etnguyen03 (etnguyen03) joins
23:54:14<@JAA>Additions always welcome! :-)
23:54:30<@JAA>Well, not really, because it sucks when a list needs to be listed there, but...
23:54:40<marktheworst>so errors like divested.dev should be recorded too?
23:56:16<@JAA>Oh, wait, I misread this entirely. Oops.
23:57:06<marktheworst>Haha no worries
23:57:18<@JAA>Are you sure there were previous snapshots?
23:58:04<@JAA>I see that a snapshot of the homepage was made an hour ago (by you?), but it isn't indexed yet. SPN returns that 'delay in registering this snapshot' message.
23:58:07<fireonlive>:( syd is still on there
23:58:36<fireonlive>some nice removals thjough :)
23:58:54<marktheworst>JAA: yeah I tried to make that snapshot, does it usually take so long to index?
23:59:00<pokechu22>The "This page is unavailable for archiving" message is generally when it runs into an error checking if the page is currently online and capable of being saved (and I'm guessing with the amount of load currently also affecting web.archive.org/save that check is failing randomly)
23:59:07<@JAA>Not usually, no, but it happens fairly regularly.
23:59:27<@JAA>Sometimes, it resolves itself within hours. Sometimes, days.
23:59:44<pokechu22>sites that are excluded from being saved get a message saying that (though note that sites being excluded from being saved and being excluded from being viewed are different)
23:59:44<marktheworst>JAA: I can't say for sure but there are plenty of snapshots on archive.ph, I would be surprised if divested somehow never got crawled by WB