00:28:21 | | qwertyasdfuiopghjkl (qwertyasdfuiopghjkl) joins |
00:36:04 | | Earendil7 (Earendil7) joins |
01:05:07 | | Wohlstand (Wohlstand) joins |
01:20:23 | | bf_ quits [Ping timeout: 272 seconds] |
01:25:40 | <pabs> | https://slate.com/business/2024/02/messenger-gawker-vice-media-layoffs-sites-deleted-why.html |
01:38:05 | | Mateon2 joins |
01:40:05 | | Mateon1 quits [Ping timeout: 255 seconds] |
01:40:05 | | Mateon2 is now known as Mateon1 |
01:43:41 | | cascode quits [Ping timeout: 255 seconds] |
01:47:34 | | cascode joins |
01:47:51 | <fireonlive> | oh, on one of the hacker news threads regarding someone wanting to backup a forum someone tried to start up an argument how it was wrong to backup a site or whatever lol, asking if the OP had permission/etc (and we see results like gawker..) |
01:48:00 | <fireonlive> | btw, wtf is https://gawker.com/ now anyways |
01:48:02 | <fireonlive> | o_O |
01:51:25 | <nulldata> | fireonlive - It's a website on the internet. |
01:51:31 | <fireonlive> | true |
01:51:33 | <fireonlive> | :p |
01:54:24 | <@JAA> | No, it's a URL. |
01:54:42 | <nicolas17> | map, territory |
02:00:23 | <pabs> | ugh, the article recommends Webrecorder |
02:03:18 | <fireonlive> | :( |
02:03:28 | | fireonlive sends an angry JAA to slate HQ |
02:23:43 | | @Sanqui quits [Ping timeout: 272 seconds] |
02:39:55 | | cascode quits [Read error: Connection reset by peer] |
02:40:11 | | cascode joins |
03:16:50 | | lennier2_ quits [Ping timeout: 255 seconds] |
03:21:42 | | lennier2_ joins |
03:38:04 | <@OrIdow6> | It mentions everything |
03:38:15 | <@OrIdow6> | It mentions us, the IA, webrecorder, and archive.is/whatever |
03:38:35 | <@OrIdow6> | Everything but Google Web Cache and archive.eu |
03:47:51 | <@JAA> | Looks like there are about 165 domains on our WBM exclusion list that are no longer excluded! I'll update it soon. |
03:59:50 | <fireonlive> | ooh yay :D |
04:03:04 | <@JAA> | Actually, the count might be lower because I checked www and non-www. Will fix that as well while at it (exclusions are usually for entire domains, not just the www subdomain). |
04:05:12 | <fireonlive> | is one of them syd? :p |
04:35:41 | <@JAA> | Oh |
04:35:51 | <@JAA> | Looks like the WBM actually returns inconsistent data. :-| |
04:36:58 | <@JAA> | There are two endpoints that return whether or not a URL is excluded. The one I was looking at is apparently irrelevant but sometimes returns that an excluded domain isn't excluded. |
04:37:14 | | BlueMaxima quits [Read error: Connection reset by peer] |
04:37:49 | <@JAA> | I wonder whether these indicate different kinds of exclusions. |
04:39:00 | <@JAA> | Nevermind then, I guess... |
04:43:43 | <fireonlive> | :| sigh |
05:01:42 | <pabs> | hmm, I don't see any AT mention OrIdow6 ? |
05:03:33 | <@JAA> | > Thereās the Archive Team, a volunteer group that [...] |
05:07:16 | <pabs> | ah. I always call AT ArchiveTeam not Archive Team :) |
05:11:19 | <@JAA> | Yeah, Archiveteam, ArchiveTeam, and Archive Team are all used on the wiki. |
05:11:42 | <@JAA> | I also tend to use the middle one. |
05:23:24 | <fireonlive> | from https://archive.eu/: |
05:23:24 | <fireonlive> | >The European Web Archive has been discontinued in Q4 2022. |
05:23:24 | <fireonlive> | >You can access historical websites via intelx.io. You will be redirected in 5 seconds. |
05:23:26 | <fireonlive> | oof. |
05:23:53 | <@JAA> | Yeah :-/ |
05:24:03 | <fireonlive> | >Intelligence X is an independent European technology company founded in 2018 by Peter Kleissner. The company is based in Prague, Czech Republic. Its mission is to develop and maintain the search engine and data archive. |
05:24:04 | <fireonlive> | ew |
05:24:08 | <fireonlive> | sad :/ |
05:25:31 | <fireonlive> | hm, they do tor/i2p as well though - interesting at the very least i guess |
05:43:38 | | Island quits [Read error: Connection reset by peer] |
06:01:59 | | pabs quits [Ping timeout: 255 seconds] |
06:06:07 | | pabs (pabs) joins |
06:19:51 | | wyatt8740 quits [Remote host closed the connection] |
06:22:45 | | wyatt8740 joins |
06:23:50 | | Ketchup901 quits [Remote host closed the connection] |
06:23:58 | | Ketchup901 (Ketchup901) joins |
06:27:33 | | wyatt8740 quits [Ping timeout: 272 seconds] |
06:27:53 | | wyatt8740 joins |
06:31:44 | | Ketchup901 quits [Remote host closed the connection] |
06:32:06 | | Ketchup901 (Ketchup901) joins |
06:33:09 | | Ketchup901 quits [Remote host closed the connection] |
06:33:20 | | Ketchup901 (Ketchup901) joins |
06:41:01 | | Ketchup901 quits [Remote host closed the connection] |
06:41:23 | | Ketchup901 (Ketchup901) joins |
07:22:55 | | bf_ joins |
07:23:03 | | bf_ quits [Remote host closed the connection] |
07:35:19 | | Ruthalas59 quits [Ping timeout: 272 seconds] |
07:43:46 | | Arcorann (Arcorann) joins |
07:46:43 | | cascode quits [Ping timeout: 272 seconds] |
07:48:41 | | cascode joins |
07:50:30 | | Carnildo quits [Remote host closed the connection] |
07:51:20 | | Doranwen quits [Ping timeout: 255 seconds] |
08:00:44 | | Ruthalas59 (Ruthalas) joins |
08:11:07 | | Carnildo joins |
08:11:21 | | Doranwen (Doranwen) joins |
08:15:56 | | cascode quits [Read error: Connection reset by peer] |
08:16:18 | | cascode joins |
08:28:54 | | sec^nd quits [Ping timeout: 255 seconds] |
08:37:22 | | sec^nd (second) joins |
08:38:17 | | sec^nd quits [Remote host closed the connection] |
08:38:35 | | sec^nd (second) joins |
08:42:25 | | sec^nd quits [Remote host closed the connection] |
08:42:59 | | sec^nd (second) joins |
08:50:49 | | qwertyasdfuiopghjkl quits [Remote host closed the connection] |
09:06:20 | | ace24x joins |
09:10:44 | | ace24x quits [Remote host closed the connection] |
09:11:01 | | ace24x joins |
09:11:16 | | ace24x quits [Remote host closed the connection] |
09:11:16 | | yonerboner quits [Read error: Connection reset by peer] |
09:13:04 | | yonerboner joins |
09:15:00 | | ace24x joins |
09:20:02 | | bf_ joins |
09:21:51 | | ace24x is now authenticated as ace24x |
09:35:17 | | qwertyasdfuiopghjkl (qwertyasdfuiopghjkl) joins |
10:00:02 | | Bleo18260 quits [Client Quit] |
10:01:14 | | Bleo18260 joins |
10:24:16 | <@OrIdow6> | Yeah, archive.eu is kind of an old reference at this point :P |
11:02:06 | <xkey> | hej, may I ask what the state of archival of vice.com is or where I can check on that? |
11:05:36 | | akrillic_ joins |
12:38:53 | <imer> | xkey: iirc there was a complete archival last year and newer articles have already been archived incrementally, there seems to be another full archivebot job running for vice.com too |
12:39:34 | <xkey> | imer: thanks! for the last part, are archivebot jobs public to look at? |
12:39:50 | <xkey> | meaning the state they are in |
12:40:43 | <imer> | xkey: yes http://archivebot.com/3 if you search for vice.com, there's also the non /3 ui, but i find this one easier to look at |
12:41:48 | <xkey> | imer: yea thanks a lot! |
12:51:59 | | Arcorann quits [Ping timeout: 272 seconds] |
13:10:50 | | ace24x quits [Client Quit] |
13:12:49 | | blue_0000ff quits [Read error: Connection reset by peer] |
13:13:10 | | blue_0000ff joins |
13:26:31 | | Maika (Maika) joins |
13:28:37 | <bf_> | oh boy |
13:28:45 | <bf_> | softwareheritage indexes their content blobs by sha1 |
13:28:49 | <bf_> | but they use the git blob sha1 |
13:28:57 | <bf_> | and I think git blob sha1 is different than sha1sum file |
13:29:05 | <bf_> | I hope I am wrong |
13:29:12 | <bf_> | https://archive.softwareheritage.org/browse/content/sha1_git:94a9ed024d3859793618152ea559a168bbcbb5e2/raw/ |
13:30:07 | <bf_> | "You see a difference because git hash-object doesn't just take a hash of the bytes in the file - it prepends the string "blob " followed by the file size and a NUL to the file's contents before hashing." |
13:30:08 | <bf_> | good lord |
13:31:14 | <bf_> | ok I'm an idiot :) their id's use git blob hashes but the api accepts query for sha1 and git_sha1 types. puh |
14:19:29 | | Wohlstand quits [Client Quit] |
15:37:45 | <AK> | https://transfer.archivete.am/ "Upload up to 500 MB", Is this limit correct? š¤I'm sure we've had larger files uploaded there |
15:39:22 | <nulldata> | Can someone throw https://www.supermassivegames.com/ into AB? Incoming layoffs. https://twitter.com/SuperMGames/status/1762125258159431750 |
15:39:23 | <eggdrop> | nitter: https://farside.link/nitter/SuperMGames/status/1762125258159431750 |
15:45:09 | <TheTechRobo> | AK: I don't think any of the limits on that page are correct |
15:45:20 | <TheTechRobo> | IIRC they're to discourage random people from using it as file storage |
16:22:07 | | HP_Archivist quits [Client Quit] |
17:07:04 | | etnguyen03 (etnguyen03) joins |
17:47:40 | | akrillic_ quits [Remote host closed the connection] |
18:01:33 | <@JAA> | Correct |
18:02:31 | <@JAA> | nulldata: Ack |
18:13:41 | | Mateon1 quits [Ping timeout: 255 seconds] |
18:38:14 | <pokechu22> | from #archivebot, haven't had a change to look into this yet: |
18:38:16 | <pokechu22> | 12:46 <moto> MotorTrend+ may shut down soon https://variety.com/2024/digital/news/motortrend-plus-shutting-down-discovery-plus-1235921047/ |
18:38:18 | <pokechu22> | 12:46 <moto> Please archive https://www.motortrend.com/plus |
18:38:20 | <pokechu22> | 12:46 <moto> https://help.motortrendondemand.com/hc/en-us |
18:38:22 | <pokechu22> | 12:46 <moto> Due to region locking, site only allowed American IP addresses |
18:39:41 | | magmaus3 quits [Ping timeout: 272 seconds] |
18:40:19 | | magmaus32 (magmaus3) joins |
19:47:21 | | Island joins |
19:57:11 | | Earendil7 quits [Ping timeout: 255 seconds] |
19:57:59 | | Earendil7 (Earendil7) joins |
20:10:41 | | etnguyen03 quits [Ping timeout: 255 seconds] |
20:18:59 | <Darken> | Could someone archive https://tjharman.com/ with archivebot please (reason being there's not much coverage on the site, lots of links from there archived but there's blog posts dating back to the 2000s that have not been archived on there) |
20:55:12 | | elid joins |
20:57:49 | | elid quits [Remote host closed the connection] |
21:21:11 | | cascode quits [Ping timeout: 272 seconds] |
21:24:21 | | cascode joins |
21:25:10 | | Maika quits [Client Quit] |
21:28:03 | | BlueMaxima joins |
21:31:50 | | cascode quits [Read error: Connection reset by peer] |
21:32:13 | | cascode joins |
22:00:52 | | ThetaDev quits [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.] |
22:01:05 | | ThetaDev joins |
22:44:47 | <pabs> | from #archivebot: |
22:44:48 | <pabs> | <vesz> Could you please archive following Japanese sites. Sorry for long long list and flooding. |
22:44:49 | <pabs> | <vesz> Mitsuki Imamura's Official Fan Club - Shutdown by March 31, 2024 https://imamuramitsuki-fc.jp/ |
22:44:49 | <pabs> | <vesz> Rilakkuma Official Fan Club shutdown by February 29, 2024 https://rilakkuma-tomonokai.jp/ |
22:44:49 | <pabs> | <vesz> Tokyo Metropolitan Industrial Technology Research Institute's Techno Knowledge Freeway (TKF) - Shutdown by March 31, 2024 https://www.iri-tokyo.jp/site/tkf/ |
22:44:52 | <pabs> | <vesz> Tavigator - travel agency - Shutdown by March 29, 2024 https://www.tavigator.co.jp/ |
22:44:54 | <pabs> | <vesz> DOD Store (Old site) - Shutdown by March 29, 2024 https://ec.dod.camp/ |
22:44:56 | <pabs> | <vesz> Daiichi Sankyo RD Novare Co. Ltd. - Shutdown due to merger, March 31, 2024 https://www.daiichisankyo-rdn.co.jp/ |
22:44:59 | <pabs> | <vesz> Futari Game Plus - Game show hosted by two voice actors - Website shutdown by June 28, 2024. https://nicochannel.jp/futarigameplus/ |
22:45:02 | <pabs> | <vesz> Orix Corp's Online lending service - website shutdown by March 31, 2024 https://biz.orix.co.jp/online-lending/ |
22:45:05 | <pabs> | <vesz> Sunaogatari - Food community shutdown by February 29, 2024 https://sunao-mom.coorum.jp/ |
22:45:07 | <pabs> | <vesz> Hiki Sta - Hikikomori-themed community shutdown by March 22, 2024 https://hkst.gr.jp/ |
22:45:09 | <pabs> | <vesz> Glenfield - Fashion retail shop shutdown by Feburuary 29, 2024 https://glenfield.co.jp/ |
22:45:11 | <pabs> | <vesz> NTT Docomo Business DX Store - Shutdown by February 28, 2024 but delayed to May 2024. https://biz-dxstore.docomo.ne.jp/ |
22:45:14 | <pabs> | <vesz> Itonowa - Handmade shop shutdown by April 25, 2024 https://itonowa-lifestyle.jp/ |
22:45:16 | <pabs> | <vesz> Marinero - Official fanclub of Bardral Urayasu futsal club shutdown by April 30, 2024 https://bardral-marinero.net/ |
22:45:21 | <pabs> | <vesz> Cho Sento Puroresu FMW - pro wrestling promotion website shutdown by February 29, 2024 https://fmw.jp/ |
22:45:24 | <pabs> | <vesz> Awesome Store - Bankruptcy filed in January and all shops closed by February 2024. https://awesome-store.jp/ |
22:45:27 | <pabs> | <pabs> vesz: I suggest adding them here, so we archive them just before their shutdown dates instead of now https://wiki.archiveteam.org/index.php/Deathwatch |
22:45:30 | <pabs> | <vesz> Ikuple - child's toy subscription service, shutdown by March 15, 2024 https://ikuple.com/ |
22:45:32 | <pabs> | <vesz> Leap Arrows Ltd Developer Blog - Shutdown by March 31, 2024 https://developer.leap-arrows.jp/ |
22:45:35 | <pabs> | <vesz> Sapporo Apollo Blog - Shutdown by March 2024 http://s-apollo.jugem.jp/ |
22:45:37 | <pabs> | <vesz> Nonbiri Tankentai 3 - Shutdown announced on May 2022 but still active http://tanken183.da-te.jp/ |
22:45:40 | <pabs> | <vesz> Japanese Black Army - Shutdown announced http://blog.livedoor.jp/hisui666/ |
22:45:42 | <pabs> | <vesz> Sitemap: http://blog.livedoor.jp/hisui666/sitemap.xml |
22:45:44 | <pabs> | <vesz> Yukicolo (Note) - Shutdown by March 2024 https://note.com/yukicolo/ |
22:45:46 | <pabs> | <vesz> Life of eSports Games - Shutdown by March 31, 2024 https://vistainfinity.sakura.ne.jp/ |
22:45:48 | <pabs> | <vesz> Twitter de Game Haishin - Shutdown announced on July 2022 but still alive https://meeyamow.com/ |
22:45:53 | <pabs> | <vesz> Hilltop (Yamanoue) Hotel - Closed for long renovation starting February 2, 2024 https://www.yamanoue-hotel.co.jp/ |
22:45:56 | <pabs> | <vesz> Cambridge Research Institute - merger announced will be shutdown by July 2024 https://cambridge-research.jp/ |
22:45:59 | <pabs> | <vesz> Fujiya Fukushima - merger announced will be shutdown by June 1, 2024 https://www.fujiya-fukushima.co.jp/ |
22:46:02 | <pabs> | <vesz> Sumilena - merger announced will be shutdown by April 1, 2024 https://www.sumilena.co.jp/ |
22:46:04 | <pabs> | <vesz> Monobit Engine - merger announced will be shutdown by April 1, 2024 https://www.monobitengine.com/ |
22:46:07 | <pabs> | <vesz> Habitus Care - merger announced will be shutdown by April 1, 2024 https://www.habituscare.co.jp/ |
22:46:10 | <pabs> | <vesz> Souzou - merger announced will be shutdown by April 1, 2024 https://souzoh.com/ |
22:48:15 | <nicolas17> | could have pastebin'd that even if the original poster didn't -.- |
22:48:16 | | etnguyen03 (etnguyen03) joins |
22:48:52 | <steering> | i know your client probably didnt show you but that took you over a minute to post btw |
22:49:44 | <@JAA> | Much easier to search for if it isn't behind a link that will probably be dead in a few years. |
22:56:56 | <pabs> | yeah, I prefer to have it in the logs than a pastebin |
22:57:07 | <pabs> | I expected it would take a while to paste |
23:09:29 | | etnguyen03 quits [Ping timeout: 272 seconds] |
23:34:15 | | marktheworst joins |
23:35:07 | <marktheworst> | I wanted to ask about a site that seems to be excluded from the Wayback Machine but not explicitly |
23:35:24 | <marktheworst> | Does it go on the List of websites excluded from the Wayback Machine wiki page? |
23:40:07 | <pokechu22> | What do you mean by that? |
23:43:49 | <marktheworst> | If you look it up on the Wayback Machine, it says "This page is unavailable for archiving. The server returned code: because server does not respond" |
23:44:03 | <marktheworst> | If you try to archive a new page, it won't show up |
23:44:30 | <marktheworst> | and will show that error message |
23:44:39 | <marktheworst> | no archives exist, seemingly all previous archives were nuked |
23:44:45 | <marktheworst> | The domain in question is divested.dev |
23:46:16 | <marktheworst> | this is different from a "Sorry. This URL has been excluded from the Wayback Machine." but the effect is still the same |
23:50:44 | <h2ibot> | JustAnotherArchivist edited List of websites excluded from the Wayback Machine/Former exclusions (+2941): https://wiki.archiveteam.org/?diff=51782&oldid=51336 |
23:50:45 | <h2ibot> | JustAnotherArchivist edited List of websites excluded from the Wayback Machine (-1139, Rechecked the entire list. Moved the unblockedā¦): https://wiki.archiveteam.org/?diff=51783&oldid=51778 |
23:51:12 | <@JAA> | Unblocked highlights: members.tripod.com, mitglied.lycos.de, snopes.com |
23:51:36 | <@JAA> | I think geocities.com was already known? |
23:52:04 | <@JAA> | marktheworst: Yes, that's the errors we're recording on the page I happen to have just beem going through. :-) |
23:52:07 | <@JAA> | been* |
23:53:41 | | etnguyen03 (etnguyen03) joins |
23:54:14 | <@JAA> | Additions always welcome! :-) |
23:54:30 | <@JAA> | Well, not really, because it sucks when a list needs to be listed there, but... |
23:54:40 | <marktheworst> | so errors like divested.dev should be recorded too? |
23:56:16 | <@JAA> | Oh, wait, I misread this entirely. Oops. |
23:57:06 | <marktheworst> | Haha no worries |
23:57:18 | <@JAA> | Are you sure there were previous snapshots? |
23:58:04 | <@JAA> | I see that a snapshot of the homepage was made an hour ago (by you?), but it isn't indexed yet. SPN returns that 'delay in registering this snapshot' message. |
23:58:07 | <fireonlive> | :( syd is still on there |
23:58:36 | <fireonlive> | some nice removals thjough :) |
23:58:54 | <marktheworst> | JAA: yeah I tried to make that snapshot, does it usually take so long to index? |
23:59:00 | <pokechu22> | The "This page is unavailable for archiving" message is generally when it runs into an error checking if the page is currently online and capable of being saved (and I'm guessing with the amount of load currently also affecting web.archive.org/save that check is failing randomly) |
23:59:07 | <@JAA> | Not usually, no, but it happens fairly regularly. |
23:59:27 | <@JAA> | Sometimes, it resolves itself within hours. Sometimes, days. |
23:59:44 | <pokechu22> | sites that are excluded from being saved get a message saying that (though note that sites being excluded from being saved and being excluded from being viewed are different) |
23:59:44 | <marktheworst> | JAA: I can't say for sure but there are plenty of snapshots on archive.ph, I would be surprised if divested somehow never got crawled by WB |