| 00:01:48 | <@JAA> | Spinrilla's CDN is still working, but the website is now redirecting to a Google search for 'animal adoption near me'. Classy. |
| 00:02:47 | <@JAA> | API's still up. |
| 00:05:31 | <@JAA> | Haha qwarc goes brrrrr |
| 00:07:45 | <Jake> | Getting CF errors now. |
| 00:08:39 | <@JAA> | Yup, same :-( |
| 00:09:57 | <@JAA> | Oh, I guess I can run the collected user avatars through AB. |
| 00:11:40 | <@JAA> | api.spinrilla.com is now NXDOMAIN. |
| 00:11:51 | <nicolas17> | D: |
| 00:12:37 | <@JAA> | They're pretty quick. |
| 00:13:13 | <nicolas17> | was that the deadline? |
| 00:13:23 | <@JAA> | 00:00 |
| 00:13:45 | <@JAA> | Oh well, we did what we could. Metadata on up to 450k tracks is missing (though a good chunk will be in mixtape metadata or 404s). |
| 00:15:54 | <@JAA> | I got the first Google redirect at 00:00:28.479Z. |
| 00:20:06 | | Ruthalas5 quits [Client Quit] |
| 00:20:27 | | Ruthalas5 (Ruthalas) joins |
| 00:32:05 | | vitzli quits [Client Quit] |
| 00:43:08 | | TastyWiener95 (TastyWiener95) joins |
| 00:58:13 | | rubberduck quits [Remote host closed the connection] |
| 01:16:29 | | Guest50 quits [Client Quit] |
| 01:30:47 | | dumbgoy_ quits [Read error: Connection reset by peer] |
| 01:31:51 | | dumbgoy joins |
| 01:36:26 | <pabs> | JAA: re opensource.com, the AB continues. the site is much bigger than I thought, not sure when it will finish, might be worth you doing the download stuff earlier rather than later just in case |
| 01:58:01 | <@JAA> | pabs: Hmm, data is in the upload queue, and I can't easily pull it from there to somewhere where I can process it at the moment. :-/ |
| 02:04:19 | | Letur quits [Ping timeout: 265 seconds] |
| 02:05:46 | | tzt quits [Ping timeout: 265 seconds] |
| 02:08:35 | | tzt (tzt) joins |
| 02:09:13 | <nicolas17> | what project can my digitalocean bandwidth help with? |
| 02:12:29 | | rubberduck joins |
| 02:22:02 | <datechnoman> | #urlteam #// and #telegrab nicolas17 |
| 02:22:12 | <datechnoman> | #shreddit should be back up and running shortly but not sure when |
| 02:22:27 | <pabs> | JAA: maybe just do the downloads/ and book sections early and leave the AB WARC processing for later? |
| 02:26:25 | <@JAA> | pabs: Guess so. Will do later. |
| 02:26:40 | <pabs> | cool, thanks! |
| 02:43:17 | | Letur joins |
| 03:00:40 | | dumbgoy_ joins |
| 03:03:50 | | dumbgoy quits [Ping timeout: 252 seconds] |
| 03:09:02 | <@arkiver> | JAA: a note that is always great to read :) |
| 03:42:40 | | Guest50 joins |
| 04:07:04 | | Guest50 quits [Client Quit] |
| 04:37:43 | | marxist_redneck79 joins |
| 04:38:21 | | marxist_redneck79 quits [Remote host closed the connection] |
| 04:42:34 | | marxist_redneck joins |
| 04:59:55 | | redbr joins |
| 05:02:49 | | marxist_redneck quits [Ping timeout: 252 seconds] |
| 05:04:41 | | redbr quits [Client Quit] |
| 05:04:52 | | redbr joins |
| 05:06:58 | | redbr quits [Client Quit] |
| 05:08:06 | | marxist_redneck joins |
| 05:19:32 | | Rotietip joins |
| 05:24:12 | | qwertyasdfuiopghjkl (qwertyasdfuiopghjkl) joins |
| 05:24:16 | <Rotietip> | Hello, a few years ago I downloaded several files from a site in WARC format and uploaded them to https://archive.org/details/gratislibros-books How do I make them visible from the Wayback Machine? |
| 05:24:16 | <Rotietip> | Because when checking http://web.archive.org/web/*/http://escolar1.com.ar/libros* most of the archives inside the WARC are still not indexed. |
| 05:26:38 | <nicolas17> | Rotietip: I think only WARCs uploaded by "trusted" users (such as IA's own crawler or archiveteam projects) appear on the wayback machine |
| 05:28:07 | | marxist_redneck quits [Ping timeout: 252 seconds] |
| 05:39:46 | <Rotietip> | What about https://gist.github.com/Asparagirl/6206247 ? Is it still valid? If so, does it mean that I would have to delete the item and upload it again with this script? |
| 05:43:36 | <Jake> | No, not really. |
| 05:47:22 | | Rotietip quits [Ping timeout: 252 seconds] |
| 06:07:56 | | Island quits [Read error: Connection reset by peer] |
| 06:10:27 | | lexikiq quits [Client Quit] |
| 06:10:55 | | marxist_redneck joins |
| 06:18:10 | | Rotietip joins |
| 06:25:04 | | spirit joins |
| 06:41:48 | <Rotietip> | According to https://wiki.archiveteam.org/index.php/Frequently_Asked_Questions#halp_pls_halp I must have a "whitelisted" account in Internet Archive for the content of a WARC to be indexed in Wayback Machine. Now, where should I request that (the FAQ doesn't make it clear) and how long does it take to get it approved? |
| 06:45:40 | | marxist_redneck quits [Ping timeout: 252 seconds] |
| 06:51:06 | | nic (nic) joins |
| 06:52:56 | | s-crypt|m joins |
| 06:54:21 | | Arcorann (Arcorann) joins |
| 06:58:09 | | Aoede quits [Quit: ZNC - https://znc.in] |
| 06:59:03 | | Aoede (Aoede) joins |
| 07:00:02 | | nfriedly quits [Remote host closed the connection] |
| 07:08:24 | | pabs quits [Ping timeout: 252 seconds] |
| 07:09:04 | | pabs (pabs) joins |
| 07:10:03 | | xkey quits [Quit: xkey] |
| 07:13:41 | | xkey (xkey) joins |
| 07:19:13 | | pabs quits [Ping timeout: 252 seconds] |
| 07:19:52 | | icedice (icedice) joins |
| 07:38:55 | | pabs (pabs) joins |
| 08:34:49 | | ymgve_ joins |
| 08:37:52 | | ymgve quits [Ping timeout: 252 seconds] |
| 08:40:26 | | lennier1 quits [Ping timeout: 252 seconds] |
| 08:41:12 | | lennier1 (lennier1) joins |
| 09:05:03 | | BlueMaxima quits [Client Quit] |
| 09:14:44 | | nfriedly joins |
| 09:30:40 | | TastyWiener95 quits [Ping timeout: 252 seconds] |
| 10:16:08 | | dumbgoy_ quits [Ping timeout: 252 seconds] |
| 11:03:27 | | imer joins |
| 11:26:32 | | systwi_ joins |
| 11:44:24 | | Rotietip quits [Client Quit] |
| 11:53:13 | | sonick (sonick) joins |
| 12:00:54 | | dumbgoy_ joins |
| 12:11:06 | | Chris5010 (Chris5010) joins |
| 12:13:28 | | Nulo quits [Ping timeout: 252 seconds] |
| 12:26:05 | | ehmry joins |
| 12:34:52 | | Rotietip joins |
| 12:51:01 | | Rotietip quits [Ping timeout: 265 seconds] |
| 13:28:38 | | Arcorann quits [Ping timeout: 252 seconds] |
| 13:48:05 | | Guest50 joins |
| 14:24:53 | | HP_Archivist (HP_Archivist) joins |
| 14:36:00 | | Island joins |
| 14:36:05 | | HP_Archivist quits [Client Quit] |
| 14:38:23 | | marxist_redneck joins |
| 14:38:52 | | Guest50 quits [Client Quit] |
| 14:40:58 | | Guest50 joins |
| 14:47:05 | | Guest50 quits [Client Quit] |
| 14:51:29 | | Guest50 joins |
| 14:55:03 | | imer quits [Remote host closed the connection] |
| 14:55:18 | | imer joins |
| 15:04:43 | | hitgrr8 joins |
| 15:05:11 | | imer is now authenticated as imer |
| 15:07:41 | | imer quits [Changing host] |
| 15:07:41 | | imer (imer) joins |
| 15:08:52 | | imer quits [Client Quit] |
| 15:09:11 | | imer (imer) joins |
| 15:25:09 | | redbr joins |
| 15:25:12 | | marxist_redneck quits [Ping timeout: 265 seconds] |
| 15:32:12 | | Guest50 quits [Client Quit] |
| 15:58:31 | | zhongfu quits [Client Quit] |
| 16:02:59 | | zhongfu (zhongfu) joins |
| 16:03:21 | | zhongfu quits [Client Quit] |
| 16:03:26 | | Nulo joins |
| 16:13:54 | | nostalgebraist joins |
| 16:14:26 | | andrew quits [Quit: ] |
| 16:14:44 | | andrew (andrew) joins |
| 16:15:52 | | zhongfu (zhongfu) joins |
| 16:18:47 | | redbr quits [Remote host closed the connection] |
| 16:36:59 | <@arkiver> | I want to set something straight - for quite some time people seem to have been under the impression that accounts on IA need to be 'whitelisted' to get their WARCs into the Wayback Machine. This is not true. The rules for getting a WARC indexed into the Wayback Machine are: 1. the WARC should be in a mediatype=web item, and 2. this item should be in a web collection. |
| 16:51:49 | | andrew quits [Client Quit] |
| 16:52:06 | | andrew (andrew) joins |
| 16:55:15 | | Guest50 joins |
| 16:59:29 | | NF885 joins |
| 17:00:48 | <NF885> | inactive Twitter account are being purged according to https://twitter.com/elonmusk/status/1655608985058267139?t=xOEoj0itED8FYRM6fiASpw&s=19 to |
| 17:04:24 | | NF885 quits [Remote host closed the connection] |
| 17:08:02 | <@arkiver> | right now? |
| 17:09:02 | <andrew> | arkiver: seems like it, the tweet is written in present tense |
| 17:10:06 | <@arkiver> | yeah |
| 17:10:08 | <@arkiver> | fuck musk |
| 17:10:44 | <Exorcism|m> | ban musk pls |
| 17:16:44 | <nicolas17> | send him to mars |
| 17:17:23 | <@arkiver> | if anyone knows of an affected account, please let me know |
| 17:19:15 | <Exorcism|m> | nicolas17: LMAO |
| 17:23:07 | <andrew> | arkiver: seems a bit difficult to figure out who is affected until the account disappears :( |
| 17:23:45 | <@arkiver> | andrew: yeah, if anyone does know of one or more accounts after they have been deleted, let me know |
| 17:23:54 | <@arkiver> | the more the better. but indeed difficult |
| 17:23:59 | <h2ibot> | JustAnotherArchivist edited Imgur (+45, Add IA collection): https://wiki.archiveteam.org/?diff=49751&oldid=49714 |
| 17:29:56 | | Guest50_ joins |
| 17:32:15 | | Guest50_ quits [Client Quit] |
| 17:32:28 | | Guest50 quits [Ping timeout: 252 seconds] |
| 17:57:04 | <h2ibot> | 0KepOnline edited Spore (+53, Added /atom/news): https://wiki.archiveteam.org/?diff=49752&oldid=49743 |
| 17:57:32 | | elon joins |
| 17:58:27 | | TastyWiener95 (TastyWiener95) joins |
| 18:00:50 | | TastyWiener95 quits [Client Quit] |
| 18:01:16 | | TastyWiener95 (TastyWiener95) joins |
| 18:01:27 | | TastyWiener95 quits [Client Quit] |
| 18:02:52 | | TastyWiener95 (TastyWiener95) joins |
| 18:05:05 | | Guest50 joins |
| 18:07:53 | | BearFortress_ quits [Client Quit] |
| 18:17:50 | | Jake quits [Quit: Leaving for a bit!] |
| 18:18:05 | | Jake (Jake) joins |
| 18:18:44 | | BearFortress joins |
| 18:23:58 | | Guest50 quits [Client Quit] |
| 18:40:01 | | BigBrain (bigbrain) joins |
| 18:40:38 | | ZizzyDizzyMC joins |
| 18:42:37 | | Guest50 joins |
| 18:45:10 | | sunny_starscout joins |
| 18:55:59 | | soundguy7440 joins |
| 18:59:15 | <h2ibot> | Entartet edited Twitter (+189, /* Vital Signs */ On 8 May 2023, Elon Musk…): https://wiki.archiveteam.org/?diff=49753&oldid=49715 |
| 18:59:18 | | Guest50 quits [Client Quit] |
| 19:03:49 | | Guest50 joins |
| 20:06:24 | <mgrandi> | We have no idea what that means because musk doesn't even know probably |
| 20:06:57 | <mgrandi> | It could be anything from 0 post accounts that haven't logged in in 10 years to NPR (presumably not logged in in a few weeks) |
| 20:08:01 | <@JAA> | NPR probably does log in regularly precisely to get around the inactivity policy (which says that a login is sufficient, no need for posting). |
| 20:08:44 | <mgrandi> | Yeah, they did email NPR saying "lol should we get rid of your account" so it's at least threatened that they will start reassigning usernames |
| 20:09:29 | <mgrandi> | But I feel that there is at least a grace period because twitter has never had a option to reassign a username, even after it's banned, so they have to program <whatever> first |
| 20:09:31 | <@JAA> | I hope they replied with a single 💩 emoji like Twitter's press email address does. |
| 20:12:22 | | hitgrr8 quits [Client Quit] |
| 20:15:53 | | sonick quits [Client Quit] |
| 20:17:23 | <madpro|m> | Oh and of course, I guess anyone is welcome to use JAA's snscrape on accounts at risk |
| 20:17:28 | <madpro|m> | https://github.com/JustAnotherArchivist/snscrape |
| 20:17:56 | <madpro|m> | > snscrape --jsonl twitter-user elonmusk > muskytweets.json |
| 20:18:32 | <@JAA> | Yeah, except it's broken currently because Twitter changed shit. |
| 20:19:26 | <nicolas17> | madpro|m: just have to pay $40k for access to the API! |
| 20:19:40 | <madpro|m> | Ah! It broke, yeah |
| 20:20:46 | <madpro|m> | Funny how they aligned the API change and "date TBA" purges like this on purpose, |
| 20:20:48 | <madpro|m> | seeing as Musk never stroke as the kind of guy to be adverse specifically against archivists. |
| 20:20:52 | <madpro|m> | * me |
| 20:22:08 | <@JAA> | He does care for having a dataset that other 'AI' companies 'can't' train on. |
| 20:34:16 | | elegan joins |
| 20:35:01 | | TastyWiener95 quits [Ping timeout: 265 seconds] |
| 20:52:56 | | htesttwi joins |
| 20:56:04 | <icee> | I'm new to archiveteam.. I get a whole lot of errors like @ERROR: max connections (-1) reached -- try again later |
| 20:56:07 | <icee> | rsync error: error starting client-server protocol (code 5) at main.c(1863) [sender=3.2.7] |
| 20:56:11 | <icee> | And workers getting stuck for awhile |
| 20:56:20 | <icee> | is this normal? Is rsync capacity constrained? |
| 20:57:01 | <nicolas17> | yeah rsync servers got full, it will work eventually as they manage to upload to archive.org |
| 21:05:12 | <htesttwi> | Hello, I'm just visiting this channel because I'm trying to archive all tweets from a deceased user and stumbled upon this on the archiveteam.org wiki. I'm still trying to find my way but I should be fine. |
| 21:12:33 | | leo60228 quits [Quit: ZNC 1.8.2 - https://znc.in] |
| 21:12:55 | | leo60228 (leo60228) joins |
| 21:21:25 | <mgrandi> | @htesttwi: the snscrape project is what you are looking for however it's currently broken due to twitter changes |
| 21:23:23 | | soundguy7440 quits [Remote host closed the connection] |
| 21:24:24 | <@JAA> | The twitter-profile scraper still works (sometimes), but it is limited to about 3200 results, so if the account has more tweets than that (including replies and retweets), you can't get everything with it. |
| 21:28:37 | | imer4 (imer) joins |
| 21:30:04 | | imer quits [Ping timeout: 252 seconds] |
| 21:30:04 | | imer4 is now known as imer |
| 21:40:16 | | elegan quits [Ping timeout: 265 seconds] |
| 21:56:32 | | nostalgebraist quits [Client Quit] |
| 22:04:52 | <lennier1> | Were you saying there was a way to make Twitter search work without an account JAA? I know how to get lists of user tweet URLs when logged in, but that's really slow and doesn't scale well. |
| 22:05:10 | <@JAA> | lennier1: Yes, it's being worked on. |
| 23:04:40 | | Guest50 quits [Client Quit] |
| 23:05:54 | | Guest50 joins |
| 23:10:34 | | jamesp (jamesp) joins |
| 23:14:22 | | umgr036 joins |
| 23:15:16 | | umgr036 quits [Remote host closed the connection] |
| 23:15:29 | | umgr036 joins |
| 23:18:34 | | BlueMaxima joins |
| 23:27:00 | | tartarus joins |
| 23:29:12 | | sonick (sonick) joins |
| 23:35:18 | | whoami quits [Ping timeout: 265 seconds] |
| 23:46:56 | <mgrandi> | @JAA: apparently the handles will be freed up but the accounts will be "archived" https://twitter.com/elonmusk/status/1655720120440823809?s=20 |
| 23:47:08 | <mgrandi> | Wonder if it's something like Tumblr where they just rename the account |
| 23:58:05 | | whoami (whoami) joins |