00:01:48<@JAA>Spinrilla's CDN is still working, but the website is now redirecting to a Google search for 'animal adoption near me'. Classy.
00:02:47<@JAA>API's still up.
00:05:31<@JAA>Haha qwarc goes brrrrr
00:07:45<Jake>Getting CF errors now.
00:08:39<@JAA>Yup, same :-(
00:09:57<@JAA>Oh, I guess I can run the collected user avatars through AB.
00:11:40<@JAA>api.spinrilla.com is now NXDOMAIN.
00:11:51<nicolas17>D:
00:12:37<@JAA>They're pretty quick.
00:13:13<nicolas17>was that the deadline?
00:13:23<@JAA>00:00
00:13:45<@JAA>Oh well, we did what we could. Metadata on up to 450k tracks is missing (though a good chunk will be in mixtape metadata or 404s).
00:15:54<@JAA>I got the first Google redirect at 00:00:28.479Z.
00:20:06Ruthalas5 quits [Client Quit]
00:20:27Ruthalas5 (Ruthalas) joins
00:32:05vitzli quits [Client Quit]
00:43:08TastyWiener95 (TastyWiener95) joins
00:58:13rubberduck quits [Remote host closed the connection]
01:16:29Guest50 quits [Client Quit]
01:30:47dumbgoy_ quits [Read error: Connection reset by peer]
01:31:51dumbgoy joins
01:36:26<pabs>JAA: re opensource.com, the AB continues. the site is much bigger than I thought, not sure when it will finish, might be worth you doing the download stuff earlier rather than later just in case
01:58:01<@JAA>pabs: Hmm, data is in the upload queue, and I can't easily pull it from there to somewhere where I can process it at the moment. :-/
02:04:19Letur quits [Ping timeout: 265 seconds]
02:05:46tzt quits [Ping timeout: 265 seconds]
02:08:35tzt (tzt) joins
02:09:13<nicolas17>what project can my digitalocean bandwidth help with?
02:12:29rubberduck joins
02:22:02<datechnoman>#urlteam #// and #telegrab nicolas17
02:22:12<datechnoman>#shreddit should be back up and running shortly but not sure when
02:22:27<pabs>JAA: maybe just do the downloads/ and book sections early and leave the AB WARC processing for later?
02:26:25<@JAA>pabs: Guess so. Will do later.
02:26:40<pabs>cool, thanks!
02:43:17Letur joins
03:00:40dumbgoy_ joins
03:03:50dumbgoy quits [Ping timeout: 252 seconds]
03:09:02<@arkiver>JAA: a note that is always great to read :)
03:42:40Guest50 joins
04:07:04Guest50 quits [Client Quit]
04:37:43marxist_redneck79 joins
04:38:21marxist_redneck79 quits [Remote host closed the connection]
04:42:34marxist_redneck joins
04:59:55redbr joins
05:02:49marxist_redneck quits [Ping timeout: 252 seconds]
05:04:41redbr quits [Client Quit]
05:04:52redbr joins
05:06:58redbr quits [Client Quit]
05:08:06marxist_redneck joins
05:19:32Rotietip joins
05:24:12qwertyasdfuiopghjkl (qwertyasdfuiopghjkl) joins
05:24:16<Rotietip>Hello, a few years ago I downloaded several files from a site in WARC format and uploaded them to https://archive.org/details/gratislibros-books How do I make them visible from the Wayback Machine?
05:24:16<Rotietip>Because when checking http://web.archive.org/web/*/http://escolar1.com.ar/libros* most of the archives inside the WARC are still not indexed.
05:26:38<nicolas17>Rotietip: I think only WARCs uploaded by "trusted" users (such as IA's own crawler or archiveteam projects) appear on the wayback machine
05:28:07marxist_redneck quits [Ping timeout: 252 seconds]
05:39:46<Rotietip>What about https://gist.github.com/Asparagirl/6206247 ? Is it still valid? If so, does it mean that I would have to delete the item and upload it again with this script?
05:43:36<Jake>No, not really.
05:47:22Rotietip quits [Ping timeout: 252 seconds]
06:07:56Island quits [Read error: Connection reset by peer]
06:10:27lexikiq quits [Client Quit]
06:10:55marxist_redneck joins
06:18:10Rotietip joins
06:25:04spirit joins
06:41:48<Rotietip>According to https://wiki.archiveteam.org/index.php/Frequently_Asked_Questions#halp_pls_halp I must have a "whitelisted" account in Internet Archive for the content of a WARC to be indexed in Wayback Machine. Now, where should I request that (the FAQ doesn't make it clear) and how long does it take to get it approved?
06:45:40marxist_redneck quits [Ping timeout: 252 seconds]
06:51:06nic (nic) joins
06:52:56s-crypt|m joins
06:54:21Arcorann (Arcorann) joins
06:58:09Aoede quits [Quit: ZNC - https://znc.in]
06:59:03Aoede (Aoede) joins
07:00:02nfriedly quits [Remote host closed the connection]
07:08:24pabs quits [Ping timeout: 252 seconds]
07:09:04pabs (pabs) joins
07:10:03xkey quits [Quit: xkey]
07:13:41xkey (xkey) joins
07:19:13pabs quits [Ping timeout: 252 seconds]
07:19:52icedice (icedice) joins
07:38:55pabs (pabs) joins
08:34:49ymgve_ joins
08:37:52ymgve quits [Ping timeout: 252 seconds]
08:40:26lennier1 quits [Ping timeout: 252 seconds]
08:41:12lennier1 (lennier1) joins
09:05:03BlueMaxima quits [Client Quit]
09:14:44nfriedly joins
09:30:40TastyWiener95 quits [Ping timeout: 252 seconds]
10:16:08dumbgoy_ quits [Ping timeout: 252 seconds]
11:03:27imer joins
11:26:32systwi_ joins
11:44:24Rotietip quits [Client Quit]
11:53:13sonick (sonick) joins
12:00:54dumbgoy_ joins
12:11:06Chris5010 (Chris5010) joins
12:13:28Nulo quits [Ping timeout: 252 seconds]
12:26:05ehmry joins
12:34:52Rotietip joins
12:51:01Rotietip quits [Ping timeout: 265 seconds]
13:28:38Arcorann quits [Ping timeout: 252 seconds]
13:48:05Guest50 joins
14:24:53HP_Archivist (HP_Archivist) joins
14:36:00Island joins
14:36:05HP_Archivist quits [Client Quit]
14:38:23marxist_redneck joins
14:38:52Guest50 quits [Client Quit]
14:40:58Guest50 joins
14:47:05Guest50 quits [Client Quit]
14:51:29Guest50 joins
14:55:03imer quits [Remote host closed the connection]
14:55:18imer joins
15:04:43hitgrr8 joins
15:07:41imer quits [Changing host]
15:07:41imer (imer) joins
15:08:52imer quits [Client Quit]
15:09:11imer (imer) joins
15:25:09redbr joins
15:25:12marxist_redneck quits [Ping timeout: 265 seconds]
15:32:12Guest50 quits [Client Quit]
15:58:31zhongfu quits [Client Quit]
16:02:59zhongfu (zhongfu) joins
16:03:21zhongfu quits [Client Quit]
16:03:26Nulo joins
16:13:54nostalgebraist joins
16:14:26andrew quits [Quit: ]
16:14:44andrew (andrew) joins
16:15:52zhongfu (zhongfu) joins
16:18:47redbr quits [Remote host closed the connection]
16:36:59<@arkiver>I want to set something straight - for quite some time people seem to have been under the impression that accounts on IA need to be 'whitelisted' to get their WARCs into the Wayback Machine. This is not true. The rules for getting a WARC indexed into the Wayback Machine are: 1. the WARC should be in a mediatype=web item, and 2. this item should be in a web collection.
16:51:49andrew quits [Client Quit]
16:52:06andrew (andrew) joins
16:55:15Guest50 joins
16:59:29NF885 joins
17:00:48<NF885>inactive Twitter account are being purged according to https://twitter.com/elonmusk/status/1655608985058267139?t=xOEoj0itED8FYRM6fiASpw&s=19 to
17:04:24NF885 quits [Remote host closed the connection]
17:08:02<@arkiver>right now?
17:09:02<andrew>arkiver: seems like it, the tweet is written in present tense
17:10:06<@arkiver>yeah
17:10:08<@arkiver>fuck musk
17:10:44<Exorcism|m>ban musk pls
17:16:44<nicolas17>send him to mars
17:17:23<@arkiver>if anyone knows of an affected account, please let me know
17:19:15<Exorcism|m>nicolas17: LMAO
17:23:07<andrew>arkiver: seems a bit difficult to figure out who is affected until the account disappears :(
17:23:45<@arkiver>andrew: yeah, if anyone does know of one or more accounts after they have been deleted, let me know
17:23:54<@arkiver>the more the better. but indeed difficult
17:23:59<h2ibot>JustAnotherArchivist edited Imgur (+45, Add IA collection): https://wiki.archiveteam.org/?diff=49751&oldid=49714
17:29:56Guest50_ joins
17:32:15Guest50_ quits [Client Quit]
17:32:28Guest50 quits [Ping timeout: 252 seconds]
17:57:04<h2ibot>0KepOnline edited Spore (+53, Added /atom/news): https://wiki.archiveteam.org/?diff=49752&oldid=49743
17:57:32elon joins
17:58:27TastyWiener95 (TastyWiener95) joins
18:00:50TastyWiener95 quits [Client Quit]
18:01:16TastyWiener95 (TastyWiener95) joins
18:01:27TastyWiener95 quits [Client Quit]
18:02:52TastyWiener95 (TastyWiener95) joins
18:05:05Guest50 joins
18:07:53BearFortress_ quits [Client Quit]
18:17:50Jake quits [Quit: Leaving for a bit!]
18:18:05Jake (Jake) joins
18:18:44BearFortress joins
18:23:58Guest50 quits [Client Quit]
18:40:01BigBrain (bigbrain) joins
18:40:38ZizzyDizzyMC joins
18:42:37Guest50 joins
18:45:10sunny_starscout joins
18:55:59soundguy7440 joins
18:59:15<h2ibot>Entartet edited Twitter (+189, /* Vital Signs */ On 8 May 2023, Elon Musk…): https://wiki.archiveteam.org/?diff=49753&oldid=49715
18:59:18Guest50 quits [Client Quit]
19:03:49Guest50 joins
20:06:24<mgrandi>We have no idea what that means because musk doesn't even know probably
20:06:57<mgrandi>It could be anything from 0 post accounts that haven't logged in in 10 years to NPR (presumably not logged in in a few weeks)
20:08:01<@JAA>NPR probably does log in regularly precisely to get around the inactivity policy (which says that a login is sufficient, no need for posting).
20:08:44<mgrandi>Yeah, they did email NPR saying "lol should we get rid of your account" so it's at least threatened that they will start reassigning usernames
20:09:29<mgrandi>But I feel that there is at least a grace period because twitter has never had a option to reassign a username, even after it's banned, so they have to program <whatever> first
20:09:31<@JAA>I hope they replied with a single 💩 emoji like Twitter's press email address does.
20:12:22hitgrr8 quits [Client Quit]
20:15:53sonick quits [Client Quit]
20:17:23<madpro|m>Oh and of course, I guess anyone is welcome to use JAA's snscrape on accounts at risk
20:17:28<madpro|m>https://github.com/JustAnotherArchivist/snscrape
20:17:56<madpro|m>> snscrape --jsonl twitter-user elonmusk > muskytweets.json
20:18:32<@JAA>Yeah, except it's broken currently because Twitter changed shit.
20:19:26<nicolas17>madpro|m: just have to pay $40k for access to the API!
20:19:40<madpro|m>Ah! It broke, yeah
20:20:46<madpro|m>Funny how they aligned the API change and "date TBA" purges like this on purpose,
20:20:48<madpro|m> seeing as Musk never stroke as the kind of guy to be adverse specifically against archivists.
20:20:52<madpro|m>* me
20:22:08<@JAA>He does care for having a dataset that other 'AI' companies 'can't' train on.
20:34:16elegan joins
20:35:01TastyWiener95 quits [Ping timeout: 265 seconds]
20:52:56htesttwi joins
20:56:04<icee>I'm new to archiveteam.. I get a whole lot of errors like @ERROR: max connections (-1) reached -- try again later
20:56:07<icee>rsync error: error starting client-server protocol (code 5) at main.c(1863) [sender=3.2.7]
20:56:11<icee>And workers getting stuck for awhile
20:56:20<icee>is this normal? Is rsync capacity constrained?
20:57:01<nicolas17>yeah rsync servers got full, it will work eventually as they manage to upload to archive.org
21:05:12<htesttwi>Hello, I'm just visiting this channel because I'm trying to archive all tweets from a deceased user and stumbled upon this on the archiveteam.org wiki. I'm still trying to find my way but I should be fine.
21:12:33leo60228 quits [Quit: ZNC 1.8.2 - https://znc.in]
21:12:55leo60228 (leo60228) joins
21:21:25<mgrandi>@htesttwi: the snscrape project is what you are looking for however it's currently broken due to twitter changes
21:23:23soundguy7440 quits [Remote host closed the connection]
21:24:24<@JAA>The twitter-profile scraper still works (sometimes), but it is limited to about 3200 results, so if the account has more tweets than that (including replies and retweets), you can't get everything with it.
21:28:37imer4 (imer) joins
21:30:04imer quits [Ping timeout: 252 seconds]
21:30:04imer4 is now known as imer
21:40:16elegan quits [Ping timeout: 265 seconds]
21:56:32nostalgebraist quits [Client Quit]
22:04:52<lennier1>Were you saying there was a way to make Twitter search work without an account JAA? I know how to get lists of user tweet URLs when logged in, but that's really slow and doesn't scale well.
22:05:10<@JAA>lennier1: Yes, it's being worked on.
23:04:40Guest50 quits [Client Quit]
23:05:54Guest50 joins
23:10:34jamesp (jamesp) joins
23:14:22umgr036 joins
23:15:16umgr036 quits [Remote host closed the connection]
23:15:29umgr036 joins
23:18:34BlueMaxima joins
23:27:00tartarus joins
23:29:12sonick (sonick) joins
23:35:18whoami quits [Ping timeout: 265 seconds]
23:46:56<mgrandi>@JAA: apparently the handles will be freed up but the accounts will be "archived" https://twitter.com/elonmusk/status/1655720120440823809?s=20
23:47:08<mgrandi>Wonder if it's something like Tumblr where they just rename the account
23:58:05whoami (whoami) joins