00:17:39qwertyasdfuiopghjkl quits [Remote host closed the connection]
00:18:28jacobk quits [Ping timeout: 268 seconds]
00:19:31upintheairsheep joins
00:24:09upintheairsheep quits [Remote host closed the connection]
00:24:32tomorrowInstallment joins
00:24:47upintheairsheep joins
00:25:47<upintheairsheep>lennier1 Thank you, writing a scraper would be super easy thanks to your help.
00:26:48<tomorrowInstallment>hello hello... at the risk of being told off because I'm probably the tenth person to ask this today, but how are we looking in terms of saving twitter?
00:28:27<TheTechRobo>tomorrowInstallment: Would likely take too long to make archiving EVERYTHING feasible, although there might be news in that regard.
00:28:38<andrew>tomorrowInstallment: surprisingly, you are not the tenth person to ask this. as far as I know, AT has not started an proper Twitter archival project. I have personally been working on a Twitter scraper, and I think I may have a workable prototype
00:29:01<andrew>but I'm going to need a lot more resources than I have right now to actually make it happen
00:29:04<TheTechRobo>andrew: WARC?
00:29:30<andrew>I don't think WARC makes sense for this. we don't have enough space to store full web pages
00:29:47<andrew>and the webpages themselves are JavaScript rendered
00:29:55<andrew>so the least bad solution appears to be saving API JSON
00:30:19Atom__ joins
00:31:10<tomorrowInstallment>I'd love to hack together something that warriors can get to work on.
00:31:10upintheairsheep quits [Remote host closed the connection]
00:31:10Atom-- quits [Read error: Connection reset by peer]
00:31:23lennier2 joins
00:31:25upintheairsheep joins
00:31:53<andrew>based on my calculations and a Pushshift analysis of Snowflake IDs, if you manage to juggle enough guest tokens, it is definitely feasible to archive around 99% of all tweets made since 2017 within the next few months
00:32:42lennier2_ joins
00:33:18tomorrowRemoval joins
00:33:49<tomorrowRemoval>Hello, I was tomorrowInstallment (seems like my internet has decided to die)
00:34:02<andrew>to repeat, based on my calculations and a Pushshift analysis of Snowflake IDs, if you manage to juggle enough guest tokens, it is definitely feasible to archive around 99% of all tweets made since 2017 within the next few months
00:34:29<andrew>it's probably made easier by the fact that it's going to be difficult for Twitter to deploy new anti-abuse mechanisms due to Elon's management
00:34:30<@arkiver>tomorrowRemoval: have you come here to remove tomorrowInstallment from the chat :P
00:34:53<TheTechRobo><andrew> so the least bad solution appears to be saving API JSON
00:34:56<TheTechRobo>You can do that with WARC :-)
00:35:03<upintheairsheep>lennier1 Hello, can you decrypt this samsung smart tv firmware, and upload the decrypted files to the internetarchive and post the link for me to try to reverse engineer the store itself? https://wiki.samygo.tv/index.php?title=Extracting_the_ES-series_firmwarehttps://wiki.samygo.tv/index.php?title=Extracting_the_ES-series_firmware
00:35:11<tomorrowRemoval>I think twitter still has a few months of life left into it
00:35:25<TheTechRobo>WARC doesn't necessarily mean full webpages, it's just an archival format
00:35:43<tomorrowRemoval>But I've also heard that multiple critical infra teams at twitter has completely resigned
00:35:49<tomorrowRemoval>Oh, and the world cup is on next Monday, and it's going to be extreme traffic o'clock
00:35:54tomorrowInstallment quits [Ping timeout: 265 seconds]
00:35:54lennier1 quits [Ping timeout: 265 seconds]
00:36:04lennier2_ is now known as lennier1
00:36:06<tomorrowRemoval>see, the removal worked!
00:36:14<@arkiver>tomorrowRemoval: congrats ;)
00:36:23<upintheairsheep>Firmware: https://www.samsung.com/uk/support/model/UE55ES8000UXXU/ Script to use: https://github.com/george-hopkins/samygo-patcher/blob/master/samygo_patcher.py
00:36:46<upintheairsheep>The github version is more updated, but still requires python 2.7
00:37:06<@arkiver>python2.7 is definitely dead by now
00:37:21lennier2 quits [Ping timeout: 265 seconds]
00:37:30<upintheairsheep>I could do this myself but i can't use python 2.7 with cryptography frameworks due to software limitations on iOS.
00:37:44<upintheairsheep>It's our only option.
00:37:55<andrew>so, here's the idea, offered free of charge, and without warranty: you have warriors mint guest tokens from Twitter from the large pool of IPs available and submit them to a central server. each guest token can make 180 status lookup requests per 15 minutes, and each token lasts 10800 seconds or so (I have not verified this myself). the central server checks out guest tokens to warriors who then use them to scrape tweets until they are
00:37:56<andrew>exhausted of rate limit, and checks them back in to the central server, which waits until the limit has reset
00:37:56<tomorrowRemoval>also I think twitter has quoted, somewhere back in 2015, that they recieve about 6k tweets a second
00:38:49<andrew>see here for the Pushshift analysis of snowflakes: https://docs.google.com/document/d/1xVrPoNutyqTdQ04DXBEZW4ZW4A5RAQW2he7qIpTmG-M/edit
00:38:59<tomorrowRemoval>ooooh
00:39:20<andrew>I propose you hand out jobs for e.g. scraping all tweets with sequence ID 0 within, say, a minute, or hour
00:39:37<andrew>then later on, you move on to higher sequence IDs that yield progressively fewer tweets to complete the archive
00:39:54<tomorrowRemoval>right, low hanging fruit first, always
00:40:17<andrew>to determine the machine IDs that are generating tweets, you can use the search API or sample the time space randomly until you are reasonably confident of the result
00:40:50<andrew>oh, and I should probably mention that for every request, you can request up to 100 tweets. so that 180 requests per 15 minutes is more like up to 1800 tweets per 15 minutes
00:41:37upintheairsheep leaves
00:42:06<andrew>the fun thing is that guest tokens need not be used on the same IP they were generated on
00:42:07<tomorrowRemoval>here's the scuttlebutt, btw:
00:42:09<tomorrowRemoval>> Twitter currently has no engineers remaining on the team(s) that maintained their monorepo, build system, caching team, search team, timeline team, DNS, DHCP, NTP and egress proxy teams
00:42:31<@arkiver>tomorrowRemoval: source of that?
00:42:49<andrew>you can literally mint guest tokens over their Tor Onion hidden service and use them on clearnet, and the onion service's rate limit for minting guest tokens appears to be much higher than for a normal IP address
00:43:47<andrew>alternatively, you can use services like Luminati or Stormproxies to mint guest tokens, which you then hand to warriors or something
00:44:15<andrew>it's unclear whether there is a per IP rate limit, the only rate limit I can see is per guest token
00:45:09<andrew>alright, there's essentially the results of my prototyping from the past few days, here's hoping someone at AT turns it into reality so I don't have to :)
00:45:13<tomorrowRemoval>All I can say is that it's from someone who I most certainly trust and knows their way around the tech circle :/
00:45:24<tomorrowRemoval>Sorry, it's probably not the answer you're looking for!
00:45:28<andrew>"sources familiar with the matter"
00:45:40<@arkiver>tomorrowRemoval: i'll consider it useless information then
00:45:55<tomorrowRemoval>Take it with a horse-licking-block grain of salt, indeed
00:46:18<tomorrowRemoval>(i did say scuttlebutt!)
00:46:22<andrew>I don't doubt it that Elon's mismanagement of Twitter makes it very much endangered
00:47:02<andrew>to shrink the scope of this project to something more manageable, I suggest starting at December 2017 - that's when the LoC stopped ingesting the Firehose
00:47:42<andrew>each tweet is around 600 bytes when compressed, so ~100 TiB per year of tweets you want to scrape
00:48:13<andrew>if you don't care about pictures, it's a very manageable amount of data
00:48:48<tomorrowRemoval>no way we can do pictures too
00:48:50<tomorrowRemoval>honestly
00:49:06<andrew>if you restrict pictures to only tweets with at least 100 likes or retweets, it's probably manageable
00:51:57<andrew>btw, if any AT folks want access to my prototype Rust code, I am happy to share
00:52:10<tomorrowRemoval>omg another rustacean!
00:52:50<andrew>🦀🦀🦀
00:52:51dasineura (dasineura) joins
00:55:42wywin joins
00:58:22wywin leaves
01:02:40qwertyasdfuiopghjkl joins
01:02:43<tomorrowRemoval>arkiver: would this be any good? https://twitter.com/alexeheath/status/1593399683086327808 - it's not as detailed as the scuttlebutt...
01:04:23<tomorrowRemoval>i've just realised how ironic it is the news is being shared through twitter
01:05:07<joepie91|m>well, it went out in style
01:05:46<joepie91|m>the news of its demise being shared on twitter right now and predicted by dril 5 years prior
01:06:01<joepie91|m>truly a twitter ending
01:06:11<@arkiver>5 years prior a prediction of elon musk taking over?
01:06:20<joepie91|m>arkiver: not quite; https://twitter.com/dril/status/900592164589248513
01:06:45<joepie91|m>(there's always a dril tweet)
01:07:20<@arkiver>he may just miss 2022
01:08:12<joepie91|m>idk, Twitter is estimated to have lost 88% of its entire employee headcount by now, the offices are locked, the world cup is approaching
01:08:44<joepie91|m>seems quite possible to make 2022 still
01:08:58<andrew>insert this is fine gif here
01:11:56<@JAA>The website of the original 'this is fine' comic is down. This is fine.
01:12:04<@arkiver>ohno
01:12:25<JTL>fitting
01:17:03<tomorrowRemoval>you know how when you're watching a disaster compilation and you know something terrible is about to happen but you just can't look away
01:18:24<tomorrowRemoval>i'm gonna catch some Zs, but I'm 100% interested in working on something tomorrow
01:20:32<schwarzkatz|m>arkiver, have you seen what I wrote earlier about uploadir?
01:20:51<@arkiver>no
01:20:55<@arkiver>checking
01:23:30<@arkiver>schwarzkatz|m: no response yet?
01:23:35<@arkiver>feel free to PM me the information
01:24:03<schwarzkatz|m>Sadly no
01:41:19pk joins
01:41:34pk quits [Remote host closed the connection]
01:55:44Lord_Nightmare quits [Quit: ZNC - http://znc.in]
02:01:34Lord_Nightmare (Lord_Nightmare) joins
02:07:26<TheTechRobo>arkiver: Found a source: https://www.reddit.com/r/DataHoarder/comments/yy7tig/backup_twitter_now_multiple_critical_infra_teams/
02:15:28qwertyasdfuiopghjkl quits [Ping timeout: 265 seconds]
02:18:04blackle joins
02:22:40qwertyasdfuiopghjkl joins
02:24:38zaza joins
02:24:54<zaza>hi
02:25:20<dasineura>https://twitter.com/PopBase/status/1593427523206934529
02:25:24<dasineura>never seen anything like this
02:27:34Anthony joins
02:28:01zaza quits [Remote host closed the connection]
02:33:26michaelblob (michaelblob) joins
02:35:20jman005 joins
02:36:58anononymous_penguin joins
02:43:07anelki (anelki) joins
02:46:52mut4ntm0nkey quits [Remote host closed the connection]
02:48:29mut4ntm0nkey (mutantmonkey) joins
02:53:39lennier1 quits [Client Quit]
02:53:56lennier1 (lennier1) joins
02:56:26cascode joins
02:56:55<lennier1>Can a site the size of Twitter run on autopilot? I guess we're about to find out.
03:00:22schr0z1ng3r joins
03:00:41<tech234a>I speculate that it will probably have some outages/reliability issues but I doubt it will disappear completely
03:01:03<tech234a>there are still some people working at the company for now
03:02:01Hackerpcs quits [Client Quit]
03:03:53Hackerpcs (Hackerpcs) joins
03:04:59<@JAA>In the same way that a driverless train will keep running, I guess.
03:09:36Entropy joins
03:09:59Entropy quits [Remote host closed the connection]
03:18:50Earl joins
03:27:45<Earl>what’s the status with twitter?
03:30:32<Frogging101>The market for internet-ruining is about to get shaken up.
03:30:36<@JAA>https://transfer.archivete.am/inline/S38yt/fire.gif
03:31:23misbeseem joins
03:32:43<Arcorann>What was the channel for Twitter archiving discussion again
03:34:27<Earl>Is there one? This page just led me here https://wiki.archiveteam.org/index.php/Twitter
03:35:48<@JAA>There isn't one.
03:36:32sonick quits [Client Quit]
03:37:08<@JAA>We had one back on EFnet for a while when they were contemplating nuking inactive accounts, but that never happened, so the channel wasn't recreated after we moved here.
03:40:21surebet joins
03:42:38Earl quits [Remote host closed the connection]
03:43:26surebet quits [Remote host closed the connection]
03:47:04misbeseem quits [Remote host closed the connection]
04:17:20<mind_combatant>what's the easiest way to queue up around 1500 twitter URLs to be archived and end up on the wayback machine?
04:17:52<mind_combatant>preferably skipping any that are already saved there
04:21:08<@JAA>Define 'twitter URLs'? Tweets, users, something else?
04:22:55Earl joins
04:23:00Earl quits [Remote host closed the connection]
04:29:12<mind_combatant>specifically tweets, all in the form of "https://twitter.com/i/web/status/<id>"
04:34:02Iki1 joins
04:38:15Iki quits [Ping timeout: 276 seconds]
04:38:43nematode joins
04:39:55Lord_Nightmare quits [Client Quit]
04:40:08tomorrowRemoval quits [Client Quit]
04:40:08qwertyasdfuiopghjkl quits [Client Quit]
04:40:08anononymous_penguin quits [Client Quit]
04:40:08blackle quits [Client Quit]
04:40:08Anthony quits [Client Quit]
04:40:08cascode quits [Client Quit]
04:40:08schr0z1ng3r quits [Client Quit]
04:40:08jman005 quits [Client Quit]
04:40:17qwertyasdfuiopghjkl joins
04:40:19Lord_Nightmare (Lord_Nightmare) joins
04:43:45nick joins
04:44:04nick quits [Remote host closed the connection]
04:44:14nick123456 joins
04:46:35Anthony joins
04:46:36<nick123456>just wondering if twitter will be a warrior project, seeing as it seems to be unstable and dying
04:47:05cascode joins
04:47:37<@JAA>mind_combatant: You can send me a list, and I'll run it through the machinery. Should eventually show up in the WBM.
04:48:29<@JAA>Or if those tweets are all from one account (or a small selection of them), we could just run that through socialbot.
04:50:31<Anthony>Google+ is now trending on Twitter.
04:51:49jacobk joins
04:54:19<mind_combatant><JAA> "mind_combatant (Archie): You can..." <- i assume you mean in a PM, and as a .txt file?
04:54:45<mind_combatant>oh, whoops, forgot, i shouldn't do Matrix-style replies in here
04:55:00<@JAA>mind_combatant: Yeah (or here if you don't mind it being public). You can upload it to https://transfer.archivete.am/
05:01:44maybe joins
05:02:22nick123456 quits [Remote host closed the connection]
05:05:48maybe quits [Remote host closed the connection]
05:11:54lun4 (lun4) joins
05:12:34ivan (ivan) joins
05:31:07Island joins
05:55:11tyoma joins
05:55:13megaminxwin joins
05:58:03<megaminxwin>question, because im not really sure about the best way to go about this: im wanting to archive all the tweets + images/videos of various people i follow, ive worked out how to get the json file of the users in question via snscrape + twarc, but im not sure how to get the files
05:58:45<megaminxwin>im assuming parsing the json file, but im not sure how well that would work or if theres a better method
05:59:03<megaminxwin>plus of course this doesnt work for private accounts i follow
05:59:15<megaminxwin>any suggestions? thanks
06:01:20jacobk quits [Ping timeout: 268 seconds]
06:11:01tyoma quits [Remote host closed the connection]
06:11:18BlueMaxima quits [Read error: Connection reset by peer]
06:12:39Dudebloke joins
06:15:50jacobk joins
06:19:12mut4ntm0nkey quits [Ping timeout: 255 seconds]
06:25:03Ketchup901 quits [Ping timeout: 255 seconds]
06:25:40Ketchup901 (Ketchup901) joins
06:27:43Dudebloke quits [Remote host closed the connection]
06:32:33mut4ntm0nkey (mutantmonkey) joins
06:32:45<lennier1>megaminxwin: You might check out the Twitter Media Downloader Chrome extension. Be sure to click "No Media" to also get text-only tweets. It does work with private accounts you follow to some extent (might not get really old tweets because the Twitter API doesn't return them so there's really no way to get them unless you already know the link). Would admittedly be annoying if you follow a lot of accounts. https://chrome.googl
06:32:45<lennier1>e.com/webstore/detail/twitter-media-downloader/cblpjenafgeohmnjknfhpdbdljfkndig
06:38:30jacobk_ joins
06:38:51jacobk quits [Client Quit]
06:38:51Lord_Nightmare quits [Client Quit]
06:38:51qwertyasdfuiopghjkl quits [Client Quit]
06:38:51megaminxwin quits [Client Quit]
06:38:51cascode quits [Client Quit]
06:38:51Anthony quits [Client Quit]
06:38:53Lord_Nightmare2 (Lord_Nightmare) joins
06:39:00cascode joins
06:39:24Lord_Nightmare2 is now known as Lord_Nightmare
06:50:54Nick joins
06:51:46Nick is now known as NickNick
06:53:19<NickNick>So this week I've been working on exporting my own data from Twitter, and I just thought to come by here to see if anyone's attempting to take on that behemoth?
07:02:49NickNick quits [Remote host closed the connection]
07:04:01Nick joins
07:04:43Nick is now known as NickNick
07:07:19pabs quits [Ping timeout: 268 seconds]
07:08:28sec^nd quits [Remote host closed the connection]
07:09:24sec^nd (second) joins
07:09:50Island quits [Read error: Connection reset by peer]
07:13:11b joins
07:13:34pabs (pabs) joins
07:16:11sonick (sonick) joins
07:20:38b quits [Remote host closed the connection]
07:20:38NickNick quits [Remote host closed the connection]
07:20:38cascode quits [Remote host closed the connection]
07:23:58atphoenix quits [Ping timeout: 268 seconds]
07:25:30atphoenix (atphoenix) joins
07:27:56Anthony joins
07:35:27Anthony quits [Remote host closed the connection]
07:38:19Arachnophine3 (Arachnophine) joins
07:40:03<sonick>Does anyone know why the job about ArchiveBot's GeoLog project (https://geolog.mydns.jp/) has stopped?
07:41:48Arachnophine3 quits [Changing host]
07:41:48Arachnophine3 (Arachnophine) joins
07:42:14<sonick>Job id: 7dv1ztme3pksk96o7m168n1l3
07:44:08<ivan>that seems like a question for #archivebot
07:46:26<sonick>ivan thanks!
08:00:10<IDK>JAA: https://www.businessinsider.com/twitter-offices-shutting-down-after-elon-musk-ended-remote-work-2022-11?r=US&IR=T
08:00:41<IDK>https://usercontent.irccloud-cdn.com/file/MlvUuEZa/image.png
08:01:34<@JAA>Why are you pinging me about this?
08:01:49<IDK>wrong channel
08:03:55<IDK>but yea, Users speculate site will shut down in the near future over mass employee exit
08:12:54icryclanteat joins
08:14:24sec^nd quits [Ping timeout: 255 seconds]
08:20:21sec^nd (second) joins
08:22:43megaminxwin joins
08:24:34sepro0 (sepro) joins
08:25:01sepro quits [Ping timeout: 268 seconds]
08:25:01sepro0 is now known as sepro
08:27:36<megaminxwin>lennier1: okay well i found the firefox version and am using that rn; so theres no real way to get tweets past the 3200 api limit? i thought the snscraper could go past that
08:29:04<theblazehen|m>megaminxwin: snscrape to get the actual tweets, then python script to iterate over that data grabbing the actual images?
08:29:13<lennier1>For private accounts, yes. It's not a problem with public accounts.
08:37:10qwertyasdfuiopghjkl joins
08:38:19<megaminxwin>...im currently using that addon now, and its at over 4200 tweets so far
08:38:43<megaminxwin>theblazehen|m: thats what i was thinking, trouble is im really quite bad at python scripting
08:39:14sepro quits [Ping timeout: 265 seconds]
08:39:26<megaminxwin>i can convert the data to a json file with twarc and that does have links to the media, and then i imagine i can use some combination of jq and curl, but god knows how
08:39:33<megaminxwin>and yeah that doesnt work with private accounts
08:39:59sepro (sepro) joins
08:40:47<megaminxwin>okay weve hit 4500, so either the 3200 tweet api limit is no more (considering everything else i wouldnt be surprised if that just fell) or this addon is doing something. strange
08:44:00namwen joins
08:51:38<lennier1>The addon definitely does search for public acounts, if that's what you're trying.
08:52:12Ketchup901 quits [Ping timeout: 255 seconds]
08:55:00Ketchup901 (Ketchup901) joins
08:57:54<theblazehen|m>https://gist.github.com/theblazehen/6077c25577bf3579c44b9eff26c4901a Not fully tested
09:00:23<megaminxwin>will try and report back, cheers
09:01:07<theblazehen|m>File created with snscrape --jsonl --progress twitter-user Foone > foone_tweets.jsonl
09:02:39<megaminxwin>thatll explain why that wasnt working on the file i had
09:03:12<megaminxwin>keeps the filename as foone_tweets.jsonl because i cant be bothered editing the python script
09:03:20<megaminxwin>im a *professional* lazy
09:03:28<theblazehen|m>Hah! Relatable
09:06:02<IDK>Sorry for being off topic, but which API can I use for searching older twitter posts, I can only call GET /2/tweets/search/recent
09:06:34<IDK>The GUI interface does not work as well
09:08:05<megaminxwin>alright well snscrape got 8430 tweets, not the full 18.1k apparently on the account, i assume theres a reason but lets just test this first
09:08:43<IDK>megaminxwin: SNscrape doesnt seem to get the retweets
09:08:50<megaminxwin>how rude of it
09:09:08<IDK>and the tweet count seems to include the retweets
09:10:27<megaminxwin>alright the script doesnt work, sometimes it says 'NoneType' object has no attribute 'groups', and other times it goes "no such file or directory"
09:10:51<theblazehen|m>Ah, you need to `mkdir media
09:10:57<megaminxwin>ah, cheers
09:11:08<theblazehen|m>Have you got the latest revision? I fixed it shortly after my initial upload
09:11:14<theblazehen|m>That fixes most of the groups issue
09:12:31<megaminxwin>theeere we go
09:17:34<megaminxwin>im very intrigued in seeing what the addon is doing... hmm
09:19:00<megaminxwin>well, while this is happening
09:19:05<megaminxwin>gets out my ds
09:19:13<megaminxwin>see you in six months
09:26:12<IDK>Just curious, which addon are you guys using
09:27:05dasineura quits [Read error: Connection reset by peer]
09:27:10<IDK>nevermind
09:29:52Ketchup901 quits [Remote host closed the connection]
09:30:05Ketchup901 (Ketchup901) joins
10:14:26<jacobk_>above script modified by me for downloading videos also: https://bpa.st/V47A
10:14:34jacobk_ is now known as jacobk
10:15:00<jacobk>not sure how to cleanly get file extension though
10:21:59<jacobk>It does seem like snscrape might be missing some things. @copilotcase scrapes 0 tweets even though they have 2.
10:23:09megaminxwin quits [Ping timeout: 265 seconds]
10:23:59<jacobk>get "gifs" also: https://bpa.st/GMBA
10:24:27<jacobk>Check for failed media fetches because there could be other types too.
10:25:11<ivan>jacobk: this is expected behavior because Twitter search is broken
10:25:30<ivan>https://twitter.com/search?q=from%3Acopilotcase&src=spelling_expansion_revert_click&f=live
10:26:06<jacobk>understandable
10:26:58<ivan>to clarify, it's broken for particular users in unpredictable ways depending on gaps in tweet history, whether they've ever privated, and other unknown factors
10:28:08<jacobk>Yeah, I figured it was something like that; just wanted to make sure it was known.
10:40:29ggggg joins
10:49:24<jacobk>Maybe this will be useful to somebody, if you happen to use Akregator to subscribe to Twitter users, so you can get a list of all users you are subscribed to and then (try to) download all of their tweets: https://bpa.st/7BZQ
10:49:33<jacobk>(I'm going to sleep now)
10:49:53<jacobk>(Hopefully my hard drive isn't completely full in the morning :P)
10:50:50<schwarzkatz|m>Good night o/
10:57:10ggggg quits [Remote host closed the connection]
10:59:22atphoenix quits [Remote host closed the connection]
10:59:22qwertyasdfuiopghjkl quits [Remote host closed the connection]
10:59:22icryclanteat quits [Remote host closed the connection]
10:59:22namwen quits [Remote host closed the connection]
10:59:26atphoenix_ (atphoenix) joins
11:02:28qwertyasdfuiopghjkl joins
11:17:47ggggg joins
11:17:59ggggg quits [Remote host closed the connection]
11:20:50inconsistentUsername joins
11:29:23<betamax_>Is there a recommended set of options to add to wget so that when given a URL to a tweet it grabs all the necessary pre-requisites?
11:29:28betamax_ is now known as betamax
11:31:05Pichu0102 joins
11:31:58<inconsistentUsername>Good morning, I was bumming around here yesterday trying to see how I can help with archiving twitter.
11:32:09<inconsistentUsername>Anything I missed while I was away? :)
11:54:01inconsistentUsername quits [Ping timeout: 265 seconds]
12:02:13mut4ntm0nkey quits [Remote host closed the connection]
12:03:35mut4ntm0nkey (mutantmonkey) joins
12:06:22inconsistentUsername joins
12:59:16inconsistentUsername quits [Ping timeout: 257 seconds]
13:02:58namwen joins
13:08:56inconsistentUsername joins
13:10:31eroc1990 quits [Client Quit]
13:13:39<TheTechRobo>IDK: For older tweets, use snscrape.
13:13:46<TheTechRobo>snscrape CAN include retweets. 1sec
13:14:54<TheTechRobo>If you're scraping a user and are fine with a 3200 tweet limit, use the `twitter-profile` scraper (rather than `twitter-user`) to include retweets.
13:15:48<TheTechRobo>If you're using search OR the 3200 tweet limit doesn't work for you, you can include retweets from the past 7 days (non-retweets will not be affected, though, unless Twitter's search does something weird) with `include:nativeretweets` as a search operator.
13:23:20inconsistentUsername quits [Remote host closed the connection]
13:23:25inconsistentUsername joins
13:30:45eroc1990 (eroc1990) joins
13:32:47qwertyasdfuiopghjkl quits [Client Quit]
13:34:28pie_ quits []
13:34:39pie_ joins
13:41:01qwertyasdfuiopghjkl joins
13:44:15Arcorann quits [Ping timeout: 276 seconds]
13:46:06qw3rty joins
13:46:06qw3rty__ quits [Read error: Connection reset by peer]
13:46:49qw3rty quits [Read error: Connection reset by peer]
13:47:17qw3rty joins
13:48:09qw3rty_ joins
13:48:09qw3rty quits [Read error: Connection reset by peer]
13:48:33qw3rty_ quits [Read error: Connection reset by peer]
13:48:45qw3rty_ joins
13:49:09qw3rty_ quits [Read error: Connection reset by peer]
13:49:46qw3rty joins
13:50:18qw3rty quits [Read error: Connection reset by peer]
13:51:00qw3rty joins
13:51:48qw3rty quits [Read error: Connection reset by peer]
13:53:14qw3rty joins
13:54:11qw3rty quits [Read error: Connection reset by peer]
13:54:33qw3rty joins
13:56:15qw3rty quits [Read error: Connection reset by peer]
13:56:25qw3rty_ joins
13:57:08qw3rty_ quits [Read error: Connection reset by peer]
13:58:30qw3rty joins
13:59:05qw3rty quits [Read error: Connection reset by peer]
13:59:40qw3rty joins
14:00:26qw3rty_ joins
14:00:26qw3rty quits [Read error: Connection reset by peer]
14:00:50qw3rty_ quits [Read error: Connection reset by peer]
14:08:48lunik17 joins
14:10:39inconsistentUsername quits [Remote host closed the connection]
14:13:13tech_exorcist (tech_exorcist) joins
14:33:24inconsistentUsername joins
14:38:20LeGoupil joins
14:56:40eroc1990 quits [Client Quit]
14:56:40LeGoupil quits [Remote host closed the connection]
14:56:48LeGoupil joins
14:56:58eroc1990 (eroc1990) joins
15:01:20LeGoupil quits [Remote host closed the connection]
15:01:20qwertyasdfuiopghjkl quits [Client Quit]
15:01:20namwen quits [Client Quit]
15:01:20inconsistentUsername quits [Client Quit]
15:01:33LeGoupil joins
15:03:31eroc1990 quits [Client Quit]
15:11:30qwertyasdfuiopghjkl joins
15:24:01tech_exorcist quits [Remote host closed the connection]
15:25:03tech_exorcist (tech_exorcist) joins
15:26:28spirit joins
15:32:50Island joins
15:35:50inconsistentUsername joins
15:40:24eroc1990 (eroc1990) joins
15:52:53<fishingforsoup>How familiar...
15:52:55<fishingforsoup>Could you upload Ain't My Fault's beta? I have it if you wish. Just send me an email!
15:53:01<fishingforsoup>Wrong paste.
15:53:09<fishingforsoup>https://spacehey.com/
15:58:15holbrooke joins
16:14:03inconsistentUsername quits [Ping timeout: 265 seconds]
16:18:20inconsistentUsername joins
16:26:28Jason80 joins
16:39:35Jason80 quits [Remote host closed the connection]
16:45:58tech_exorcist quits [Remote host closed the connection]
16:46:58tech_exorcist (tech_exorcist) joins
17:01:08HP_Archivist (HP_Archivist) joins
17:04:48inconsistentUsername quits [Ping timeout: 265 seconds]
17:09:23HP_Archivist quits [Client Quit]
17:11:19<h2ibot>Jarshua edited Twitter (+249): https://wiki.archiveteam.org/?diff=49161&oldid=49157
17:11:20<h2ibot>Squidboy edited ArchiveBot/Antarctica (+104, +Queen Maud Land): https://wiki.archiveteam.org/?diff=49162&oldid=40412
17:20:26upintheairsheep joins
17:21:55cascode joins
17:22:10<upintheairsheep>lennier1 Hello, to focus on the samsung store project, please decrypt the firmware with this python 2.7 script, I haven't found a way to get the python 2.7 encryption modules working on my devices. https://hackint.logs.kiska.pw/archiveteam-bs/20221118
17:31:34<upintheairsheep>And off-topic, there is a website called https://decrypt.day/ which is a mirror on the app store on onedrive, and is blocked by hcaptia before downloading.
17:36:14<upintheairsheep>the browse page of the app store looks different, and I too am getting 404 errors.
17:38:03<upintheairsheep>https://archive.ph/Pptae
17:38:21<upintheairsheep>The links are still working to this day on archive.ph
17:51:12qwertyasdfuiopghjkl quits [Ping timeout: 265 seconds]
17:55:19<upintheairsheep>lennier1 I'm calling you for this project, before it gets forgotton
18:02:42<lennier1>I don't really have time at the moment. Right now, I'm literally at work. :)
18:03:01<@arkiver>upintheairsheep: don't spam people
18:03:13<@arkiver>you can leave a message, mention someone, but just wait for a reply
18:03:27<@arkiver>if there is no reply after a long time (say day or two), feel free to ping again
18:03:32<upintheairsheep>ok
18:03:44<upintheairsheep>Me too.
18:03:48upintheairsheep leaves
18:13:52qw3rty joins
18:37:49tomorrowRemoval joins
18:44:45LeGoupil quits [Client Quit]
18:49:38wyatt8750 joins
18:51:03wyatt8740 quits [Ping timeout: 276 seconds]
18:54:50tech_exorcist quits [Remote host closed the connection]
18:56:35tech_exorcist (tech_exorcist) joins
18:59:25cascode quits [Remote host closed the connection]
19:00:37<h2ibot>JAABot edited CurrentWarriorProject (-4): https://wiki.archiveteam.org/?diff=49163&oldid=49134
19:00:39tech_exorcist quits [Remote host closed the connection]
19:01:08tech_exorcist (tech_exorcist) joins
19:03:18HackMii quits [Ping timeout: 255 seconds]
19:04:35HackMii (hacktheplanet) joins
19:09:42tech_exorcist quits [Read error: Connection reset by peer]
19:10:10tech_exorcist (tech_exorcist) joins
19:21:19TheTechRobo quits [Client Quit]
19:21:41TheTechRobo (TheTechRobo) joins
19:22:20TheTechRobo quits [Client Quit]
19:22:41TheTechRobo (TheTechRobo) joins
19:28:06tech_exorcist quits [Read error: Connection reset by peer]
19:28:41TheTechRobo quits [Remote host closed the connection]
19:29:02TheTechRobo (TheTechRobo) joins
19:29:06tech_exorcist (tech_exorcist) joins
19:29:07TheTechRobo quits [Remote host closed the connection]
19:29:31TheTechRobo (TheTechRobo) joins
19:30:25TheTechRobo quits [Client Quit]
19:30:48TheTechRobo (TheTechRobo) joins
19:55:30HackMii quits [Ping timeout: 255 seconds]
20:02:36HackMii (hacktheplanet) joins
20:19:21HackMii quits [Ping timeout: 255 seconds]
20:22:16HackMii (hacktheplanet) joins
20:23:35cascode joins
20:53:23wyatt8740 joins
20:53:36eroc1990 quits [Client Quit]
20:55:18<@JAA>So my Twitter US election candidate rescrape found about 170k tweets less but 221k tweets that weren't in the first scrape. So roughly 391k older tweets vanished for one reason or another, I guess.
20:56:01upintheairsheep joins
20:56:19wyatt8750 quits [Ping timeout: 265 seconds]
20:56:21eroc1990 (eroc1990) joins
20:59:04<upintheairsheep>Even though Roblox archival is not needed right now, a yt-dlp developer has made a pull request to support it, but it is currently a draft.
20:59:05<upintheairsheep>https://github.com/yt-dlp/yt-dlp/pull/5178
21:00:50<upintheairsheep>I'm a python newbie, but I tried to add support for comments extraction. https://github.com/upintheairsheep/ytdl-sheep/blob/main/yt_dlp/extractor/roblox.py
21:02:09<upintheairsheep>However, it is untested due to my software limitations and does not support looping after the first 10
21:09:10upintheairsheep leaves
21:10:19wyatt8750 joins
21:11:47wyatt8740 quits [Ping timeout: 265 seconds]
21:19:20eroc1990 quits [Client Quit]
21:20:50eroc1990 (eroc1990) joins
21:21:56wyatt8750 quits [Ping timeout: 265 seconds]
21:22:30wyatt8740 joins
21:37:14lennier1 quits [Client Quit]
21:38:49lennier1 (lennier1) joins
22:28:49<tomorrowRemoval>Oh hey, the warrior auto-select has moved to reddit.
22:28:53<tomorrowRemoval>Are we done with telegram?
22:58:12tomorrowRemoval quits [Client Quit]
22:58:12cascode quits [Client Quit]
23:02:06BlueMaxima joins
23:03:09HackMii quits [Ping timeout: 255 seconds]
23:03:59tech_exorcist quits [Client Quit]
23:06:55HackMii (hacktheplanet) joins
23:07:18lennier1 quits [Client Quit]
23:07:38lennier1 (lennier1) joins
23:21:39jacobk quits [Ping timeout: 268 seconds]
23:23:37HP_Archivist (HP_Archivist) joins
23:25:40eroc1990 quits [Client Quit]
23:31:06spirit quits [Client Quit]
23:32:18eroc1990 (eroc1990) joins
23:41:27HP_Archivist quits [Client Quit]
23:43:16XanaAdmin joins