| 00:03:33 | | Dada quits [Remote host closed the connection] |
| 00:09:04 | | hexa- quits [Quit: Disconnected] |
| 00:09:40 | | hexa- (hexa-) joins |
| 00:52:35 | | nexussfan quits [Read error: Connection reset by peer] |
| 00:53:28 | | nexussfan (nexussfan) joins |
| 01:01:28 | | nexussfan quits [Remote host closed the connection] |
| 01:25:09 | <klea> | btw, could someone find and give a link to the wiki page that says channel names must contain humour or something akin to that. |
| 01:33:55 | <TheTechRobo> | klea: https://wiki.archiveteam.org/index.php/Dev/New_Project#IRC_Channel |
| 01:34:01 | <klea> | TheTechRobo: thanks |
| 01:35:23 | <Guest> | #logitheck |
| 01:41:46 | | midou quits [Ping timeout: 256 seconds] |
| 01:41:58 | | midou joins |
| 01:46:47 | | midou quits [Ping timeout: 272 seconds] |
| 02:06:48 | | xkey quits [Quit: WeeChat 4.8.0] |
| 02:06:58 | | xkey (xkey) joins |
| 02:22:14 | <Doranwen> | I might be able to figure this out, but I'll also probably spend half the evening bashing my head against the regex needed, lol. I'm working on a private project derived from the LJ project (but it's not relating to LJ directly, hence not putting this in #recordedjournal) where we're taking already-downloaded LJs and trying to scrape the *non*-watermarked graphics from Photobucket using gallery-dl. |
| 02:22:20 | <Doranwen> | First step is to extract the links from the html files. |
| 02:23:56 | <Doranwen> | I compared two different posts and found what's the same. The challenge is that it has to match only the img links that are Photobucket and not anything else - but then back up to get the entire Photobucket link from the https:// all the way to the " at the end of the url. |
| 02:24:41 | <Doranwen> | Examples: src="https://i856.photobucket.com/albums/ab126/audiopineapple4/icontests/merlinmoodstill/lemortedarthur0836.jpg" |
| 02:24:49 | <Doranwen> | and src="https://i180.photobucket.com/albums/x53/littlemissbookworm/Inception%2020in20/Inception20in20SignUps.png" |
| 02:25:23 | <Doranwen> | The bits between the " are what I need to pull out, obviously. |
| 02:25:55 | <Doranwen> | How do I write a command to search the html file for anything like that and pull just those links out? |
| 02:28:02 | <Doranwen> | I'm assuming I need to use `sed`. |
| 02:29:25 | <Doranwen> | From what I can tell the bit between the https:// and photobucket is always i followed by 3 different digits (which digits they are varies). |
| 02:30:17 | <Doranwen> | So that looks like 13 characters preceding photobucket. |
| 02:30:28 | <Doranwen> | The extra challenge is that there are *also* versions of the url like this in the file: <a href="https://www.livejournal.com/away?to=http%3A%2F%2Fi856.photobucket.com%2Falbums%2Fab126%2Faudiopineapple4%2Ficontests%2Fmerlinmoodstill%2Fexcalibur1606.jpg" |
| 02:30:32 | <Doranwen> | We don't want those, lol. |
| 02:30:34 | <nicolas17> | grep -o 'https://i[0-9]*\.photobucket[^"]*' might work without needing sed, but that's not checking for the src= part, if it matters |
| 02:30:38 | <Doranwen> | So it has to be one that has a src part. |
| 02:30:44 | <Doranwen> | Lol what I was just saying. |
| 02:30:49 | <Doranwen> | Yeah, the src= matters. |
| 02:30:54 | <nicolas17> | well it won't match on that one |
| 02:31:19 | <nicolas17> | because it's looking for :// and won't match %3A%2F%2F |
| 02:31:22 | <klea> | why not grep -o 'src="https://i[0-9]*\.photobucket[^"]*' and later sed? |
| 02:31:30 | <Doranwen> | OH, you looked for that, that's good, that would work. |
| 02:32:11 | <Doranwen> | I don't care how it gets solved as long as I can plug it into the script. It's going to be extracting each url from a single post, grabbing the pics from that post and putting them in a folder with the post name, then moving on to the next post to do the same. |
| 02:32:21 | <Doranwen> | I'm *pretty* sure I can manage the loops to do the rest. |
| 02:32:32 | <Doranwen> | But I realized first thing was the url extraction and my brain fried looking at it. |
| 02:32:44 | | jinn6 quits [Quit: WeeChat 4.8.1] |
| 02:32:45 | | jinn6 joins |
| 02:32:46 | <klea> | Doranwen: do you have awk? |
| 02:32:48 | <Doranwen> | I do. |
| 02:32:55 | <klea> | you could use that maybe?, but im not sure how |
| 02:33:10 | <Doranwen> | Yeah, I know awk's powerful but I've only used it when someone's handed me a command with it already made up, lol. |
| 02:33:24 | | jinn6 quits [Client Quit] |
| 02:33:37 | <klea> | grep -oP 'src="\K[^"]+' file appaerntly should work? |
| 02:33:48 | | jinn6 joins |
| 02:33:51 | <Doranwen> | Let me try that. :) |
| 02:33:59 | <klea> | but that'd match all src urls |
| 02:34:06 | <klea> | not only for photobucket |
| 02:34:19 | | midou joins |
| 02:34:23 | <klea> | and for the annoying livejournal.com away ones it's a bit annoying |
| 02:34:27 | <klea> | since you have to do url decoding |
| 02:34:31 | <klea> | and idk something that can do that |
| 02:35:13 | <Doranwen> | Yeah, I have to match only the LJ ones. |
| 02:35:22 | <Doranwen> | So maybe I extract all urls, then pull only the Photobucket ones out? |
| 02:35:31 | <Doranwen> | I can encode that into the script if need be. |
| 02:35:59 | <klea> | imho if you just extract all urls more extraction can be done later down the road |
| 02:36:17 | <Doranwen> | True. Though the LJ ones really are quite useless, lol. |
| 02:36:59 | | jinn6 quits [Client Quit] |
| 02:37:17 | <Doranwen> | I'd have to do two sets of extraction then. Or, rather, one extraction - but then dump the full set of links in one spot to save them for later, and extract just the photobucket ones for use with gallery-dl. |
| 02:37:29 | <klea> | oh the post has both the direct link and a indirect link via the url dereffer? |
| 02:37:48 | | jinn6 joins |
| 02:38:52 | <Doranwen> | Yes, exactly. |
| 02:39:21 | | midou quits [Ping timeout: 272 seconds] |
| 02:39:37 | | jinn6 quits [Client Quit] |
| 02:39:40 | | jinn6 (jinn6) joins |
| 02:40:07 | <klea> | oh yeah then you do only need the urls for #photosucket |
| 02:41:55 | <Doranwen> | Haha, there's a channel for it, of course there is. XD |
| 02:42:37 | <Doranwen> | Yeah, some people *may* have hosted images on another site - so I think I will want to store the links for later, just in case we find some - but the main purpose of this is photobucket. |
| 02:42:45 | <Doranwen> | Well, thank you all, I have what I need to get going. |
| 02:42:52 | <Doranwen> | I think I actually can code this, which I'm really excited about. |
| 02:43:04 | <Doranwen> | My bash skills have been improving thanks to all of you and the projects I've been working on. |
| 02:43:12 | <klea> | grep -oP 'src="\K[^"]+' would actually give you all urls that have a src tag (likely because of being a <img>) |
| 02:43:59 | <Doranwen> | Yeah, I think I'll do that and dump them to a txt file stored with the post. Can't hurt to have it. |
| 02:44:51 | | midou joins |
| 03:00:16 | | Kotomind joins |
| 03:05:17 | <Doranwen> | Within a for loop that loops through each html in a directory, how do I grab the number between the - and the . at the end of filenames like 20in20inception-765.html and merlinmoodstill-778.html ? |
| 03:05:43 | <Doranwen> | There's always that hyphen and the .html and the loop is doing `for f in *.html; do` |
| 03:05:55 | <Doranwen> | But some numbers are 4 or 5 digits. |
| 03:06:00 | <Doranwen> | So it can't go by number of characters. |
| 03:11:57 | | PredatorIWD256 joins |
| 03:15:16 | | PredatorIWD25 quits [Ping timeout: 256 seconds] |
| 03:15:16 | | PredatorIWD256 is now known as PredatorIWD25 |
| 03:18:16 | | Yakov quits [Quit: The Lounge - https://thelounge.chat] |
| 03:18:36 | | Yakov joins |
| 03:19:54 | | sec^nd quits [Remote host closed the connection] |
| 03:20:20 | | sec^nd (second) joins |
| 03:20:59 | | Yakov quits [Client Quit] |
| 03:21:18 | | Yakov (Yakov) joins |
| 03:22:08 | | DogsRNice_ quits [Read error: Connection reset by peer] |
| 03:23:03 | | Kotomind quits [Ping timeout: 272 seconds] |
| 03:28:26 | | vietthan0 joins |
| 03:28:46 | | mrminemeet_ joins |
| 03:30:01 | | mrminemeet quits [Ping timeout: 272 seconds] |
| 03:30:03 | | tzt quits [Quit: tzt] |
| 03:30:30 | | tzt (tzt) joins |
| 03:33:27 | | HackMii quits [Remote host closed the connection] |
| 03:34:11 | | HackMii (hacktheplanet) joins |
| 03:56:59 | <nukke> | is there a non-social media app that can upload video right after recording "securely"? |
| 03:57:48 | <nukke> | I remember during the george floyd protests there was one promoted or at least mentioned by the EFF called "record-a-cop" or similar, that would upload and also would lock the phone |
| 03:59:01 | <nukke> | oh, it might be Citizen |
| 03:59:54 | <nukke> | nevermind, not it. |
| 04:23:38 | | midou quits [Read error: Connection reset by peer] |
| 04:33:34 | | midou joins |
| 04:36:21 | | nexussfan (nexussfan) joins |
| 04:43:06 | | nexussfan quits [Ping timeout: 256 seconds] |
| 04:54:15 | <@JAA> | nukke: Are you thinking of ACLU Mobile Justice? Predates Floyd though. And it was shut down a year ago. |
| 04:58:12 | <nukke> | noooooooooooooo ;_; |
| 04:58:38 | <nukke> | JAA: I meant that it gained prominence during the 2020 protests, but that it was released then. |
| 04:58:47 | <@JAA> | Ah, yeah |
| 05:02:54 | <nukke> | I mention it because I'm in the state where an incident happened today and friends and family members will be attending demonstrations |
| 05:03:24 | <nukke> | so I want something normie-friendly that's also not traditional social media that can get wiped easily |
| 05:05:29 | | vietthan0 quits [Client Quit] |
| 05:12:46 | <chrismrtn> | nukke: Probably not what you'd want in this case, but I remembered hearing about https://github.com/scriptjunkie/private-record-live (it's a fairly simplistic self-hosted webpage that records with WebRTC -- it's basically got a record/stop button and some quality settings) |
| 05:19:13 | <nukke> | chrismrtn: ohh, that's interesting! my setup is just automatic upload to a file server which then gets replicated to 3 different places, but that requires either the video to be manually stopped, or for it to reach the recording limit |
| 05:19:51 | <nukke> | although now that I think aobut it, since this is basically a webapp, I'm guessing grapheneos will kill/freeze the browser tab if the screen gets locked |
| 05:20:31 | <nukke> | mobile justice basically never stopped recording or streaming |
| 05:21:44 | <nukke> | there has to be a camera app that supports a remote endpoint 🤔 |
| 05:21:54 | <nukke> | custom remote endpoint* |
| 06:10:10 | | ArchivalEfforts quits [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.] |
| 06:10:20 | | ArchivalEfforts joins |
| 06:45:43 | | nine quits [Ping timeout: 272 seconds] |
| 06:48:15 | | nine joins |
| 08:42:28 | | SootBector quits [Remote host closed the connection] |
| 08:43:39 | | SootBector (SootBector) joins |
| 09:28:44 | | Dango360 quits [Quit: The Lounge - https://thelounge.chat] |
| 09:30:36 | | Dango360 (Dango360) joins |
| 09:34:49 | | h|ca2 quits [Ping timeout: 272 seconds] |
| 09:35:31 | | h|ca2 (h) joins |
| 09:50:48 | | midou quits [Ping timeout: 256 seconds] |
| 10:00:58 | | midou joins |
| 10:49:15 | | SootBector quits [Remote host closed the connection] |
| 10:50:23 | | SootBector (SootBector) joins |
| 10:53:55 | | Dada joins |
| 11:10:08 | | archiveDrill quits [Ping timeout: 256 seconds] |
| 11:26:30 | | iPwnedYourIOTSmartdog3 joins |
| 11:28:11 | | iPwnedYourIOTSmartdog quits [Ping timeout: 272 seconds] |
| 11:28:12 | | iPwnedYourIOTSmartdog3 is now known as iPwnedYourIOTSmartdog |
| 12:00:05 | | Bleo182600722719623455222 quits [Quit: The Lounge - https://thelounge.chat] |
| 12:02:49 | | Bleo182600722719623455222 joins |
| 12:15:43 | | CraftByte quits [Quit: Hasta la vista] |
| 12:15:57 | | CraftByte (DragonSec|CraftByte) joins |
| 12:33:25 | | nine quits [Ping timeout: 272 seconds] |
| 12:40:17 | | nine joins |
| 12:57:08 | | mrminemeet joins |
| 12:58:45 | | mrminemeet_ quits [Ping timeout: 272 seconds] |
| 13:19:52 | <justauser> | Guardian project had something IIRC. |
| 13:20:48 | <justauser> | https://guardianproject.info/apps/org.witness.proofmode/ |
| 13:36:54 | | benjins3 quits [Ping timeout: 256 seconds] |
| 14:29:19 | | pabs quits [Ping timeout: 272 seconds] |
| 14:32:23 | | pabs (pabs) joins |
| 14:48:58 | | benjins3 joins |
| 15:02:53 | | that_lurker quits [Ping timeout: 272 seconds] |
| 15:03:25 | | Wake joins |
| 15:07:31 | | that_lurker (that_lurker) joins |
| 16:12:04 | | SootBector quits [Ping timeout: 256 seconds] |
| 16:13:14 | | SootBector (SootBector) joins |
| 16:16:34 | | programmerq (programmerq) joins |
| 16:52:52 | | HackMii quits [Ping timeout: 256 seconds] |
| 17:03:55 | | HackMii (hacktheplanet) joins |
| 17:04:18 | | TastyWiener95 quits [Ping timeout: 256 seconds] |
| 17:07:32 | | TastyWiener95 (TastyWiener95) joins |
| 17:11:35 | | grill (grill) joins |
| 17:38:12 | | HackMii quits [Ping timeout: 256 seconds] |
| 18:06:31 | | HackMii (hacktheplanet) joins |
| 18:17:57 | | grill quits [Ping timeout: 272 seconds] |
| 18:28:30 | | grill (grill) joins |
| 18:33:40 | | DogsRNice joins |
| 19:56:00 | | grill quits [Ping timeout: 256 seconds] |
| 20:21:28 | <nukke> | they're reporting up to 300 ICE agents in my small town lol |
| 20:26:16 | <nicolas17> | isn't this what 2A is for |
| 20:38:32 | <Dango360> | moving from -dev: |
| 20:38:33 | <Dango360> | ; |
| 21:44:48 | | nine quits [Ping timeout: 256 seconds] |
| 21:47:57 | | nine joins |
| 21:47:58 | | nine is now authenticated as nine |
| 21:47:58 | | nine quits [Changing host] |
| 21:47:58 | | nine (nine) joins |
| 22:03:31 | | nine quits [Client Quit] |
| 22:03:43 | | nine joins |
| 22:03:44 | | nine is now authenticated as nine |
| 22:03:44 | | nine quits [Changing host] |
| 22:03:44 | | nine (nine) joins |
| 22:05:06 | | SootBector quits [Ping timeout: 256 seconds] |
| 22:05:06 | | sec^nd quits [Ping timeout: 256 seconds] |
| 22:07:32 | | SootBector (SootBector) joins |
| 22:09:57 | | sec^nd (second) joins |
| 22:15:10 | | useretail joins |
| 22:31:55 | | ThetaDev quits [Ping timeout: 272 seconds] |
| 22:34:58 | | ThetaDev joins |
| 22:41:12 | <steering> | nukke++ |
| 22:41:14 | <eggdrop> | [karma] 'nukke' now has 6 karma! |
| 22:44:50 | | flotwig quits [Quit: ZNC - http://znc.in] |
| 22:45:55 | | flotwig joins |
| 22:46:09 | | flotwig quits [Client Quit] |
| 22:47:29 | | flotwig joins |
| 22:48:16 | | ThetaDev quits [Ping timeout: 256 seconds] |
| 22:49:24 | | flotwig quits [Client Quit] |
| 22:49:56 | | flotwig joins |
| 22:53:19 | | ThetaDev joins |
| 23:16:55 | | atphoenix__ (atphoenix) joins |
| 23:20:03 | | atphoenix_ quits [Ping timeout: 272 seconds] |
| 23:25:40 | | nexussfan (nexussfan) joins |
| 23:52:59 | | mrminemeet quits [Ping timeout: 272 seconds] |
| 23:53:21 | | mrminemeet joins |