| 00:00:00 | | Jake quits [Quit: Leaving for a bit!] |
| 00:01:40 | <qwertyasdfuiopghjkl> | JAA: Looks like you can just fill in whatever in the email/country selection thing on pages like https://opensource.com/downloads/linux-metacharacters-cheat-sheet and it sets a cookie with the name STYXKEY_Drupal_visitor_gatedemail and value osdc-gated-content that makes pages show the actual download link without needing a login. Guessing it's |
| 00:01:41 | <qwertyasdfuiopghjkl> | just to get people to sign up for the newsletter. |
| 00:03:50 | | Jake (Jake) joins |
| 00:07:04 | | sonick (sonick) joins |
| 00:23:35 | | nicolas17 quits [Client Quit] |
| 00:27:47 | | graham (graham) joins |
| 00:42:31 | | programmerq quits [Ping timeout: 252 seconds] |
| 00:44:03 | | programmerq (programmerq) joins |
| 00:56:36 | | benjinsm is now known as benjins |
| 00:56:37 | | benjins is now authenticated as benjins |
| 02:17:27 | | tzt quits [Remote host closed the connection] |
| 02:17:50 | | tzt (tzt) joins |
| 02:23:06 | | Guest50 quits [Client Quit] |
| 02:24:48 | | Guest50 joins |
| 02:52:36 | | Guest50 quits [Client Quit] |
| 02:54:02 | | dumbgoy__ joins |
| 02:56:43 | | dumbgoy_ quits [Ping timeout: 252 seconds] |
| 03:49:07 | | TheTechRobo quits [Remote host closed the connection] |
| 03:49:47 | <Hans5958|m> | What is a good way to scrape links from a Google search? |
| 03:49:47 | | TheTechRobo (TheTechRobo) joins |
| 03:58:18 | | TheTechRobo quits [Read error: Connection reset by peer] |
| 03:58:25 | | AlsoTheTechRobo (TheTechRobo) joins |
| 04:01:07 | | AlsoTheTechRobo quits [Remote host closed the connection] |
| 04:01:56 | | AlsoTheTechRobo (TheTechRobo) joins |
| 04:05:20 | <lennier1> | "According to Twitter's policy, users should log in to their account at least once every 30 days to avoid permanent removal due to prolonged inactivity." I had not heard of this policy before. |
| 04:05:25 | | eythian quits [Quit: http://quassel-irc.org - Chat comfortabel. Waar dan ook.] |
| 04:05:55 | | eythian joins |
| 04:22:19 | <datechnoman> | 30 days is far from "prolonged inactivity" lol? Maybe like 12 months or more..... |
| 04:26:08 | | Guest50 joins |
| 04:36:39 | | AlsoTheTechRobo quits [Remote host closed the connection] |
| 04:37:15 | | AlsoTheTechRobo (TheTechRobo) joins |
| 04:40:58 | | Sluggs quits [Excess Flood] |
| 04:44:04 | | Sluggs joins |
| 04:44:13 | <Hans5958|m> | Even Google has two years |
| 04:45:26 | | sepro quits [Ping timeout: 252 seconds] |
| 04:53:23 | | sepro (sepro) joins |
| 04:55:42 | | Nulo quits [Ping timeout: 252 seconds] |
| 05:15:58 | | Nulo joins |
| 05:19:00 | | qwertyasdfuiopghjkl quits [Quit: qwertyasdfuiopghjkl] |
| 05:26:46 | | Jonboy3451 joins |
| 05:29:48 | | Jonboy345 quits [Ping timeout: 252 seconds] |
| 05:38:50 | | BlueMaxima quits [Client Quit] |
| 06:15:47 | | Island quits [Read error: Connection reset by peer] |
| 06:16:44 | | Guest50 quits [Ping timeout: 265 seconds] |
| 06:18:03 | | hitgrr8 joins |
| 06:54:00 | <pabs> | rewby|backup: was a date mentioned? |
| 06:54:23 | <@rewby|backup> | No |
| 06:54:54 | <@rewby|backup> | Premptive more than anything |
| 06:55:07 | <@rewby|backup> | If a site loses all staff it doesn't go down immediately |
| 06:55:10 | <pabs> | would AB be the thing to use to save it? |
| 06:55:16 | <@rewby|backup> | But also probably won't live long |
| 06:55:37 | <@rewby|backup> | Unsure, I'm not an expert on rgat |
| 06:55:39 | <@rewby|backup> | *that |
| 06:56:52 | <pabs> | I'll ask a RedHat person if they are able to make downloads public |
| 07:01:30 | <pabs> | you might want to ask through your channels too |
| 07:01:53 | <@rewby|backup> | I don't have any channels |
| 07:06:00 | <pabs> | where did you read about the staff firing? |
| 07:07:02 | <@rewby|backup> | pabs: Friend who has friends there told me |
| 07:08:35 | <pabs> | emailing their site address says: Thank you for your contributions. The Opensource.com community publication, including this email listserv, is no longer supported by Red Hat. |
| 07:10:52 | <pabs> | JAA: perhaps we should kick off an AB job and then later if they make the downloads public, do just those? |
| 07:12:41 | | pabs started with a snscrape of the twitter account for now |
| 07:18:59 | | lexikiq quits [Client Quit] |
| 07:50:32 | | Arcorann (Arcorann) joins |
| 07:55:01 | | sec^nd quits [Ping timeout: 245 seconds] |
| 08:01:32 | | sec^nd (second) joins |
| 08:15:09 | <@OrIdow6> | Wasn't there some kerfluffle a few years ago about Twitter deleting old accounts? |
| 08:15:38 | <@OrIdow6> | People complained, they paused it, and that was the last I heard of it |
| 08:35:22 | <AK> | Think it was around people who had passed away iirc |
| 08:36:00 | <AK> | Families+Friends wanted their accounts to remain as a preserved memory |
| 09:15:34 | | Ivan226 quits [Ping timeout: 265 seconds] |
| 09:56:10 | | pie_ quits [Ping timeout: 265 seconds] |
| 10:04:16 | | pie_ joins |
| 10:04:37 | | Ruthalas5 quits [Ping timeout: 252 seconds] |
| 10:06:02 | | Ruthalas5 (Ruthalas) joins |
| 11:57:22 | | Barto quits [Ping timeout: 252 seconds] |
| 12:10:03 | | Icyelut (Icyelut) joins |
| 12:11:06 | | fred44 joins |
| 12:11:18 | | Icyelut|2 quits [Ping timeout: 252 seconds] |
| 12:13:25 | | fred44 quits [Remote host closed the connection] |
| 12:19:22 | | benjins quits [Ping timeout: 252 seconds] |
| 12:35:39 | | icedice (icedice) joins |
| 12:38:36 | | tjwds quits [Quit: Ping timeout (120 seconds)] |
| 12:38:42 | | tjwds joins |
| 12:47:50 | | therubberduckie quits [Remote host closed the connection] |
| 12:53:50 | | TastyWiener95 quits [Ping timeout: 252 seconds] |
| 13:11:48 | | HP_Archivist quits [Ping timeout: 252 seconds] |
| 13:23:57 | | benjins joins |
| 13:35:59 | | CaldeiraG (CaldeiraG) joins |
| 13:37:20 | | Barto (Barto) joins |
| 13:42:55 | | rubberduck joins |
| 14:03:38 | | rubberduck quits [Ping timeout: 265 seconds] |
| 14:09:00 | | Billy549 quits [Ping timeout: 252 seconds] |
| 14:10:24 | | Arcorann quits [Ping timeout: 265 seconds] |
| 14:10:45 | | pabs quits [Read error: Connection reset by peer] |
| 14:11:56 | | pabs (pabs) joins |
| 14:16:01 | <icedice> | Sanqui: How is it going with https://pokecommunity.com/, by the way? |
| 14:16:15 | <icedice> | Any ETA for when that archivation job might begin? |
| 14:22:47 | | Billy549 (Billy549) joins |
| 14:28:46 | | CaldeiraG quits [Ping timeout: 265 seconds] |
| 14:41:58 | | Island joins |
| 14:42:10 | | umgr036 joins |
| 14:43:07 | | umgr036 quits [Remote host closed the connection] |
| 14:43:21 | | umgr036 joins |
| 14:53:55 | <@Sanqui> | icedice: unfortunately there's a cloudflare wall so it can't be done with archivebot |
| 15:08:16 | | rubberduck joins |
| 15:12:15 | | nostalgebraist joins |
| 15:15:38 | | umgr036 quits [Remote host closed the connection] |
| 15:15:54 | | umgr036 joins |
| 15:34:37 | | AlsoTheTechRobo quits [Remote host closed the connection] |
| 15:35:15 | | AlsoTheTechRobo (TheTechRobo) joins |
| 15:37:44 | | Ivan226 joins |
| 15:40:53 | | Guest50 joins |
| 15:51:50 | | umgr036 quits [Remote host closed the connection] |
| 15:52:05 | | umgr036 joins |
| 16:01:12 | | nicolas17 joins |
| 16:04:07 | | retromouse (retromouse) joins |
| 16:07:01 | | Nulo quits [Read error: Connection reset by peer] |
| 16:07:07 | | Nulo joins |
| 16:12:53 | <@JAA> | pabs: I don't have time to watch it, but yes, an AB job now is a good idea either way. |
| 16:20:09 | | AlsoTheTechRobo quits [Remote host closed the connection] |
| 16:20:49 | | AlsoTheTechRobo (TheTechRobo) joins |
| 16:23:51 | | threedeeitguy_ joins |
| 16:27:14 | | threedeeitguy quits [Ping timeout: 252 seconds] |
| 16:39:31 | | Guest50 quits [Ping timeout: 252 seconds] |
| 16:57:25 | | Hackerpcs quits [Quit: Hackerpcs] |
| 17:00:39 | | Hackerpcs (Hackerpcs) joins |
| 17:07:02 | | Guest50 joins |
| 17:12:26 | | threedeeitguy joins |
| 17:14:32 | | threedeeitguy_ quits [Ping timeout: 252 seconds] |
| 17:21:19 | | Guest50 quits [Read error: Connection reset by peer] |
| 17:22:45 | <icedice> | Saqui: Well shit. Is there any way to do it? |
| 17:23:08 | <icedice> | Or are we stuck with just scraping Imgur URLs from it? |
| 17:23:18 | <icedice> | Or is that even possible |
| 17:23:37 | | Webuser710 joins |
| 17:23:42 | <icedice> | If nothing else I could try asking the webmaster for assistance |
| 17:24:06 | <icedice> | Assuming he doesn't just yell at me and ban me for even suggesting it |
| 17:24:51 | <icedice> | * Sanqui |
| 17:24:57 | <pokechu22> | There's a note at the top of https://www.pokecommunity.com/forumdisplay.php?fn=scarlet-violet mentioning imgur, so they at least know about it, and probably would be happy to help if they can |
| 17:25:03 | <icedice> | (I spelled your nick wrong) |
| 17:25:10 | <icedice> | Yeah |
| 17:25:30 | <@Sanqui> | Yes if somebody could get a list of imgur urls either by scraping through other means than archivebot or from the forum administrators that would be ideal. |
| 17:25:33 | <pokechu22> | I don't see a search feature (it may only be available when signed in), but if there is one you can try searching imgur and seeing what links show up |
| 17:25:49 | <pokechu22> | though depending on the forum software that might be incomplete |
| 17:25:53 | <icedice> | Of the three major Pokémon forums, they're the one that is the most chill by far |
| 17:26:05 | <icedice> | Has a ROM hacking section |
| 17:26:22 | <icedice> | Used to allow manga scans until 2009 when VIZ started publishing again |
| 17:27:07 | <icedice> | Is there some way for the admin to add an exception for ArchiveBot in CloudFlare? |
| 17:27:16 | <icedice> | Like an allowed IP or user-agent or something |
| 17:27:52 | <pokechu22> | Oh, yeah, another large forum: https://hypixel.net/forums/ which also seems to be imgur-heavy (but also 4 million threads and 33 million posts). It does have a search: https://hypixel.net/search/11815554/?q=imgur&o=date - and from https://hypixel.net/search/11815554/?page=5&q=imgur&o=date there's a "view older results" link so you can keep going back, but I haven't seen how |
| 17:27:54 | <pokechu22> | far back it actually goes |
| 17:30:16 | <myself> | icedice: CF theoretically has a concept of "friendly bots" but I don't know if anyone's ever pursued getting AB listed as such. |
| 17:33:47 | <icedice> | I remember that a manga ripper had a CloudFlare bypass thing |
| 17:33:54 | <icedice> | Something with cookies iirc |
| 17:35:27 | <pokechu22> | On further thought I can probably hack together a bookmarklet to do the hypixel forums |
| 17:36:17 | <pokechu22> | CloudFlare tends to give you a cookie that allows longer access once you load the page in a browser (and possibly complete a captcha) but that cookie is time-limited. It's not really suitable for archivebot (since you can't set custom cookies on it) but it is something I've done for a few wikis |
| 17:52:05 | <icedice> | <pokechu22> On further thought I can probably hack together a bookmarklet to do the hypixel forums |
| 17:52:16 | <icedice> | Could you make one for The PokéCommunity as well? |
| 17:52:52 | <icedice> | Assuming the admin doesn't agree to help |
| 17:53:06 | <pokechu22> | If there's a search page, maybe, but it really depends on how the search page behaves |
| 17:54:11 | | Barto quits [Ping timeout: 265 seconds] |
| 17:54:49 | <pokechu22> | Some forums show the whole post if it's a search result, and others only give a snippet and a link. Hypixel is the latter so I'd need to extract content from each post separately (I'm planning on using archivebot for that, which would be an issue for PokéCommunity) |
| 18:00:41 | <icedice> | The PokéCommunity uses XenForo in case that tells you anything |
| 18:03:38 | | HP_Archivist (HP_Archivist) joins |
| 18:07:30 | <pokechu22> | ah, found it: https://www.pokecommunity.com/search.php requires being signed in |
| 18:08:59 | | theavery joins |
| 18:10:03 | <pokechu22> | yeah, I don't think that's going to work in the same way :/ |
| 18:11:22 | | icedice quits [Ping timeout: 252 seconds] |
| 18:13:56 | | umgr036 quits [Remote host closed the connection] |
| 18:14:10 | | umgr036 joins |
| 18:17:13 | | icedice (icedice) joins |
| 18:26:57 | | theavery quits [Remote host closed the connection] |
| 18:59:08 | | Guest50 joins |
| 19:00:08 | | whoami quits [Ping timeout: 252 seconds] |
| 19:05:43 | | Ivan226 quits [Ping timeout: 265 seconds] |
| 19:10:12 | | Guest50_ joins |
| 19:12:14 | | Guest50 quits [Ping timeout: 252 seconds] |
| 19:24:02 | | Megame (Megame) joins |
| 19:24:05 | | Guest50_ quits [Ping timeout: 265 seconds] |
| 19:41:23 | | sonick quits [Client Quit] |
| 19:56:07 | | BigBoris57 joins |
| 19:56:50 | | Barto (Barto) joins |
| 19:59:22 | | BigBoris quits [Ping timeout: 265 seconds] |
| 20:08:06 | | qwertyasdfuiopghjkl (qwertyasdfuiopghjkl) joins |
| 20:30:04 | | hitgrr8 quits [Client Quit] |
| 20:48:10 | | Webuser710 quits [Remote host closed the connection] |
| 21:11:55 | | Billy549 quits [Client Quit] |
| 21:30:02 | | TastyWiener95 (TastyWiener95) joins |
| 21:32:49 | | Billy549 (Billy549) joins |
| 21:33:36 | | Webuser124 joins |
| 21:34:49 | <@rewby> | Now that things are working correctly, we've slammed into the 10 limit immediatelyt |
| 21:38:07 | | AlsoTheTechRobo quits [Remote host closed the connection] |
| 21:38:50 | | AlsoTheTechRobo (TheTechRobo) joins |
| 21:44:29 | | lexikiq joins |
| 21:53:55 | | dumbgoy__ quits [Ping timeout: 265 seconds] |
| 22:06:09 | | Guest50 joins |
| 22:06:12 | <datechnoman> | rewby What is the 10 limit you set out of curiosity? |
| 22:06:58 | <Billy549> | Hey, something a friend forked has just got DMCA'd but is still up on GitHub - what's the best way to immediately archive it? |
| 22:07:11 | <Billy549> | https://github.com/shchmue/Lockpick_RCM for reference |
| 22:07:23 | <@rewby> | datechnoman: I've implemented some code to limit our max upload concurrency |
| 22:07:51 | <datechnoman> | Ahhh upload concurrency. Nice |
| 22:08:06 | <datechnoman> | Thanks :) |
| 22:21:51 | | dumbgoy__ joins |
| 22:35:00 | | BigBoris57 quits [Ping timeout: 265 seconds] |
| 22:39:23 | <JTL> | Billy549: Not seeing any DMCA registered with GitHub for anything related to that, so... ? |
| 22:39:56 | <Billy549> | JTL: theyve been privately sent the DMCA request |
| 22:39:59 | <@JAA> | Billy549: #gitgud is our GitHub project. |
| 22:40:13 | <Billy549> | "ah it's going to be processed after 1 business day / grace period for counter notice or making changes |
| 22:40:18 | <Billy549> | @JAA noted |
| 22:40:19 | | dumbgoy__ quits [Ping timeout: 252 seconds] |
| 22:40:19 | <JTL> | Billy549: ahh |
| 22:40:56 | <@JAA> | So best to request it there with note of urgency. I'll take care of it now. |
| 22:41:04 | <Billy549> | Thank you ^^ |
| 22:42:42 | | dumbgoy__ joins |
| 22:46:26 | | Webuser124 quits [Remote host closed the connection] |
| 22:50:44 | | benjins is now authenticated as benjins |
| 22:51:18 | | Megame quits [Client Quit] |
| 22:53:05 | | dumbgoy joins |
| 22:53:22 | | dumbgoy__ quits [Ping timeout: 265 seconds] |
| 22:58:27 | <@OrIdow6> | It appears that the newworld and playlostark forums have indeed frozen |
| 22:58:38 | <@OrIdow6> | Does Discourse work well in AB? |
| 23:00:06 | | retromouse quits [Read error: Connection reset by peer] |
| 23:00:43 | | retromouse (retromouse) joins |
| 23:01:17 | | sonick (sonick) joins |
| 23:02:37 | <pokechu22> | Yes, ish - there isn't an ignoreset for it IIRC but it generally works OK |
| 23:02:56 | <@OrIdow6> | Nothing in the forums ignoreset? |
| 23:04:01 | <pokechu22> | I usually still apply it, but I think nothing in it specifically targets discourse - see https://github.com/ArchiveTeam/ArchiveBot/issues/317 |
| 23:06:11 | <@OrIdow6> | Ah thanks |
| 23:06:26 | <@OrIdow6> | Good that we have people keeping track of this stuff |
| 23:07:46 | <@JAA> | Discourse works reasonably well for archival, but playback in the WBM is often broken unless you disable JS. |
| 23:09:56 | <@OrIdow6> | A shame but better than many other sites |
| 23:12:11 | | icedice quits [Client Quit] |
| 23:13:37 | | Ruthalas5 quits [Client Quit] |
| 23:13:58 | | Ruthalas5 (Ruthalas) joins |
| 23:14:32 | | icedice (icedice) joins |
| 23:17:39 | <h2ibot> | OrIdow6 edited Discourse (+300, Archiving with AB & New World/Lost Ark forums): https://wiki.archiveteam.org/?diff=49737&oldid=49670 |
| 23:29:49 | | Guest50 quits [Ping timeout: 252 seconds] |
| 23:49:04 | | retromouse quits [Ping timeout: 252 seconds] |
| 23:49:58 | | Guest50 joins |
| 23:55:34 | | Arcorann (Arcorann) joins |