00:00:53 | <project10> | https://old.reddit.com/r/povertyfinancecanada/comments/11ytjf4/rogers_data_overage_bill_will_make_you_homeless/ continues to this day |
00:06:02 | | ell quits [Client Quit] |
00:07:01 | | ell (ell) joins |
00:09:59 | | ell quits [Client Quit] |
00:10:08 | | ell (ell) joins |
00:21:10 | | ell quits [Client Quit] |
00:23:29 | | ell (ell) joins |
00:26:46 | | icedice quits [Client Quit] |
00:39:10 | <fireonlive> | oof |
00:57:02 | | onetruth joins |
01:00:35 | | us3rrr quits [Ping timeout: 252 seconds] |
01:19:16 | | etnguyen03 quits [Ping timeout: 265 seconds] |
01:21:10 | | HP_Archivist (HP_Archivist) joins |
01:24:47 | | etnguyen03 (etnguyen03) joins |
01:30:38 | | Dango360 (Dango360) joins |
01:48:47 | <anarcat> | some epic crawl, journalmetro.com (d9mh44xbsx92ie1iwf88mk2pn) - i didn't expect it to be so big (and still growing) |
01:49:13 | <anarcat> | things seem to be running smoothly though, and might finish in time to keep that thing in IA before the damn thing falls apart like everything else |
01:50:40 | <@JAA> | (immature giggles from the back row) |
01:52:07 | <@JAA> | They didn't say anything about how long the site would stay up, did they? |
01:54:41 | <anarcat> | i haven't followed closely |
01:59:52 | | Hackerpcs quits [Quit: Hackerpcs] |
02:30:02 | | Hackerpcs (Hackerpcs) joins |
02:43:01 | | parfait_ joins |
02:44:32 | | etnguyen03 quits [Ping timeout: 252 seconds] |
02:47:14 | | parfait quits [Ping timeout: 265 seconds] |
03:08:32 | | etnguyen03 (etnguyen03) joins |
03:22:18 | <Ryz> | Heya folks, besides the default Warrior project selection, any other Warrior projects that might need attention? |
03:23:26 | <flashfire42> | atm telegram and reddit are the 2 with items but they are right now clogged by targets. if you wanna try your luck at Zowa then we could test if its a ban or if the items that its trying to push out are indeed bad at this point Ryz |
03:31:50 | <nicolas17> | yeah everything seems stalled atm |
03:36:13 | <Ryz> | Was pondering on Imgur but hmm o.o; |
03:36:33 | <Ryz> | Wouldn't mind running more of the bruteforcer if it needs attention |
03:37:55 | <nicolas17> | I have 135 million IDs from the bruteforcer that I still didn't submit into the queue and probably never will |
03:38:00 | <nicolas17> | imgur got too large |
03:38:54 | <Ryz> | Oof, too much data? :c |
03:42:00 | <nicolas17> | we archived 654TB |
03:43:15 | <nicolas17> | <JAA> The problem is the data size. We already went well past the initial estimate we gave IA. |
03:43:17 | <nicolas17> | <nicolas17> we're at 650TiB |
03:43:18 | <nicolas17> | <JAA> Yes, which is more than double what we told IA. |
03:44:35 | <Ryz> | ...Oo; |
03:44:39 | <Ryz> | Aaaaah <#>; |
03:44:48 | <nicolas17> | <JAA> I feel like the best option going forward that we have is keeping this running continuously MediaFire-style so that we can queue lists of images collected from other crawls |
03:44:49 | <nicolas17> | <JAA> But I don't see archiving all of Imgur happening anytime soon. Well, not until they're shutting down or doing a severe policy change like deleting images after X days or whatever. |
03:46:28 | <Ryz> | Hmm, would it be best to just run the bruteforcer on the remote chance that Imgur may actually shut down or the severe policy change? Collect more of the stuff |
03:46:44 | <@JAA> | → #imgone |
03:49:06 | | dumbgoy quits [Ping timeout: 265 seconds] |
03:53:47 | | dumbgoy joins |
03:58:17 | | dumbgoy quits [Ping timeout: 265 seconds] |
04:20:14 | | Aoede_ quits [Ping timeout: 252 seconds] |
04:23:28 | <flashfire42> | ok I have to ask. What the fuck is actually connecting to the rsync servers if nobody is actually seeming to connect. If we are all complaining what the fuck is the clog? |
04:25:11 | | etnguyen03 quits [Ping timeout: 252 seconds] |
04:27:25 | | etnguyen03 (etnguyen03) joins |
04:27:32 | <nicolas17> | flashfire42: if you get -1 it means disks are full |
04:27:46 | <nicolas17> | and the server is set to maximum 0 concurrent connections and nobody is connecting |
04:28:20 | <flashfire42> | So the bottleneck is moving that data to temp storage? or did we already fill that |
04:29:26 | <@JAA> | Yes, that is the bottleneck. No, it isn't full, but its capacity is reduced compared to at the beginning. |
04:29:52 | | hitgrr8 joins |
04:30:19 | <nicolas17> | I have seen continuous rsync errors for hours so it looks stuck full rather than "slow to free up" |
04:30:46 | <nicolas17> | unless it's so slow that the hysteresis is making it look stuck |
04:34:32 | | monika quits [Ping timeout: 252 seconds] |
04:47:52 | <nicolas17> | JAA: should we pause (or greatly rate-limit) projects while targets are full? |
04:48:37 | <nicolas17> | especially telegram where people would get reclaims of items that took too long *because* they're stuck uploading |
04:50:30 | | etnguyen03 quits [Client Quit] |
05:00:02 | | Island quits [Read error: Connection reset by peer] |
05:06:04 | | Aoede (Aoede) joins |
05:07:12 | <flashfire42> | poor optane9 rewby |
05:21:08 | | monika (boom) joins |
05:25:06 | | BlueMaxima quits [Read error: Connection reset by peer] |
05:30:22 | | shinji257 quits [Client Quit] |
06:38:41 | | decky_e joins |
06:40:41 | | decky_e_ quits [Ping timeout: 265 seconds] |
06:53:08 | | treora quits [Ping timeout: 252 seconds] |
06:53:23 | | treora joins |
07:03:11 | | decky joins |
07:05:20 | | decky_e quits [Ping timeout: 265 seconds] |
07:06:15 | | Unholy2361316618085159 quits [Remote host closed the connection] |
07:06:36 | | Unholy2361316618085159 (Unholy2361) joins |
07:07:40 | | sen__time joins |
07:07:50 | | sen__time quits [Remote host closed the connection] |
07:12:23 | | decky quits [Ping timeout: 252 seconds] |
07:17:35 | | Arcorann (Arcorann) joins |
07:27:35 | | decky_e joins |
08:36:25 | | iCaotix quits [Client Quit] |
08:37:37 | | iCaotix joins |
08:43:11 | | iCaotix quits [Client Quit] |
08:43:23 | | iCaotix joins |
08:52:38 | | iCaotix quits [Client Quit] |
08:52:56 | | iCaotix joins |
08:55:04 | | JohnnyJ quits [Client Quit] |
09:20:35 | | JohnnyJ joins |
09:52:01 | | parfait_ quits [Read error: Connection reset by peer] |
10:00:01 | | railen63 quits [Remote host closed the connection] |
10:00:18 | | railen63 joins |
10:29:06 | | treora quits [Remote host closed the connection] |
10:29:07 | | treora joins |
10:30:10 | | treora quits [Remote host closed the connection] |
10:30:12 | | treora joins |
12:31:23 | | HP_Archivist quits [Ping timeout: 252 seconds] |
13:06:57 | | etnguyen03 (etnguyen03) joins |
13:14:17 | | shinji257 (shinji257) joins |
13:28:38 | | nic (nic) joins |
13:31:02 | | nic9 quits [Ping timeout: 265 seconds] |
13:44:10 | | HP_Archivist (HP_Archivist) joins |
14:04:36 | | railen64 joins |
14:07:46 | | AmAnd0A quits [Ping timeout: 265 seconds] |
14:07:50 | | AmAnd0A joins |
14:08:15 | | railen63 quits [Ping timeout: 265 seconds] |
14:28:15 | <@JAA> | So someone mentioned archiving Doomworld yesterday. Since it's Invision and I only had to replace three lines in my Canucks forums script, I gave it a quick try. Turns out that site is very broken. Quite a lot of topics return 500s: https://www.doomworld.com/forum/topic/721-x/ |
14:30:14 | <@JAA> | There hasn't been any official announcement in five years, and the sole admin I could see is rarely active. So it could use an archival. |
14:31:17 | | AmAnd0A quits [Read error: Connection reset by peer] |
14:32:10 | | AmAnd0A joins |
14:34:31 | | Island joins |
14:44:30 | <that_lurker> | Cisco appreas to have bought Splunk |
14:44:53 | <that_lurker> | https://www.splunk.com/en_us/blog/leadership/splunk-and-cisco-unite-to-accelerate-digital-resilience-as-one-of-the-leading-global-software-companies.html |
14:51:45 | | HP_Archivist quits [Ping timeout: 265 seconds] |
14:53:51 | <fireonlive> | that_lurker: im not sure they could make it any more expensive but i’m sure they’re going to try |
14:54:48 | <that_lurker> | "Somebody: Splunk has exorbitant prices and locked-in enterprise customers! |
14:54:48 | <that_lurker> | Cisco: Oh these guys are just like us. Better buy them up. We know this business." |
14:55:28 | <that_lurker> | That and many more fun takes are on the HN https://news.ycombinator.com/item?id=37596497 |
14:56:11 | <fireonlive> | :3 |
15:01:27 | | onetruth quits [Client Quit] |
15:05:36 | <that_lurker> | Would maybe be a good idea to grab the splunk documentation site https://docs.splunk.com/Documentation |
15:12:32 | | Arcorann quits [Ping timeout: 265 seconds] |
15:14:29 | <rktk> | https://www.fanforum.com/ is anything from this site archived? there seems to be a LOT of older content there |
15:14:35 | <rktk> | https://archive.org/search?query=originalurl%3A%28*www.fanforum.com*%29 nothing on archive |
15:14:38 | <rktk> | at least, not as a warc |
15:14:46 | <rktk> | im sure it's in web.archive |
15:15:30 | <pabs> | rktk: no results in the AB job viewer https://archive.fart.website/archivebot/viewer/?q=fanforum.com |
15:15:43 | <rktk> | possibly worthy as a new project? |
15:15:56 | <rktk> | I notice the site loads verrrrry slow. takes a while depending on how old the INDEX of a subforum is |
15:16:04 | <rktk> | talking minutes, not seconds |
15:16:32 | <pabs> | seems like it would be impossible to archive - would kill the site? |
15:16:34 | <@JAA> | That IA search is only really useful for wiki dumps. |
15:16:45 | <pabs> | oh, the front page eventually loaded |
15:16:47 | <@JAA> | Most other items don't have an 'originalurl' metadata field. |
15:19:09 | <rktk> | pabs exactly what I'm talking about |
15:19:15 | <rktk> | Jaa ah only for wiki sites |
15:19:23 | <rktk> | didn't know that was a metadata entry |
15:19:46 | <rktk> | pabs just trying to sus out thoughts on this but yeah it seems the site is basically in some kind of maintenance mode or like, skeleton life support... |
15:19:54 | <rktk> | and yet people still actively post |
15:20:39 | | just1602 joins |
15:21:43 | <@JAA> | Oh wow, that's huge. |
15:21:54 | <pabs> | Threads: 495,274 | Posts: 107,495,875 | Members: 406,871 | Currently Active Users: 2969 (36 members and 2933 guests) |
15:21:55 | <pabs> | |
15:22:11 | <pabs> | vBulletin |
15:22:37 | <pabs> | probably too big for ArchiveBot? |
15:22:38 | <@JAA> | Topic IDs are around 63 million, so enumerating that is out of question. |
15:22:56 | <just1602> | https://github.com/EFForg/apkeep <= I thought this could be helpfull for the archive team if people want to download apk to archive them |
15:23:30 | <anarcat> | holy crap |
15:24:38 | <@JAA> | Neat, thanks! |
15:26:04 | <@JAA> | pabs: Yeah, with all the extra links everywhere, probably too big. And also, the slow responses would run into the timeout all the time I bet. |
15:28:04 | <pabs> | 2933 guests, hmm I wonder if they are getting hit hard by spidering |
15:42:55 | | shinji257 quits [Client Quit] |
15:46:38 | | etnguyen03 quits [Ping timeout: 252 seconds] |
16:03:11 | <@JAA> | Oh: https://www.fanforum.com/f443/possible-board-closure-discussion-please-read-respond-63272309/ |
16:03:29 | | dumbgoy joins |
16:03:45 | <@JAA> | Just that individual subforum I think, but yeah. |
16:03:55 | <@JAA> | > Fan Forum requires that we average at least 12 posts per day, with lower numbers than that leading to warnings and then possible closure of the board. |
16:06:28 | <fireonlive> | huh. |
16:08:07 | <DigitalDragons> | interesting |
16:22:46 | | railen64 quits [Remote host closed the connection] |
16:24:12 | | Lord_Nightmare quits [Quit: ZNC - http://znc.in] |
16:26:09 | | railen63 joins |
16:27:35 | | katocala quits [Remote host closed the connection] |
16:27:47 | | Lord_Nightmare (Lord_Nightmare) joins |
16:42:01 | | shinji257 (shinji257) joins |
16:42:47 | | shinji257 quits [Client Quit] |
16:43:42 | | shinji257 (shinji257) joins |
16:57:11 | | parfait (kdqep) joins |
17:01:26 | | railen63 quits [Remote host closed the connection] |
17:02:14 | | railen63 joins |
17:14:07 | | parfait_ joins |
17:18:12 | | parfait quits [Ping timeout: 265 seconds] |
17:27:52 | | kdqep__ joins |
17:31:44 | | parfait_ quits [Ping timeout: 265 seconds] |
17:33:34 | | parfait_ joins |
17:37:32 | | kdqep__ quits [Ping timeout: 265 seconds] |
17:59:03 | <nicolas17> | that_lurker: I saw several people wondering if the $28B Cisco paid Splunk was to acquire them or just renewing their license for the year |
18:01:57 | | DogsRNice joins |
18:04:46 | | Wohlstand (Wohlstand) joins |
18:24:51 | | katocala joins |
18:59:35 | | ymgve quits [Quit: Leaving] |
19:07:11 | | katocala is now authenticated as katocala |
19:11:30 | | ymgve joins |
19:21:15 | | nick joins |
19:21:39 | | nick quits [Remote host closed the connection] |
20:34:51 | | shinji257 quits [Client Quit] |
20:50:36 | | Billy549 quits [Remote host closed the connection] |
21:05:22 | | shinji257 (shinji257) joins |
21:14:11 | | hitgrr8 quits [Client Quit] |
21:32:52 | | kdqep__ joins |
21:36:47 | | parfait_ quits [Ping timeout: 265 seconds] |
21:59:38 | | JohnnyJ quits [Client Quit] |
21:59:54 | | JohnnyJ joins |
22:23:29 | | etnguyen03 (etnguyen03) joins |
23:04:11 | | JayEmbee quits [Client Quit] |
23:04:16 | | etnguyen03 quits [Ping timeout: 265 seconds] |
23:08:08 | | AmAnd0A quits [Ping timeout: 265 seconds] |
23:08:41 | | AmAnd0A joins |
23:31:31 | | parfait_ joins |
23:32:32 | | Arcorann (Arcorann) joins |
23:32:46 | | BlueMaxima joins |
23:35:39 | | parfait (kdqep) joins |
23:35:41 | | kdqep__ quits [Ping timeout: 265 seconds] |
23:38:00 | | benjins is now authenticated as benjins |
23:39:04 | | parfait_ quits [Ping timeout: 265 seconds] |
23:45:09 | | etnguyen03 (etnguyen03) joins |
23:51:15 | | AmAnd0A quits [Read error: Connection reset by peer] |
23:51:31 | | AmAnd0A joins |
23:58:20 | | AnotherIki quits [Ping timeout: 252 seconds] |