00:00:53<project10>https://old.reddit.com/r/povertyfinancecanada/comments/11ytjf4/rogers_data_overage_bill_will_make_you_homeless/ continues to this day
00:06:02ell quits [Client Quit]
00:07:01ell (ell) joins
00:09:59ell quits [Client Quit]
00:10:08ell (ell) joins
00:21:10ell quits [Client Quit]
00:23:29ell (ell) joins
00:26:46icedice quits [Client Quit]
00:39:10<fireonlive>oof
00:57:02onetruth joins
01:00:35us3rrr quits [Ping timeout: 252 seconds]
01:19:16etnguyen03 quits [Ping timeout: 265 seconds]
01:21:10HP_Archivist (HP_Archivist) joins
01:24:47etnguyen03 (etnguyen03) joins
01:30:38Dango360 (Dango360) joins
01:48:47<anarcat>some epic crawl, journalmetro.com (d9mh44xbsx92ie1iwf88mk2pn) - i didn't expect it to be so big (and still growing)
01:49:13<anarcat>things seem to be running smoothly though, and might finish in time to keep that thing in IA before the damn thing falls apart like everything else
01:50:40<@JAA>(immature giggles from the back row)
01:52:07<@JAA>They didn't say anything about how long the site would stay up, did they?
01:54:41<anarcat>i haven't followed closely
01:59:52Hackerpcs quits [Quit: Hackerpcs]
02:30:02Hackerpcs (Hackerpcs) joins
02:43:01parfait_ joins
02:44:32etnguyen03 quits [Ping timeout: 252 seconds]
02:47:14parfait quits [Ping timeout: 265 seconds]
03:08:32etnguyen03 (etnguyen03) joins
03:22:18<Ryz>Heya folks, besides the default Warrior project selection, any other Warrior projects that might need attention?
03:23:26<flashfire42>atm telegram and reddit are the 2 with items but they are right now clogged by targets. if you wanna try your luck at Zowa then we could test if its a ban or if the items that its trying to push out are indeed bad at this point Ryz
03:31:50<nicolas17>yeah everything seems stalled atm
03:36:13<Ryz>Was pondering on Imgur but hmm o.o;
03:36:33<Ryz>Wouldn't mind running more of the bruteforcer if it needs attention
03:37:55<nicolas17>I have 135 million IDs from the bruteforcer that I still didn't submit into the queue and probably never will
03:38:00<nicolas17>imgur got too large
03:38:54<Ryz>Oof, too much data? :c
03:42:00<nicolas17>we archived 654TB
03:43:15<nicolas17><JAA> The problem is the data size. We already went well past the initial estimate we gave IA.
03:43:17<nicolas17><nicolas17> we're at 650TiB
03:43:18<nicolas17><JAA> Yes, which is more than double what we told IA.
03:44:35<Ryz>...Oo;
03:44:39<Ryz>Aaaaah <#>;
03:44:48<nicolas17><JAA> I feel like the best option going forward that we have is keeping this running continuously MediaFire-style so that we can queue lists of images collected from other crawls
03:44:49<nicolas17><JAA> But I don't see archiving all of Imgur happening anytime soon. Well, not until they're shutting down or doing a severe policy change like deleting images after X days or whatever.
03:46:28<Ryz>Hmm, would it be best to just run the bruteforcer on the remote chance that Imgur may actually shut down or the severe policy change? Collect more of the stuff
03:46:44<@JAA>→ #imgone
03:49:06dumbgoy quits [Ping timeout: 265 seconds]
03:53:47dumbgoy joins
03:58:17dumbgoy quits [Ping timeout: 265 seconds]
04:20:14Aoede_ quits [Ping timeout: 252 seconds]
04:23:28<flashfire42>ok I have to ask. What the fuck is actually connecting to the rsync servers if nobody is actually seeming to connect. If we are all complaining what the fuck is the clog?
04:25:11etnguyen03 quits [Ping timeout: 252 seconds]
04:27:25etnguyen03 (etnguyen03) joins
04:27:32<nicolas17>flashfire42: if you get -1 it means disks are full
04:27:46<nicolas17>and the server is set to maximum 0 concurrent connections and nobody is connecting
04:28:20<flashfire42>So the bottleneck is moving that data to temp storage? or did we already fill that
04:29:26<@JAA>Yes, that is the bottleneck. No, it isn't full, but its capacity is reduced compared to at the beginning.
04:29:52hitgrr8 joins
04:30:19<nicolas17>I have seen continuous rsync errors for hours so it looks stuck full rather than "slow to free up"
04:30:46<nicolas17>unless it's so slow that the hysteresis is making it look stuck
04:34:32monika quits [Ping timeout: 252 seconds]
04:47:52<nicolas17>JAA: should we pause (or greatly rate-limit) projects while targets are full?
04:48:37<nicolas17>especially telegram where people would get reclaims of items that took too long *because* they're stuck uploading
04:50:30etnguyen03 quits [Client Quit]
05:00:02Island quits [Read error: Connection reset by peer]
05:06:04Aoede (Aoede) joins
05:07:12<flashfire42>poor optane9 rewby
05:21:08monika (boom) joins
05:25:06BlueMaxima quits [Read error: Connection reset by peer]
05:30:22shinji257 quits [Client Quit]
06:38:41decky_e joins
06:40:41decky_e_ quits [Ping timeout: 265 seconds]
06:53:08treora quits [Ping timeout: 252 seconds]
06:53:23treora joins
07:03:11decky joins
07:05:20decky_e quits [Ping timeout: 265 seconds]
07:06:15Unholy2361316618085159 quits [Remote host closed the connection]
07:06:36Unholy2361316618085159 (Unholy2361) joins
07:07:40sen__time joins
07:07:50sen__time quits [Remote host closed the connection]
07:12:23decky quits [Ping timeout: 252 seconds]
07:17:35Arcorann (Arcorann) joins
07:27:35decky_e joins
08:36:25iCaotix quits [Client Quit]
08:37:37iCaotix joins
08:43:11iCaotix quits [Client Quit]
08:43:23iCaotix joins
08:52:38iCaotix quits [Client Quit]
08:52:56iCaotix joins
08:55:04JohnnyJ quits [Client Quit]
09:20:35JohnnyJ joins
09:52:01parfait_ quits [Read error: Connection reset by peer]
10:00:01railen63 quits [Remote host closed the connection]
10:00:18railen63 joins
10:29:06treora quits [Remote host closed the connection]
10:29:07treora joins
10:30:10treora quits [Remote host closed the connection]
10:30:12treora joins
12:31:23HP_Archivist quits [Ping timeout: 252 seconds]
13:06:57etnguyen03 (etnguyen03) joins
13:14:17shinji257 (shinji257) joins
13:28:38nic (nic) joins
13:31:02nic9 quits [Ping timeout: 265 seconds]
13:44:10HP_Archivist (HP_Archivist) joins
14:04:36railen64 joins
14:07:46AmAnd0A quits [Ping timeout: 265 seconds]
14:07:50AmAnd0A joins
14:08:15railen63 quits [Ping timeout: 265 seconds]
14:28:15<@JAA>So someone mentioned archiving Doomworld yesterday. Since it's Invision and I only had to replace three lines in my Canucks forums script, I gave it a quick try. Turns out that site is very broken. Quite a lot of topics return 500s: https://www.doomworld.com/forum/topic/721-x/
14:30:14<@JAA>There hasn't been any official announcement in five years, and the sole admin I could see is rarely active. So it could use an archival.
14:31:17AmAnd0A quits [Read error: Connection reset by peer]
14:32:10AmAnd0A joins
14:34:31Island joins
14:44:30<that_lurker>Cisco appreas to have bought Splunk
14:44:53<that_lurker>https://www.splunk.com/en_us/blog/leadership/splunk-and-cisco-unite-to-accelerate-digital-resilience-as-one-of-the-leading-global-software-companies.html
14:51:45HP_Archivist quits [Ping timeout: 265 seconds]
14:53:51<fireonlive>that_lurker: im not sure they could make it any more expensive but i’m sure they’re going to try
14:54:48<that_lurker>"Somebody: Splunk has exorbitant prices and locked-in enterprise customers!
14:54:48<that_lurker>Cisco: Oh these guys are just like us. Better buy them up. We know this business."
14:55:28<that_lurker>That and many more fun takes are on the HN https://news.ycombinator.com/item?id=37596497
14:56:11<fireonlive>:3
15:01:27onetruth quits [Client Quit]
15:05:36<that_lurker>Would maybe be a good idea to grab the splunk documentation site https://docs.splunk.com/Documentation
15:12:32Arcorann quits [Ping timeout: 265 seconds]
15:14:29<rktk>https://www.fanforum.com/ is anything from this site archived? there seems to be a LOT of older content there
15:14:35<rktk>https://archive.org/search?query=originalurl%3A%28*www.fanforum.com*%29 nothing on archive
15:14:38<rktk>at least, not as a warc
15:14:46<rktk>im sure it's in web.archive
15:15:30<pabs>rktk: no results in the AB job viewer https://archive.fart.website/archivebot/viewer/?q=fanforum.com
15:15:43<rktk>possibly worthy as a new project?
15:15:56<rktk>I notice the site loads verrrrry slow. takes a while depending on how old the INDEX of a subforum is
15:16:04<rktk>talking minutes, not seconds
15:16:32<pabs>seems like it would be impossible to archive - would kill the site?
15:16:34<@JAA>That IA search is only really useful for wiki dumps.
15:16:45<pabs>oh, the front page eventually loaded
15:16:47<@JAA>Most other items don't have an 'originalurl' metadata field.
15:19:09<rktk>pabs exactly what I'm talking about
15:19:15<rktk>Jaa ah only for wiki sites
15:19:23<rktk>didn't know that was a metadata entry
15:19:46<rktk>pabs just trying to sus out thoughts on this but yeah it seems the site is basically in some kind of maintenance mode or like, skeleton life support...
15:19:54<rktk>and yet people still actively post
15:20:39just1602 joins
15:21:43<@JAA>Oh wow, that's huge.
15:21:54<pabs>Threads: 495,274 | Posts: 107,495,875 | Members: 406,871 | Currently Active Users: 2969 (36 members and 2933 guests)
15:21:55<pabs>
15:22:11<pabs>vBulletin
15:22:37<pabs>probably too big for ArchiveBot?
15:22:38<@JAA>Topic IDs are around 63 million, so enumerating that is out of question.
15:22:56<just1602>https://github.com/EFForg/apkeep <= I thought this could be helpfull for the archive team if people want to download apk to archive them
15:23:30<anarcat>holy crap
15:24:38<@JAA>Neat, thanks!
15:26:04<@JAA>pabs: Yeah, with all the extra links everywhere, probably too big. And also, the slow responses would run into the timeout all the time I bet.
15:28:04<pabs>2933 guests, hmm I wonder if they are getting hit hard by spidering
15:42:55shinji257 quits [Client Quit]
15:46:38etnguyen03 quits [Ping timeout: 252 seconds]
16:03:11<@JAA>Oh: https://www.fanforum.com/f443/possible-board-closure-discussion-please-read-respond-63272309/
16:03:29dumbgoy joins
16:03:45<@JAA>Just that individual subforum I think, but yeah.
16:03:55<@JAA>> Fan Forum requires that we average at least 12 posts per day, with lower numbers than that leading to warnings and then possible closure of the board.
16:06:28<fireonlive>huh.
16:08:07<DigitalDragons>interesting
16:22:46railen64 quits [Remote host closed the connection]
16:24:12Lord_Nightmare quits [Quit: ZNC - http://znc.in]
16:26:09railen63 joins
16:27:35katocala quits [Remote host closed the connection]
16:27:47Lord_Nightmare (Lord_Nightmare) joins
16:42:01shinji257 (shinji257) joins
16:42:47shinji257 quits [Client Quit]
16:43:42shinji257 (shinji257) joins
16:57:11parfait (kdqep) joins
17:01:26railen63 quits [Remote host closed the connection]
17:02:14railen63 joins
17:14:07parfait_ joins
17:18:12parfait quits [Ping timeout: 265 seconds]
17:27:52kdqep__ joins
17:31:44parfait_ quits [Ping timeout: 265 seconds]
17:33:34parfait_ joins
17:37:32kdqep__ quits [Ping timeout: 265 seconds]
17:59:03<nicolas17>that_lurker: I saw several people wondering if the $28B Cisco paid Splunk was to acquire them or just renewing their license for the year
18:01:57DogsRNice joins
18:04:46Wohlstand (Wohlstand) joins
18:24:51katocala joins
18:59:35ymgve quits [Quit: Leaving]
19:11:30ymgve joins
19:21:15nick joins
19:21:39nick quits [Remote host closed the connection]
20:34:51shinji257 quits [Client Quit]
20:50:36Billy549 quits [Remote host closed the connection]
21:05:22shinji257 (shinji257) joins
21:14:11hitgrr8 quits [Client Quit]
21:32:52kdqep__ joins
21:36:47parfait_ quits [Ping timeout: 265 seconds]
21:59:38JohnnyJ quits [Client Quit]
21:59:54JohnnyJ joins
22:23:29etnguyen03 (etnguyen03) joins
23:04:11JayEmbee quits [Client Quit]
23:04:16etnguyen03 quits [Ping timeout: 265 seconds]
23:08:08AmAnd0A quits [Ping timeout: 265 seconds]
23:08:41AmAnd0A joins
23:31:31parfait_ joins
23:32:32Arcorann (Arcorann) joins
23:32:46BlueMaxima joins
23:35:39parfait (kdqep) joins
23:35:41kdqep__ quits [Ping timeout: 265 seconds]
23:39:04parfait_ quits [Ping timeout: 265 seconds]
23:45:09etnguyen03 (etnguyen03) joins
23:51:15AmAnd0A quits [Read error: Connection reset by peer]
23:51:31AmAnd0A joins
23:58:20AnotherIki quits [Ping timeout: 252 seconds]