| 00:20:11 | <LegitSi> | Hey JAA, I don't mean to be too pushy but it's been two days. How's it going with getting StripGenerator in WBM? |
| 00:20:31 | <pokechu22> | Hmm, I'd expect it to be in there by now |
| 00:21:47 | <pokechu22> | I see CDX files for everything related to it on https://archive.org/download/archiveteam_archivebot_go_20230226085206_c2cccc34 which should mean it's finished processing (though it hasn't finished processing that item as a whole; there's a few left near the bottom (starting at www.bytereef.org-inf-20230226-043107-cjsdi-meta.warc.gz)) |
| 00:21:57 | <pokechu22> | Is there a page you expect to exist that doesn't? |
| 00:27:11 | <pokechu22> | hmm, actually, nothing on that seems to be on web.archive.org yet (atari-investisseurs.fr doesn't have an archivebot capture for instance), so maybe it only shows up there once it finishes indexing/generating cdx files for everything? |
| 00:28:37 | <LegitSi> | Got it. If it's that close to completion, I'll be monitoring for it to be done. |
| 00:29:55 | <pokechu22> | I estimate it'll be done within 6 hours, but that's just extrapolating based on how much of it seems to be finished and how long that took |
| 00:30:37 | <LegitSi> | Understood. Thanks :) |
| 00:32:36 | <@JAA> | Yeah, there's a delay between CDX derivation in the item and it actually being added to the WBM index. It can take a couple days if the WBM index is lagging behind, and there were some issues with SPN the other day, so that might still be the case. |
| 00:40:24 | | Hackerpcs quits [Quit: Hackerpcs] |
| 00:43:15 | | Hackerpcs (Hackerpcs) joins |
| 00:48:40 | | hitgrr8 quits [Client Quit] |
| 01:03:03 | | mick joins |
| 01:03:24 | | mick quits [Remote host closed the connection] |
| 01:32:54 | | useretail__ joins |
| 01:35:02 | | useretail_ quits [Ping timeout: 265 seconds] |
| 01:38:45 | | useretail_ joins |
| 01:41:16 | | useretail__ quits [Ping timeout: 252 seconds] |
| 01:51:54 | | yawkat quits [Ping timeout: 252 seconds] |
| 02:16:31 | | luna quits [Client Quit] |
| 03:02:18 | | lennier1 quits [Ping timeout: 252 seconds] |
| 03:03:58 | | fishingforsoup joins |
| 03:05:25 | | fishingforsoup_ quits [Ping timeout: 252 seconds] |
| 03:10:59 | | BlueMaxima quits [Read error: Connection reset by peer] |
| 03:15:56 | | fishingforsoup_ joins |
| 03:17:42 | | fishingforsoup quits [Ping timeout: 252 seconds] |
| 03:46:01 | | fishingforsoup__ joins |
| 03:49:53 | | fishingforsoup_ quits [Ping timeout: 265 seconds] |
| 03:58:27 | | lennier1 (lennier1) joins |
| 04:54:38 | | LegitSi quits [Client Quit] |
| 04:55:29 | | LegitSi joins |
| 05:02:51 | | luna joins |
| 05:25:35 | | LegitSi quits [Ping timeout: 265 seconds] |
| 05:27:52 | | user_ quits [Ping timeout: 252 seconds] |
| 05:51:56 | | yawkat (yawkat) joins |
| 06:20:31 | | Arcorann (Arcorann) joins |
| 06:51:52 | | Island_ quits [Read error: Connection reset by peer] |
| 07:30:19 | | hitgrr8 joins |
| 07:33:29 | <pabs> | seems w3c is doing a website redesign https://beta.w3.org/ https://news.ycombinator.com/item?id=34961568 |
| 07:34:11 | <pabs> | https://beta.w3.org/news/2023/w3c-welcomes-feedback-on-the-beta-of-its-new-website/ |
| 08:03:43 | | Larsenv quits [Quit: ZNC 1.8.2+deb2build5 - https://znc.in] |
| 08:24:15 | | Larsenv (Larsenv) joins |
| 09:53:15 | | datechnoman quits [Quit: The Lounge - https://thelounge.chat] |
| 09:53:55 | | datechnoman (datechnoman) joins |
| 10:35:19 | | lennier1 quits [Ping timeout: 252 seconds] |
| 10:36:23 | | lennier1 (lennier1) joins |
| 11:01:40 | | Larsenv_ (Larsenv) joins |
| 11:01:41 | | Larsenv quits [Client Quit] |
| 11:01:41 | | qwertyasdfuiopghjkl quits [Client Quit] |
| 11:01:49 | | qwertyasdfuiopghjkl joins |
| 11:43:39 | | sonick (sonick) joins |
| 12:01:28 | | dan_a quits [Quit: leaving] |
| 12:01:56 | | dan_a (dan_a) joins |
| 12:43:06 | | Arcorann quits [Ping timeout: 252 seconds] |
| 13:16:06 | | eroc1990 quits [Ping timeout: 252 seconds] |
| 13:22:23 | | VerifiedJ quits [Quit: The Lounge - https://thelounge.chat] |
| 13:22:43 | | VerifiedJ (VerifiedJ) joins |
| 13:23:27 | | eroc1990 (eroc1990) joins |
| 13:27:04 | | eroc1990 quits [Client Quit] |
| 13:27:58 | | eroc1990 (eroc1990) joins |
| 14:08:10 | | luna quits [Ping timeout: 252 seconds] |
| 14:13:54 | | luna joins |
| 14:13:56 | | luna quits [Remote host closed the connection] |
| 14:23:54 | | fl0w_ joins |
| 14:27:58 | | fl0w quits [Ping timeout: 252 seconds] |
| 14:32:14 | | eroc19903 (eroc1990) joins |
| 14:33:28 | | eroc1990 quits [Ping timeout: 252 seconds] |
| 14:49:31 | | fishingforsoup__ quits [Read error: Connection reset by peer] |
| 14:50:07 | | fishingforsoup__ joins |
| 14:51:14 | | igloo22225 quits [Quit: Ping timeout (120 seconds)] |
| 14:51:23 | | igloo22225 (igloo22225) joins |
| 14:52:10 | | njha1 quits [Ping timeout: 252 seconds] |
| 14:52:38 | | jspiros_ (jspiros) joins |
| 14:52:43 | | jspiros quits [Ping timeout: 252 seconds] |
| 14:53:16 | | fangfufu quits [Ping timeout: 252 seconds] |
| 14:53:33 | | fishingforsoup__ quits [Read error: Connection reset by peer] |
| 14:53:41 | | njha1 joins |
| 14:53:58 | | fuzzy8021 quits [Read error: Connection reset by peer] |
| 14:54:24 | | fangfufu joins |
| 14:54:26 | | fangfufu is now authenticated as fangfufu |
| 14:54:43 | | fishingforsoup__ joins |
| 14:55:16 | | eroc19903 quits [Client Quit] |
| 14:55:26 | | nepeat_ (nepeat) joins |
| 14:55:28 | | nepeat quits [Ping timeout: 252 seconds] |
| 14:55:40 | | eroc1990 (eroc1990) joins |
| 14:55:53 | | fuzzy8021 (fuzzy8021) joins |
| 16:27:33 | | fishingforsoup__ quits [Read error: Connection reset by peer] |
| 16:28:06 | | fuzzy8021 quits [Read error: Connection reset by peer] |
| 16:28:12 | | nepeat_ quits [Client Quit] |
| 16:28:38 | | lun4 quits [Quit: Ping timeout (120 seconds)] |
| 16:28:44 | | ave quits [Quit: Ping timeout (120 seconds)] |
| 16:33:38 | | fuzzy8021 (fuzzy8021) joins |
| 16:34:28 | | katocala quits [Ping timeout: 252 seconds] |
| 16:35:06 | | katocala joins |
| 16:35:07 | | Island joins |
| 16:36:02 | | katocala is now authenticated as katocala |
| 16:36:36 | | fishingforsoup joins |
| 16:39:16 | | ave (ave) joins |
| 16:39:23 | | nepeat (nepeat) joins |
| 16:39:23 | | lun4 (lun4) joins |
| 16:56:26 | | Island quits [Remote host closed the connection] |
| 16:56:47 | | Island joins |
| 17:02:19 | | Ketchup901 (Ketchup901) joins |
| 17:12:33 | | upintheairsheep joins |
| 17:15:24 | <upintheairsheep> | I have a strange special case over here, I would like to unpack every RAR file from here http://s1.bitdl.ir/Software/HandHeld/Iphone/ and upload them to the Internet Archive, I've got a solution for the IPA files, but the RAR files have multiple parts, making them really hard to use, also please note that I do not have a proper computer yet, so I |
| 17:15:24 | <upintheairsheep> | can't do this manually, and my phone runs iOS making it really hard to keep background processes open. |
| 17:16:55 | <upintheairsheep> | These RAR files contain old and delisted iOS apps from 2007-2012, and each RAR is 100-500 MB in size. I personally know how rare IPA files are compared to their APK counterparts. |
| 17:22:25 | <upintheairsheep> | lennier1 might be interested in this. |
| 17:33:58 | | upintheairsheep quits [Ping timeout: 265 seconds] |
| 18:19:54 | <pokechu22> | 7-zip can probably do it from the command line, so if you had a proper computer it'd be easy enough to automate |
| 18:21:34 | | CreaZyp154 joins |
| 18:22:20 | | jacksonchen666 (jacksonchen666) joins |
| 18:24:23 | <@JAA> | unar as well. |
| 18:35:50 | | treora quits [Ping timeout: 252 seconds] |
| 18:37:18 | | treora joins |
| 18:37:31 | <CreaZyp154> | vlive.tv is now excluded from the Wayback machine yay... |
| 18:38:47 | <CreaZyp154> | seriously why do they do that, what do they benefit from it ?! |
| 18:40:48 | <@arkiver> | still here https://archive.org/details/archiveteam_vlive |
| 18:40:55 | <@arkiver> | as WARCs thouhg |
| 18:41:41 | <CreaZyp154> | yeah I mean at's least it's saved... |
| 18:45:08 | <h2ibot> | JustAnotherArchivist edited List of websites excluded from the Wayback Machine (+33): https://wiki.archiveteam.org/?diff=49503&oldid=49400 |
| 18:45:31 | <CreaZyp154> | What's the progress on fixing #// and other broken projects ? |
| 19:00:11 | <h2ibot> | JAABot edited List of websites excluded from the Wayback Machine (+0): https://wiki.archiveteam.org/?diff=49504&oldid=49503 |
| 19:07:47 | <@arkiver> | CreaZyp154: nothing unfortunately |
| 19:30:22 | <sepro> | Is it safe to assume that the exclusion for vlive.tv was requested by vlive/the parent company? I don't see another reason why the site should be excluded. |
| 19:30:42 | <CreaZyp154> | What did happen for it to take so long to fix ? Is there like a huge bug or something ? |
| 19:38:49 | | lennier1 quits [Client Quit] |
| 19:42:45 | | Gereon6200 (Gereon) joins |
| 19:50:31 | | umgr036 joins |
| 19:51:34 | | fishingforsoup_ joins |
| 19:51:44 | | LegitSi joins |
| 19:55:02 | | fishingforsoup quits [Ping timeout: 252 seconds] |
| 19:58:32 | | jacksonchen666 quits [Client Quit] |
| 20:00:20 | | fishingforsoup_ quits [Remote host closed the connection] |
| 20:00:32 | | fishingforsoup_ joins |
| 20:04:22 | <TheTechRobo> | CreaZyp154: As far as I remember, there's a data-loss bug with the backfeed (which those projects rely on), which allows a project to queue items for itself and other projects |
| 20:05:26 | <CreaZyp154> | oh ok, pretty bad then, losing data is never ok... |
| 20:35:29 | | fishingforsoup_ is now authenticated as fishingforsoup |
| 20:37:40 | | lennier1 (lennier1) joins |
| 20:54:17 | | fl0w joins |
| 20:56:49 | | fl0w_ quits [Ping timeout: 252 seconds] |
| 21:23:03 | | CreaZyp154 quits [Remote host closed the connection] |
| 21:28:21 | <@OrIdow6> | Losing data is sometimes "ok" and often necessary |
| 21:35:39 | <@arkiver> | while we were catching on with backlog (meaning many more URLs being processed than normal), I observed a few time no items were queued for tens of minutes. those items all didn't make it to backfeed |
| 21:39:44 | <@arkiver> | (they didn't make it to the bloom filter to be precise) |
| 21:44:21 | <@JAA> | Which at least means that when we come across them again, we'll archive them at that time. Would be way worse if they had been added to the filter but not queued. |
| 21:45:14 | <@arkiver> | i'm mostly worried about page requisites not being archived |
| 21:45:19 | <@arkiver> | images etc. |
| 21:45:39 | <@arkiver> | and about the telegram project, when only part of the posts of a channel are queued to the project |
| 21:47:14 | <@JAA> | Right |
| 21:53:02 | <datechnoman> | The risk would be to high for telegram but it could be tolerated with #// i guess |
| 21:53:06 | <datechnoman> | There would be some loss though |
| 21:53:32 | | @arkiver doens't like loss |
| 21:53:39 | <@arkiver> | but I understand the arguments here |
| 21:54:09 | <@arkiver> | initially I was hoping that backfeed would be fixed some time soon. it has not however. at some point the decision to stop #// should be revisited perhaps |
| 21:54:13 | <datechnoman> | Oh its totally fine. We like to cover off as much as we can not "half ass" archive bits of things around the plae |
| 21:54:16 | <datechnoman> | place* |
| 21:54:16 | <@arkiver> | (which we're doing now - discussion on this is good) |
| 21:54:51 | <datechnoman> | Personally (not that I have an authority) I would keep telegram halted |
| 21:55:00 | <TheTechRobo> | Do you have any idea of what the issue is? Because then my take would be different than if we don't know that. |
| 21:55:36 | <TheTechRobo> | And yeah, I agree with datechnoman for the record. Though maybe we could scrap backfeed for a bit and upload page requisites somewhere separate and do them later? |
| 21:55:53 | <TheTechRobo> | Or if the issue is only at high load, then putting a ratelimit would be better than a standstill... |
| 22:10:49 | | hitgrr8 quits [Client Quit] |
| 22:14:39 | | BlueMaxima joins |
| 22:41:42 | | TransonicGravity joins |
| 22:55:15 | | TransonicGravity quits [Remote host closed the connection] |
| 22:55:17 | | TransonicGravity joins |
| 23:25:32 | | TransonicGravity quits [Client Quit] |
| 23:46:39 | | h3ndr1k quits [Quit: ] |
| 23:47:02 | | h3ndr1k (h3ndr1k) joins |
| 23:47:38 | | useretail_ quits [Remote host closed the connection] |
| 23:50:18 | | useretail joins |
| 23:59:40 | | Sluggs quits [Ping timeout: 265 seconds] |
| 23:59:57 | | Sluggs joins |