| 00:05:27 | | TheTechRobo is now known as OriginalUsername |
| 00:05:34 | | OriginalUsername is now known as TheTechRobo |
| 00:08:38 | <pabs> | aww, one site I wanted to crawl has some links only in comments :( |
| 00:11:49 | | eggdrop quits [Client Quit] |
| 00:13:08 | | eggdrop (eggdrop) joins |
| 00:16:33 | <pabs> | project10: I definitely think AT needs a system for feeding links discovered by #archivebot, #wikibot, #// and other places to different projects. imgur links for eg usually 429 in AB. the mailman/buzilla/codearchiver/SWH/wikibot/etc projects could also use those auto-discovery services |
| 00:19:02 | <project10> | especially with (seemingly?) longer-running projects now like imgur, reddit, telegram. I assume things like imgur discovered through #// are sent via backfeed to the imgur queue in that case? |
| 00:19:27 | <nicolas17> | I *think* #// and other projects do feed into imgur, but not archivebot |
| 00:20:07 | <nulldata> | https://sh.itjust.works/post/4842435 |
| 00:20:47 | | BearFortress joins |
| 00:20:55 | <project10> | so what, it was based on the metro novels? or just a metro-like idea? |
| 00:21:26 | | eggdrop quits [Ping timeout: 252 seconds] |
| 00:25:19 | | eggdrop (eggdrop) joins |
| 00:45:57 | | KoalaFritto joins |
| 00:46:48 | | KoalaFritto30 joins |
| 00:47:50 | | etnguyen03 quits [Ping timeout: 252 seconds] |
| 00:47:50 | | KoalaFritto30 quits [Remote host closed the connection] |
| 00:51:05 | | etnguyen03 (etnguyen03) joins |
| 00:51:17 | | KoalaFritto quits [Ping timeout: 265 seconds] |
| 01:14:00 | | etnguyen03 quits [Ping timeout: 265 seconds] |
| 01:15:02 | | etnguyen03 (etnguyen03) joins |
| 01:26:38 | <h2ibot> | PaulWise edited Bugzilla (+785, add bugzilla-url-list by JAA strategy): https://wiki.archiveteam.org/?diff=50756&oldid=50599 |
| 01:28:41 | | eggdrop quits [Client Quit] |
| 01:30:11 | | etnguyen03 quits [Ping timeout: 252 seconds] |
| 01:31:51 | | eggdrop (eggdrop) joins |
| 01:51:22 | | eggdrop quits [Client Quit] |
| 01:55:09 | | eggdrop (eggdrop) joins |
| 02:42:29 | <thuban> | arkiver: i've checked periodically, but i still just get redirects to the shutdown notice. plcp might know more |
| 02:56:29 | | etnguyen03 (etnguyen03) joins |
| 02:58:56 | <h2ibot> | DigitalDragon edited NewsGrabber (+18): https://wiki.archiveteam.org/?diff=50757&oldid=50706 |
| 02:59:38 | <fireonlive> | that works |
| 03:02:46 | <project10> | does AB have a max size per fetched URL? I see the debian.org/releases job fetching netinst ISOs but no others, I assume size limit at play? |
| 03:04:43 | <pabs> | hmm, didn't mean to fetch those |
| 03:22:06 | | krvme joins |
| 03:25:08 | | decagon__ quits [Ping timeout: 252 seconds] |
| 03:34:04 | <thuban> | archivebot jobs for katapult are all done; i will grab the meta files and extract srcset components when they get uploaded (probably tomorrow) |
| 03:42:23 | | dumbgoy quits [Ping timeout: 265 seconds] |
| 03:50:58 | <DogsRNice> | what does ab do with 429 errors? |
| 03:53:09 | <DogsRNice> | im noticing on the empire minecraft job that its not getting imgur links and some of them arent in the wbm, the rest were grabbed by the imgur project already |
| 04:06:33 | | kiryu quits [Ping timeout: 265 seconds] |
| 04:07:31 | <pokechu22> | It retries them twice and then dismisses them, but imgur will never succeed with AB - you'll need to download the meta-warc and send a list of them to the imgur project |
| 04:08:18 | | kiryu (kiryu) joins |
| 04:11:21 | <DogsRNice> | not really sure how to do that lol |
| 04:11:53 | <DogsRNice> | kind sounds like something that could be automated (not that i know how to do that either) |
| 04:15:15 | | etnguyen03 quits [Ping timeout: 265 seconds] |
| 04:21:46 | | DogsRNice quits [Read error: Connection reset by peer] |
| 04:22:36 | | etnguyen03 (etnguyen03) joins |
| 04:33:34 | | kiryu quits [Remote host closed the connection] |
| 04:35:07 | | kiryu joins |
| 04:35:07 | | kiryu is now authenticated as kiryu |
| 04:35:07 | | kiryu quits [Changing host] |
| 04:35:07 | | kiryu (kiryu) joins |
| 04:51:09 | | etnguyen03 quits [Client Quit] |
| 05:57:14 | | BlueMaxima quits [Read error: Connection reset by peer] |
| 06:22:26 | | Dango360 quits [Read error: Connection reset by peer] |
| 06:24:46 | | nicolas17 quits [Client Quit] |
| 06:51:27 | | railen63 quits [Remote host closed the connection] |
| 07:00:08 | | nfriedly quits [Remote host closed the connection] |
| 07:02:31 | | BigBrain_ quits [Ping timeout: 245 seconds] |
| 07:02:51 | | Arcorann (Arcorann) joins |
| 07:03:18 | | Arcorann quits [Remote host closed the connection] |
| 07:05:03 | | Unholy236131661808515 quits [Remote host closed the connection] |
| 07:07:10 | | Unholy236131661808515 (Unholy2361) joins |
| 07:13:56 | | nulldata quits [Ping timeout: 252 seconds] |
| 07:15:58 | | greg joins |
| 07:16:59 | | nulldata (nulldata) joins |
| 07:21:02 | | greg quits [Remote host closed the connection] |
| 07:24:30 | | Arcorann (Arcorann) joins |
| 07:32:04 | | BigBrain_ (bigbrain) joins |
| 08:04:09 | | Shampoo2140 quits [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.] |
| 08:04:35 | | Shampoo2140 joins |
| 08:05:48 | | nulldata quits [Ping timeout: 265 seconds] |
| 08:06:16 | | Shampoo2140 quits [Client Quit] |
| 08:07:57 | | Shampoo2140 joins |
| 08:08:29 | | nulldata (nulldata) joins |
| 08:47:51 | | qw3rty joins |
| 09:03:23 | | nulldata quits [Ping timeout: 252 seconds] |
| 09:06:31 | | nulldata (nulldata) joins |
| 09:22:56 | | BigBrain_ quits [Ping timeout: 245 seconds] |
| 09:25:28 | | BigBrain_ (bigbrain) joins |
| 09:35:50 | | gfhh quits [Ping timeout: 252 seconds] |
| 09:41:12 | | bilboed quits [Quit: The Lounge - https://thelounge.chat] |
| 09:41:32 | | bilboed joins |
| 09:45:37 | | Shampoo2140 quits [Client Quit] |
| 09:47:22 | | Shampoo2140 joins |
| 10:02:00 | | igloo22225 quits [Quit: The Lounge - https://thelounge.chat] |
| 10:03:17 | | igloo22225 (igloo22225) joins |
| 10:08:31 | | nfriedly joins |
| 10:14:29 | | Shampoo2140 quits [Client Quit] |
| 10:16:13 | | Shampoo2140 joins |
| 11:01:17 | | icedice (icedice) joins |
| 11:03:26 | | Shampoo2140 quits [Client Quit] |
| 11:03:44 | | Shampoo2140 joins |
| 11:04:55 | | Shampoo2140 quits [Client Quit] |
| 11:06:37 | | Shampoo2140 joins |
| 11:35:22 | | gfhh joins |
| 11:59:03 | | icedice quits [Client Quit] |
| 12:07:59 | | Carnildo_again joins |
| 12:08:39 | | Carnildo quits [Read error: Connection reset by peer] |
| 12:21:00 | | Island quits [Ping timeout: 265 seconds] |
| 12:28:00 | | Megame (Megame) joins |
| 12:38:14 | | etnguyen03 (etnguyen03) joins |
| 12:52:49 | | icedice (icedice) joins |
| 13:11:59 | | gfhh quits [Ping timeout: 252 seconds] |
| 13:12:31 | | gfhh joins |
| 13:29:17 | | JohnnyJ joins |
| 13:41:14 | | Arcorann quits [Ping timeout: 265 seconds] |
| 13:44:30 | | andrew quits [Client Quit] |
| 13:45:32 | | etnguyen03 quits [Ping timeout: 252 seconds] |
| 13:47:08 | | andrew (andrew) joins |
| 13:50:53 | | LeGoupil joins |
| 14:12:45 | | PredatorIWD_ joins |
| 14:16:02 | | PredatorIWD quits [Ping timeout: 265 seconds] |
| 14:17:10 | <h2ibot> | JustAnotherArchivist edited The WARC Ecosystem (+304, /* Tools */ Add ArchiveBox): https://wiki.archiveteam.org/?diff=50758&oldid=50711 |
| 14:39:24 | | Island joins |
| 14:44:57 | <anarcat> | not sure if this is -ot but we might need a watch on bandcamp https://teddydd.me/2023/backup-your-bandcamp-music/ |
| 14:49:00 | | LeGoupil quits [Client Quit] |
| 15:03:53 | | HP_Archivist quits [Ping timeout: 265 seconds] |
| 15:05:58 | | etnguyen03 (etnguyen03) joins |
| 15:13:52 | | Megame quits [Client Quit] |
| 15:13:55 | | icedice quits [Client Quit] |
| 15:18:51 | <TheTechRobo> | Wonder if archivebox could use wget-AT |
| 15:26:21 | | railen63 joins |
| 15:35:29 | | kiryu quits [Remote host closed the connection] |
| 15:36:46 | | kiryu joins |
| 15:36:46 | | kiryu is now authenticated as kiryu |
| 15:36:46 | | kiryu quits [Changing host] |
| 15:36:46 | | kiryu (kiryu) joins |
| 15:41:36 | | icedice (icedice) joins |
| 15:42:22 | <icedice> | <anarcat> not sure if this is -ot but we might need a watch on bandcamp https://teddydd.me/2023/backup-your-bandcamp-music/ |
| 15:42:31 | <icedice> | Reminds me of Amazon Prime Video's bs |
| 15:43:41 | <fireonlive> | TheTechRobo: if it did i would be so happy |
| 15:59:48 | | Dango360 (Dango360) joins |
| 16:08:38 | | kiryu quits [Remote host closed the connection] |
| 16:10:24 | | dumbgoy joins |
| 16:17:09 | <qq44|m> | how can I mirror a site with wget-lua and include all page requisites? |
| 16:17:23 | <qq44|m> | --recursive, --mirror, and --page-requisites doesn't seem to work |
| 16:20:39 | <imer> | qq44|m: are you using a lua script? (I dont know the solution, I assume wget-lua behaved like regular wget, but with more scripting) |
| 16:21:28 | <qq44|m> | imer: im not using a script, just plain old wgetlua |
| 16:22:40 | <qq44|m> | grab-site seems to work properly with page requisites, but wget doesn't seem to pull them with recursive downloads |
| 16:37:39 | | etnguyen03 quits [Ping timeout: 265 seconds] |
| 17:01:06 | | andrew6 (andrew) joins |
| 17:02:17 | | ferro joins |
| 17:03:16 | | andrew quits [Ping timeout: 265 seconds] |
| 17:03:23 | | andrew6 is now known as andrew |
| 17:03:35 | | ferro quits [Remote host closed the connection] |
| 18:00:57 | <h2ibot> | FireonLive edited Current Projects (-163, Remove NG -- superseded by URLs): https://wiki.archiveteam.org/?diff=50759&oldid=50685 |
| 18:03:41 | | xarph quits [Ping timeout: 265 seconds] |
| 18:27:39 | <pokechu22> | Looking at https://web.archive.org/web/20230000000000*/https://e.orange.fr/error404.html some captures show in blue and some show in orange - I'm pretty sure https://e.orange.fr/error404.html always returns 404, so is there a reason for them being blue? (that page has a ton of captures because any personal page that had a 404 or didn't exist would *redirect* there, and |
| 18:27:42 | <pokechu22> | archivebot doesn't dedupelicate redirect targets) |
| 18:28:53 | <qq44|m> | pokechu22: perhaps the server didn't return 404 error code in the headers, and instead returned 200 but said 404 on the page? |
| 18:30:02 | <pokechu22> | Picking a snapshot from april 2 that shows as blue (https://web.archive.org/web/20230402090744/https://e.orange.fr/error404.html) still shows a 404 in my developer tools when loading the page |
| 18:46:30 | <@JAA> | I've found the colours in the calendar to be wildly inaccurate all the time. |
| 18:50:01 | <fireonlive> | calendars, the bane of our existence |
| 18:51:15 | <@JAA> | In other news, my FuzzyMemories.TV grab-site crawl finished. |
| 18:52:05 | | petrichor quits [Quit: ZNC 1.8.2 - https://znc.in] |
| 18:52:08 | <@JAA> | Three /watch/ URLs failed, otherwise it looks fine. |
| 18:52:26 | <imer> | nice |
| 18:53:02 | <@JAA> | 4232 video files from 4761 attempted IDs |
| 18:53:13 | | petrichor (petrichor) joins |
| 18:53:51 | | petrichor quits [Client Quit] |
| 18:54:50 | <@JAA> | Random example of a video where the file is a 404: http://www.fuzzymemories.tv/watch/2276/kiddieland-amusement-park-commercial-1-1990/ |
| 18:55:04 | | petrichor (petrichor) joins |
| 18:55:17 | <fireonlive> | awesome :) |
| 18:55:28 | | jacksonchen666 (jacksonchen666) joins |
| 18:55:36 | | petrichor quits [Client Quit] |
| 18:55:46 | <@JAA> | Total WARC size is 107 GiB. It'll be on its slow way to IA soon. |
| 18:57:03 | | petrichor (petrichor) joins |
| 19:07:03 | | jacksonchen666 quits [Client Quit] |
| 19:07:22 | | etnguyen03 (etnguyen03) joins |
| 19:14:42 | | KoalaFritto joins |
| 19:20:19 | | Island_ joins |
| 19:22:57 | | Island quits [Ping timeout: 265 seconds] |
| 19:23:56 | | erkinalp joins |
| 19:24:57 | | andrew quits [Client Quit] |
| 19:28:05 | | andrew (andrew) joins |
| 19:31:16 | | Carnildo_again is now known as Carnildo |
| 19:32:51 | | leo60228 quits [Quit: ZNC 1.8.2 - https://znc.in] |
| 19:33:14 | | leo60228 (leo60228) joins |
| 19:43:46 | | erkinalp quits [Remote host closed the connection] |
| 19:56:17 | <h2ibot> | JustAnotherArchivist created The Museum of Classic Chicago Television (+611, Created page with "{{Infobox project | URL =…): https://wiki.archiveteam.org/?title=The%20Museum%20of%20Classic%20Chicago%20Television |
| 19:57:18 | <h2ibot> | JustAnotherArchivist created FuzzyMemories.TV (+54, Redirected page to [[The Museum of Classic…): https://wiki.archiveteam.org/?title=FuzzyMemories.TV |
| 19:57:19 | <h2ibot> | JustAnotherArchivist created FuzzyMemoriesTV (+54, Redirected page to [[The Museum of Classic…): https://wiki.archiveteam.org/?title=FuzzyMemoriesTV |
| 20:06:57 | | katocala quits [Remote host closed the connection] |
| 20:08:50 | | givemeawhisper joins |
| 20:09:13 | | givemeawhisper quits [Remote host closed the connection] |
| 20:54:10 | | efeafewa quits [Remote host closed the connection] |
| 21:06:21 | | efeafewa joins |
| 21:09:11 | | shinji257 quits [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.] |
| 21:12:37 | <h2ibot> | FireonLive edited Reddit (+129, wording fix?): https://wiki.archiveteam.org/?diff=50763&oldid=50722 |
| 21:13:18 | <flashfire42> | https://wiki.archiveteam.org/index.php/ArchiveBot/2019_Australian_federal_election a question do pages like this still work as originally intended? |
| 21:13:42 | <pokechu22> | https://wiki.archiveteam.org/index.php/Special:Contributions/HadeanEon makes me think no |
| 21:13:52 | <@JAA> | No |
| 21:14:31 | <flashfire42> | Alright then. I will have to use the viewer to work out what to put in and what not then. All good. Still good sources of things to throw in. |
| 21:20:01 | | nicolas17 joins |
| 21:41:59 | | DogsRNice joins |
| 21:45:39 | | KoalaFritto quits [Remote host closed the connection] |
| 21:50:15 | | shinji257 (shinji257) joins |
| 21:57:08 | | etnguyen03 quits [Ping timeout: 265 seconds] |
| 22:11:52 | <h2ibot> | JustAnotherArchivist edited The Museum of Classic Chicago Television (+597, Add known archives): https://wiki.archiveteam.org/?diff=50764&oldid=50760 |
| 22:17:54 | | BlueMaxima joins |
| 22:39:46 | | etnguyen03 (etnguyen03) joins |
| 22:57:17 | <fireonlive> | -+rss:#hackernews- Microsoft to kill off third-party printer drivers in Windows: https://www.theregister.com/2023/09/11/go_native_or_go_home/ https://news.ycombinator.com/item?id=37473628 |
| 22:57:19 | <fireonlive> | "To be clear, the end of servicing applies to drivers provided via Windows Update. Manufacturers will, according to Microsoft, "need to provide customers with an alternative means to download and install those printer drivers." Legacy v3 and v4 Windows printer drivers are facing the end of servicing ax." |
| 23:25:07 | <nicolas17> | I have never seen 3rd party drivers updating via Windows Update |
| 23:26:47 | <fireonlive> | looks like this 'Mopria' has existed for a while and more newer printers are using it? |
| 23:38:26 | | nicolas17 quits [Ping timeout: 252 seconds] |
| 23:42:41 | | nicolas17 joins |