| 00:09:57 | | levomi quits [Ping timeout: 265 seconds] |
| 00:26:05 | <project10> | arkiver: do you mean, when was the service established? |
| 00:32:58 | <project10> | seems to be ~2001 based on the earliest pages I can find based on last-mod dates. The service may have existed under another incarnation, with Telecom France, since 1996 (less certainty on that) |
| 00:33:44 | <thuban> | there are definitely wanadoo.fr user pages from the late 90s, still trying to munge ia cdx data to find the earliest |
| 00:39:16 | <@arkiver> | thanks! |
| 00:39:21 | <@arkiver> | project10: btw, we do have an update |
| 00:39:28 | <@arkiver> | i didn't set the new version yet but will do soon |
| 00:43:57 | <project10> | Thanks arkiver, I should be updating within the next 2 hours or so |
| 01:12:25 | <thuban> | arkiver: oldest known wanadoo captures are late 1998 for both 'perso' and 'pro' (https://web.archive.org/web/19981201033556/http://perso.wanadoo.fr/florent.buttazzoni/urgences.htm; https://web.archive.org/web/19981201035723/http://mairie.wanadoo.fr/f6flv/index.html) |
| 01:23:29 | | sonick (sonick) joins |
| 04:21:24 | | BornOn420 quits [Read error: Connection reset by peer] |
| 04:21:55 | | BornOn420 (BornOn420) joins |
| 05:24:38 | | threedeeitguy39 quits [Client Quit] |
| 05:27:00 | | threedeeitguy39 (threedeeitguy) joins |
| 06:33:25 | <pokechu22> | https://ecole.pagespro-orange.fr/therese.eveilleau.mairie.assoc.mairie.assoc.ecole.mairie.mairie.mairie.ecole.ecole.assoc.assoc.mairie/ - this doesn't seem to have been cleaned up :| |
| 06:35:39 | <pokechu22> | Hmm, it also tried to retrieve http://orange.et.rose.assoc.assoc.ecole.mairie.assoc.ecole.ecole.ecole.assoc.assoc.assoc.ecole.mairie.ecole.assoc.pagespro-orange.fr/ |
| 06:35:47 | <pokechu22> | so I don't think these are being cleaned up if they already exist :| |
| 06:38:28 | <@flashfire42> | pokechu22 the to do is going down not up at least |
| 06:42:16 | | levomi joins |
| 06:46:03 | <thuban> | i believe they are getting cleaned up, it's just taking a while |
| 06:46:57 | <thuban> | that's to say, they'll get retrieved if they're already in the tracker, but neither of those should queue anything new |
| 06:56:25 | <thuban> | (if admins manually purged the tracker and those are new, then yes, problematic, but afaik that hasn't been done...? arkiver only said "this is somewhat annoying to filter out actually") |
| 06:57:55 | | jacksonchen666 (jacksonchen666) joins |
| 06:59:42 | <fireonlive> | 33=302 http://hansi.mairie.assoc.ecole.assoc.mairie.ecole.ecole.mairie.mairie.mairie.ecole.ecole.mairie.ecole.assoc.ecole.ecole.ecole.assoc.assoc.ecole.ecole.ecole.assoc.mairie.ecole.ecole.ecole.ecole.mairie.assoc.pagespro-orange.fr/toiles/un%20long%20dimanche%202.jpg |
| 06:59:42 | <fireonlive> | those french have interesting subdomains |
| 07:07:29 | <project10> | backfeed queue is dropping fast |
| 07:21:22 | <BornOn420> | Are there any valid URLs left? |
| 07:25:08 | <thuban> | a few |
| 07:37:32 | | jacksonchen666 quits [Client Quit] |
| 08:10:59 | <thuban> | per logs and grafana, seems to be all legit again--not sure what that bolus was about |
| 08:15:01 | | shinji257_ (shinji257) joins |
| 09:16:07 | | imer quits [Ping timeout: 265 seconds] |
| 09:16:12 | | yts98 leaves |
| 09:16:29 | | yts98 joins |
| 09:17:04 | | imer (imer) joins |
| 09:21:55 | | imer quits [Ping timeout: 265 seconds] |
| 09:25:20 | | imer (imer) joins |
| 09:29:16 | | imer quits [Read error: Connection reset by peer] |
| 09:48:19 | | imer (imer) joins |
| 09:51:55 | | imer quits [Read error: Connection reset by peer] |
| 09:52:45 | | imer (imer) joins |
| 09:57:56 | | imer quits [Ping timeout: 252 seconds] |
| 09:59:41 | | kallemarc joins |
| 10:05:03 | | imer (imer) joins |
| 10:18:11 | <kallemarc> | Will new tasks be added, or can I switch to another project myself? |
| 10:19:15 | <thuban> | kallemarc: new tasks are being added as links are discovered, but we have enough workers that it's probably safe for you to switch |
| 10:19:37 | | imer quits [Read error: Connection reset by peer] |
| 10:20:26 | <kallemarc> | (y) |
| 10:20:30 | | imer (imer) joins |
| 10:23:25 | | kallemarc leaves |
| 10:31:13 | <plcp> | \ô/ |
| 10:34:16 | <thuban> | indeed |
| 10:35:22 | <thuban> | the ridiculous eta on the remaining claims makes me wonder whether they're held by workers that have been b&, but that's easily remedied |
| 10:39:09 | <BornOn420> | still 17.7M out? seems like a lot |
| 10:45:47 | | imer quits [Ping timeout: 252 seconds] |
| 10:54:01 | | imer (imer) joins |
| 11:00:32 | <nstrom|m> | Huh yeah we got through a lot last night. Not sure if auto reclaim is on or if we need someone to manually move out to redo |
| 11:05:49 | | Exorcism quits [Remote host closed the connection] |
| 11:06:58 | | Exorcism (exorcism) joins |
| 11:35:39 | <nstrom|m> | arkiver ^ |
| 12:24:47 | | qwertyasdfuiopghjkl (qwertyasdfuiopghjkl) joins |
| 12:35:43 | | Exorcism quits [Remote host closed the connection] |
| 12:36:28 | | Exorcism (exorcism) joins |
| 13:41:43 | <@arkiver> | pokechu22: thuban: yes, this is likely because a ton of these URLs were queued previously and they were offload in lists "at the end" |
| 13:42:02 | <@arkiver> | which were queued in and quickly moved out, but not fats enough for some to end up with warriors |
| 13:42:29 | <@arkiver> | to try to explain |
| 13:43:02 | <@arkiver> | URLs are queued to a queue. they are mixed in in this set. the set has a maximum size set, and we take items out of this set to store on disk in 'offloaded lists' |
| 13:43:29 | <@arkiver> | so, when we saw assoc.marie.assoc.ecole... etc., it shows there had already been a ton of round of queued of this, the loop went on pretty long |
| 13:44:11 | <@arkiver> | the 'top of the iceberg' was only visible through what was now in redis, while the majority was in offloaded lists which would only be loaded in again much later |
| 13:44:24 | | shinji257_ leaves |
| 13:44:34 | <@arkiver> | when those were eventually loaded in, we got through them fast as most of the stuff was taken out again |
| 13:45:27 | <thuban> | huh, interesting |
| 13:45:50 | <@arkiver> | i'll do some checks here and there to make sure all went well |
| 13:45:56 | <@arkiver> | claims are being requeued now |
| 13:46:16 | <thuban> | cool cool |
| 13:46:59 | <@arkiver> | paused as we requeue everything |
| 13:48:54 | <thuban> | any idea why things ground to a halt so completely? (naïvely i would have expected banned workers to fail their items due to connect timeouts even if the tracker didn't have a time-to-live configured) |
| 13:52:26 | <@arkiver> | not sure what you mean |
| 13:52:34 | <@arkiver> | because there are no items left to do |
| 13:52:40 | <@arkiver> | but you probably mean something else |
| 13:52:55 | <thuban> | i mean why weren't the out items being returned |
| 13:54:03 | <@arkiver> | because i never set for that to happen |
| 13:54:09 | <@arkiver> | you mean reclaimed? |
| 13:54:12 | <@arkiver> | or |
| 13:54:20 | <@arkiver> | those that were not returned failed for some reason |
| 13:58:00 | <thuban> | i thought that items with connection issues would be aborted by their workers (without the tracker needing to reclaim them) |
| 13:59:52 | <@arkiver> | thuban: no, items are in claims when they are claimed. we can enable some auto-reclaiming with a timeout, or not |
| 13:59:56 | <@arkiver> | that was not set in this case |
| 14:01:04 | <thuban> | huh, i see. so aborting an item doesn't report anything to the tracker, it just punts and assumes the tracker will figure it out later? |
| 14:03:06 | <@arkiver> | no |
| 14:03:11 | <@arkiver> | yes |
| 14:03:18 | <@arkiver> | the tracker doesn't get informed about aborts |
| 14:03:54 | <thuban> | makes sense i guess, since broken items may require other manual intervention like code changes |
| 14:05:47 | <thuban> | ty for explaining |
| 14:06:30 | <@arkiver> | thanks! |
| 14:06:45 | <@arkiver> | as for if it makes sense - i don't know, it's just how it currently is done, not with a very good reason in mind |
| 14:07:16 | <@arkiver> | we could probably think of a reason why we would want to do it the current way, but there is not good reason we do it this way - it's just being done this way |
| 14:07:31 | <thuban> | that also makes sense :P |
| 14:39:10 | <project10> | tracker 500s, that's new |
| 15:40:41 | | Flo99 joins |
| 15:44:11 | | magmaus3 quits [Quit: Ping timeout (120 seconds)] |
| 15:44:49 | | magmaus3 (magmaus3) joins |
| 15:58:26 | | BornOn420 quits [Client Quit] |
| 16:16:33 | | Exorcism quits [Remote host closed the connection] |
| 16:17:19 | | Exorcism (exorcism) joins |
| 16:27:15 | | sonick quits [Client Quit] |
| 16:44:11 | <pokechu22> | Archiving item url:http://leclairefontaine.pagesperso-orange.fr/1/'http://perso.orange.fr/leclairefontaine/cariboost1/' |
| 16:45:06 | <pokechu22> | doesn't seem to exist at all, not sure where that came from |
| 17:06:53 | <@arkiver> | pokechu22: yeah i think we'll eventually be left with these type of problematic URLs |
| 17:06:58 | <@arkiver> | i'll go through them soon |
| 17:43:23 | | BornOn420 (BornOn420) joins |
| 17:49:02 | | thuban quits [Read error: Connection reset by peer] |
| 17:49:35 | | thuban joins |
| 18:21:44 | | @flashfire42 quits [Ping timeout: 252 seconds] |
| 18:22:17 | | kiska quits [Ping timeout: 265 seconds] |
| 18:29:17 | | flashfire42 joins |
| 18:30:21 | | kiska (kiska) joins |
| 18:36:05 | | fireonlive quits [Quit: Connection gently closed by peer] |
| 18:37:00 | | fireonlive (fireonlive) joins |
| 18:51:31 | | project10 quits [Remote host closed the connection] |
| 18:53:19 | | project10 (project10) joins |
| 19:01:31 | | fireonlive quits [Client Quit] |
| 19:02:24 | | fireonlive (fireonlive) joins |
| 19:31:53 | | Exorcism quits [Remote host closed the connection] |
| 19:31:53 | | yts98 leaves |
| 19:32:10 | | yts98 joins |
| 19:32:41 | | Exorcism (exorcism) joins |
| 20:02:31 | | flashfire42 is now authenticated as flashfire42 |
| 20:02:31 | | @ChanServ sets mode: +o flashfire42 |
| 20:02:38 | <@flashfire42> | https://server8.kiska.pw/uploads/0d2d5e1feae4eede/image.png |
| 20:02:46 | <@flashfire42> | I am no expert but its not meant to do that |
| 20:33:59 | <@arkiver> | flashfire42: ah |
| 20:39:59 | <@arkiver> | fixed |
| 20:40:03 | <@arkiver> | and forced the new version |
| 21:14:11 | <@arkiver> | DLoader: project10 FYI update is in |
| 21:17:47 | | kalle joins |
| 21:22:28 | <fireonlive> | did they end up raising the rate limits? |
| 21:25:12 | <@flashfire42> | https://transfer.archivete.am/LNLvH/range.frsommaire.htmurlhttpscpa25.p.txt |
| 21:25:41 | <@flashfire42> | arkiver ton of errors just now coming through |
| 21:29:18 | <phaeton> | yeah, I'm seeing the same |
| 21:30:00 | <flashfire42|m> | I have to go do school drop off but I’m assuming someone can push a fix |
| 21:33:16 | | yts98 leaves |
| 21:33:34 | | yts98 joins |
| 21:44:24 | | Flo99 quits [Remote host closed the connection] |
| 21:50:19 | <@flashfire42> | yeah ok looks like almost all of them are failing arkiver |
| 22:05:29 | | Exorcism quits [Remote host closed the connection] |
| 22:06:48 | | Exorcism (exorcism) joins |
| 22:12:31 | | LukeMax joins |
| 22:12:48 | <LukeMax> | anyone else experiencing download problems from ppo? |
| 22:18:18 | <pokechu22> | If you're seeing "Exception: Unknown item" that's happening to me too |
| 22:42:34 | | Exorcism quits [Remote host closed the connection] |
| 22:43:21 | | Exorcism (exorcism) joins |
| 22:47:28 | <@flashfire42> | *pokes arkiver * |
| 22:49:37 | <@arkiver> | fix coming |
| 22:49:50 | <@flashfire42> | neat |
| 22:56:06 | <@arkiver> | fixed |
| 23:03:41 | <LukeMax> | yeah its exception unknown item |
| 23:04:17 | <@arkiver> | it's fixed with latest version |
| 23:04:39 | <LukeMax> | ok ill try it |
| 23:04:49 | <LukeMax> | wait what latest version |
| 23:07:15 | <LukeMax> | nvm |
| 23:07:47 | <LukeMax> | thank you perso works |
| 23:07:59 | | LukeMax quits [Remote host closed the connection] |
| 23:26:53 | | Exorcism quits [Remote host closed the connection] |
| 23:27:39 | | Exorcism (exorcism) joins |
| 23:50:18 | | kalle quits [Remote host closed the connection] |