00:05:06BlueMaxima joins
00:15:00etnguyen03 quits [Client Quit]
00:29:43@Sanqui quits [Quit: .]
00:34:21qwertyasdfuiopghjkl2 joins
00:34:58qwertyasdfuiopghjkl2 leaves
00:35:44qwertyasdfuiopghjkl2 joins
00:37:13qwertyasdfuiopghjkl2 leaves
00:37:42<immibis>TheTechRobo: I think they are mostly concerned about layer 7 transparent proxies and similar. NAT is allowed, so if your fancy setup amounts to a NAT, my judgement says it's allowed... my judgement also doesn't count for anything.
00:38:20<pabs>katia: see the wiki page I'm maintaining https://wiki.archiveteam.org/index.php/SmolNet
00:39:31<pabs>katia: short answer: its not possible in a standards-compliant way, but others have already done it without updating the WARC standard, and I have done it using HTTP to SmolNet proxies, but the AB jobs crashed once, then killed due to pipeline disappearance once both times
00:40:01<pabs>kokos: ^
00:40:40<pabs>see also the Gopher wiki page
00:40:47etnguyen03 (etnguyen03) joins
00:41:55<h2ibot>PaulWise created Gemini (+21, create page): https://wiki.archiveteam.org/?title=Gemini
00:41:56<h2ibot>PaulWise edited Regex Rodeo (-9, update redirect): https://wiki.archiveteam.org/?diff=53843&oldid=53827
00:42:55<h2ibot>PaulWise edited ArchiveBot/Regex rodeo (-9, update redirect): https://wiki.archiveteam.org/?diff=53844&oldid=53825
00:44:29<immibis>given the pedantry around WARCs when applied to HTTP i would have expected WARC should be updated to support Gopher/Gemini/HTTP2 exchanges and then the exchanges should be stored in the file without modification. anything else seems to defeat the point of the pedantry, no?
00:53:07<pabs>I've no idea about WARC. currently it doesn't support anything but HTTP 1.1 I thought
00:53:30<pabs>the wiki page includes some links to WARC standard stuff about Gopher/etc too
00:53:57<h2ibot>Switchnode edited CuriousCat (+44, add data link): https://wiki.archiveteam.org/?diff=53845&oldid=53572
00:55:31<pabs>oh I forgot to add them
00:55:46<pabs>https://github.com/iipc/warc-specifications/issues/87
00:56:16<pabs>ah no, third paragraph
00:56:18<pabs>https://github.com/iipc/warc-specifications/issues/85
00:56:26<pabs>https://github.com/iipc/warc-specifications/issues/42
00:59:40mls quits [Ping timeout: 260 seconds]
01:16:38mls (mls) joins
01:25:55tbc1887 quits [Ping timeout: 260 seconds]
01:29:49<that_lurker>https://img.kuhaon.fun/u/BfK6vP.mp4
01:33:20that_lurker forgot to add context
01:33:56<that_lurker>CNN tiktok about reasearchers rushing to catalogue and save scientific work post-election
01:34:13<@OrIdow6>Namely about the IA (no surprise in the information given there) and "EDGI"
01:35:18<@OrIdow6>Also contains what I think is a really old photo of Brewster Khale at 0:44?
01:39:33<pabs>heh "engineers"
01:49:23Sanqui joins
01:49:25Sanqui quits [Changing host]
01:49:25Sanqui (Sanqui) joins
01:49:25@ChanServ sets mode: +o Sanqui
02:11:25<@OrIdow6>Parallel now thinks it'll be 4 days to do my (limited) scan
02:21:49Hackerpcs quits [Quit: Hackerpcs]
02:23:32Hackerpcs (Hackerpcs) joins
02:23:44DopefishJustin quits [Remote host closed the connection]
02:28:20pokechu22 quits [Ping timeout: 260 seconds]
02:29:30Hackerpcs quits [Ping timeout: 260 seconds]
02:31:51Hackerpcs (Hackerpcs) joins
02:37:05Hackerpcs quits [Ping timeout: 260 seconds]
02:41:31Hackerpcs (Hackerpcs) joins
03:09:45decky_e quits [Ping timeout: 260 seconds]
03:26:56datechnoman quits [Quit: The Lounge - https://thelounge.chat]
03:27:40datechnoman (datechnoman) joins
03:29:34BlueMaxima quits [Read error: Connection reset by peer]
03:36:55etnguyen03 quits [Remote host closed the connection]
03:50:33Matthww quits [Quit: Ping timeout (120 seconds)]
03:52:05Matthww joins
04:09:26Matthww quits [Client Quit]
04:10:59Matthww joins
04:28:59Island quits [Read error: Connection reset by peer]
04:47:41Island joins
04:49:21<@OrIdow6>Looking thru some of the CDXs in WARCs which are accessible to me I find that hashes consisting of all 'A's occur about 7 times as often as those consisting of e.g. 1 byte and then all As after that
04:50:21<@JAA>That sounds about right for the base32 strings.
04:50:31<@JAA>Should be a factor 8, actually.
04:50:43<@OrIdow6>Ahh forgot about base32
04:52:09<@OrIdow6>Ex4plains it
04:52:21<@JAA>Hashes with all As should be about as common as hashes with two arbitrary base32 chars and then all As.
04:52:45<@JAA>(Actually not quite arbitrary, but I'm too lazy to figure out the possible value for the second char.)
04:53:00Wohlstand quits [Ping timeout: 260 seconds]
05:19:50ArchivalEfforts quits [Ping timeout: 260 seconds]
05:20:29ArchivalEfforts joins
05:21:15Commander001 quits [Read error: Connection reset by peer]
05:21:28Commander001 joins
05:21:59AlsoHP_Archivist joins
05:24:30HP_Archivist quits [Ping timeout: 260 seconds]
05:26:54wickedplayer494 quits [Ping timeout: 252 seconds]
05:27:45wickedplayer494 joins
05:28:35Commander001 quits [Ping timeout: 260 seconds]
05:28:52<h2ibot>JustAnotherArchivist edited ArchiveBot/Ignore (+560, /* Drupal */ Include basePath and fix profiles): https://wiki.archiveteam.org/?diff=53846&oldid=53832
05:28:57<@JAA>c3manu: ^
05:35:54<h2ibot>JustAnotherArchivist edited ArchiveBot/Ignore (+572, /* Drupal */ Add Backdrop CMS): https://wiki.archiveteam.org/?diff=53847&oldid=53846
05:38:42Guest54 quits [Quit: My MacBook has gone to sleep. ZZZzzz…]
05:38:49<@arkiver>JAA: IA recalculates yes
05:39:07Guest54 joins
05:48:05<@arkiver>OrIdow6: on the SHA1 collision attacks, likely yes. (unrelated to the issue Wget-AT had of course)
05:58:29wessel15126 joins
05:58:51wessel1512 quits [Read error: Connection reset by peer]
05:58:51wessel15126 is now known as wessel1512
06:08:51ArchivalEfforts quits [Client Quit]
06:09:01ArchivalEfforts joins
06:49:29pixel (pixel) joins
06:50:01<@arkiver>OrIdow6: JAA: doing a scan similar to what OrIdow6 is doing, but on all items
06:50:14<@arkiver>also those not reachable outside
06:50:20<@arkiver>using the CDX GZ files
06:52:54<thuban>JAA: the archivebot job for forum.pclab.pl (shuts down 29/30 november) definitely will not finish in time, partly because of size but also because of the enumeration issues (i thought user pages could get us around this but they can't).
06:53:15<thuban>can you get a qwarc job going / would it be helpful if i tried writing a spec file?
07:05:28qwertyasdfuiopghjkl quits [Ping timeout: 255 seconds]
07:05:50Unholy2361924645377131 (Unholy2361) joins
07:07:40<@arkiver>OrIdow6: JAA: looking into requeuing items to projects that are still up
07:19:13<@JAA>thuban: Right, will get that started this week. I've done Invision that way before, so can mostly copy the spec file from one of those.
07:22:07<thuban>ok, cool. unfortunately the php error pages have status 200, so there'll need to be a specific check for that (i don't _think_ it can happen on thread pages, but probably safest to do it everywhere)
07:22:43<@JAA>Right
07:22:54<@JAA>Do you have an example of a page that always fails?
07:24:08<@JAA>I do always have a check whether the expected content is in the response (in this case, whether there are posts on a thread page), but it normally just generates a warning rather than rejecting the response and retrying.
07:24:44<@JAA>If it's not something that fixes itself within a minute though, I guess it doesn't matter.
07:25:03<thuban>no, sorry. a lot of them are frequent but i don't think any are 100% consistent
07:28:14loug8318142 joins
07:39:10pixel leaves
07:51:41pixel (pixel) joins
07:57:14Wohlstand (Wohlstand) joins
08:02:03<@arkiver>first doing a general scan for all SHA1s ending with a NUL byte, then doing a second scan over those results to get the items these revisit record belonged to
08:05:02pixel leaves
08:13:49Island quits [Read error: Connection reset by peer]
08:16:51wessel1512 quits [Ping timeout: 252 seconds]
08:37:54tek_dmn quits [Quit: ZNC - https://znc.in]
08:38:11tek_dmn (tek_dmn) joins
08:44:43<@arkiver>the queuing bot is back up!
08:44:56<@arkiver>feel free to queue whatever was not picked up
08:45:14<@JAA>qubert++
08:45:15<eggdrop>[karma] 'qubert' now has 1 karma!
08:45:23<@arkiver>i will also take some time today or tomorrow to go through my logs and find !a commands that have not been run yet
08:45:28<@arkiver>first karma!
08:45:51<@arkiver>also let's release this
08:46:09<@arkiver>i like qubert
08:46:16<@arkiver>the long version is Quantum BERT
09:04:11<@arkiver>no opinions on Quantum BERT? :P
09:04:45<@arkiver>if it's a horrible idea, tell me... and if it's a great horrible idea, also tell me :P
09:05:51<@JAA>We can call it that once it runs on a quantum computer and processes items in parallel. :-P
09:06:46<@arkiver>it processes in parallel
09:07:10<@JAA>But not in the quantum computing sense. :-)
09:07:42<@JAA>(I don't know whether this makes any sense and should probably be in bed anyway.)
09:07:47<@arkiver>it will... we're just a bit early with the name
09:52:09Wohlstand quits [Client Quit]
10:07:33Wohlstand (Wohlstand) joins
10:18:13<Vokun>qubert++
10:18:13<eggdrop>[karma] 'qubert' now has 2 karma!
10:21:17qwertyasdfuiopghjkl (qwertyasdfuiopghjkl) joins
10:30:27sralracer (sralracer) joins
10:51:21Wohlstand quits [Client Quit]
11:05:49ducky_ (ducky) joins
11:06:13ducky quits [Ping timeout: 260 seconds]
11:06:24ducky_ is now known as ducky
11:10:00LomanicOld|m joins
11:11:28<pabs>JAA: re AB monitoring for puu.sh, can you add it to the AB/Monitoring wiki in a new ideas section?
11:14:58<h2ibot>JustAnotherArchivist edited ArchiveBot/Monitoring (+74, Add puush as an idea): https://wiki.archiveteam.org/?diff=53848&oldid=53810
11:33:45Naruyoko5 quits [Ping timeout: 252 seconds]
12:00:06Bleo182600722719623 quits [Quit: The Lounge - https://thelounge.chat]
12:02:37qwertyasdfuiopghjkl2 joins
12:02:48Bleo182600722719623 joins
12:04:42qwertyasdfuiopghjkl2 quits [Client Quit]
12:05:17decky_e joins
12:05:52qwertyasdfuiopghjkl2 joins
12:28:27<imer>arkiver++
12:28:28<eggdrop>[karma] 'arkiver' now has 37 karma!
12:33:07qwertyasdfuiopghjkl2 quits [Client Quit]
12:33:50f_ quits [Ping timeout: 260 seconds]
12:34:15qwertyasdfuiopghjkl2 joins
12:34:24qwertyasdfuiopghjkl2 leaves
12:34:55qwertyasdfuiopghjkl2 joins
12:36:02f_ (funderscore) joins
12:38:29SkilledAlpaca41896 quits [Quit: SkilledAlpaca41896]
12:39:58SkilledAlpaca41896 joins
12:44:01qwertyasdfuiopghjkl2 quits [Excess Flood]
12:45:15Sluggs quits [Ping timeout: 252 seconds]
12:48:34Sluggs joins
12:50:01qwertyasdfuiopghjkl2 joins
12:57:01qwertyasdfuiopghjkl2 quits [Client Quit]
12:58:41qwertyasdfuiopghjkl2 joins
12:59:40qwertyasdfuiopghjkl2 leaves
13:03:23qwertyasdfuiopghjkl2 joins
13:11:39qwertyasdfuiopghjkl2 quits [Client Quit]
13:12:15qwertyasdfuiopghjkl2 joins
13:12:27qwertyasdfuiopghjkl2 quits [Excess Flood]
13:14:20qwertyasdfuiopghjkl2 joins
13:14:23qwertyasdfuiopghjkl2 quits [Max SendQ exceeded]
13:14:37qwertyasdfuiopghjkl2 joins
13:14:37qwertyasdfuiopghjkl2 quits [Excess Flood]
13:16:54<@arkiver>thanks imer :)
13:18:04qwertyasdfuiopghjkl2 joins
13:18:04qwertyasdfuiopghjkl2 quits [Excess Flood]
13:20:31qwertyasdfuiopghjkl2 joins
13:20:34qwertyasdfuiopghjkl2 quits [Max SendQ exceeded]
13:22:41qwertyasdfuiopghjkl2 joins
13:28:06qwertyasdfuiopghjkl2 quits [Client Quit]
13:45:39<f_>arkiver++
13:45:39<eggdrop>[karma] 'arkiver' now has 38 karma!
13:46:07qwertyasdfuiopghjkl2 joins
13:46:07qwertyasdfuiopghjkl2 quits [Excess Flood]
13:47:34qwertyasdfuiopghjkl2 joins
13:47:34qwertyasdfuiopghjkl2 quits [Excess Flood]
14:00:34f_ is now known as funderscore
14:01:40funderscore is now known as f_
14:05:16qwertyasdfuiopghjkl2 joins
14:09:20qwertyasdfuiopghjkl2 leaves
14:12:46qwertyasdfuiopghjkl2 joins
14:13:26qwertyasdfuiopghjkl2 leaves
14:14:08qwertyasdfuiopghjkl2 joins
14:21:23qwertyasdfuiopghjkl2 quits [Client Quit]
14:24:37qwertyasdfuiopghjkl2 joins
14:26:06qwertyasdfuiopghjkl2 quits [Client Quit]
14:28:24qwertyasdfuiopghjkl2 joins
14:30:02vix5110_ joins
14:34:55qwertyasdfuiopghjkl2 quits [Client Quit]
14:58:20qwertyasdfuiopghjkl2 joins
15:03:14qwertyasdfuiopghjkl2 quits [Client Quit]
15:03:22qwertyasdfuiopghjkl2 joins
15:03:41qwertyasdfuiopghjkl2 quits [Client Quit]
15:03:49qwertyasdfuiopghjkl2 joins
15:06:39qwertyasdfuiopghjkl2 quits [Client Quit]
15:06:57qwertyasdfuiopghjkl2 joins
15:15:57katocala quits [Ping timeout: 252 seconds]
15:16:43katocala joins
15:21:26qwertyasdfuiopghjkl2 leaves
15:21:43AlsoHP_Archivist quits [Quit: Leaving]
15:21:57HP_Archivist (HP_Archivist) joins
15:23:54Muad-Dib quits [Quit: ZNC - http://znc.in]
15:24:18qwertyasdfuiopghjkl2 joins
15:29:15sludge quits [Remote host closed the connection]
15:29:16DopefishJustin joins
15:29:28sludge joins
15:46:47qwertyasdfuiopghjkl2 quits [Client Quit]
15:51:24qwertyasdfuiopghjkl2 joins
15:57:46Webuser618972 joins
15:58:54Webuser618972 quits [Client Quit]
16:07:20katocala quits [Ping timeout: 260 seconds]
16:08:07katocala joins
16:15:54qwertyasdfuiopghjkl2 quits [Excess Flood]
16:18:41qwertyasdfuiopghjkl2 joins
16:35:11Muad-Dib joins
16:45:15Doranwen quits [Ping timeout: 260 seconds]
16:46:18Doranwen (Doranwen) joins
16:51:05Naruyoko5 joins
17:00:48<c3manu>JAA++
17:00:48<eggdrop>[karma] 'JAA' now has 170 karma!
17:03:16<that_lurker>arkiver++
17:03:16<eggdrop>[karma] 'arkiver' now has 39 karma!
17:03:23<that_lurker>JAA++
17:03:23<eggdrop>[karma] 'JAA' now has 171 karma!
17:03:28<that_lurker>c3manu++
17:03:29<eggdrop>[karma] 'c3manu' now has 49 karma!
17:07:36<@OrIdow6>arkiver: I'll just go with your scan then, because you have access to these things + are closer to the data
17:07:48<@OrIdow6>What are your plans for what you're going to do with the results?
17:09:38ducky quits [Ping timeout: 260 seconds]
17:11:07ducky (ducky) joins
17:18:36HP_Archivist quits [Ping timeout: 252 seconds]
17:25:40<nicolas17>h2ibot: wb
17:28:23<nicolas17>anyone queueing stuff from IRC logs? I don't think I have complete-enough logs to do that myself
17:40:28HP_Archivist (HP_Archivist) joins
18:27:44khaoohs joins
18:27:47<kiska>nicolas17: I have done #frogger #pastalavista #imgone and #mediaonfire but #down-the-tube has an error and I am waiting for arkiver to fix that before resuming queuing from my logs, and I don't have +v or +o in #telegrab
18:28:34<kiska>But #telegrab requires more... careful selection from the logs due to new restrictions
18:51:28<h2ibot>Manu edited Discourse/archived (+91, Grabbing https://discourse.joplinapp.org/): https://wiki.archiveteam.org/?diff=53849&oldid=53839
19:01:38HP_Archivist quits [Read error: Connection reset by peer]
19:02:02HP_Archivist (HP_Archivist) joins
19:22:00<@JAA>kiska++
19:22:01<eggdrop>[karma] 'kiska' now has 7 karma!
19:30:46pabs quits [Read error: Connection reset by peer]
19:31:59pabs (pabs) joins
20:30:23Commander001 joins
21:00:18matoro quits [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.]
21:00:38matoro joins
21:54:11vix5110_ quits [Quit: Ooops, wrong browser tab.]
21:56:37Island joins
22:27:14simon8162 quits [Quit: ZNC 1.9.1 - https://znc.in]
22:30:51simon816 (simon816) joins
22:32:37etnguyen03 (etnguyen03) joins
22:40:13ducksauce joins
22:40:34ducksauce quits [Client Quit]
22:56:21loug8318142 quits [Quit: The Lounge - https://thelounge.chat]
23:00:35Radzig quits [Remote host closed the connection]
23:06:38Radzig joins
23:27:41BornOn420 quits [Remote host closed the connection]
23:28:03BornOn420 (BornOn420) joins