00:10:39Notrealname1234 (Notrealname1234) joins
00:50:50Notrealname1234 quits [Client Quit]
01:01:27eightthree quits [Ping timeout: 272 seconds]
01:04:42eightthree joins
01:18:50Jake quits [Quit: Leaving for a bit!]
01:19:14Jake (Jake) joins
01:25:10<fireonlive>pabs: denada :)
01:27:31qw3rty__ quits [Ping timeout: 255 seconds]
01:33:34qw3rty__ joins
01:45:58lemuria joins
01:48:07qwertyasdfuiopghjkl (qwertyasdfuiopghjkl) joins
02:04:14lemuria quits [Client Quit]
02:04:36lemuria (lemuria) joins
02:05:16lemuria quits [Changing host]
02:05:16lemuria (lemuria) joins
02:14:35Guest54 quits [Client Quit]
02:23:38<lemuria>hi there, is --level 2 a good option for crawling a wordpress site from 2014
02:47:14muklumsum quits [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.]
02:48:23muklumsum joins
03:21:10Panasonic joins
03:21:48Ravenloft quits [Read error: Connection reset by peer]
03:27:39Wohlstand quits [Client Quit]
03:45:33etnguyen03 quits [Client Quit]
04:54:01BlueMaxima quits [Read error: Connection reset by peer]
04:57:56DogsRNice quits [Read error: Connection reset by peer]
06:41:11loug4 joins
06:47:49<h2ibot>Exorcism edited Bugzilla (+0, /* Status */): https://wiki.archiveteam.org/?diff=52976&oldid=52975
06:56:08<fireonlive>lemuria: which program are you using? #archivebot might be a good choice instead to get it in the wayback machine (and you can grab the warcs after)
07:10:03Unholy236192464537713 quits [Ping timeout: 272 seconds]
07:29:50Unholy236192464537713 (Unholy2361) joins
08:10:24<lemuria>grab-site, i forgot to say, fireonlive
08:11:08<lemuria>the archive was kinda OK-ish but fonts were missing and one of the images too, is that normal when grabbing sites?
08:23:05<h2ibot>Exorcism edited Bugzilla (+0, /* Status */): https://wiki.archiveteam.org/?diff=52977&oldid=52976
08:25:06<h2ibot>Exorcism edited Bugzilla (+0, /* Status */): https://wiki.archiveteam.org/?diff=52978&oldid=52977
08:42:09<h2ibot>Exorcism edited Bugzilla (+0, /* Status */): https://wiki.archiveteam.org/?diff=52979&oldid=52978
08:46:09<h2ibot>Exorcism edited Bugzilla (+0, /* Status */): https://wiki.archiveteam.org/?diff=52980&oldid=52979
08:47:10<h2ibot>Exorcism edited Bugzilla (+0, /* Status */): https://wiki.archiveteam.org/?diff=52981&oldid=52980
08:58:11<h2ibot>Exorcism edited Bugzilla (+0, /* Status */ aborted): https://wiki.archiveteam.org/?diff=52982&oldid=52981
09:00:03Bleo1826007227196 quits [Client Quit]
09:00:12<h2ibot>Exorcism edited Bugzilla (+0, /* Status */): https://wiki.archiveteam.org/?diff=52983&oldid=52982
09:01:25Bleo1826007227196 joins
09:02:55<thuban>asie, nullpeta, c3manu: as expected, my bruteforce did not turn up any sites not found in https://asie.pl/files/hp_vector_urls_20161012_plus.txt (except 403s)
09:03:12<h2ibot>Exorcism edited Bugzilla (+0, /* Status */): https://wiki.archiveteam.org/?diff=52984&oldid=52983
09:03:14<thuban>i can produce a list of the 403s if that's wanted but idk how useful it would be
09:04:32<thuban>also, c3manu, how did you run that list? `!ao <` or `!a <`?
09:08:20<c3manu>!ao and !a <
09:08:45<thuban>ty!
09:08:49<c3manu>np :)
09:09:10<c3manu>depends on what causes the 403s. if AB can someone get around that, it would be pretty useful ^^
09:10:20<thuban>i doubt it. my (100% speculative) guess is that they're sites set to private in some way by their authors
09:10:36<lemuria>what site are we investigating the 403s for
09:10:41<lemuria>multiple domains?
09:10:45<thuban>hp.vector.co.jp
09:14:12<thuban>personally i'd really like to know more about the non-'VA\d{6}' authors. given that there were only two in the 2016 directory and that that directory was 99.9% complete, we're probably not missing much in that regard, but maybe somebody can grep CC/IA CDX and see if anything turns up?
09:14:14<h2ibot>Exorcism edited Bugzilla (+0, /* Status */): https://wiki.archiveteam.org/?diff=52985&oldid=52984
09:38:18<h2ibot>Exorcism edited Bugzilla (+28, /* Status */): https://wiki.archiveteam.org/?diff=52986&oldid=52985
09:46:20<h2ibot>Exorcism edited Bugzilla (+0, /* Status */): https://wiki.archiveteam.org/?diff=52987&oldid=52986
09:46:21<h2ibot>Exorcism edited Bugzilla (+0, /* Status */): https://wiki.archiveteam.org/?diff=52988&oldid=52987
09:50:21<h2ibot>Exorcism edited Bugzilla (+0, /* Status */): https://wiki.archiveteam.org/?diff=52989&oldid=52988
09:52:21<h2ibot>Exorcism edited Bugzilla (+0, /* Status */): https://wiki.archiveteam.org/?diff=52990&oldid=52989
09:54:21<h2ibot>Exorcism edited Bugzilla (+0, /* Status */): https://wiki.archiveteam.org/?diff=52991&oldid=52990
10:00:22<h2ibot>Exorcism edited Bugzilla (+0, /* Status */): https://wiki.archiveteam.org/?diff=52992&oldid=52991
10:03:23<h2ibot>Exorcism edited Bugzilla (+0, /* Status */): https://wiki.archiveteam.org/?diff=52993&oldid=52992
10:03:24<h2ibot>Exorcism edited Bugzilla (+0, /* Status */): https://wiki.archiveteam.org/?diff=52994&oldid=52993
10:05:23<h2ibot>Exorcism edited Bugzilla (+0, /* Status */): https://wiki.archiveteam.org/?diff=52995&oldid=52994
10:21:26<h2ibot>Exorcism edited Bugzilla (+0, /* Status */): https://wiki.archiveteam.org/?diff=52996&oldid=52995
10:22:26<h2ibot>Exorcism edited Bugzilla (+0, /* Status */): https://wiki.archiveteam.org/?diff=52997&oldid=52996
10:22:27<h2ibot>Exorcism edited Bugzilla (+0, /* Status */): https://wiki.archiveteam.org/?diff=52998&oldid=52997
10:23:26<h2ibot>Exorcism edited Bugzilla (+0, /* Status */): https://wiki.archiveteam.org/?diff=52999&oldid=52998
10:31:27<h2ibot>Exorcism edited Bugzilla (+0, /* Status */): https://wiki.archiveteam.org/?diff=53000&oldid=52999
10:53:31<h2ibot>Exorcism edited Bugzilla (+0, /* Status */): https://wiki.archiveteam.org/?diff=53001&oldid=53000
11:00:00Bleo1826007227196 quits [Client Quit]
11:01:17Bleo1826007227196 joins
11:43:56SkilledAlpaca quits [Client Quit]
11:46:42yarrow2 quits [Quit: Connection closed for inactivity]
11:47:13SkilledAlpaca joins
12:06:29etnguyen03 (etnguyen03) joins
12:16:16danwellby quits [Quit: Watch out For sysops carrying carpet and quicklime]
12:37:20Guest54 joins
13:04:59<h2ibot>Exorcism edited Bugzilla (+18, /* Status */): https://wiki.archiveteam.org/?diff=53002&oldid=53001
13:05:59<h2ibot>PaulWise edited SmolNet (+345, add link to mercury protocl doc, mention…): https://wiki.archiveteam.org/?diff=53003&oldid=52708
13:17:01<h2ibot>Exorcism edited Bugzilla (+27, /* Archived */): https://wiki.archiveteam.org/?diff=53004&oldid=53002
13:46:15<JaffaCakes118>Could someone archive https://hlelo101.github.io/ with archivebot please (no coverage
13:54:07<h2ibot>Exorcism edited Bugzilla (+38, /* Archived */): https://wiki.archiveteam.org/?diff=53005&oldid=53004
14:04:53<@JAA>(That's been handled in #archivebot since.)
14:10:10<h2ibot>Exorcism edited Bugzilla (+34, /* Archived */): https://wiki.archiveteam.org/?diff=53006&oldid=53005
14:14:11<h2ibot>Exorcism edited Bugzilla (+36, /* Archived */): https://wiki.archiveteam.org/?diff=53007&oldid=53006
14:18:11<h2ibot>Exorcism edited Bugzilla (+39, /* Archived */): https://wiki.archiveteam.org/?diff=53008&oldid=53007
14:23:12<h2ibot>Exorcism edited Bugzilla (+40, /* Archived */): https://wiki.archiveteam.org/?diff=53009&oldid=53008
14:26:01Dango360 quits [Ping timeout: 255 seconds]
14:33:14<h2ibot>Exorcism edited Bugzilla (+45, /* Archived */): https://wiki.archiveteam.org/?diff=53010&oldid=53009
14:38:15<h2ibot>Exorcism edited Bugzilla (+34, /* Archived */): https://wiki.archiveteam.org/?diff=53011&oldid=53010
14:49:00Notrealname1234 (Notrealname1234) joins
14:52:42Dango360 (Dango360) joins
14:58:34Notrealname1234 quits [Remote host closed the connection]
14:58:40Notrealname1234 (Notrealname1234) joins
14:59:17Notrealname1234 quits [Client Quit]
15:35:05danwellby joins
15:54:25Wohlstand (Wohlstand) joins
16:20:45<@arkiver>JAA: what is sense even :P
16:35:35<h2ibot>Exorcism edited Bugzilla (+0, /* Status */): https://wiki.archiveteam.org/?diff=53012&oldid=53011
17:01:40<h2ibot>Bzc6p edited Demotivalo.net (-28, /* Sister sites */ kommenthuszar.com restored): https://wiki.archiveteam.org/?diff=53013&oldid=50595
17:44:27DogsRNice joins
18:03:39ilnrja quits [Remote host closed the connection]
18:03:56ilnrja (ilnrja) joins
18:16:52danwellby quits [Read error: Connection reset by peer]
18:24:01Island joins
18:29:29<lemuria>HELP VER
18:29:36<lemuria>(sorry forgot the /)
18:36:20danwellby joins
18:44:29pseudorizer quits [Quit: ZNC 1.9.1 - https://znc.in]
18:45:07<katia>/)
18:45:47pseudorizer (pseudorizer) joins
19:00:28JaffaCakes118 quits [Remote host closed the connection]
19:30:40SkilledAlpaca quits [Ping timeout: 255 seconds]
19:34:38SkilledAlpaca joins
19:48:40SkilledAlpaca quits [Ping timeout: 255 seconds]
19:49:18ilnrja quits [Remote host closed the connection]
19:49:56ilnrja (ilnrja) joins
20:33:04SkilledAlpaca joins
20:36:25Exorcism quits [Remote host closed the connection]
20:36:25DigitalDragons quits [Read error: Connection reset by peer]
20:36:54DigitalDragons (DigitalDragons) joins
20:36:57Exorcism (exorcism) joins
20:47:57TheGamer2000 joins
20:48:04TheGamer2000 quits [Client Quit]
20:51:07icedice (icedice) joins
21:28:03pixel leaves
21:28:04pixel (pixel) joins
21:39:24Megame (Megame) joins
21:52:29yarrow2 joins
22:29:37BlueMaxima joins
22:41:41yarrow_irccloud (yarrow_irccloud) joins
23:17:19sec^nd quits [Remote host closed the connection]
23:17:19shgaqnyrjp quits [Remote host closed the connection]
23:17:43shgaqnyrjp (shgaqnyrjp) joins
23:17:50sec^nd (second) joins
23:22:11CookMePlox joins
23:24:24<CookMePlox>hi friends! i am wondering if anyone knows specifics about how the "length" field from wayback machine's cdx api is calculated
23:24:53<CookMePlox>specifically, I'm seeing a bunch of cases where the digest matches, but the length is different, for example
23:24:57<CookMePlox>it,tip)/zenit/natale-6.jpg 20010725190547 http://www.tip.it:80/zenit/natale-6.JPG image/jpeg 200 PO53TOR6WEL4F3CFWDXP3OEUEY7F25NW 29785
23:24:58<CookMePlox>it,tip)/zenit/natale-6.jpg 20011230210923 http://www.tip.it:80/zenit/natale-6.JPG image/jpeg 200 PO53TOR6WEL4F3CFWDXP3OEUEY7F25NW 29773
23:26:30<CookMePlox>it's not obvious to me why the length would vary if the hash is the same. is the length maybe including some http headers that varied between responses, even though the response payloads were otherwise identical?
23:32:06<CookMePlox>ah, I see! the headers are preserved under X-Archive-Orig, and they are indeed different. so I think the length must be the compressed (gzip maybe?) size of the original entire network request, including headers
23:33:46CookMePlox quits [Client Quit]
23:43:15<@OrIdow6>Wish more people could just answer their questions by looking at the list of users in the channel