00:03:41dumbgoy quits [Read error: Connection reset by peer]
00:04:24dumbgoy joins
00:04:53Mateon1 quits [Remote host closed the connection]
00:07:20Mateon1 joins
00:09:10cascode quits [Ping timeout: 265 seconds]
00:09:22cascode joins
00:09:43Dango360 quits [Read error: Connection reset by peer]
00:11:54nepeat (nepeat) joins
00:12:25Dango360 (Dango360) joins
00:29:38cascode quits [Read error: Connection reset by peer]
00:29:58cascode joins
00:48:44cascode quits [Ping timeout: 252 seconds]
00:49:19cascode joins
00:50:56lk quits [Ping timeout: 252 seconds]
00:51:38lk (lk) joins
01:00:45Arcorann (Arcorann) joins
01:26:12cascode quits [Ping timeout: 258 seconds]
01:26:25imer5 (imer) joins
01:27:15cascode joins
01:29:59imer quits [Ping timeout: 252 seconds]
01:29:59imer5 is now known as imer
01:51:07AmAnd0A quits [Ping timeout: 258 seconds]
01:54:51AmAnd0A joins
02:18:05nicolas17 joins
02:20:41Gaelan_ is now known as Gaelan
02:45:01xkey quits [Quit: xkey]
02:46:00benjinsm joins
02:47:09tbc1887 (tbc1887) joins
02:49:38tbc1887_ quits [Ping timeout: 265 seconds]
02:49:38benjins quits [Ping timeout: 265 seconds]
02:52:00superkuh_ joins
02:52:01superkuh quits [Remote host closed the connection]
02:52:02railen63 quits [Remote host closed the connection]
02:52:02yano quits [Remote host closed the connection]
02:52:02Dango360 quits [Remote host closed the connection]
02:52:03railen63 joins
02:52:06Dango360 (Dango360) joins
02:52:16yano (yano) joins
03:31:12imer quits [Ping timeout: 265 seconds]
03:32:09imer (imer) joins
03:35:30imer6 (imer) joins
03:37:00imer quits [Ping timeout: 265 seconds]
03:37:00imer6 is now known as imer
04:18:19cascode quits [Read error: Connection reset by peer]
04:18:36cascode joins
04:39:06Island quits [Read error: Connection reset by peer]
04:48:21Lord_Nightmare quits [Quit: ZNC - http://znc.in]
04:51:23Lord_Nightmare (Lord_Nightmare) joins
05:14:40killsushi quits [Ping timeout: 258 seconds]
05:25:02BlueMaxima quits [Read error: Connection reset by peer]
05:33:27nicolas17 quits [Ping timeout: 258 seconds]
05:45:41xkey (xkey) joins
06:20:56cascode quits [Ping timeout: 252 seconds]
06:42:04dumbgoy quits [Ping timeout: 258 seconds]
06:46:36jtagcat quits [Client Quit]
06:47:03jtagcat (jtagcat) joins
07:08:59hellow joins
07:09:25hellow quits [Remote host closed the connection]
07:13:27hitgrr8 joins
07:52:46bf_ joins
08:17:01bf__ joins
08:18:47bf_ quits [Ping timeout: 265 seconds]
08:35:07IDK (IDK) joins
09:13:02BigBrain_ (bigbrain) joins
09:14:36BigBrain quits [Ping timeout: 245 seconds]
09:15:39albertlarsan684 (AlbertLarsan68) joins
09:15:40railen64 joins
09:15:40marto_ quits [Quit: Ping timeout (120 seconds)]
09:15:41marto_9 joins
09:15:43albertlarsan68 quits [Quit: Ping timeout (120 seconds)]
09:15:43lflare quits [Client Quit]
09:15:43TastyWiener95 quits [Quit: Ping timeout (120 seconds)]
09:15:43ell quits [Quit: Ping timeout (120 seconds)]
09:15:43railen63 quits [Remote host closed the connection]
09:15:43albertlarsan684 is now known as albertlarsan68
09:15:46ell4 (ell) joins
09:15:49lflare (lflare) joins
09:15:49TastyWiener952 (TastyWiener95) joins
10:00:01pie_ quits []
10:00:01railen64 quits [Remote host closed the connection]
10:02:14railen64 joins
10:09:29pie_ joins
10:09:33pie_ quits [Client Quit]
10:09:54pie_ joins
10:10:03pie_ quits [Client Quit]
10:15:34pie_ joins
10:15:46pie_ quits [Client Quit]
10:31:44pie_ joins
10:33:39pie_ quits [Client Quit]
10:39:27benjinsm is now known as benjins
10:40:19pie_ joins
10:40:47pie_ quits [Client Quit]
10:40:50pie_ joins
10:40:52pie_ quits [Client Quit]
10:44:34pie_ joins
10:46:20spirit quits [Client Quit]
10:46:47spirit joins
10:52:11spirit quits [Client Quit]
10:57:29pie_ quits [Client Quit]
10:59:58pie_ joins
11:00:38pie_ quits [Client Quit]
11:00:45pie_ joins
11:01:49pie_ quits [Remote host closed the connection]
11:01:56pie_ joins
11:44:49pie_ quits [Client Quit]
11:44:51pie_ joins
11:47:28pie_ quits [Remote host closed the connection]
11:47:30pie_ joins
11:48:37railen69 joins
11:52:11railen64 quits [Ping timeout: 258 seconds]
11:56:24railen69 quits [Ping timeout: 258 seconds]
12:35:46girst quits [Quit: ZNC 1.8.2 - https://znc.in]
12:37:05girst (girst) joins
12:48:57<@OrIdow6>So URLs involved in fetching gallery pages in Wysp are tied to window size, as I have spent quite a bit of time figuring out
12:49:22<@OrIdow6>So playback will not work properly on those, but it should still be fine on individual images
13:11:34pie_ quits [Client Quit]
13:19:35pie_ joins
13:48:05Arcorann quits [Ping timeout: 252 seconds]
13:49:52AmAnd0A quits [Ping timeout: 265 seconds]
13:50:02AmAnd0A joins
14:01:44onetruth joins
14:17:05pie_ quits [Ping timeout: 258 seconds]
14:37:44pie_ joins
14:51:48spirit joins
15:05:50<spirit>someone archivebot https://geomaticblog.net please, thanks :) no date given at https://geomaticblog.net/2023/07/06/retiring-geomaticblog.net/ but should be a quick job so no issue
15:08:24<Barto>spirit: it's in
15:08:29<spirit>cheers!
15:09:07AmAnd0A quits [Read error: Connection reset by peer]
15:09:24AmAnd0A joins
15:13:14<Barto>spirit: https://github.com/jsanz/geomaticblog/ and https://jorgesanz.net/ is also being taken care of :)
15:22:28us3rrr joins
15:22:36redbees_ joins
15:22:42jspiros_ (jspiros) joins
15:22:46jspiros quits []
15:22:46onetruth quits [Remote host closed the connection]
15:22:46HotSwap quits [Quit: ZNC - http://znc.in]
15:22:46redbees quits [Quit: ZNC 1.7.5+deb4 - https://znc.in]
15:22:46Billy549 quits [Quit: ZNC 1.8.2+deb2build5 - https://znc.in]
15:22:46Ryz2 quits [Quit: Ping timeout (120 seconds)]
15:22:59Ryz2 (Ryz) joins
15:24:43HotSwap joins
15:29:11<spirit>Barto: <3
15:29:48Billy549 (Billy549) joins
15:33:32mattx4332 (mattx433) joins
15:33:35celestial_ joins
15:33:36ell (ell) joins
15:33:37msrn_ joins
15:33:37T31M_ joins
15:33:40wyatt8750 joins
15:33:40Justin[home] joins
15:33:43fredgido_ joins
15:33:59fangfufu_ joins
15:34:00s-crypt5 (s-crypt) joins
15:34:09mattx433 quits [Quit: Ping timeout (120 seconds)]
15:34:09mattx4332 is now known as mattx433
15:34:13Ryz2 quits [Client Quit]
15:34:13iCaotix quits [Quit: ZNC 1.8.2 - https://znc.in]
15:34:13Minkafighter quits [Client Quit]
15:34:13imer quits [Client Quit]
15:34:13Matthww1 quits [Quit: Ping timeout (120 seconds)]
15:34:13IDK_ quits [Quit: Ping timeout (120 seconds)]
15:34:13ell4 quits [Client Quit]
15:34:13mikael quits [Quit: ZNC - http://znc.in]
15:34:13Suika_ quits [Quit: Server is ded]
15:34:14kiska quits [Quit: Ping timeout (120 seconds)]
15:34:14celestial quits [Quit: ZNC 1.8.0 - https://znc.in]
15:34:14Meroje quits [Quit: bye!]
15:34:14fredgido quits [Quit: I will be back]
15:34:14Aoede quits [Quit: ZNC - https://znc.in]
15:34:14T31M quits [Quit: ZNC - https://znc.in]
15:34:14s-crypt quits [Quit: Ping timeout (120 seconds)]
15:34:14wyatt8740 quits [Quit: ZNC got killed or something else has gone wrong, probably.]
15:34:14DopefishJustin quits [Remote host closed the connection]
15:34:14@dxrt quits [Quit: ZNC - http://znc.sourceforge.net]
15:34:14fangfufu quits [Quit: ZNC 1.8.2+deb2+b1 - https://znc.in]
15:34:14s-crypt5 is now known as s-crypt
15:34:14T31M_ is now known as T31M
15:34:16Aoede_ (Aoede) joins
15:34:18iCaotix joins
15:34:18Ryz2 (Ryz) joins
15:34:19Suika joins
15:34:19IDK_ joins
15:34:20dxrt_ joins
15:34:22Matthww1 joins
15:34:23Minkafighter joins
15:34:25Meroje joins
15:34:26Meroje quits [Changing host]
15:34:26Meroje (Meroje) joins
15:34:27imer (imer) joins
15:35:04kiska (kiska) joins
15:41:54dumbgoy joins
15:45:38AmAnd0A quits [Ping timeout: 258 seconds]
15:46:06AmAnd0A joins
16:06:50IDK quits [Client Quit]
16:29:32AmAnd0A quits [Read error: Connection reset by peer]
16:29:49AmAnd0A joins
17:14:59killsushi joins
17:31:03bf__ quits [Ping timeout: 258 seconds]
17:39:41Earendil7 quits [Quit: Leaving]
17:40:43Earendil7 (Earendil7) joins
17:44:11Earendil7 quits [Client Quit]
17:45:26Earendil7 (Earendil7) joins
17:50:36tzt quits [Ping timeout: 258 seconds]
17:55:12yawkat quits [Ping timeout: 258 seconds]
17:57:30killsushi quits [Ping timeout: 258 seconds]
17:57:34tzt (tzt) joins
18:13:23JustZac2 joins
18:13:50<JustZac2>Hello?
18:14:17<JustZac2>I have no idea what im doing i wanted to find a deleted or privated youtube video
18:23:59<pokechu22>I'm not an expert at that, but does the video work on web.archive.org?
18:24:22Lord_Nightmare quits [Client Quit]
18:24:25<JustZac2>havent tried it yet lemme check
18:24:56<JustZac2>Yeah its not there
18:26:08<@JAA>What's the video URL?
18:26:28bf_ joins
18:27:10<JustZac2>Wait a min
18:27:38<JustZac2>https://www.youtube.com/watch?v=8gR1Vm3yoMQ
18:27:43<JustZac2>this is the one
18:28:18<@JAA>#youtubearchive has a copy. You can ask there and wait patiently until someone has time to pull it from storage.
18:29:25<JustZac2>Thanks. how long will that take?
18:29:32<JustZac2>any estimates?
18:30:42<pokechu22>I'm pretty sure it's a manual process so it could take a few hours to a day depending on who's available (but I'm not part of that project so I don't know the details)
18:31:26<JustZac2>Oh Thanks man
18:31:31Lord_Nightmare (Lord_Nightmare) joins
18:32:49<@JAA>Yep, something like that.
18:38:08bf_ quits [Client Quit]
18:39:09bf_ joins
18:39:10qq44|m joins
18:39:22<qq44|m>hello
18:39:38<qq44|m>anyone know how to get http/s proxy working with grab-site or wpull?
18:39:46<qq44|m>i have an http proxy working, but having trouble with https proxy
18:40:34<qq44|m>wpull keeps sending a bad request to the proxy. i think its a cert error, but am not sure how to debug it
18:40:46<qq44|m>anyone try this before and know how to get it working?
18:41:16<@JAA>I know that HTTPS proxying is pretty broken in wpull 2.x. Not overly familiar with the changes in ludios_wpull (which is what grab-site uses), so can't comment on whether it applies there as well.
18:42:13<qq44|m>ahh that sucks, thought it must have been on wpulls end, tried a few different proxies and none worked with https
18:42:31<@JAA>I think you should get a relatively clear error though, not 400 or similar.
18:42:49<@JAA>'CONNECT is intentionally not supported' should appear somewhere.
18:42:51<qq44|m>I get: code 400, message Bad request version
18:43:37<@JAA>Oh wait, you're trying to use an external proxy, not wpull's proxy.
18:43:39<@JAA>Nevermind then.
18:45:15<qq44|m>yeah trying to use an external proxy
18:46:25<qq44|m>i have a pretty specific crawl im trying to do, and need to modify the logic of wpull to do it. thought the easy way would be to run it through a proxy and have the proxy handle that logic
18:47:53<@JAA>Hmm, what kind of logic?
18:50:11<qq44|m>Im trying to download all pages from a site that were published from a specific date range
18:50:41<qq44|m>my first idea was to use igsets and see if the date is available in the url path
18:52:03<qq44|m>its not though unfortunately, so I have to crawl a few index pages, and use those pages to find pages between date ranges
19:00:50VickoSaviour joins
19:02:22<VickoSaviour>hey guys, can someone please archive https://www.progaming.ba because it is shutting down and i want it to be archived.
19:02:32JustZac2 leaves
19:03:14<Barto>spirit: saved
19:03:19<spirit>yaaay
19:03:29<Barto>!a https://www.progaming.ba --igset blogs,badvideos -e 'for VickoSaviour'
19:03:34<Barto>wrong place lol
19:03:45<Barto>now it's at the right place :-)
19:04:41<Barto>VickoSaviour: looks like it's login walled, cant do much
19:04:56<VickoSaviour>oh fk
19:05:13<VickoSaviour>so how is it bad?
19:05:34<Barto>no account, no data
19:05:49<VickoSaviour>welp. shit.
19:05:56<Barto>:(
19:06:43<VickoSaviour>and also who tf uses login wall...
19:07:31<thuban>VickoSaviour: if you have an account, you can save it yourself by giving your login cookies to https://github.com/ArchiveTeam/grab-site/ (or another spidering program)
19:07:47<VickoSaviour>OH YES
19:07:56<VickoSaviour>i have a acc already
19:08:14<thuban>that can't go in the wayback machine, but it's better than nothing, and you could upload it to the internet archive if you want
19:12:27Island joins
19:13:55<spirit>fucking sourceforge broke my https://github.com/SpiritQuaddicted/sourceforge-file-download download script =(
19:20:20<masterX244>qq44: sometimes a crude selfwritten program for enumerating and then crawling a url list without recursion works for pages like that
19:26:37JensRex quits [Quit: JensRex]
19:27:24<spirit>fixed, i think
19:36:09JensRex (JensRex) joins
19:36:17JensRex quits [Client Quit]
19:36:48JensRex (JensRex) joins
19:55:42yawkat (yawkat) joins
19:55:52JensRex quits [Client Quit]
19:56:24JensRex (JensRex) joins
19:57:27<thuban>update: progaming.ba is gutted already, nothing to do but contact the admins :(
20:03:22<TheTechRobo>VickoSaviour: If you do upload it to archive.org, remember that the cookies you pass it will be stored inside the WARC
20:09:48<fireonlive>also any personal data will be saved in the WARC as well such as your username if it’s returned in the pages
20:10:55<thuban>true, but moot
20:11:52<fireonlive>:)
20:12:29<fireonlive>just a note I guess to those grab siteing ao3 or something ig
20:14:44JensRex quits [Ping timeout: 258 seconds]
20:15:42<@JAA>qq44|m: I'm not sure how a proxy would help you there, unless you mangle data there (in which case I sure hope you aren't producing WARCs). I'd do it with a wpull plugin.
20:16:01JensRex (JensRex) joins
20:16:43<@JAA>Or well, I'd really do it with my own stuff (qwarc) instead, but that's not really user-friendly especially since there's zero documentation.
20:23:42<qq44|m>JAA: I want to preserve a specific directory structure in the WARC. The proxy in this case would take the URL, do some fetches to find the relevant URL within the date range, and return that page to wpull
20:23:50<qq44|m>unless im misunderstanding how proxies work
20:24:27<qq44|m>also in some cases I want the proxy to modify the contents of the page that its sending back to wpull
20:24:55<@JAA>As long as you don't write that to WARC, that's fine. WARC is supposed to be an exact reproduction of what the target server sent.
20:30:52<qq44|m>i do want to write it to the warc, but i know im misusing warc in this case
20:32:38<qq44|m>i save a lot of documentation, and in those cases I mostly care about having a usable copy of the documentation as opposed to a faithful copy of the web pages
20:33:28<@JAA>Then I'd recommend at least adding a custom WARC header explaining that in detail. Not sure what I'd call that header, but probably something with an X- prefix.
20:33:40<@JAA>--warc-header on wpull
20:43:31VickoSaviour leaves
20:47:00VickoSaviour joins
20:48:23<VickoSaviour>what's the progress on reddit.com website? is the content earlier than January of 2021 saved?
20:50:03VickoSaviour leaves
20:51:36<@JAA>100 seconds, longer than some other people.
20:55:32eroc1990 quits [Quit: The Lounge - https://thelounge.chat]
20:55:58<fireonlive>we should have a leaderboard at some point, JAA
20:57:54<@JAA>I'd rather spend my time saving shit. :-)
20:58:37<masterX244>before the shredders reach the data
20:59:04<fireonlive>:)
21:01:24<fireonlive>it's like a conveyer belt we're trapped on, constantly running, with a meat shredder screaming at the end
21:06:26<myself>We should have a bot that makes someone pass a "welcome to IRC" quiz before voicing them...
21:07:01<fireonlive>i have seen such a long time ago lol
21:07:13<fireonlive>read the rules at <link> and enter the password hidden in the rules
21:07:18<fireonlive>but it din't help a lot
21:07:31<fireonlive>s/rules/faq/
21:08:08<fireonlive>it was a game of find password asap, ask question already answered :3
21:09:10<fireonlive>'you do understand you might have to wait minutes or hours for this right' 'yes yes get out of my way i want to type'
21:09:16<fireonlive>:D
21:09:36<@JAA>I mean, we generally want people to be able to reach us with as few barriers as possible in general.
21:13:02<fireonlive>that too
21:13:19<fireonlive>sometimes there are gems that come in, sometimes you get me :3
21:18:46beario__ joins
21:20:11bf_ quits [Ping timeout: 252 seconds]
21:20:11eroc1990 (eroc1990) joins
21:21:17beario_ quits [Ping timeout: 252 seconds]
21:29:56BigBrain_ quits [Remote host closed the connection]
21:30:23BigBrain_ (bigbrain) joins
21:49:25AmAnd0A quits [Ping timeout: 258 seconds]
21:52:40AmAnd0A joins
21:54:12Minkafighter5 joins
21:54:12Matthww13 joins
21:54:17eroc19905 (eroc1990) joins
21:54:17imer5 (imer) joins
21:54:24imer quits [Client Quit]
21:54:24Ryz2 quits [Client Quit]
21:54:24Minkafighter quits [Client Quit]
21:54:24eroc1990 quits [Client Quit]
21:54:24Matthww1 quits [Client Quit]
21:54:24Minkafighter5 is now known as Minkafighter
21:54:24imer5 is now known as imer
21:54:24Matthww13 is now known as Matthww1
21:54:29Ryz2 (Ryz) joins
21:55:48AmAnd0A quits [Read error: Connection reset by peer]
21:56:08AmAnd0A joins
22:22:51<nulldata>https://news.ycombinator.com/item?id=36657829
22:23:21<nulldata>"InfluxDB Cloud shuts down in Belgium; some weren't notified before data deletion"
22:23:24<nulldata>Oof
22:24:07<qq44|m>JAA: I tried https proxy with grab-site, but get the same error as wpull 2.0.3. Do you know of any other archiving tools similar to wpull or grab-site that works with https proxies?
22:26:45<@JAA>qq44|m: No idea, I don't use proxies for archival precisely because of the potential for data corruption.
22:28:58<qq44|m>wget works with proxy, do you know if there is a way to download page requisites with wget?
22:29:16<qq44|m>when ive used it in the past it only downloaded files from the first party domain, no third party files
22:35:45hitgrr8 quits [Client Quit]
22:50:55jtagcat quits [Client Quit]
22:51:33jtagcat (jtagcat) joins
23:11:27rohvani quits [Ping timeout: 258 seconds]
23:22:08jtagcat quits [Client Quit]
23:22:50jtagcat (jtagcat) joins
23:37:40etnguyen03 (etnguyen03) joins
23:40:56<tech234a>Came across https://radar.cloudflare.com/domains which has a top 1 million domains list sourced from users of Cloudflare's 1.1.1.1 DNS. Lots of other interesting information on that site including a list of known bots.
23:41:48<fireonlive>in csv format, too!
23:41:50<fireonlive>=]
23:54:20nicolas17 joins