00:25:30etnguyen03 quits [Client Quit]
00:40:03Guest58 joins
00:55:00Guest58 quits [Client Quit]
00:55:19Guest58 joins
01:01:44DogsRNice_ joins
01:03:14DogsRNice quits [Ping timeout: 260 seconds]
01:03:41DogsRNice joins
01:06:44DogsRNice_ quits [Ping timeout: 260 seconds]
01:07:59etnguyen03 (etnguyen03) joins
01:17:26egallager joins
01:18:59ericgallager quits [Ping timeout: 260 seconds]
01:19:55Guest58 quits [Client Quit]
01:20:46Guest58 joins
01:36:29Guest58 quits [Ping timeout: 260 seconds]
01:39:00Guest58 joins
01:49:45lennier2 quits [Ping timeout: 258 seconds]
01:53:18lennier2 joins
01:55:53Guest58 quits [Client Quit]
02:16:18Guest58 joins
02:30:22Guest58 quits [Client Quit]
02:31:44Guest58 joins
02:35:51croissant_ joins
02:38:03croissant quits [Ping timeout: 258 seconds]
02:59:34Guest58 quits [Client Quit]
03:07:40etnguyen03 quits [Remote host closed the connection]
03:09:55Guest58 joins
03:11:10ericgallager joins
03:11:34egallager quits [Ping timeout: 260 seconds]
03:16:46Guest58_ joins
03:16:49Guest58 quits [Ping timeout: 260 seconds]
03:17:55Webuser728583 joins
03:18:11Webuser728583 quits [Client Quit]
03:18:35Guest58_ quits [Client Quit]
03:21:47Guest58 joins
03:24:09DogsRNice quits [Read error: Connection reset by peer]
03:26:41Guest58 quits [Client Quit]
03:32:52Shjosan quits [Quit: Am sleepy (-, – )…zzzZZZ]
03:33:24Shjosan (Shjosan) joins
03:34:58Island quits [Read error: Connection reset by peer]
04:01:46Radzig quits [Quit: ZNC 1.10.1 - https://znc.in]
04:02:15Radzig joins
04:24:35Webuser593583 joins
04:28:34ericgallager quits [Ping timeout: 260 seconds]
04:29:39Webuser593583 quits [Client Quit]
04:29:41ericgallager joins
04:44:55Guest58 joins
06:08:15SootBector quits [Remote host closed the connection]
06:09:23SootBector (SootBector) joins
06:13:52rohvani quits [Ping timeout: 258 seconds]
06:29:27HackMii quits [Remote host closed the connection]
06:33:41HackMii (hacktheplanet) joins
06:36:46ConstantK joins
07:05:26lemuria quits [Remote host closed the connection]
07:16:36dendory quits [Read error: Connection reset by peer]
07:16:58dendory (dendory) joins
07:21:49Cornelius quits [Ping timeout: 260 seconds]
07:23:33Wohlstand (Wohlstand) joins
07:42:25chunkynutz60 quits [Read error: Connection reset by peer]
07:42:42chunkynutz60 joins
08:13:33Cornelius (Cornelius) joins
08:26:07tertu2 quits [Ping timeout: 258 seconds]
08:34:04tertu (tertu) joins
11:00:02Bleo182600722719623455222 quits [Quit: The Lounge - https://thelounge.chat]
11:02:47Bleo182600722719623455222 joins
11:45:31Dada joins
12:18:25tek_dmn quits [Ping timeout: 258 seconds]
12:19:19Snivy quits [Ping timeout: 260 seconds]
12:22:38benjins3_ quits [Ping timeout: 258 seconds]
12:38:29benjins3 joins
12:40:06tek_dmn (tek_dmn) joins
12:56:25beastbg8_ quits [Read error: Connection reset by peer]
12:57:07beastbg8_ joins
12:59:12<twiswist>Just tried ludios-wpull because someone mentioned it above, but it's missing --rejected-log so I'll stick with wget. This omission is not documented in the "Differences between Wpull and Wget" page
13:13:21egallager joins
13:15:19ericgallager quits [Ping timeout: 260 seconds]
13:22:26FiTheArchiver joins
13:22:57FiTheArchiver quits [Read error: Connection reset by peer]
13:25:01FiTheArchiver joins
13:26:24egallager quits [Ping timeout: 260 seconds]
13:28:22ericgallager joins
13:31:30FiTheArchiver1 joins
13:35:09FiTheArchiver quits [Ping timeout: 260 seconds]
13:40:13FiTheArchiver joins
13:40:19Dada quits [Remote host closed the connection]
13:41:28FiTheArchiver3 joins
13:42:46Dada joins
13:43:54FiTheArchiver1 quits [Ping timeout: 258 seconds]
13:44:40FiTheArchiver quits [Ping timeout: 258 seconds]
13:45:03FiTheArchiver1 joins
13:48:15FiTheArchiver joins
13:48:34FiTheArchiver3 quits [Ping timeout: 260 seconds]
13:48:55FiTheArchiver quits [Read error: Connection reset by peer]
13:51:04FiTheArchiver joins
13:51:11FiTheArchiver1 quits [Ping timeout: 258 seconds]
13:51:27Island joins
13:53:28FiTheArchiver1 joins
13:55:23FiTheArchiver3 joins
13:55:23FiTheArchiver3 quits [Remote host closed the connection]
13:56:05cipherrot (petrichor) joins
13:56:56FiTheArchiver quits [Ping timeout: 258 seconds]
13:57:19petrichor quits [Ping timeout: 258 seconds]
13:59:04FiTheArchiver1 quits [Ping timeout: 260 seconds]
14:04:16Webuser694459 joins
14:25:19pabs quits [Ping timeout: 260 seconds]
14:26:05pabs (pabs) joins
14:34:30ducky quits [Ping timeout: 258 seconds]
14:35:14<@JAA>twiswist: It is mentioned there. It falls under 'Features greater than Wget 1.15'.
14:39:21<twiswist>Oh, alright
14:48:26ducky (ducky) joins
15:05:56wickedplayer494 quits [Ping timeout: 258 seconds]
15:13:23<pabs>https://www.iconfinder.com/ "9,375,000+ free and premium vector icons..." shuts down November 15, 2025 according to https://support.freepik.com/s/article/Iconfinder-Closure-FAQs?language=en_US https://en.wikipedia.org/wiki/Iconfinder
15:13:35Dada quits [Remote host closed the connection]
15:15:18Dada joins
15:28:17Webuser694459 quits [Client Quit]
15:30:04wickedplayer494 joins
15:45:14wickedplayer494 quits [Ping timeout: 260 seconds]
15:45:33wickedplayer494 joins
16:37:09AK quits [Ping timeout: 260 seconds]
16:42:01AK (AK) joins
16:53:29BornOn420 quits [Ping timeout: 260 seconds]
17:05:46<b3nzo>the sitemap likely has all the urls https://www.iconfinder.com/iconfinder-googlesitemap-index.xml
17:13:41<@arkiver>ouch
17:13:50<@arkiver>nearly 10 million icons sounds amazing though
17:14:01<@arkiver>b3nzo: make sure you archive that! will come in handy if there's a project for it
17:14:05<@arkiver>(unless AB can handle it)
17:14:09<@arkiver>(can it?)
17:19:44<b3nzo>arkiver: i can run my grab-site instance for it, but if its added to AB or warrior then ig its better to just go through that. im pretty sure, my IPs would get blocked pretty soon if i have have to archive 10mil
17:25:16<@JAA>AB will not be happy with this one. Each icon has over a dozen downloads, so that alone brings it to well over 100M URLs.
17:25:21<dendory>The site seems to already have tons of crawls on the wayback machine, and perhaps more useful, there's a zip file of all their icons here: https://archive.org/details/perma_cc_4L5X-3ZBR
17:26:41<@JAA>Wrong link? That's just one specific icon.
17:27:37Snivy (Snivy) joins
17:27:45<nicolas17>discover the urls and put them into #//
17:27:47<nicolas17>hashtag yolo
17:27:55<nicolas17>(don't do that we'd kill the site)
17:28:01<dendory>My bad.. seems like there's multiple zip files of single icons.. hmm.. Maybe it'd be worth it to compile an archive of all the icons.
17:30:09<@JAA>Interestingly, the homepage mentions '6 million icons' (the 9M+ figure includes illustrations and whatnot), but then https://www.iconfinder.com/icons says <title>18,861+ high-quality icons - Iconfinder</title>.
17:31:02<dendory>Well I easily found that you can go from https://www.iconfinder.com/icons/1/download/png/128 to https://www.iconfinder.com/icons/100000/download/png/128 and download those as 128x128 png files.. might get a script up and zip them up.
17:31:57<@JAA>Yes, the IDs are sequential, and you don't need to know the slug.
17:32:07<@JAA>IDs go to over 13 million: https://www.iconfinder.com/search/icons?sort=timestamp_published
17:32:10<@arkiver>JAA: alright, i'll look into getting a Warrior project started for this one
17:32:14<@arkiver>nice that the URLs are sequential
17:32:29<@arkiver>let's make sure it's on deathwatch so it's not forgotten
17:32:38<nicolas17>JAA: might need to know the slug for WBM replay right?
17:32:53<@JAA>Looks like 'only' 216k are free: https://www.iconfinder.com/search/icons?sort=timestamp_published&price=free
17:32:59<@JAA>nicolas17: It redirects.
17:33:07<@arkiver>JAA: does that mean only web pages accessible for those 216k?
17:33:16<@JAA>No, web pages are accessible for all.
17:33:20<nicolas17>oh perfect
17:33:30<@JAA>'Download with Pro': https://www.iconfinder.com/icons/13085800/advertising_and_promotions_business_management_finance_planning_analysis_technology_icon
17:33:35<@arkiver>just checked premium is available too as web pages
17:35:01<@arkiver>i was just going to take some off time
17:35:31<@arkiver>but let's add it to deathwatch and i'll look later into a project for it
17:37:59<@JAA>Deathwatch++
17:37:59<eggdrop>[karma] 'Deathwatch' now has 1 karma!
17:38:07<dendory>Will need a lot of throttling because it blocks with error 429 Too Many Requests after just a handful of requests.
17:38:30<@JAA>Ah, I was going to say I could try with qwarc. Nevermind then.
17:38:51<@arkiver>i'm sure we can come up with a solution :)
17:38:53<dendory>It blocks at around 50 for a couple of seconds..
17:38:54<@arkiver>yay for many IPs
17:39:05<@arkiver>dendory: what does "around 50" mean?
17:40:03<dendory>I did around 50 wget requests manually just to test various IDs and got blocked
17:40:39<b3nzo>the sitemaps have a total of 7,203,306 urls
17:40:58<@arkiver>ah
17:41:00<@arkiver>that is not too bad
17:41:07@arkiver will be afk for some time
17:41:31<dendory>I'm just trying to get the highest resolution free PNG icons to zip them up, I think that will be far more useful than all 7M pages
17:42:05<@JAA>Metadata is important, especially licence and author.
17:42:12<@JAA>URL preservation is also important.
17:42:48<@arkiver>yeah
17:42:57Lord_Nightmare quits [Quit: ZNC - http://znc.in]
17:43:29<@JAA>(It looks like a lot of the free icons there are under CC BY, so attribution is even required.)
17:44:04<dendory>Yeah every icon I found so far is CC BY 4.0
17:44:40<dendory>I just think a zip file of all the free icons would be useful to have
17:45:02<b3nzo>are any specific reasons for archiving the highest quality image rather than archiving the preview page?
17:45:21<nicolas17>can confirm, "wget https://www.iconfinder.com/icons/1000{00..99}/download/png/128" got me 429'd after 57 requests
17:46:05<dendory>the block lasts 60 secs btw.
17:47:09<@arkiver>that rate limiting is really not too bad
17:48:15Lord_Nightmare (Lord_Nightmare) joins
17:55:50<@JAA>So almost 1 req/s, yeah, should be fine then with DPoS.
17:58:00<nicolas17>sleep 1 between requests, tell people to use concurrency 1 per IP?
18:00:52<nicolas17>wait I found another issue
18:01:24<nicolas17>my wget started getting redirects to login
18:06:24ducky quits [Ping timeout: 260 seconds]
18:07:05<nicolas17>can't reproduce...
18:07:58<nicolas17>oh there it is
18:08:18<@JAA>We might want a channel for this one.
18:08:27<@JAA>Is #iconloser too harsh?
18:09:45<nicolas17>https://www.iconfinder.com/icons/100530/download/png/128 -> 302 Found https://www.iconfinder.com/user/login?redirect_to=https%3A//www.iconfinder.com/icons/100530/download/png/128
18:09:54<nicolas17>went away when I restarted wget, is this per TCP connection?
18:17:18<dendory>I've downloaded over a thousand so far using wget and doing fine so far. There are lots of missing IDs btw, doesnt seem to be an issue to just skip them.
18:17:46<nicolas17>are you using separate wget calls for each URL?
18:19:35<dendory>Just did a really quick shell script, more of a test than anything: https://sharetext.io/45b80631
18:20:28<nicolas17>yeah one wget per URL
18:20:42<nicolas17>if you pass multiple URLs at once, wget reuses the connection for multiple requests
18:20:49<@JAA>You can use https://transfer.archivete.am/ to share things without JS nonsense sites.
18:21:17<nicolas17>and it turns out once you send 100 requests in the same connection, the website starts redirecting to login (?!)
18:22:02<nicolas17>(I'm using -w 0.9 to make wget wait between requests, to avoid the 429)
18:25:08<h2ibot>Nicolas17v2 edited Deathwatch (+193, /* 2025 */ Add iconfinder): https://wiki.archiveteam.org/?diff=57592&oldid=57574
18:25:50<dendory>Not sure, I just left the script running and I'm at 1300... other than the 60sec waits and random 404s it's going fine.
18:26:19<dendory>A better way to do this would be to fetch metadata to at least get the name of the icons instead of just IDs.. might look into that
18:26:45<nicolas17>there's a name in the HTTP headers
18:27:08<dendory>sweet
18:27:10<nicolas17>content-disposition: attachment; filename=99400_dribbble_icon.png
18:32:30ducky (ducky) joins
18:33:47SootBector quits [Remote host closed the connection]
18:34:54SootBector (SootBector) joins
18:39:37DogsRNice joins
18:41:59<dendory>Updated the script to use the right filenames: https://transfer.archivete.am/lxpaJ/iconfinder.sh
18:42:01<eggdrop>inline (for browser viewing): https://transfer.archivete.am/inline/lxpaJ/iconfinder.sh
19:01:42Ointment8862 (Ointment8862) joins
19:03:30Guest58 quits [Quit: My Mac has gone to sleep. ZZZzzz…]
20:10:04NF885 (NF885) joins
20:18:53NF885 quits [Client Quit]
20:33:17ThreeHM quits [Quit: WeeChat 4.7.1]
20:35:15ThreeHM (ThreeHeadedMonkey) joins
20:39:19Webuser537497 joins
20:39:47Webuser537497 quits [Client Quit]
20:55:34etnguyen03 (etnguyen03) joins
21:02:35etnguyen03 quits [Client Quit]
21:03:04etnguyen03 (etnguyen03) joins
21:20:45cyanbox quits [Read error: Connection reset by peer]
21:22:47cyanbox joins
21:24:16twiswist quits [Quit: twiswist]
21:25:21twiswist (twiswist) joins
21:31:25etnguyen03 quits [Client Quit]
21:39:47etnguyen03 (etnguyen03) joins
21:41:37Dada quits [Remote host closed the connection]
21:51:58Wohlstand quits [Remote host closed the connection]
22:21:12BearFortress joins
22:31:02abirkill (abirkill) joins
22:32:22Webuser301963 joins
22:32:57<Webuser301963>Hey could you archive this site please? https://crba.dedyn.io
22:41:25<pokechu22>Archivebot job started
22:44:41Webuser301963 quits [Client Quit]
22:48:55matoro quits [Quit: https://quassel-irc.org - Chat comfortably. Anywhere.]
22:49:14matoro joins
23:13:15etnguyen03 quits [Client Quit]
23:26:37rohvani joins
23:40:37Shard7 quits [Quit: Im doing something rq. Il brb]
23:45:43Shard7 (Shard) joins
23:53:07etnguyen03 (etnguyen03) joins