| 00:01:06 | | VerifiedJ quits [Remote host closed the connection] |
| 00:01:40 | | VerifiedJ (VerifiedJ) joins |
| 00:12:17 | | Megame quits [Ping timeout: 252 seconds] |
| 00:23:21 | | sec^nd quits [Ping timeout: 245 seconds] |
| 00:24:22 | | Megame (Megame) joins |
| 00:33:55 | <h2ibot> | Arkiver edited Deathwatch (+245, SAPO Videos deleting data on September 17): https://wiki.archiveteam.org/?diff=50616&oldid=50603 |
| 00:34:03 | | sec^nd (second) joins |
| 00:34:04 | | lukash97 joins |
| 00:34:52 | | lukash9 quits [Ping timeout: 265 seconds] |
| 00:34:52 | | lukash97 is now known as lukash9 |
| 00:35:21 | | yasom1 (yasomi) joins |
| 00:35:50 | | yasomi quits [Ping timeout: 265 seconds] |
| 00:42:35 | | Megame quits [Read error: Connection reset by peer] |
| 00:51:37 | | yts98 leaves |
| 00:51:43 | | yts98 joins |
| 01:03:15 | | parfait (kdqep) joins |
| 01:45:26 | | wyatt8740 joins |
| 01:48:57 | | wyatt8740 quits [Remote host closed the connection] |
| 02:06:19 | | Megame (Megame) joins |
| 02:18:43 | | imer quits [Client Quit] |
| 02:41:32 | | imer (imer) joins |
| 02:42:07 | | imer quits [Excess Flood] |
| 02:42:33 | | imer (imer) joins |
| 02:53:26 | <h2ibot> | FireonLive edited Current Projects (+2, fix comments (i've removed these auto…): https://wiki.archiveteam.org/?diff=50617&oldid=50613 |
| 02:54:21 | <fireonlive> | that link has a virus |
| 02:54:25 | <fireonlive> | no one click it |
| 02:54:28 | <fireonlive> | thanks |
| 02:59:00 | | wyatt8740 joins |
| 03:00:46 | <@JAA> | Still wrestling with https://pkg.fig.io/ and it's such a mess. There's a script at https://pkg.fig.io/install.sh that you're supposed to pipe to a shell (because of course), which then invokes other scripts, adds their repo to your package manager, and installs stuff. I haven't been able to get my hands on the actual .deb or .rpm files though, just get 403s there. Maybe I'm doing something wrong. |
| 03:02:51 | | kiryu joins |
| 03:03:43 | | wyatt8750 joins |
| 03:05:11 | <nicolas17> | found it |
| 03:05:22 | <nicolas17> | well I didn't check if the deb actually downloads |
| 03:05:32 | | wyatt8740 quits [Ping timeout: 252 seconds] |
| 03:06:00 | <nicolas17> | oh, I see |
| 03:06:06 | <@JAA> | :-) |
| 03:06:08 | <nicolas17> | it 403s in the end |
| 03:06:51 | <@JAA> | It might also be the old way of installing it. https://repo.fig.io/ has far newer versions. |
| 03:07:18 | <nicolas17> | what's the latest version? |
| 03:07:40 | <@JAA> | Not a clue, but repo has 2.16.0 builds. |
| 03:08:25 | <@JAA> | The files are all helpfully named 'fig.$packageManagerExtension' without a version, and the download page link goes to 'latest'. |
| 03:08:36 | <nicolas17> | open bucket listing nice |
| 03:08:45 | <@JAA> | (The download page also says there are no Linux builds.) |
| 03:09:11 | <@JAA> | I found hints somewhere that there are also Windows builds. Haven't seen those in the bucket. |
| 03:10:23 | <fireonlive> | hmm even setting a user-agent to apt doesn't seem to work, though i'm just guessing at the correct value lol |
| 03:10:32 | <nicolas17> | yeah I think this script is just outdated |
| 03:10:34 | <fireonlive> | curl -A 'debian APT-HTTP/1.3 (2.6.1)' https://pkg-cdn.fig.io/2.5.3/linux/x86_64/fig.deb |
| 03:10:39 | <fireonlive> | :( |
| 03:11:08 | <fireonlive> | and no open bucket either |
| 03:13:03 | <@JAA> | I wanted to try running the script in a Debian container to see what'd happen, but there are some libseccomp2 issues on my test machine. |
| 03:13:18 | <nicolas17> | I did that |
| 03:13:22 | <@JAA> | Ah |
| 03:13:25 | <@JAA> | Thanks |
| 03:13:29 | <nicolas17> | it's not letting me install the package because it says the repository signature expired |
| 03:13:39 | <@JAA> | Heh |
| 03:13:59 | <@JAA> | libfaketime time? |
| 03:14:49 | <@JAA> | Or just --allow-unauthenticated |
| 03:15:04 | <nicolas17> | E: Failed to fetch https://pkg-cdn.fig.io/2.5.3/linux/x86_64/fig.deb 403 Forbidden [IP: 13.227.83.128 443] |
| 03:15:07 | <nicolas17> | this repo is just broken |
| 03:15:18 | <nicolas17> | nothing to see here move along |
| 03:15:22 | <@JAA> | Welp |
| 03:15:25 | <fireonlive> | :( |
| 03:15:36 | <@JAA> | I'll collect the scripts and stuff at least. |
| 03:15:46 | <@JAA> | The redirects will be broken in AB anyway thanks to the 307 bug. |
| 03:16:18 | <fireonlive> | ah it doesn't like the new 307 hotness? |
| 03:17:18 | <@JAA> | It's a bit overzealous at preserving the request: https://github.com/ArchiveTeam/wpull/issues/425 |
| 03:18:41 | | nic quits [Client Quit] |
| 03:18:48 | <fireonlive> | ahh i see |
| 03:19:38 | | nic (nic) joins |
| 03:20:39 | | dumbgoy_ quits [Ping timeout: 265 seconds] |
| 03:21:44 | <nicolas17> | >2019 |
| 03:21:52 | | monoxane quits [Quit: estoy fuera] |
| 03:21:52 | | mindstrut quits [Read error: Connection reset by peer] |
| 03:22:10 | | mindstrut joins |
| 03:23:24 | <fireonlive> | >https://github.com/search?q=org%3AArchiveTeam+author%3Anicolas17&type=pullrequests |
| 03:23:37 | <fireonlive> | :P |
| 03:24:16 | <fireonlive> | just ignore the spider webs (lol webs) in https://github.com/ArchiveTeam/wpull/pulls |
| 03:25:48 | <@JAA> | pkg.fig.io was still in use under a year ago per the GitHub issues: https://github.com/withfig/fig/issues?q=is%3Aissue+pkg.fig.io |
| 03:26:16 | <fireonlive> | hmm |
| 03:28:20 | | DogsRNice quits [Read error: Connection reset by peer] |
| 03:29:50 | | dumbgoy_ joins |
| 03:31:12 | | monoxane (monoxane) joins |
| 03:39:54 | | Megame quits [Client Quit] |
| 03:46:09 | <@JAA> | lol: https://pkg.fig.io/install-headless.sh |
| 03:48:05 | | krvme joins |
| 03:49:05 | <fireonlive> | lol |
| 03:49:21 | | monoxane quits [Client Quit] |
| 03:51:44 | | Krume quits [Ping timeout: 252 seconds] |
| 03:52:46 | | Island quits [Read error: Connection reset by peer] |
| 04:01:23 | | monoxane (monoxane) joins |
| 04:18:19 | | Naruyoko joins |
| 04:30:06 | | HP_Archivist quits [Read error: Connection reset by peer] |
| 04:45:43 | | icedice quits [Client Quit] |
| 05:04:56 | <nicolas17> | JAA: so, what projects are currently running? |
| 05:05:55 | <@JAA> | nicolas17: Xuite, Gfycat, Telegram, not sure what else. |
| 05:08:20 | <fireonlive> | urlteam2, mediafire maybe?, github maybe? |
| 05:10:23 | <nicolas17> | mediafire has no work |
| 05:10:36 | <nicolas17> | JAA: what's keeping reddit and urls paused? |
| 05:10:40 | <fireonlive> | i mean it's technically running though |
| 05:10:42 | <fireonlive> | :p |
| 05:10:54 | <fireonlive> | just needs some sweet !a lovin' |
| 05:11:09 | | monoxane quits [Client Quit] |
| 05:11:24 | <fireonlive> | reddit -> arkiver verifying i.reddit.com; urls... i think just sheer size |
| 05:11:43 | <@JAA> | i.redd.it* but yeah |
| 05:11:45 | <fireonlive> | or 'load sheeding' for the latter |
| 05:11:49 | <fireonlive> | ah yeah |
| 05:12:03 | <fireonlive> | i.reddit.com is the now kill web interface |
| 05:12:16 | <fireonlive> | i did correct myself from i.imgur.com before i hit enter though :D |
| 05:12:16 | <nicolas17> | oh yeah they made the image links even worse |
| 05:12:21 | <fireonlive> | indeed! |
| 05:12:24 | <fireonlive> | it's awful :D |
| 05:12:46 | <nicolas17> | it used to be that clicking an image link showed me the webpage, and I had to right click the image and open in a new tab to see the actual image with usable zooming |
| 05:13:01 | <nicolas17> | now if I open the image in a new tab it loads the goddamn webpage too |
| 05:13:02 | <fireonlive> | now right click does noooothing |
| 05:13:04 | <fireonlive> | :D |
| 05:13:25 | <@JAA> | For a while, I could use view-source to get the image itself. No idea why, never bothered to look into it. |
| 05:14:04 | <fireonlive> | i decided to follow the thot leaders at reddit and host my very own image: https://mkx9delh5a.execute-api.ca-central-1.amazonaws.com/uploads/a-very-nice-image.png |
| 05:14:18 | <@JAA> | Every time I click on an image now, I get redirected to https://old.reddit.com/r/funny/comments/media/nice_hat/ due to my URL rewrites from www.reddit.com to old.reddit.com. |
| 05:14:36 | <fireonlive> | lol |
| 05:14:37 | <nicolas17> | if workers are bored we could resume imgur at a low rate >.> |
| 05:14:54 | <fireonlive> | at least you can see gaga's hat |
| 05:15:29 | <@JAA> | It redirects to https://www.reddit.com/media?url=..., but on old.reddit.com, that redirects to the post with ID 'media' instead. :-) |
| 05:18:08 | <nicolas17> | I think I'm done configuring allll the Apple update assets in my script... now I have 600MB of json responses |
| 05:45:56 | <imer> | tracker taking a nap? "Tracker returned status code 500. The tracker has probably malfunctioned. Retrying after 80 seconds.." |
| 05:51:38 | <fireonlive> | looks like tracker isn't happy |
| 05:52:06 | <fireonlive> | cc JAA |
| 05:54:04 | | owenwastaken joins |
| 05:54:23 | | drunkmoon joins |
| 06:01:34 | <fireonlive> | https://mkx9delh5a.execute-api.ca-central-1.amazonaws.com/uploads/b20a08951272ce78/fix-it.gif |
| 06:01:55 | <imer> | Fusl: hi! tracker has been erroring out on item requests/backfeed for the last ~15min |
| 06:02:49 | <drunkmoon> | seems to be back up |
| 06:02:50 | <imer> | looks to be recovering, just have to start pinging people :D |
| 06:07:13 | <fireonlive> | the Fusl-bat-phone |
| 06:18:10 | <masterX244> | Someone on a game alpha stuff discors noticed that epicgames cleared the UT assets used by the cancelled UT game that is/was on github for ue license holders from their servers, i got a full mirror of that data luckily |
| 06:19:00 | | rubberduckie quits [Ping timeout: 265 seconds] |
| 06:22:25 | | Exorcism (exorcism) joins |
| 06:24:44 | | Exorcism quits [Remote host closed the connection] |
| 06:25:48 | | Exorcism (exorcism) joins |
| 06:39:18 | | datechnoman quits [Quit: The Lounge - https://thelounge.chat] |
| 06:39:59 | | datechnoman (datechnoman) joins |
| 06:48:29 | <that_lurker> | fireonlive: https://lounge.kuhaon.fun/folder/b912317d19b4b8b6/JaaSignal.png |
| 06:48:40 | <fireonlive> | 😂 |
| 06:48:44 | <fireonlive> | yes. |
| 06:54:24 | <project10> | https://media.tenor.com/KJYhAJa46UYAAAAC/old-school-batman.gif |
| 06:54:49 | <project10> | bonus points if you hear the sound effect while viewing |
| 07:03:18 | | bladem quits [Quit: Leaving] |
| 07:03:40 | | bladem (bladem) joins |
| 07:09:11 | | Unholy2361316618085 quits [Ping timeout: 252 seconds] |
| 07:10:51 | | z joins |
| 07:11:16 | | z quits [Remote host closed the connection] |
| 07:13:54 | | rubberduckie joins |
| 07:14:09 | | Exorcism quits [Remote host closed the connection] |
| 07:14:26 | | Exorcism (exorcism) joins |
| 07:15:47 | | nulldata quits [Ping timeout: 252 seconds] |
| 07:18:51 | | nulldata (nulldata) joins |
| 07:19:38 | | nicolas17 quits [Ping timeout: 252 seconds] |
| 07:20:11 | | owenwastaken quits [Remote host closed the connection] |
| 07:23:44 | | owen joins |
| 07:24:33 | | Arcorann (Arcorann) joins |
| 07:26:12 | | owen quits [Client Quit] |
| 07:33:58 | | PredatorIWD_ joins |
| 07:35:02 | | PredatorIWD quits [Ping timeout: 252 seconds] |
| 08:45:50 | | sec^nd quits [Remote host closed the connection] |
| 08:46:17 | | sec^nd (second) joins |
| 08:46:32 | | dumbgoy_ quits [Ping timeout: 252 seconds] |
| 09:20:05 | | kiryu quits [Ping timeout: 252 seconds] |
| 09:55:01 | | BlueMaxima quits [Client Quit] |
| 10:00:01 | | railen63 quits [Remote host closed the connection] |
| 10:00:19 | | railen63 joins |
| 10:06:47 | | kiryu joins |
| 11:03:06 | <nstrom|m> | Tracker /backfeed unhappy again |
| 11:03:19 | <yts98> | the tracker returns 500 and failed to accept backfeeds again |
| 11:27:21 | | Exorcism quits [Client Quit] |
| 11:37:20 | | test joins |
| 11:38:17 | | test quits [Remote host closed the connection] |
| 11:47:51 | | Krume (Krume) joins |
| 11:48:02 | | krvme quits [Ping timeout: 252 seconds] |
| 11:53:33 | | ssssss quits [Remote host closed the connection] |
| 12:09:58 | | plcp_ joins |
| 12:10:03 | <plcp_> | re |
| 12:11:31 | <pabs> | :) pls repeat for those not on #archiveteam plcp_ |
| 12:11:38 | <plcp_> | okok |
| 12:11:42 | <plcp_> | buckle up |
| 12:12:46 | <plcp_> | All personal websites from personal webpages of the main telco operator in France are going offline by September 5th, they have a registry here https://annuaire-pp.orange.fr/accueil |
| 12:13:19 | <plcp_> | https://pages.perso.orange.fr/ |
| 12:13:27 | <plcp_> | The announce (in French, sry) |
| 12:14:09 | <pabs> | pokechu22 flashfire42 JAA have been doing some ArchiveBot jobs for orange |
| 12:14:10 | <qyxojzh|m> | I can help translate if need be |
| 12:14:22 | <plcp_> | all *.pagesperso-orange.fr and all *.monsite-orange.fr |
| 12:15:23 | <pabs> | pokechu22: did your orange !a < jobs cover https://telecommunications.monsite-orange.fr/ ? plcp_ mentioned that as an example |
| 12:15:39 | <plcp_> | I'm worried, especially because it's composed mostly of non-tech savvy people, non profits and older folks, that for most build tens of thousands of pages on topics they're passionate about, and won't be migrate anywhere |
| 12:16:23 | <plcp_> | *be migrated |
| 12:17:36 | <pabs> | yeah, ISP hosting is quite endangered in general https://wiki.archiveteam.org/index.php?title=ISP_Hosting |
| 12:18:33 | <yts98> | Pixnet (https://pixnet.net/), the last largest blog service provider in Taiwan, accepted the migration from Yahoo! Blog, Wretch, yam天空部落 and Xuite, announced to delete inactive accounts (before 2020-01-01) on 2023-12-01: https://admin.pixnet.net/blog/post/49016232 |
| 12:19:49 | <yts98> | I consider that Pixnet is partially endangered and it's going to be another large DPoS project |
| 12:21:46 | <pabs> | seems like something to mention on the announce channel #archiveteam too |
| 12:22:07 | <pabs> | and add it to deathwatch https://wiki.archiveteam.org/index.php/Deathwatch |
| 12:22:52 | <yts98> | pabs: I shortly mentioned it on #archiveteam and is editing the wiki :p |
| 12:24:06 | | plcp_ quits [Remote host closed the connection] |
| 12:26:20 | <h2ibot> | Yts98 edited Deathwatch (+164, Add Pixnet): https://wiki.archiveteam.org/?diff=50618&oldid=50616 |
| 12:32:26 | | plcp joins |
| 12:32:29 | <plcp> | re |
| 12:33:48 | <plcp> | (the "web interface" link here https://wiki.archiveteam.org/index.php/Archiveteam:IRC#How_do_I_chat_on_IRC? may be updated from #archiveteam to #archiveteam-bs to avoid ppl in a hurry pollution the announce chan) |
| 12:35:00 | <plcp> | and thanks for the rapid answer pabs |
| 12:38:05 | <pabs> | good idea, fixed |
| 12:38:23 | <h2ibot> | PaulWise edited Archiveteam:IRC (+3, set #archiveteam-bs as the default channel): https://wiki.archiveteam.org/?diff=50619&oldid=50560 |
| 12:40:24 | <h2ibot> | PaulWise edited Archiveteam:IRC (+3, fix web link too): https://wiki.archiveteam.org/?diff=50620&oldid=50619 |
| 12:51:02 | | owen joins |
| 13:36:23 | | Arcorann quits [Ping timeout: 252 seconds] |
| 14:00:02 | | AmAnd0A quits [Ping timeout: 252 seconds] |
| 14:00:11 | | AmAnd0A joins |
| 14:20:07 | | owen quits [Client Quit] |
| 14:20:46 | | DogsRNice joins |
| 14:21:44 | <h2ibot> | Yts98 created PIXNET (+4100, inactive accounts of PIXNET is endangered): https://wiki.archiveteam.org/?title=PIXNET |
| 14:23:06 | | AmAnd0A quits [Read error: Connection reset by peer] |
| 14:23:20 | | AmAnd0A joins |
| 14:25:45 | <h2ibot> | Yts98 edited Deathwatch (+0, Capitalize PIXNET): https://wiki.archiveteam.org/?diff=50622&oldid=50618 |
| 15:00:02 | | kiryu quits [Ping timeout: 265 seconds] |
| 15:02:44 | | pabs quits [Ping timeout: 252 seconds] |
| 15:05:50 | | pabs (pabs) joins |
| 15:09:31 | | Island joins |
| 15:10:29 | | Exorcism (exorcism) joins |
| 15:14:14 | | kiryu joins |
| 15:24:37 | | HP_Archivist (HP_Archivist) joins |
| 15:31:20 | | rubberduckie quits [Ping timeout: 252 seconds] |
| 15:33:54 | | rubberduckie joins |
| 15:42:10 | | dumbgoy_ joins |
| 15:50:18 | | qwertyasdfuiopghjkl quits [Ping timeout: 265 seconds] |
| 15:57:33 | | rubberduckie quits [Ping timeout: 265 seconds] |
| 16:32:01 | | nicolas17 joins |
| 16:38:29 | | VerifiedJ quits [Client Quit] |
| 16:40:21 | | VerifiedJ (VerifiedJ) joins |
| 16:42:16 | | rubberduckie joins |
| 16:44:45 | <pokechu22> | plcp: no, I don't think I've got any of https://telecommunications.monsite-orange.fr |
| 16:45:05 | <pokechu22> | er, wait, one sec |
| 16:45:25 | <pokechu22> | still waking up, thought that was something like telecommunications-orange.fr and not a subdomain of monsite-orange.fr |
| 16:58:36 | <pokechu22> | plcp: Yeah, that's on the priority list running in AB. flashfire42 also did several jobs for it starting on various pages (but would have recursed over the whole site on each one), see https://archive.fart.website/archivebot/viewer/domain/telecommunications.monsite-orange.fr |
| 17:04:05 | <plcp> | nice |
| 17:04:35 | <plcp> | I'm going through some of these websites, looks like there's some amount of badly rewritten ones |
| 17:06:28 | <plcp> | some have their homepages hosted as "<handle>.pagesperso-orange.fr" but when crawling they use legacy "http://perso.wanadoo.fr/<handle>/" urls that no longer works |
| 17:06:49 | <plcp> | but rewriting these urls to pagesperso fixes the website |
| 17:06:55 | <plcp> | what a nightmare |
| 17:07:39 | <Exorcism|TheLounge> | plcp: french operator woohoo ☆*: .。. o(≧▽≦)o .。.:*☆ |
| 17:12:16 | <pokechu22> | Unfortunately the site bans for 24 hours if you request at faster than 1 page/second so it's unlikely we'll get everything - if there was more time it'd probably be possible to handle those legacy URLs but I don't think we will be able to :| |
| 17:18:05 | | kdy is now known as kdy_ |
| 17:19:00 | | kdy (kdy) joins |
| 17:23:29 | | kdy_ quits [Quit: Fly Away~] |
| 17:24:19 | <AntoninDelFabbro|m> | Also, an history, from as far as I know:... (full message at <https://matrix.hackint.org/_matrix/media/v3/download/hackint.org/ifjFDwHYwnddPQjBIuGqPxQh>) |
| 17:28:33 | <pokechu22> | I don't think http versus https is which site builder is used - instead it's if the username has multiple dots in it it gets http and if it doesn't it gets https, because a SSL certificate for *.monsite-orange.fr only covers subdomains without dots and there isn't a way to do *.*.monsite-orange.fr |
| 17:29:24 | <pokechu22> | that can also be seen by looking at what http://perso.orange.fr/DEMO and http://perso.orange.fr/FOO.BAR redirect to |
| 17:29:43 | <plcp> | pokechu22: they rate limit that aggressively? |
| 17:29:59 | <AntoninDelFabbro|m> | Bruh, I forgot |
| 17:29:59 | <AntoninDelFabbro|m> | ↓ http://monsite.orange.fr/DEMO |
| 17:29:59 | <AntoninDelFabbro|m> | ↓ http://DEMO.monsite.orange.fr/ |
| 17:29:59 | <AntoninDelFabbro|m> | ↓ ... |
| 17:30:22 | <pokechu22> | They apply a ban after an hour or two of sustained requests at a high speed, but it does seem like it's that strict overall |
| 17:30:39 | <plcp> | ah that's why I was able to wget one site |
| 17:31:11 | <plcp> | but if I go for the 44k something pages, it won't work |
| 17:31:24 | <pokechu22> | Yeah |
| 17:31:26 | <plcp> | (still scrapping they registry) |
| 17:32:34 | <pokechu22> | The other annoying factor is that sites and pages that don't exist redirect to https://r.orange.fr/r/Oerreur_404 and then https://e.orange.fr/error404.html, and both of those pages also count into the rate limit. (And ArchiveBot doesn't have a way of applying ignores to redirect targets, so it requests those every time) |
| 17:32:38 | <plcp> | even just downloading one page per site, the front index.html, will require days with one ip |
| 17:33:27 | <plcp> | with that rate limit, should have started a year ago :D |
| 17:33:36 | <plcp> | pokechu22: when did you started? |
| 17:34:09 | <pokechu22> | A few days ago |
| 17:34:19 | <plcp> | well shit |
| 17:34:42 | <pokechu22> | The list of high-priority sites that are likely to exist (https://transfer.archivete.am/6gcam/pagesperso-orange.fr_pagespro-orange.fr_monsite-orange.fr_seed_urls_thuban_priority.txt) has already downloaded all of the front pages at least |
| 17:34:55 | <pokechu22> | but it seems unlikely it'll get everything else |
| 17:35:20 | <AntoninDelFabbro|m> | as he said: well shit |
| 17:36:11 | | rubberduckie quits [Ping timeout: 252 seconds] |
| 17:36:30 | <pokechu22> | I have a bunch of other jobs running on different IPs based on other lists I generated (e.g. sites that have no existing coverage at all, most of which don't exist but it's found 646 of them so far that do, and some other generated lists) |
| 17:36:59 | <pokechu22> | but we should have started a while back :| |
| 17:37:13 | <plcp> | the aforementioned list looks like their registry scrapped |
| 17:37:28 | <plcp> | pokechu22: they announced it like three month ago iirc |
| 17:37:28 | <pokechu22> | Yes |
| 17:37:41 | <plcp> | but the information reached me like, today |
| 17:38:07 | <pokechu22> | flashfire42 has been running individual sites for a while: https://archive.fart.website/archivebot/viewer/?q=orange.fr - it just took a while to build up lists of sites |
| 17:40:36 | <pokechu22> | We only got a full registry list 2 days ago. See https://hackint.logs.kiska.pw/archiveteam-bs/20230828 (and https://hackint.logs.kiska.pw/archiveteam-bs/20230827#c374594) |
| 17:47:50 | <plcp> | 159k pages! |
| 17:47:51 | <plcp> | wow |
| 17:48:00 | <plcp> | that's triple the amount from the registry |
| 17:55:06 | <plcp> | okok, brb spamming all friends that may have worked once in their life at orange |
| 17:57:31 | | Exorcism quits [Ping timeout: 245 seconds] |
| 17:58:31 | <plcp> | https://drop.chapril.org/download/37075644d302ef4f/#p_ubcDqTNiANhPEyXHb3Qw |
| 17:58:37 | <plcp> | here's my list |
| 18:01:36 | <@JAA> | Rehosted because JS nonsense: https://transfer.archivete.am/7gCmW/orange-list.txt.zst |
| 18:03:54 | | threedeeitguy3 quits [Remote host closed the connection] |
| 18:06:04 | <pokechu22> | http://1a1.emploi.pour.cadre.technique.top.performant.pagesperso-orange.fr/ - this is an excellent URL and site... |
| 18:06:57 | <@JAA> | (.tar.gz unpacked and then recompressed with zstd, to be precise.) |
| 18:10:06 | <pokechu22> | It looks like a few of those are new |
| 18:10:18 | <fireonlive> | --ultra --22? :p |
| 18:10:59 | | rubberduckie joins |
| 18:13:15 | <fireonlive> | does higher # have any affect on the decompressor? or just when compressing |
| 18:13:29 | <@JAA> | fireonlive: Nah, -10 is my go-to. And yes, the --ultra levels require more memory to decompress IIRC. |
| 18:13:42 | <nicolas17> | fireonlive: you can test that |
| 18:14:11 | <fireonlive> | nicolas17: technically correct |
| 18:14:15 | <fireonlive> | JAA: ah :) |
| 18:14:17 | <nicolas17> | I mean like, easily |
| 18:14:20 | <pokechu22> | https://belleilescapade.monsite-orange.fr/ and https://patrocle.monsite-orange.fr/ are completely new; they weren't on any previous list. https://erawylersitedefleur.pagesperso-orange.fr/ and https://iniri.pagesperso-orange.fr/ were one one of the lists but not the priority one. |
| 18:14:33 | <nicolas17> | "zstd -b1 -e19 file.txt" will benchmark all levels 1 to 19 and give you the compression ratio, and compression and decompression speed |
| 18:14:40 | <fireonlive> | ah! nice |
| 18:14:42 | <pokechu22> | otherwise your list matched the orangefr_online_raw.txt one pretty closely |
| 18:15:27 | <nicolas17> | and if either compression or decompression takes less than 1 second, it runs multiple times to get a better measurement |
| 18:17:24 | | qwertyasdfuiopghjkl (qwertyasdfuiopghjkl) joins |
| 18:19:06 | <imer> | neat, zstd continues to impress me |
| 18:19:52 | | threedeeitguy3 (threedeeitguy) joins |
| 18:20:36 | <nicolas17> | there's one disappointing thing though |
| 18:21:29 | <nicolas17> | "--format=FORMAT: compress and decompress in other formats. If compiled with support, zstd can compress to or decompress from other compression algorithm formats. Possibly available options are zstd, gzip, xz, lzma, and lz4." |
| 18:21:46 | | itachi1706 (itachi1706) joins |
| 18:21:49 | <nicolas17> | it doesn't support benchmarking them :( -b only does zstd format |
| 18:37:12 | | Megame (Megame) joins |
| 18:39:15 | | AmAnd0A quits [Read error: Connection reset by peer] |
| 18:39:56 | | AmAnd0A joins |
| 18:42:42 | <fireonlive> | :( |
| 18:49:54 | | c joins |
| 18:51:26 | | c quits [Remote host closed the connection] |
| 18:55:53 | <plcp> | JAA: thanks |
| 19:19:11 | | railen63 quits [Remote host closed the connection] |
| 19:23:10 | | railen63 joins |
| 19:26:49 | | AlsoHP_Archivist joins |
| 19:31:11 | | HP_Archivist quits [Ping timeout: 265 seconds] |
| 19:31:40 | | AlsoHP_Archivist quits [Client Quit] |
| 19:31:57 | | HP_Archivist (HP_Archivist) joins |
| 20:07:59 | | Shampoo2140 quits [Ping timeout: 252 seconds] |
| 20:10:20 | | lukash98 joins |
| 20:11:18 | | lukash9 quits [Ping timeout: 265 seconds] |
| 20:11:18 | | lukash98 is now known as lukash9 |
| 20:13:24 | | Shampoo2140 joins |
| 20:16:59 | <h2ibot> | JustAnotherArchivist edited Deathwatch (+10, Link to Game Atsumaru section on [[Niconico]]): https://wiki.archiveteam.org/?diff=50623&oldid=50622 |
| 20:27:11 | <@JAA> | transfer will be getting a bit of an upgrade soonish. Planned changes include adding on-the-fly zstd compression support on upload, removing the forced download (i.e. no longer requiring /inline/ for browser access), and pasting content directly on the web interface (thanks to upstream's implementation of that). Now's your opportunity for further ideas. :-) |
| 20:30:57 | <@arkiver> | wooh :) |
| 20:31:21 | <@arkiver> | so also no need for zstd'ing stuff ourselves and taking .zst off from the URL? |
| 20:31:24 | <@arkiver> | JAA: ^ |
| 20:31:55 | <fireonlive> | JAA: 🥳🥳🥳🥳🥳🥳🥳🥳🥳🥳🥳 |
| 20:32:01 | <@JAA> | arkiver: Correct, no need for that anymore, although it might still be preferable if you want to minimise the amount of data transferred (e.g. slow connections). |
| 20:32:29 | <fireonlive> | paste text -> uploads a .txt file? :3 |
| 20:32:45 | <fireonlive> | is that what you mean or did they finally add paste binary -> uploads binary |
| 20:33:21 | <@arkiver> | JAA: nice |
| 20:34:15 | <@JAA> | fireonlive: I don't know exactly how it works, just saw it in the changelog. |
| 20:34:20 | <fireonlive> | ahh |
| 20:34:27 | <@JAA> | But I assume pasting text, yeah. |
| 20:35:13 | <@JAA> | transfer.sh-web is a clusterfuck, so the diff is very useful: https://github.com/dutchcoders/transfer.sh-web/pull/58/files |
| 20:35:21 | <fireonlive> | oh thanks |
| 20:35:44 | <fireonlive> | allow a certain UA to access image/video files? :3 |
| 20:36:07 | <fireonlive> | though idk if the 🐰 is advanced enough for that |
| 20:36:34 | <@JAA> | Oh, I guess this is really it: https://github.com/dutchcoders/transfer.sh-web/pull/58/files#diff-738ca807f137aa95054f4d49bc42f48f8f85b1acf13e381d268415f6d4f09417 |
| 20:36:45 | <@JAA> | So that seems a bit underwhelming. We'll see though. |
| 20:36:52 | <fireonlive> | 300,000 changes to 'modTime: time.Unix(1668857825, 0),' |
| 20:37:05 | <@JAA> | What do you mean regarding UA access? |
| 20:37:09 | <fireonlive> | ah yeah, listening for files in the clipboard |
| 20:37:22 | <fireonlive> | TheLounge's link preview thingy |
| 20:37:46 | <fireonlive> | dunno if you can allowlist say just stuff ending in .jpg/.png/etc |
| 20:38:27 | <@JAA> | The Lounge is blocked specifically because dozens of people would spam the server within milliseconds of a link getting shared, and it caused problems on the server side including a fun crash due to a mutex bug. |
| 20:38:44 | <fireonlive> | bindata_gen.go scares me: var _bindataDistScriptsMainJs = |
| 20:38:53 | <fireonlive> | https://github.com/dutchcoders/transfer.sh-web/pull/58/files#diff-eef14c30d770fdc35b929095526891a4d3b2dc4ae748face27cafe361367926aR2037 haha |
| 20:39:54 | <fireonlive> | ah ye, after that was patched I thought it was more of a bandwidth thing |
| 20:40:07 | <@JAA> | https://github.com/dutchcoders/transfer.sh/issues/380 |
| 20:41:31 | <fireonlive> | hm, make delete urls available if possible? |
| 20:41:39 | <fireonlive> | if one were to accidentally shove a file? |
| 20:41:51 | <fireonlive> | they seem to be hidden on the AT instance |
| 20:41:55 | <fireonlive> | (or i'm dumb) |
| 20:42:11 | <@JAA> | Yeah, that's the other part of it. When a large file gets linked, a dozen downloads of it would be started simultaneously, which is *great*. |
| 20:42:58 | <fireonlive> | ye, i figured limiting it to images at least would be somewhat better instead of everyone trying to download 100MB files to immediately throw out haha |
| 20:43:02 | <fireonlive> | but either way is fine |
| 20:43:27 | <@JAA> | Don't remember as I hardly ever use the web interface, but will check. |
| 20:44:53 | <fireonlive> | oh, i guess i just misremembered: they show in curl at least: x-url-delete: https://transfer.archivete.am/PMcII/test.txt/kyjYcQjrG1 |
| 20:45:36 | <fireonlive> | ah yeah but not on web |
| 20:46:34 | <@JAA> | Yeah, the header exists, but it isn't always present. Depends on how the upload is done. |
| 20:47:17 | <fireonlive> | i was like I tried to delete this but ye it's probably just cached |
| 20:47:52 | | owen joins |
| 21:08:40 | | mcint joins |
| 21:35:48 | | Kline joins |
| 21:36:45 | <Kline> | Question. After I installed my AT Warrior I was able to access the UI on localhost:8001 once and now loads forever. How can I solve this? |
| 21:42:30 | | efawfeawfew joins |
| 21:55:03 | <pokechu22> | Is it still running? If it's not running (or it's just starting up) it'll either load forever or immediately fail to road |
| 21:55:18 | <plcp> | pokechu22: question |
| 21:56:17 | <plcp> | I have a half day of free time before leaving for holidays, away from my computers, until next monday (more or less ~5 days of continuous querying with up to 3 unique IPs & machines) |
| 21:56:26 | <plcp> | what do I do during this half day |
| 21:56:53 | <plcp> | is it worth it to learn to setup an "archive warrior" to contribute to the effort? |
| 21:56:57 | <Kline> | pokechu22 it's running like it would normally, just cant access localhost |
| 21:57:00 | <pokechu22> | I don't think we have any kind of distributed project set up for orange |
| 21:57:24 | <pokechu22> | Setting up the warrior isn't too hard but it wouldn't be targeting orange specifically |
| 21:58:09 | <plcp> | so I can just get wget to spit out as much warcs as possible w/o being banned, and it would be somewhat useful |
| 21:58:18 | <pokechu22> | You're connecting to http://localhost:8001/ and not https://localhost:8001/ right? |
| 21:59:05 | | Kline quits [Client Quit] |
| 21:59:13 | <pokechu22> | Yeah, that'd be useful, though it'd be hard to avoid duplicating other work |
| 21:59:30 | <thuban> | the orange.fr priority job is onto its third pass, so that's pretty cool--we have assets (and one layer of links) for front pages of all those sites |
| 21:59:34 | <thuban> | that said, queue has been slowly growing, so while we might finish the majority of sites (which are small), we definitely will not completely get the large ones by the deadline |
| 22:00:00 | <plcp> | nice |
| 22:00:13 | | Kline joins |
| 22:00:21 | <Kline> | pokechu22 yup |
| 22:00:58 | <pokechu22> | You could try http://127.0.0.1:8001/ or something like that maybe? |
| 22:01:45 | <Kline> | there we go, loaded after around 30 seconds |
| 22:01:54 | <Kline> | thanks for the help :] |
| 22:04:35 | | JTL quits [Ping timeout: 252 seconds] |
| 22:08:51 | <plcp> | mmmh I guess I'll find a way to prioritize some orange sites over others, and get as much shit as possible before the deadline |
| 22:10:23 | <Kline> | ok well it loaded.. but i cant do anything on the interface :p |
| 22:10:28 | <Kline> | screenshot for reference: https://ibb.co/VwngwJG |
| 22:12:56 | | BlueMaxima joins |
| 22:13:47 | | siinus quits [Quit: rage quit] |
| 22:33:27 | | siinus (siinus) joins |
| 22:39:14 | <AntoninDelFabbro|m> | <thuban> "that said, queue has been slowly..." <- Where to find it in the warrior list? |
| 23:05:05 | | bf_ quits [Ping timeout: 252 seconds] |
| 23:15:55 | <fireonlive> | AntoninDelFabbro|m: those are archivebot jobs, so no warrior support: http://archivebot.com/ |
| 23:16:16 | <fireonlive> | can type 'orange' in the "Show" box to see |
| 23:19:17 | <@JAA> | http://archivebot.com/?initialFilter=orange |
| 23:19:23 | | AmAnd0A quits [Ping timeout: 252 seconds] |
| 23:20:20 | | AmAnd0A joins |
| 23:28:51 | <fireonlive> | ah yeah that’s better |
| 23:39:44 | | hogchips quits [Ping timeout: 252 seconds] |
| 23:50:06 | | hogchips joins |
| 23:50:06 | | hogchips is now authenticated as shoghicp |
| 23:50:07 | | hogchips quits [Changing host] |
| 23:50:07 | | hogchips (shoghicp) joins |