00:02:35 | <@JAA> | We could just throw it into AB. Probably without offsite links and possibly ignoring the individual post URLs. It's standard XenForo, so that should work well. |
00:03:35 | <Webuser230> | Not sure if your Xenforo archiver accounts for this but clicking the "posts by user" link on a user's page creates a new search permalink |
00:03:42 | <Webuser230> | and there's about 3 million users |
00:04:09 | <@JAA> | ArchiveBot is just a generic recursive crawler. But we can ignore URLs by regex. |
00:04:52 | <@JAA> | There are also 'ignore sets' for some common things, including some forum softwares, but I don't recall whether XenForo is covered there. I think not. |
00:07:21 | <@JAA> | Looks like the forums are not very active, so there's also the risk of them getting shut down at some point due to that, and content getting shoved into walled gardens like Discord instead. |
00:08:33 | <Webuser230> | A lot of it is in Discord already, they only really have the forum because they want to make an inbuilt comment feature on the mainsite, but they don't want to handroll their own forum software since it leads to vulnurabilities |
00:10:23 | <fireonlive> | discord :( |
00:16:47 | <@JAA> | I've started an ArchiveBot job. You can watch the progress at http://archivebot.com/ under job 5j95pfluutwfmi6azi184pty6. |
00:17:08 | <katia> | can i put these 3,304.6 GB from 37c3 ftps anywhere? |
00:19:03 | <fireonlive> | would probably be a good idea.. i think there was interest on JAA's side but can't remember if arkiver said anything at the time |
00:20:11 | <fireonlive> | katia: going to retire it to the great farm up in the sky soon-ish? |
00:20:17 | <@JAA> | Yeah, this would be nice to get onto IA, I think. |
00:20:35 | <@JAA> | Plain files, separate dir per server, I assume? |
00:20:40 | <katia> | yeah |
00:21:00 | <@JAA> | How much data is the largest server? |
00:21:23 | <@JAA> | Also, how many servers? |
00:21:34 | <katia> | https://transfer.archivete.am/UCKXB/hello |
00:21:35 | <eggdrop> | inline (for browser viewing): https://transfer.archivete.am/inline/UCKXB/hello |
00:21:40 | <katia> | file list |
00:21:53 | <katia> | if you're good at math you can add the bytes together i suppose |
00:22:38 | <fireonlive> | math is hard :( |
00:22:40 | <fireonlive> | :p |
00:22:41 | <@JAA> | Hold my beer. |
00:22:46 | | katia drinks it |
00:22:55 | <@JAA> | 41 servers |
00:23:03 | <katia> | off-by-one error |
00:23:06 | <fireonlive> | :D |
00:25:01 | <@JAA> | Ah, looks like it's choking a little on those non-UTF8 filenames. |
00:25:05 | <fireonlive> | would be interesting to do a diff of sorts at 38c3 next year |
00:25:25 | <@JAA> | 21 servers |
00:25:34 | <katia> | i wish i had the idea to do his earlier :') |
00:25:36 | <fireonlive> | i.e. grab the stuff from the item and see what's available there next year and output "what's new" |
00:25:52 | <katia> | this |
00:25:57 | <fireonlive> | ugh me too. most weeks are bad but that wasn't the best week for me due to christmas |
00:26:06 | <fireonlive> | woulda grabbed more of the gay porn ftp at least :P |
00:26:25 | <fireonlive> | fire? missed out on the gpftp? impossible! |
00:27:30 | <@JAA> | Looks like the largest is 151.217.62.109 at 561.20 GiB. |
00:28:11 | <@JAA> | I'd lean towards one item per server with a tar (uncompressed) inside. |
00:28:32 | <@JAA> | But let's see what arkiver has to say. :-) |
00:33:42 | <fireonlive> | !seen arkiver |
00:33:43 | <eggdrop> | [seen] arkiver (~arkiver@2a01:4f9:c010:4d02::1) was last seen talking on #frogger 13 hours 42 minutes 31 seconds ago. arkiver is still on #frogger. |
00:33:49 | <fireonlive> | soon :3 |
00:37:31 | | jasons quits [Ping timeout: 272 seconds] |
01:15:23 | | lennier2_ quits [Client Quit] |
01:19:40 | | lennier1 (lennier1) joins |
01:40:14 | | jasons (jasons) joins |
01:44:37 | | Webuser230 quits [Ping timeout: 265 seconds] |
02:24:07 | <ScenarioPlanet> | Have someone tried to save GD levels? I've been picking a pause time and got cloudflared for two months |
02:25:41 | <pokechu22> | GD? |
02:25:52 | <ScenarioPlanet> | Should be 10s or more per request, they have a crap ton of IDs |
02:25:59 | <ScenarioPlanet> | Geometry Dash |
02:26:15 | <ScenarioPlanet> | Plus, it uses POST & secret token in form data |
02:34:24 | <TheTechRobo> | I looked into it awhile back, shouldn't be too hard |
02:41:39 | | jasons quits [Ping timeout: 272 seconds] |
02:44:44 | | tailormadekelso24 joins |
02:45:13 | <tailormadekelso24> | Can I access the archive |
02:45:42 | <nicolas17> | what |
02:45:43 | | tailormadekelso24 quits [Remote host closed the connection] |
02:49:31 | | kiryu joins |
02:49:31 | | kiryu is now authenticated as kiryu |
02:49:31 | | kiryu quits [Changing host] |
02:49:31 | | kiryu (kiryu) joins |
02:51:55 | <project10> | ¯\_ʘ‿ʘ_/¯ |
03:00:33 | | pseudorizer quits [Quit: ZNC 1.8.2 - https://znc.in] |
03:01:25 | | pseudorizer (pseudorizer) joins |
03:13:36 | | qtyz joins |
03:14:16 | <qtyz> | hi, is there a way to get all followers and/or following list of a twitter account? |
03:17:27 | | qwertyasdfuiopghjkl (qwertyasdfuiopghjkl) joins |
03:27:31 | <pabs> | qtyz: probably via nitter.net? |
03:32:10 | <qtyz> | pabs: it doesn't look like it shows followers and such |
03:32:33 | <pabs> | hmm |
03:32:37 | <@JAA> | Followers and followings are only visible when logged in. |
03:33:51 | <qtyz> | yeah and even then twitter only returns around 100 or something |
03:34:11 | <qtyz> | it should be possible with api to get all followers i think |
03:37:11 | <@JAA> | Which API? ;-) |
03:37:36 | <nicolas17> | the undocumented one used by the official app? |
03:37:53 | <qtyz> | i dont know |
03:38:40 | <@JAA> | Beautiful, the 'Getting access to the Twitter API' link on https://developer.twitter.com/en/docs/twitter-api/getting-started/about-twitter-api is a 404. |
03:39:55 | <audrooku|m> | It also calls it Twitter, I guess they havent had a single api customer to mention it, yet |
03:40:55 | <@JAA> | According to tweepy's docs, the followers and followings API endpoints require the Enterprise API, so have fun. |
03:41:20 | <nicolas17> | I saw some deletions happening on the samsung opensource site so I'm now just trying to upload stuff and I'll deal with metadata later https://archive.org/details/@nicolas09f9?and[]=creator%3Asamsung&sort=titleSorter |
03:44:15 | <nicolas17> | but I'm getting pretty awful speeds in both directions |
03:44:37 | | jasons (jasons) joins |
03:50:41 | <qtyz> | @JAA yeah I got a lot of 404s while trying to browser api docs as well. what a shame |
03:50:50 | <qtyz> | trying to browse* |
03:51:04 | <@JAA> | Twitter breaking in all sorts of ways has been a meme for over a year now. |
03:51:15 | <nicolas17> | qtyz: well in case you missed it the basic API costs US$100/month |
03:51:16 | <@JAA> | I wonder what happened a bit over a year ago... :-) |
03:51:54 | <nicolas17> | and JAA says it's not enough to get follower info :P |
03:52:50 | <@JAA> | Yeah, the Basic and Pro APIs are quite limited. |
03:53:01 | <nicolas17> | Pro is $5000/month |
03:53:03 | <@JAA> | And Enterprise is 'contact us' pricing. |
03:53:20 | | MetaNova quits [Ping timeout: 240 seconds] |
03:59:05 | | MetaNova (MetaNova) joins |
04:05:28 | | sec^nd quits [Remote host closed the connection] |
04:07:35 | <audrooku|m> | There are certainly cheaper ways to get much of the info |
04:08:34 | | sec^nd (second) joins |
04:45:09 | | jasons quits [Ping timeout: 272 seconds] |
05:01:50 | | emberquill080 quits [Quit: The Lounge - https://thelounge.chat] |
05:02:16 | | emberquill080 (emberquill) joins |
05:02:48 | | Megame quits [Client Quit] |
05:15:25 | | DogsRNice quits [Read error: Connection reset by peer] |
05:17:42 | | qtyz quits [Client Quit] |
05:17:43 | | qwertyasdfuiopghjkl quits [Client Quit] |
05:20:51 | | qwertyasdfuiopghjkl (qwertyasdfuiopghjkl) joins |
05:21:46 | | emberquill080 quits [Client Quit] |
05:22:34 | | emberquill080 (emberquill) joins |
05:34:20 | | nulldata quits [Ping timeout: 240 seconds] |
05:44:52 | | nulldata (nulldata) joins |
05:48:10 | | jasons (jasons) joins |
06:04:36 | | qwertyasdfuiopghjkl quits [Client Quit] |
06:06:15 | | sec^nd quits [Ping timeout: 255 seconds] |
06:13:42 | | sec^nd (second) joins |
06:22:45 | | jtagcat quits [Quit: Bye!] |
06:23:07 | | jtagcat (jtagcat) joins |
06:27:58 | | UserH quits [Client Quit] |
06:30:09 | | Island quits [Read error: Connection reset by peer] |
06:37:57 | | qwertyasdfuiopghjkl (qwertyasdfuiopghjkl) joins |
06:43:20 | | jasons quits [Ping timeout: 240 seconds] |
06:51:50 | | icedice quits [Client Quit] |
07:36:22 | <@arkiver> | JAA: katia: fireonlive: yes, one tar or zip per ftp. one ftp per item |
07:36:31 | <@arkiver> | we may soon also be able to crawl them with a Warrior project |
07:47:10 | | jasons (jasons) joins |
08:04:42 | | lizardexile joins |
08:06:21 | | lizardexile quits [Read error: Connection reset by peer] |
08:06:25 | | lizardexile joins |
08:47:43 | | jasons quits [Ping timeout: 272 seconds] |
08:49:36 | | lizardexile quits [Client Quit] |
09:32:29 | | qwertyasdfuiopghjkl quits [Remote host closed the connection] |
09:50:49 | | jasons (jasons) joins |
10:00:04 | | Bleo18260 quits [Client Quit] |
10:01:26 | | Bleo18260 joins |
10:04:07 | | qwertyasdfuiopghjkl (qwertyasdfuiopghjkl) joins |
10:49:57 | | jasons quits [Ping timeout: 272 seconds] |
11:24:52 | | BlueMaxima quits [Read error: Connection reset by peer] |
11:48:13 | | qwertyasdfuiopghjkl quits [Client Quit] |
11:52:55 | | jasons (jasons) joins |
11:54:12 | | qwertyasdfuiopghjkl (qwertyasdfuiopghjkl) joins |
12:00:14 | | bladem (bladem) joins |
12:25:55 | <Pedrosso> | What happened to the 8f579nrvtpp3qgdc5axt01kjw job? https://transfer.archivete.am/GqdCj/www.curseforge.com_api_later_pages.txt |
12:25:55 | <eggdrop> | inline (for browser viewing): https://transfer.archivete.am/inline/GqdCj/www.curseforge.com_api_later_pages.txt |
12:44:25 | | tertu quits [Quit: so long...] |
12:45:22 | | tertu (tertu) joins |
12:48:20 | | jasons quits [Ping timeout: 240 seconds] |
13:16:50 | | Arcorann quits [Ping timeout: 240 seconds] |
13:41:02 | | Wohlstand (Wohlstand) joins |
13:47:34 | | qwertyasdfuiopghjkl quits [Client Quit] |
13:49:31 | | qwertyasdfuiopghjkl (qwertyasdfuiopghjkl) joins |
13:51:57 | | jasons (jasons) joins |
14:02:31 | | eroc1990 quits [Quit: Ping timeout (120 seconds)] |
14:02:55 | | eroc1990 (eroc1990) joins |
16:35:37 | | simon816 quits [Quit: ZNC 1.8.2 - https://znc.in] |
16:40:49 | | simon816 (simon816) joins |
16:43:21 | | jasons quits [Ping timeout: 272 seconds] |
16:56:10 | | simon816 quits [Client Quit] |
17:00:50 | | simon816 (simon816) joins |
17:01:14 | | qwertyasdfuiopghjkl quits [Remote host closed the connection] |
17:25:52 | | aninternettroll quits [Remote host closed the connection] |
17:28:13 | | aninternettroll (aninternettroll) joins |
17:38:58 | | icedice (icedice) joins |
17:41:57 | <Pedrosso> | Nevermind, it took a long time to be registered to the log site |
17:43:27 | | simon816 quits [Client Quit] |
17:45:11 | <@JAA> | Pedrosso: The data needs to go from the pipelines to the rsync target, then to IA, and then get indexed. Yes, it'll take a while, and that's expected. |
17:45:41 | <Pedrosso> | I waited about an hour or so until asking expecting for that to be the case |
17:45:56 | <@JAA> | It can take days. |
17:45:59 | <Pedrosso> | wow |
17:46:22 | <@JAA> | It all depends on whether there's an upload backlog, how fast IA is at processing the uploads, and when the indexer runs. |
17:46:31 | | jasons (jasons) joins |
17:46:53 | <@JAA> | Usually, things show up within 1-2 days. |
17:48:14 | <Pedrosso> | I'd never experienced any delay before due to not usually checking on jobs small enough for only 1 WARC |
17:48:17 | <Pedrosso> | or well |
17:48:19 | <Pedrosso> | only 1 item I mean |
17:49:18 | <@JAA> | Oh right, you mean the viewer, not the WBM. It should show up a bit quicker there, yeah. |
17:49:35 | <@JAA> | Still, hours to a day delay is not unusual. |
17:50:48 | | simon816 (simon816) joins |
17:55:56 | | DogsRNice joins |
18:19:34 | | qwertyasdfuiopghjkl (qwertyasdfuiopghjkl) joins |
18:27:54 | | godane2 joins |
18:30:20 | | godane1 quits [Ping timeout: 240 seconds] |
18:43:03 | | jasons quits [Ping timeout: 272 seconds] |
18:50:53 | | godane1 joins |
18:53:49 | | godane2 quits [Ping timeout: 272 seconds] |
19:29:50 | | lennier1 quits [Ping timeout: 240 seconds] |
19:31:09 | | lennier1 (lennier1) joins |
19:46:02 | | jasons (jasons) joins |
19:49:48 | | jacksonchen666 (jacksonchen666) joins |
20:09:10 | | qwertyasdfuiopghjkl quits [Client Quit] |
20:34:31 | <@rewby> | JAA: Can you re-enable those projects with hel1 you paused? I've got the corrupt files out of there |
20:40:04 | | jacksonchen666 quits [Client Quit] |
20:40:42 | | jacksonchen666 (jacksonchen666) joins |
20:41:46 | <fireonlive> | i guess then comes the fun of what has to be requeued :o |
20:45:55 | | jasons quits [Ping timeout: 272 seconds] |
21:00:20 | | icedice quits [Client Quit] |
21:03:43 | <@JAA> | rewby: Ack |
21:20:41 | | jacksonchen666 quits [Remote host closed the connection] |
21:21:17 | | jacksonchen666 (jacksonchen666) joins |
21:49:18 | | jasons (jasons) joins |
22:03:35 | | Island joins |
22:40:45 | <that_lurker> | Could someone grab https://0.honda/ at some point. The new concept car that most likely will never actually come out. |
22:45:31 | <fireonlive> | started an AB job |
22:46:15 | | jasons quits [Ping timeout: 272 seconds] |
22:47:02 | | aninternettroll quits [Remote host closed the connection] |
22:49:19 | | aninternettroll (aninternettroll) joins |
22:50:32 | <that_lurker> | <3 |
22:50:43 | <fireonlive> | <3 |
23:00:57 | | Arcorann (Arcorann) joins |
23:04:22 | | onetruth joins |
23:14:26 | | qvp joins |
23:14:43 | | qvp quits [Remote host closed the connection] |
23:27:57 | | katia quits [Remote host closed the connection] |
23:28:13 | | katia (katia) joins |
23:34:24 | | dave3 quits [Quit: WeeChat 3.8] |
23:49:17 | | jasons (jasons) joins |