00:15:09Arcorann (Arcorann) joins
00:32:15<@JAA>Something is changing soon at GitLab SaaS (aka gitlab.com) about introducing a limit on the number of users within a namespace, but the details are a bit messy to follow if you're not already familiar with their offerings, it seems: https://about.gitlab.com/blog/2022/03/24/efficient-free-tier/
00:32:45dm4v_ joins
00:33:39dm4v quits [Ping timeout: 265 seconds]
00:33:39dm4v_ is now known as dm4v
00:33:40dm4v quits [Changing host]
00:33:40dm4v (dm4v) joins
00:33:52<@JAA>The CI/CD limits are kind of irrelevant archival-wise, but this in the FAQ caught my attention: 'You will not lose any data, but you will not be able to create any new jobs or artifacts nor access Ultimate features if you have exceeded the usage limits of the free tier.'
00:50:16kallsyms quits [Quit: probably segfaulted]
00:50:31kallsyms joins
00:58:47Discant quits [Ping timeout: 265 seconds]
01:01:50dm4v_ joins
01:03:08dm4v quits [Ping timeout: 265 seconds]
01:03:08dm4v_ is now known as dm4v
01:03:09dm4v quits [Changing host]
01:03:09dm4v (dm4v) joins
01:12:39igloo22225 quits [Quit: The Lounge - https://thelounge.chat]
01:14:04igloo22225 (igloo22225) joins
01:26:52<h2ibot>TheTechRobo edited Speedrun.com (+408, /* Archiving tools */ Add bit about Warcprox): https://wiki.archiveteam.org/?diff=48679&oldid=47989
03:30:27sec^nd quits [Remote host closed the connection]
03:32:16sec^nd (second) joins
05:27:52BlueMaxima quits [Read error: Connection reset by peer]
05:56:02DogsRNice quits [Read error: Connection reset by peer]
07:53:44fetty joins
07:54:31fetty quits [Remote host closed the connection]
08:02:46fetty joins
08:11:09<Ryz>Heya folks, has there been any progress updates regarding https://wiki.archiveteam.org/index.php/Google_Code ? systwi has suggested whether to archive Google Code projects into ArchiveBot and that prompted me to check AT wiki, which resulted in this; it doesn't look like it's been finished...
08:15:08sec^nd quits [Remote host closed the connection]
08:16:04nothere quits [Quit: Leaving]
08:17:30sec^nd (second) joins
08:17:38<systwi>For reference, I was looking to save https://code.google.com/archive/p/hotween/ and https://code.google.com/archive/p/dotween/ , but likely a couple others in the future.
08:18:11<systwi>I can think of only one or two other projects hosted on Google Code.
08:20:35<systwi>Oh, also http://code.google.com/p/hounitylibs/ , but that wasn't one of the two.
08:21:02fetty quits [Ping timeout: 265 seconds]
08:26:57nothere joins
08:32:23<systwi>Assuming everything was grabbed properly in AB, all three of those repositories totaled 35.6MiB.
09:03:27Discant joins
09:12:46fooob joins
09:14:52<fooob>Hello. Did checked recently -- having trouble with saving VK pages. It looks like account which was used for archiving currently logged off
09:19:26fooob quits [Remote host closed the connection]
10:00:07nepeat quits [Ping timeout: 245 seconds]
10:29:18spirit joins
10:36:22nepeat (nepeat) joins
12:01:51thetechrobo_ joins
12:02:00qwertyasdfuiopghjkl quits [Remote host closed the connection]
12:02:00dm4v quits [Client Quit]
12:02:00Arcorann quits [Remote host closed the connection]
12:02:00TheTechRobo quits [Remote host closed the connection]
12:02:08dm4v joins
12:02:10dm4v quits [Changing host]
12:02:10dm4v (dm4v) joins
12:06:42ave quits [Client Quit]
12:06:47ave9 (ave) joins
12:07:37Arcorann (Arcorann) joins
12:23:49qwertyasdfuiopghjkl joins
12:57:37Discant quits [Ping timeout: 245 seconds]
13:17:15thetechrobo_ is now known as TheTechRobo
14:05:10yay quits [Ping timeout: 265 seconds]
14:09:23Arcorann quits [Ping timeout: 265 seconds]
14:15:04yay joins
14:29:17Sutilbo joins
14:29:54<Sutilbo>Hello all, I have a question about ArchiveBot, does it work internally the same as wget? Because I was telling TheTechRobo that it could cause problems saving hispafiles.ru since I was trying to download it myself (as an experiment) and the threads (along with the files they contain) are skipped.
14:30:52<TheTechRobo>Sutilbo: As I said before, I'm 99% sure the temporary downloaded files are discarded after they're written to the WARC.
14:31:20<TheTechRobo>(For reference, the wget problems were files overlapping each other when the last bit of the URLs were the same.)
14:32:04<TheTechRobo>(As an example, https://hispafiles.ru/style.css and https://hispafiles.ru/styles/style.css would overwrite each other, if I understand Sutilbo correctly.)
14:35:46<Sutilbo>In fact, I had proposed the following example: wget first download a file like hispafiles.ru/c (which is a page) and when it want to download something like hispafiles.ru/c/res/31742.html then it cannot be saved because the directory cannot be created due to there is already a file with the same name. How would ArchiveBot deal with that?
14:36:15spirit quits [Client Quit]
14:38:29spirit joins
14:39:37<Sutilbo>You can still save hispafiles.ru directly if you want, but if it's like I say, then it will only download the site structure and skip everything else.
14:40:11<TheTechRobo>Sutilbo: The temporary files downloaded have random names such as tmp-wpull-warcsesreq-jqdts3ht.tmp.
14:40:34<TheTechRobo>ArchiveBot won't run into that problem.
14:41:01<TheTechRobo>And I mean the temporary files written by ArchiveBot.
14:41:13<TheTechRobo>They're random names, so even if they weren't deleted they won't collide.
14:41:33spirit quits [Client Quit]
14:42:50<Sutilbo>Ah, good to know, in that case you can archive hispafiles.ru directly, but ignoring these domains so ArchiveBot will focus only on that site: github.com www.hispachan.org www.googletagmanager.com www.google.com imgops.com iqdb.org saucenao.com t.me
14:46:22spirit joins
14:55:29spirit quits [Client Quit]
15:19:07Brella quits [Ping timeout: 265 seconds]
15:57:07ats quits [Quit: new irssi]
15:57:29ats (ats) joins
17:17:35<systwi>Sutilbo: You can always try `grab-site' (https://github.com/ArchiveTeam/grab-site) which functions very similarly to ArchiveBot.
17:18:16<systwi>Both use a forked version of `wget', actually, called `wpull' (https://github.com/ArchiveTeam/wpull). Not sure how the two compare/differ regarding that.
17:56:47zoe joins
17:57:02<zoe>hello
18:14:31<TheTechRobo>zoe: hi
18:14:49<zoe>how are you
18:17:23<TheTechRobo>systwi: "Not sure how the two compare/differ regarding that." what two? AB vs grab-site, or wget vs wpull?
18:17:27<TheTechRobo>zoe: good, thanks, you?
18:17:42<zoe>i'm doing well
18:17:51<zoe>I have some questions if thats alright
18:22:52<systwi>TheTechRobo: `wpull' vs. `wget', sorry for the lack of clarity.
18:23:20<systwi>Also, "regarding that" = "regarding temporary file management/retention"
18:23:39<systwi>zoe: A common saying on IRC, "don't ask to ask, just ask!" :)
18:23:48<zoe>alright well
18:24:12<zoe>i'm looking for a soundcloud api scrape prior to the one that was published on archive
18:24:35<zoe>i know there was some trouble with copyright infringement some time ago and the archive project was put on hold
18:25:16<zoe>but i'm just looking for the api scrapes , specifically one I saw in a log from 2016
18:25:26<zoe>the original upload link was deleted
18:28:42<zoe>not looking for the audio files just specifically the metadata scrapes from a prior time
18:28:47<zoe>i will see if i can find the log
18:32:24<@JAA>Ryz, systwi: Google Code is useless without JS, so I doubt it'll work in AB (or anything really) without specific code.
18:32:40<@JAA>Sutilbo: hispafiles.ru is running through ArchiveBot now.
18:33:59<@JAA>zoe: The project halt thing was at the same time as my first scrape. Also, SoundCloud discussion should go to the dedicated channel, #soundbutt
18:34:32<zoe>will do thank you
19:00:18knecht4202 quits [Read error: Connection reset by peer]
19:00:22knecht42024 (knecht420) joins
19:04:53zoe quits [Client Quit]
19:16:46<h2ibot>Entartet edited List of websites excluded from the Wayback Machine (+42, Added crazney.net and hrt.cafe.): https://wiki.archiveteam.org/?diff=48680&oldid=48676
19:47:47DiscantX joins
20:17:57<h2ibot>Wickedplayer494 edited Coub (+505, Grab was halted, shutdown notice disappeared): https://wiki.archiveteam.org/?diff=48681&oldid=48424
20:32:59<h2ibot>Wickedplayer494 edited Current Projects (-329, Coub to finished, at least for now): https://wiki.archiveteam.org/?diff=48682&oldid=48389
20:41:19<TheTechRobo>Oh no
20:41:23<TheTechRobo>I think I just crashed a website
20:41:46<TheTechRobo>https://ibb.co/JC97qT1
20:47:27<systwi>TIL ImgBB will claim a page is "not found" if it doesn't like the IP. T_T
20:47:59<TheTechRobo>What image hosting service should I use?
20:48:16<TheTechRobo>It would need to support pasting an image.
20:48:42<systwi>For permanent stuff I use catbox.moe, typically, or for temporary things uguu.se is fine.
20:49:59<systwi>I've come across other ones that are also good, but they're bookmarked on a different HDD, now imaged, and compressed, and stored on some _other_ disk in my house. :P
20:50:08<@JAA>https://transfer.archivete.am/
20:50:20<TheTechRobo>JAA: Does that support pasting...?
20:50:29<@JAA>It does not.
20:50:47<TheTechRobo>Yeah, that's my criteria. Too lazy to browse through my Pictures folder, lol.
20:50:57<systwi>Paste, like, Ctrl+V?
20:51:00<TheTechRobo>Yeah
20:51:03<@JAA>Just image files, and then you can insert 'inline' to make it show in a browser.
20:51:09<systwi>Oh, huh, can't think of any.
20:51:12<@JAA>(Rather than forcing a download)
20:51:41<TheTechRobo>systwi: Imgbb does which is why I use it. But if there's a better one I'd love to hear.
20:53:14<@JAA>Hmm, I wonder if this could be added to transfer.sh. Shouldn't be too difficult I imagine.
20:53:21<systwi>What about: xclip -selection clipboard -o | curl --upload-file https://transfer.archivete.am/
20:53:29<Jake>do we have a fork of transfer.sh somewhere?
20:53:45<systwi>(might need some extra `curl' arguments)
20:53:47<@JAA>Jake: Yes, it's on Gitea.
20:54:18<@JAA>curl --upload-file - https://transfer.archivete.am/filename.jpg or similar. No idea what image format you get there.
20:54:26<Jake>ty!
20:55:13<TheTechRobo>> Error: target STRING not available
20:55:18<TheTechRobo>> Could not upload empty file
20:55:37<@JAA>Yeah, not surprised there.
20:55:42<TheTechRobo>Oh, nevermind, it works.
20:55:47<TheTechRobo>Just can't copy an image from Firefox.
20:55:48<TheTechRobo>https://transfer.archivete.am/DuWE5/screenshot.png
20:55:52<TheTechRobo>https://transfer.archivete.am/inline/DuWE5/screenshot.png
20:56:05<TheTechRobo>not sure what he means by a wipe, but looks like I wasn't the cause
20:56:14<TheTechRobo>just a coincidence that I started hammering his website
20:56:17<TheTechRobo>I'll go a bit easier
20:56:44cpina_ quits [Read error: Connection reset by peer]
20:56:57cpina joins
21:06:22<@JAA>Heh: https://github.com/dutchcoders/transfer.sh/issues/425
21:23:38dm4v_ joins
21:23:38dm4v quits [Ping timeout: 265 seconds]
21:23:38jamesp quits [Ping timeout: 265 seconds]
21:23:38dm4v_ is now known as dm4v
21:23:38thuban quits [Ping timeout: 265 seconds]
21:23:38Barto quits [Ping timeout: 265 seconds]
21:23:38asie quits [Ping timeout: 265 seconds]
21:23:38marked1 quits [Ping timeout: 265 seconds]
21:23:38phuzion quits [Remote host closed the connection]
21:23:38summerisle quits [Remote host closed the connection]
21:23:40aismallard quits [Remote host closed the connection]
21:23:40fionera quits [Remote host closed the connection]
21:23:40yawkat quits [Quit: No Ping reply in 180 seconds.]
21:23:40dm4v quits [Changing host]
21:23:40dm4v (dm4v) joins
21:23:43jamesp joins
21:23:43jamesp quits [Changing host]
21:23:43jamesp (jamesp) joins
21:23:49thuban joins
21:23:53Barto (Barto) joins
21:23:55asie joins
21:23:59marked1 (marked1) joins
21:24:22phuzion (phuzion) joins
21:24:23summerisle (summerisle) joins
21:24:34fionera (Fionera) joins
21:24:41aismallard joins
21:24:50yawkat (yawkat) joins
21:28:23qwertyasdfuiopghjkl quits [Client Quit]
21:41:57qwertyasdfuiopghjkl joins
21:57:08HP_Archivist (HP_Archivist) joins
22:04:42Mateon1 quits [Ping timeout: 245 seconds]
22:06:42Mateon1 joins
22:08:30DiscantX quits [Ping timeout: 265 seconds]
22:08:43DiscantX joins
22:14:33HP_Archivist quits [Client Quit]
22:44:16DiscantX quits [Ping timeout: 265 seconds]
22:50:59michaelblob quits [Read error: Connection reset by peer]
22:52:17michaelblob (michaelblob) joins
23:58:32<h2ibot>TheTechRobo edited TwitLonger (+321, Add URL format): https://wiki.archiveteam.org/?diff=48683&oldid=29802