05:00:00<tuankiet>@alard: How is Yahoo Blog?
08:55:00<alard>tuankiet: Still there, I guess?
08:56:00<alard>(I'm not doing anything with it at the moment, if that's what you mean.)
08:56:00<ersi>Yeah, he just wrote in #dailybooth so I think so
08:57:00<alard>Ah, haha. I wanted to write that Yahoo Blog is still alive, but my question works in the other sense as well. :)
08:58:00<ersi>Ha
08:58:00<ersi>Yeah :) I'm a bit sleepy I guess
09:17:00<alard>tuankiet: Yahoo search could perhaps produce blog names? http://search.yahoo.com/search?p=T%C3%B4i&n=100&ei=UTF-8&vs=blog.yahoo.com
09:19:00<alard>Unfriendly urls though: "Hey, read my blog, it's on http://blog.yahoo.com/_XVBWXXWGXGFK4VJXNK5EGIRKLQ/"
09:21:00<SmileyG>lol thats actually a link :D
09:21:00SmileyG thought you'd headbutted the keyboard
09:22:00<SmileyG>About 12,200,000 results (0.31 seconds) << google site: search
09:24:00<alard>Is Yahoo good at blocking?
09:24:00<ersi>uh, weird that display ads for Yahoo! Korea on Yahoo! Vietnam
09:51:00<tuankiet>Sorry, I am here
10:05:00<ersi>tuankiet: alard said "Still there." to "How is Yahoo Blog?"
10:21:00<alard>tuankiet: Do you know about these unreadble blog names? Do those blogs also have a 'normal' name, or is that ID the only way to reach them?
11:26:00<tuankiet>I don't know. There are readable blog name like http://blog.yahoo.com/blog.thietke/ or http://blog.yahoo.com/hungno1 but some aren't readable like http://blog.yahoo.com/_A4YNU4FWW57SQE2KMZPX5LN3RE. I am finding the reason
11:27:00<tuankiet>The unreadable blog name like http://blog.yahoo.com/_A4YNU4FWW57SQE2KMZPX5LN3RE is autogenerated by Yahoo.
11:30:00<tuankiet>Ah, bad thing. There are Yahoo blog in Vietnam, Hongkong and Taiwan. They are using the same domain(http://blog.yahoo.com). SO if we find based on the http://blog.yahoo.com/{username}, we will also rescue Yahoo Blog Hongkong and Taiwan
11:36:00<alard>https://gist.github.com/4fa302a54ea8b5aa5c28
11:39:00<tuankiet>Sorry to say this thing, but I DON'T KNOW python. Only Pascal
11:40:00<alard>It's a blog finder. It searches Yahoo with words from the dictionary, extracts the blog names and reports them to the tracker.
11:40:00<chronomex>are you interested in learning? most decent programmers can muddle along in python without great difficulty
11:42:00<chronomex>(I say this not to taunt or belittle, just to be friendly)
11:43:00<tuankiet>In Vietnam, people don't learn Python
11:43:00<alard>There's a great niche for you then! The first Python programmer in Vietnam. :)
11:43:00<chronomex>hah
11:44:00<chronomex>why don't they, tuankiet?
11:44:00<chronomex>pascal is an unusual language to learn, is it more common there?
11:44:00<tuankiet>Very much. Every students know Pascal because they must learn Pascal in 8th grade.
11:45:00<ersi>Whaat, awesome
11:45:00<tuankiet>I am in 7th and I also learn it
11:45:00<chronomex>fascinating
11:45:00<ersi>yeah, it really is :o
11:45:00<chronomex>you're in 7th grade right now??
11:46:00<tuankiet>To 11th or 11th, they learn about Microsoft Access. It's useless, too
11:46:00<chronomex>I must say, ou seem remarkably mature for 7th grade
11:47:00<chronomex>also it's exciting to hear that lots of students learn to program
11:47:00<chronomex>:D
11:48:00<tuankiet>Oh, Sorry ;)
11:48:00<chronomex>sorry for what?
11:48:00<tuankiet>Lokk above "I must say, ou seem remarkably mature for 7th grade"
11:49:00<chronomex>no, keep up the good work
11:49:00<tuankiet>OK
11:49:00<chronomex>many 7th grade people on the internet act immature, I absolutely didn't guess you were so young
11:50:00<alard>I don't want to discourage this discussion, but shall I ring the #archiveteam-bs bell?
11:50:00<chronomex>yes, I was about to do the same
11:51:00<chronomex>tuankiet: I suggest you also join #archiveteam-bs , it's where we put off-topic conversation
11:57:00<tuankiet>OK
12:05:00<tuankiet>Is the python script work?
12:08:00<alard>Technically, yes. Practically, somewhat. The timeouts need a bit of tuning. Yahoo blocks it.
12:16:00<tuankiet>Running in Ubuntu 12.10 and "no module named tornado". How can I fix this
12:18:00<alard>sudo apt-get install python-tornado
12:19:00<alard>(may or may not work)
12:22:00<tuankiet>Thanks. The code is running
12:35:00<alard>Good. Here's a counter: http://tracker.archiveteam.org:8124/
12:39:00<tuankiet>I known it. Thanks
12:39:00<tuankiet>Now I will run it everyday?
12:43:00<tuankiet>Here is the output: http://pastebin.com/ZNRUA1k9
13:01:00<tuankiet>Error 999 blocked!
13:14:00<alard>Yahoo returns error 999 if they have had enough for a while. The script will wait for a while and then retry.
13:16:00<alard>You may want to increase the WAIT time, to 30 seconds, for instance. (Line 16 of the script.)
13:34:00<tuankiet>How to make this script run for forever
13:35:00<SmileyG>&
13:36:00<tuankiet>What?
13:36:00<SmileyG>i don't know what you mean by running forever.
13:36:00<alard>Something like while true ; do python script.py ; done
13:36:00<SmileyG>yeah, that'd work
13:39:00<tuankiet>Thanks!
15:12:00<tuankiet>@alard: you should change the BLOCK_TIMEOUT to 600 (10 mins)
15:12:00<alard>tuankiet: Is that how long it takes?
15:14:00<tuankiet>No, I think it takes 30 mins. So maybe BLOCK_TIMEOUT should be 1800
15:14:00<tuankiet>I am not sure
15:14:00<tuankiet>;)
15:14:00<tuankiet>Should at 600
15:15:00<tuankiet>600, sure
15:17:00<alard>Ideally the WAIT should be long enough to avoid the blocking.
15:26:00<tuankiet>What does the WAIT do?
15:27:00<alard>It's the time between each request to Yahoo.
15:27:00<tuankiet>ok
15:28:00<tuankiet>What should it be? 60 or 120?
15:28:00<alard>I don't know. It should be just long enough not to trigger the alarm.
15:31:00<tuankiet>Just testing with WAIT=30. This may take few days to know
15:32:00<tuankiet>And now I have to sleep. It's half past 10 in Vietnam
15:32:00<tuankiet>Goodbye ;)
15:32:00<alard>Goodnight.
15:55:00<SketchCow>AAAAAAND morning
15:56:00<SketchCow>Did we just have a 7th grade archive team member?
15:57:00<ersi>Indeedily. He's still here and #archiveteam-bs is still there :-)
15:59:00<alard>SketchCow: Want to join #dailybooth?
16:10:00<SketchCow>dashcloud: All of rainsoft uploaded? 136mb?