laike9m's blog

JAN 10TH, 2015

漫谈Quora，知乎和StackOverflow

曾经我很难理解，为什么有人愿意花很长时间在Quora或者知乎上写那么长的答案。并不是因为我觉的在问答网站上回答问题这件事难以理解，而是我很好奇Quora和知乎这种无积分的体系是如何激励人去作答的。作为一个混了好几年iAsk以及一年多StackOverflow的人，我很喜欢去回答问题，一方面是觉得在自己擅长的领域回答问题可以进一步提升技能，另一方面积分上涨也让我觉得颇爽。对我来讲积分是很重要的，如果没有积分，大概我也很难有动力去回答问题了。

所以Quora和知乎的流行让我困惑。不过当我注册了知乎并且回答了几个问题之后，疑问便烟消云散了。

因为这两个网站能给人提供极大的满足感，这种满足感是的效果远远能超过积分的激励。

回答问题被upvote/赞让人满足，这件事情没什么奇怪的。但是为什么SO要靠积分和badge来激励人答题，而Quora/知乎不需要，也就是说，其实这里想探讨的问题是，为什么在Quora/知乎上回答问题给人带来的满足感更强烈。当然也有人不认同关于满足感的比较，总之我的感觉是这样。

首先，在综合性问答网站上获得点赞更容易。StackOverflow毕竟是程序员的网站，针对的群体相当单一，更广泛地说，StackExchange系网站都是这样，不同的站点提供不同的内容，而每个站点都只针对特定人群，比如ServerFault针对运维人员，AskUbuntu针对Ubuntu用户等，StackOverflow和Programmers已经是针对用户群最广泛的网站了，而这个群体也只不过是所有程序员。Quora和知乎就不用说了，外国人只要能上网绝对不会没看过Quora的问题，中国人和知乎也是一样。这两个网站特意不区分用户群，首页上会给你推送较火的新问题，你只要呆在网站上，保证你会看到感兴趣的问题，即使这个问题你看到之前压根就不会想要去了解。在这两个网站浏览问题是会上瘾的，而且还是自我感觉良好得一种上瘾，毕竟是在“学知识”嘛。说回点赞数的话题，既然人多了浏览量大了，点赞的量自然也就上去了，一个回答在知乎能收获100赞，同等质量的回答在SO可能也就20个Upvote，你说哪边满足感更强？

虽然放在第二但是和之前同等重要的一个原因：Quora/知乎是社交网站，StackOverflow不是。问答网站发展到今天，功能早已不局限于“问”和“答”，但这些变化的核心就是一个词：社交。Quora我不知道，但是知乎约X还是很普遍的，这就是一个例子。那么具体是哪些功能让Quora/知乎成为了社交网站？我先说个最重要但是又最容易被忽视的：实名制点赞。你看不到StackOverflow上获的upvote是由谁发出的，而知乎会把每个给你点赞的人都告诉你。这个差别就太大了。曾经我以为StackOverflow开得早所以没有做实名制upvote，直到看见SegmentFault也没有实名upvote之后才悟出道理，原来不是它们不重视或者技术做不到，而是SO和SF本身就不是社交网站(其实SF社交性稍微强一点，不过upvote这里还是遵循传统形式)——问答网站不需要太多用户之间的互动，你问我答就好了，但是社交网站要想尽一切办法增强用户互动，这是社交网站的生命。举个例子，SO上知乎上一个问题被某人赞了，可能我就会去看看这个哥们是谁，而如果之前就和此人互动过，那关系可能就更好了一点，这些行为啊结果啊在非实名点赞的情况下根本就不会发生。第二个功能就是关注/follow，因为太明显了大家都懂，所以没什么好说的。知乎为了加强互动还搞出一个东西叫“感谢”，我至今都不明白这玩意和点赞的区别到底在哪里。之前写了这么多，都是为了说明Quora和知乎的社交网站本质，当然这并不是说他们就不是问答网站了，如果说Quora是有社交属性的问答网站，那知乎就是以问答为主的社交网站。讨论满足感，为什么又写这么多社交网站的事情？很简答，在社交网站回答问题远比在问答网站回答问题有满足感。写到这里，终于可以讲明白一直在说的满足感的本质是什么，那就是“受关注的感觉”。没有人不想be a super star，但不是每个人都可以。Quora/知乎提供了一个这样的平台，彻底引爆了那些在某个领域有知识但是之前一直无法展示出来的人。你很宅？没问题，看看知乎二次元界大牛有多少粉丝；你喜欢编程，没问题，Quora上"如何学习编程"类问题至少有100个，好的回答点赞数上千不要太容易；冷门领域的专家？简直不能再好，因为没有人能回答那些问题这样一来你就能独享所有关注。而且要注意，看到真实用户在给你点赞的感觉不是匿名的100个upvote所能比的，因为它让你感觉到，是这100个活生生的用户给你点了赞，而不是你收到了100个赞，更不要提粉丝数上涨给人带来的成就感了，说白了是不是明星不就在于有没有粉丝么？SO上有个C#领域的大牛叫Jon Skeet，混SO的人基本都知道他，so what?除了被当成梗出现在某些问题里，他和普通用户基本一个样：他回答问题，然后大家upvote，不会说你是个大牛大家就觉得你是个star，而他也根本感觉不到自己是个star，即使大家都认识他——社交网站才有star，问答网站只有专家，造成这种原因除了有没有粉丝等网站功能带来的区别之外，也在于两类网站培养出了不同的用户习惯：问答网站用户不习惯说没有内容或者和问题关系不大的话，而社交网站用户习惯于想到什么说什么，如果是对好的回答，自然就是一堆称赞的话，回答者也会因此更感到自己受到了关注。所以你在SO上是找不到当super star的感觉的，但是在知乎或者Quora就可以。

实际上Quora和知乎又略有不同。之前说Quora是“有社交属性的问答网站”，知乎是“以问答为主的社交网站”，相比“答案”，社交网站用户更追求“有趣”，所以你不可能在Quora上靠抖机灵获得几百个赞，但是在知乎上这种答案随处都是。正因为知乎更偏社交，在知乎上受关注也更容易，除了像之前说的特定领域专家，只要你够机灵，或者只是恰巧在一个很火的问题下面留了一个很机智的回答，那可了不得，哗哗哗就是几百个赞，瞬间就幸福感爆棚了，又有一堆留言回复称赞答主之机智，可能比现实中一年收到的称赞还要多，一般人怎么能受得了这种感觉，再加上右上角的提醒“xx、xxx、xx等xxx人关注了你”，简直就好像明星一般。一个好的回答就是一场个人演唱会，可能只要打上几行字，就能享受到当super star的感觉，还有比这更美妙的事吗？所以，Quora和知乎用户不需要积分，回答问题带来的满足感已经让他们无法自拔，而这两个网站可以用他们庞大用户群无中生有变出这种满足感提供给答题者，就像毒枭们生产毒品给吸毒者一样。吸毒要钱，答题不要钱，为啥回答问题，就这么简单。

DEC 28TH, 2014

A tutorial on using PeerJs in node-webkit app

In this article, I'm going to talk about the combination usage of node-webkit and PeerJs, but first let's take a look at things we're going to talk about. Node-webkit is an app runtime that allows developers to use Web technologies (i.e. HTML5, CSS and JavaScript) to develop native apps. PeerJS wraps the browser's WebRTC implementation to provide a complete, configurable, and easy-to-use peer-to-peer connection API. If you haven't heard about them, I suggest you go to their websites and take a quick look at what they do, cause they both are really insteresting projects.

Why do I write this article?

There has been some discussions on running peerjs client in a node.js application, but apparently it's not easy to do this because PeerJs relies on WebRTC which is built into Webkit. Then, as node-webkit gets more and more popular, people start to think, why not use PeerJs in node-webkit so that we can build p2p apps? Here is an attempt, as you see it really works, basically it's the same as running PeerJs in browser.

Yet, simply using node-webkit as a browser and running PeerJs in it waste the most powerful feature node-webkit provides: the node.js runtime. Browser is cool, HTML5 give us the ability to cope with files stored in local computer, but those features are minor compared to what nodejs can do. If a node-webkit app don't make use of nodejs, why bother using node-webkit instead of writing a pure web app?

The node-webkit project I'm working on needs nodejs(db strorage, watching files, etc...) as well as PeerJs. First I tried something like this:

<!DOCTYPE html>
<html>
<head lang="en">
  <script src="peer.js"></script>
</head>
<body>
<script>
  var peer = new Peer('username', {key: 'my-key'});
  var conn = peer.connect('hehe');
  var fs = require('fs');

  // sender side code
  conn.on('open', function(){
    console.log("connect to peer");
    var data = fs.readFileSync('file-you-want-to-transfer');
    conn.send(data);
    console.log('data sent');
  });

  // receiver side code
  peer.on('connection', function(conn) {
    conn.on('data', function(data){
      fs.writeFileSync('received_file', data);
      console.log("received complete: ", Date());
    });
  });
</script>
</body>
</html>

It doesn't work because there's no node.js runtime in html, So you can't read from/write to local using fs.write/readFileSync. You probabaly know that node-webkit's node runtime can't really interact with DOM environmen——it lets you require a nodejs's script and call its function from DOM, but you CAN'T GET THE RETURN VALUE, the functon you invoked is run in node runtime and knows nothing about DOM, that's why the above code couldn't work.

Then my friend suggested me to run an express server on localhost and use socket.io to make node and DOM interact with each other. I've written a demo and put it on github to show how this works:

https://github.com/laike9m/peerjs-with-nodewebkit-tutorial

This demo could be run either on a single machine or two machines. What it does is transfering .gitignore file to the other end. Here's its GUI:

To run this app, npm install first, then launch it following node-webkit's documentation. Assume you only have one computer, click the receive button, then click send button, you'll see a new file called received_gitignore appear in app's directory. Be sure to click receive before send, whether running on single machine or two machines.

Finally, let's get down to business to explain how this demo works.

First is package.json:

{
    "main": "main.html",
}

So main.html is the first HTML page node-webkit should display.

<html>
<script>
  var main = require('./main.js');
  window.location.href = "http://127.0.0.1:12345/";
</script>
</html>

Here's the interesting part: our main.html doesn't contain anything to display, it's only purpose is calling require('./main') which will launch an express server listening on 127.0.0.0:12345, then connects to it.

Let's see how it's done in main.js:

// main.js part 1
var app = require('express')();
var server = require('http').Server(app);
var io = require('socket.io')(server);

server.listen(12345);
app.use(require('express').static(__dirname + '/static'));
app.get('/', function (req, res) {
  res.sendfile(__dirname + '/index.html');
});

Nothing special, just a regular express http server with socket.io.
As we can see, visiting http://127.0.0.1:12345/ gets index.html displayed. Here's index.html:

<!DOCTYPE html>
<html>
<head lang="en">
  <meta charset="UTF-8">
  <meta name="author" content="laike9m">
  <title>demo</title>
  <script type="text/javascript" src="peer.js"></script>
  <script type="text/javascript" src="/socket.io/socket.io.js"></script>
</head>
<body>
  <script>
    window.socket = io.connect('http://localhost/', { port: 12345 });
    function clickSend() {
      var peer = new Peer('sender', {key: '45rvl4l8vjn3766r'});
      var conn = peer.connect('receiver');
      conn.on('open', function () {
        console.log("sender dataconn open");
        window.socket.on('sendToPeer', function(data) {
          console.log("sent data: ", Date());
          conn.send(data);
          peer.disconnect();
        });
        window.socket.emit('send');
      });
    }
    function clickRecv(){
      var peer = new Peer('receiver', {key: '45rvl4l8vjn3766r'});
      peer.on('connection', function(conn) {
        conn.on("open", function(){
          console.log("receiver dataconn open");
          conn.on('data', function(data){
            console.log("received data: ", Date());
            window.socket.emit('receive', data);
          });
        });
      });
    }
  </script>
  peerjs with nodewebkit demo
  <button onclick="clickSend()">send</button>
  <button onclick="clickRecv()">receive</button>
</body>
</html>

It contains two button: send and receive. When clicked, clickSend and clickRecv gets called. To understand what these functions do, let's see the other part of main.js.

// main.js part 2
io.on('connection', function(socket){
  socket.on('send', function(data){
    socket.emit('sendToPeer', fs.readFileSync('.gitignore'));
  });
  socket.on('receive', function(data){
    fs.writeFileSync('received_gitignore', data);
  });
});

So you clicked the receive button, PeerJs create a Peer with id receiver and a valid key I registered, then it waits for connection from sender. Then you clicked the send button, another Peer is created with id sender. sender tries to connect to receiver and when data connection successfully built, window.socketemits a send event, which is handled in main.js. The handler simply reads file content and sends it back to DOM environment. Note that socket.io supports binary data transfer from version 1.0, so you can't use a lower version socket.io.

Coming back to code in clickSend, we've already created an event handler on sendToPeer before emitting send event, now it gets fired. conn.send(data) sends data to Peer receiver. In function clickRecv, when conn receives data, it uses window.socket to emit a receive event and sends data from sender to Node runtime. Finally fs.writeFileSync('received_gitignore', data) write data to disk, all done.

You might wonder if it actually works for large file transfer. It does. My project is working great, it can transfer large files with decent speed, the prototype is this demo. Of course you should write many many more lines to make this prototype a usable app, for instance, data needs to be sliced when transfering large files, and you should handle all kinds of PeerJs errors.

That's all for this tutorial.

DEC 1ST, 2014

互联网公司GitHub repo 语言使用情况

做 PPT 太无聊了，突然想到可以统计一下这个东西，于是就做了一下
现在基本上所有国外大公司和国内部分公司都在 GitHub 上开源了一部分代码。统计一下这些代码的语言使用情况，多少可以反映公司内部对语言的偏好。很多公司流行的项目都是单独建一个 repo的，没办法统计，所以这里统计大家就随便看看吧。使用了 GitHub 的 API，只有不到四十行代码，所以直接贴在这里了，复制下来装个 requests 就可以直接运行

# coding: utf-8

"""
统计大公司github上的organization 中repo 的语言使用情况
"""

import requests
from collections import defaultdict
from os.path import join
from pprint import pprint


class GetLangStat():

    api_url = "https://api.github.com/orgs"
    ORGANIZATIONS = (
        'Microsoft', 'aws', 'google', 'twitter', 'facebook',
        'alibaba'
    )
    stats = {org: defaultdict(int) for org in ORGANIZATIONS}

    @classmethod
    def get_one_org_repos(cls, org):
        print(org)
        url = join(cls.api_url, org, 'repos')
        r = requests.get(url)
        for repo in r.json():
            cls.stats[org][repo['language']] += 1

    @classmethod
    def get_all_org_repos(cls):
        for org in cls.ORGANIZATIONS:
            cls.get_one_org_repos(org)
        pprint(cls.stats)


if __name__ == '__main__':
    GetLangStat.get_all_org_repos()

统计的公司包括 MS，amazon，google, twitter, facebook, 阿里。其中 amazon 似乎只开源了 aws 相关的代码，不过也算进来了。本来想找百度和腾讯的，结果发现百度没有一个统一的 organization，都是按产品散着的，腾讯则基本没有开源代码。。。

下面是统计结果，每个公司只取前五名

阿里

Language	repo count
Java	13
C	8
C++	2
JavaScript	2
Perl	2

google

Language	repo count
JavaScript	9
C++	4
Ruby	4
Python	3
Java	3

twitter

Language	repo count
Scala	14
Ruby	9
Java	3
Python,CSS,JavaScrit,Shell	1

facebook

Language	repo count
Java	8
C++	5
PHP	4
C	3
Python, Js, Objective-C	2

aws

Language	repo count
Java	8
Ruby	6
PHP	4
JavaScript	3
Objective-C	3

咳咳，最后是大微软，说实话我也不确定要不要把微软算成互联网公司。。。

Language	repo count
C#	29
C++	1

从这个非常不靠谱的统计来看，Java 果然还是最流行的语言啊。。。不会 Java 感觉压力真的好大 orz

← Newer Blog Archives Older →