Fork me on GitHub
Google

我的 2019 PyCon China 小结(上)

坐在从上海开往北京的高铁上,我开始敲这篇文章。

Kenneth Reitz 曾经说,他的一年是按 PyCon 计算的。尽管围绕他有很多争议,这句话依然让我有了奇妙的共鸣。对他来说,"PyCon"自然是指 PyCon US,而对我来说,则是 PyCon China。

2013 年是我第一次参加 PyCon。当时还在 UCAS 念书,而会场正好在教学楼(虽然我其实并不在那里上课)。虽然 Google Drive 里还存着当年的 slide,然而关于大会,我仅存的记忆就是有讲师给了一个中国版的 Python Epiphanies talk,并且有一个很厉害的提问者和讲师讨论了一些我似懂非懂的概念。

2014 就不同了。作为一个萌新,我尝试报名了演讲,居然通过了(当时还不知道其实每年讲师都很缺)。于是我就对着文档讲了一下 concurrent.futures。当时为了准备演讲我曾发邮件给作者 Brian,结果今年 PyCon US 见到他了,还以同事的身份聊了几句。只能说缘分真的很奇妙。

15、16 两年体验基本没变。一个人去会场,拿纪念品,听讲,离开。虽然 16 年遇到了一个师兄,但除此之外并无可以攀谈的人。17 年情况特殊。由于我不太认同组织方的一些理念,所以并没有参加,而那年似乎也是 PyCon China 口碑跌倒谷底的一年(据我观察从 13 ~ 17 口碑是逐年走低的),以至于人们都懒得评价了。

正所谓否极泰来,17 年的低谷也孕育了希望。由于 PyCon 北京缺席,一些小伙伴自发筹备了一个小型线下聚会。这次聚会的组织者中,就有后来 PyCon China 2018/19 的组织者,同时也是我的好朋友,Manjusaka。

从 2018 年开始,我和他在网上渐渐熟络起来。以我片面的视角来看,Manjusaka 正是让 PyCon China 涅槃重生的关键人物。当然,其它组织者同样付出了很多,比如统筹全局的辛老师。Manjusaka 经常找我吐槽,比如办 PyCon 亏了巨多钱,视频录制方不肯给视频,又比如在北京办活动还需要额外交一笔安全费,等等等等,让我深感办活动之不易。面对如此多的困难,PyCon China 2018 出人意料地交出了一份漂亮的答卷。如果你浏览《参加北京 PyCon2014 是怎么样的体验?》和《参加 PyCon China 2018 北京场是怎么样的体验?》下面的回答,就会发现观众评价的巨大反差:14 年是一边倒的差评,18 年则是一边倒的好评。最有意思的是,gashero 两次都作为讲师出席,却给出了截然不同的评价(1, 2),从侧面证明了 PyCon 在组织方面的巨大进步。

总之 PyCon China 2018 大获成功,一举逆转了之前崩坏的口碑。大会之前几个月 Manjusaka 问我想不想演讲,然而我并没有什么好主题,就婉拒了。在现场 Manjusaka 又问我明年要不要讲。看到大会办得这么成功,我一激动就答应了(其实是因为拿了他的票不好意思XD)。

PyCon 结束后的半年我无暇他顾,一方面是工作和申请 promote,另一方面是找组和 transfer。等我 3 月份搬入 Irvine office 附近的公寓,一切才算告一段落,而我也终于可以静下心来思考之前的承诺。

其实,关于要讲什么我并非没有头绪。我从 17 年就有一个想法,这个想法来源于我在公司重构别人写的 Python 代码时的糟糕体验——那种面对十个参数和跳来跳去的调用时一脸懵逼的感觉我至今记忆犹新。当时我就想,如果有工具能直接告诉我这个程序是怎么运行的,那该多好啊。随后我把这个想法细化为“实现变量溯源”。然而它也只是个想法,我甚至都没告诉过别人,因为我完全不知道要怎么去实现,能不能实现。18 年 PyCon 上我认识了红姐,他是我这辈子第一个(也是目前唯一一个)能说上话的 PLT 圈内人。我和他稍微说了一下想做的东西,他说可以分析所有 import 的AST,通过 AST 找到每个变量的作用域,从目标变量开始递归。我试着依照这个思路去实现,然而当时有很多问题没想清楚,代码越弄越复杂,距离目标反而越来越远,加上要忙 transfer,只能暂时搁置。

今年 5 月份,就在我启程去 PyCon US 之前,我突然有了些新的想法。之前的思路是对程序进行预处理,相当于要做完整的静态分析,这太复杂了。为什么我们不能在 runtime 收集一些信息呢?比如每一行做一个快照保存当时的状态,这样既有了每个时刻的状态,又有了状态间的顺序,对比一下相邻状态不就知道发生了什么吗?我很兴奋,觉得似乎可行。带着这个想法,我在 PyCon US 和一些人进行了交流,比如贵司做 Python 工具链的同事,讲 debugger 原理的讲师 Liran Haimovitch。Liran 建议我在字节码层面做,因为他觉得字节码比 AST 更稳定。不过我还是决定先按自己的来。于是我在 PyCon US sprints 上敲下了新思路的第一行代码。对,虽然我没报名 sprints,但是我去待了半天,并且幸运地找到了一个没人的会议室。

接下来几个月就是紧张地工作了。我几乎投入了全部业余时间,终于赶在 PyCon 之前几周弄出了一个可演示的版本。这期间的某一个时刻,我感觉自己有 95% 的把握了,于是便正式报名成为 PyCon China 2019 上海场讲师。Slide 也花了不少心血,我还在组里试讲了一次,但与项目一比则可以忽略不计——毕竟 Cyberbrain 是我程序员生涯到目前为止遇到过的最大挑战。它不像 Web 项目,搜一搜总能找到答案。这是前人没有做过的东西,每一行代码每一个决定都存在不确定性,每一个不合理的设计都可能导致项目失败,所以我必须万分小心。这期间,我无数次痛苦地思考,无数次推翻之前的想法,也无数次想过放弃。然而在红姐,信涛,翔哥等小伙伴的帮助下,最终还是硬搞出来了。也许现在的代码之后都要推翻重写,但它让我确定了这个想法的可行性,这是最重要的。

于是,我踏上了回国的飞机,正式奔赴上海。(待续)

Google

送 PyCon 门票啦

Update:已经送出。

我手上有一张 PyCon 上海(2019-09-21)的赠票,想要的话请通过任意方式告诉我,先到先得。具体参见

PyCon2019 中国Python开发者大会- 上海站

今年 PyCon China 大佬云集,不容错过。

P.S. 如果你想要门票,我希望是你本人去参加,而不是拿了送人。

Google

Demystifying EXTENDED_ARG

Recently I was studying Python bytecode for my personal project. One particular thing that bothered me was EXTENDED_ARG. Guess what, the documentation is outdated(or, you could say, wrong), thus causing my confusion. But even without the error, it is not easy to understand at first glance. In this article, I'll explain it in detail.

Early warning: If you've never heard of Python's bytecode, you probably want to learn about it before continue reading.

Before Python 3.6

Let's first take a look at the *original* documentation.

EXTENDED_ARG(ext)

Prefixes any opcode which has an argument too big to fit into the default two bytes. ext holds two additional bytes which, taken together with the subsequent opcode’s argument, comprise a four-byte argument, ext being the two most-significant bytes.

Before Python 3.6, each instruction takes 1 or 3 bytes, depending on whether it takes an argument. Example:

RETURN_VALUE             # opcode takes 1 byte, total is 1
LOAD_CONST     0x0000    # argument takes 2 bytes, total is 3

But what if you want to have a larger argument which cannot fit into 2 bytes? That's when EXTENDED_ARG comes into play. Let's say, the argument is 0x123456

LOAD_CONST     0x123456  # INVALID!! argument exceeds size limit
##################################################
EXTENDED_ARG   0x0012
LOAD_CONST     0x3456    # valid

Here's what Python does: it splits the argument into two parts, with 2 bytes each. The most significant 2 bytes 0x0012 becomes argument of EXTENDED_ARG, the remaining 2 bytes becomes argument of LOAD_CONST. When Python virtual machine sees EXTENDED_ARG, it knows to read the next instruction, and adds their arguments together. So the actual operation Python does is LOAD_CONST with argument 0x00123456.

After Python 3.6

Python 3.6 changed the size of instructions. So starting from Python 3.6, every instruction takes 2 bytes, where opcode still takes 1 byte, argument also takes 1 byte now. If there's no argument, it's just zero.

So what's the equivalent bytecode representation of LOAD_CONST 0x123456 in the latest version of Python? It's like this:

EXTENDED_ARG   0x12
EXTENDED_ARG   0x34
LOAD_CONST     0x56

The change is pretty straight forward, the only difference is the use of multiple EXTENDED_ARG. The size limit for argument is 4 bytes, so there will be 3 EXTENDED_ARG instructions at most, for each subsequent instruction(in this example, LOAD_CONST).

Wait, the documentation is wrong?

Funny enough, the documentation was not updated to reflect this change(which is understandable given the amount of work needed to be done). So I made a PR to fix it.

Play with it yourself

Finally, some code to prove the things we talked about.

I'll be using a brilliant library made by Victor Stinner, called bytecode. Thanks @thautwarm who introduced it to me.

Complete program can be found here.

In this program, I constructs a series of instructions by hand, which does two things: LOAD_CONST 0x1234567 , and then RETURN_VALUE to pop up the loaded value from VM stack. With the help of bytecode lib, things become really simple.

from bytecode import ConcreteInstr, ConcreteBytecode

CONST_ARG = 0x1234567  # The real argument we want to set.
cbc = bytecode.ConcreteBytecode()
cbc.consts = [None] * (CONST_ARG + 1)  # Make sure co_consts is big enough.
cbc.consts[CONST_ARG] = "foo"  # Sets the value we want to load.

if sys.version_info >= (3, 6):
    cbc.extend([
        ConcreteInstr("EXTENDED_ARG", 0x1),
        ConcreteInstr("EXTENDED_ARG", 0x23),
        ConcreteInstr("EXTENDED_ARG", 0x45),
        ConcreteInstr("LOAD_CONST", 0x67),
        ConcreteInstr("RETURN_VALUE"),
    ])

cbc.extend manually constructs the instructions, and the instructions should look familiar now. The tricky thing is about preparing value. Since we want to call LOAD_CONST 0x1234567, there needs to be a value located at co_consts[0x1234567](If you don't know what co_consts is, check out the doc). So what we do is set the value manually: cbc.consts[0x1234567] = "foo".

Now comes the interesting part, let's dis the code object we just created:

  1           0 EXTENDED_ARG             1
              2 EXTENDED_ARG           291
              4 EXTENDED_ARG         74565
              6 LOAD_CONST           19088743 ('foo')
              8 RETURN_VALUE

Is something wrong? Why is the argument different from what we set? Here's how it works

1 = 0x01
291 = 0x0123
74565 = 0x012345
19088743 = 0x01234567

Now it's clear, Python accumulates the arguments when seeing EXTENDED_ARG, and it becomes Instruction.arg. But under the hood, the byte value is still exactly what we set.

for raw_byte in code.co_code:
    print("raw code is: ", raw_byte)

"""
raw code is:  144   # EXTENDED_ARG
raw code is:  1     # 0x01
raw code is:  144   # EXTENDED_ARG
raw code is:  35    # 0x23
raw code is:  144   # EXTENDED_ARG
raw code is:  69    # 0x45
raw code is:  100   # LOAD_CONST
raw code is:  103   # 0x67
raw code is:  83    # RETURN_VALUE
raw code is:  0     # no argument, so zero
"""

The program also supports running with versions before 3.6, the only difference is the two bytes argument:

cbc.extend([
    ConcreteInstr("EXTENDED_ARG", 0x123),
    ConcreteInstr("LOAD_CONST", 0x4567),
    ConcreteInstr("RETURN_VALUE"),
])

top