YY Censorship Research

Jeffrey Knockel
YY user profile censorship warning
“【自我说明】包含敏感字符,请重新输入。” ([Profile] contains sensitive characters, please try again.) Note that for censorship of text chat, triggering messages do not display this warning.

Check out the...

Censorship Analysis

YY 7.1 downloads three different keywords lists:

Surveillance Analysis

When sending a word from the "Normal" or "High" lists above, a surveillance message is sent via an HTTP GET request to a URL of the form:

http://sere.hiido.com/do.action?id=<id>&content=<content>

<id> is a hash computed as md5(⌊<seconds since unix epoch> / 1000⌋ + ";username=report;password=pswd@1234"), hex-encoded. Note that the username and password in the hashed string are hardcoded; these are not the username and password of the sender or receiver of the triggering message.

<content> is a base64-encoded string of the following form:

type=2;uid=<sending user id #>;touid=<receiving user id #>;keyword=<triggering keyword>;txt=<triggering message in its entirety>

In the version of YY analyzed, type is hardcoded to 2.

Code

decode.py is a python script for automating the decoding of the "normal" and "high" lists into plain text UTF8.