scrcpy for developers

Overview

This application is composed of two parts:

  • the server (scrcpy-server), to be executed on the device,
  • the client (the scrcpy binary), executed on the host computer.

The client is responsible to push the server to the device and start its
execution.

The client and the server establish communication using separate sockets for
video, audio and controls. Any of them may be disabled (but not all), so
there are 1, 2 or 3 socket(s).

The server initially sends the device name on the first socket (it is used for
the scrcpy window title), then each socket is used for its own purpose. All
reads and writes are performed from a dedicated thread for each socket, both on
the client and on the server.

If video is enabled, then the server sends a raw video stream (H.264 by default)
of the device screen, with some additional headers for each packet. The client
decodes the video frames, and displays them as soon as possible, without
buffering (unless --display-buffer=delay is specified) to minimize latency.
The client is not aware of the device rotation (which is handled by the server),
it just knows the dimensions of the video frames it receives.

Similarly, if audio is enabled, then the server sends a raw audio stream (OPUS
by default) of the device audio output (or the microphone if
--audio-source=mic is specified), with some additional headers for each
packet. The client decodes the stream, attempts to keep a minimal latency by
maintaining an average buffering. The blog post of the scrcpy v2.0
release gives more details about the audio feature.

If control is enabled, then the client captures relevant keyboard and mouse
events, that it transmits to the server, which injects them to the device. This
is the only socket which is used in both direction: input events are sent from
the client to the device, and when the device clipboard changes, the new content
is sent from the device to the client to support seamless copy-paste.

Note that the client-server roles are expressed at the application level:

  • the server serves video and audio streams, and handle requests from the
    client,
  • the client controls the device through the server.

However, by default (when --force-adb-forward is not set), the roles are
reversed at the network level:

  • the client opens a server socket and listen on a port before starting the
    server,
  • the server connects to the client.

This role inversion guarantees that the connection will not fail due to race
conditions without polling.

Server

Privileges

Capturing the screen requires some privileges, which are granted to shell.

The server is a Java application (with a public static void main(String... args) method), compiled against the Android framework, and executed as
shell on the Android device.

To run such a Java application, the classes must be dexed (typically,
to classes.dex). If my.package.MainClass is the main class, compiled to
classes.dex, pushed to the device in /data/local/tmp, then it can be run
with:

adb shell CLASSPATH=/data/local/tmp/classes.dex app_process / my.package.MainClass

The path /data/local/tmp is a good candidate to push the server, since it's
readable and writable by shell, but not world-writable, so a malicious
application may not replace the server just before the client executes it.

Instead of a raw dex file, app_process accepts a jar containing
classes.dex (e.g. an APK). For simplicity, and to benefit from the gradle
build system, the server is built to an (unsigned) APK (renamed to
scrcpy-server.jar).

Hidden methods

Although compiled against the Android framework, hidden methods and classes are
not directly accessible (and they may differ from one Android version to
another).

They can be called using reflection though. The communication with hidden
components is provided by wrappers classes and aidl.

Execution

The server is started by the client basically by executing the following
commands:

adb push scrcpy-server /data/local/tmp/scrcpy-server.jar
adb forward tcp:27183 localabstract:scrcpy
adb shell CLASSPATH=/data/local/tmp/scrcpy-server.jar app_process / com.genymobile.scrcpy.Server 2.1

The first argument (2.1 in the example) is the client scrcpy version. The
server fails if the client and the server do not have the exact same version.
The protocol between the client and the server may change from version to
version (see protocol below), and there is no backward or forward
compatibility (there is no point to use different client and server versions).
This check allows to detect misconfiguration (running an older or newer server
by mistake).

It is followed by any number of arguments, in the form of key=value pairs.
Their order is irrelevant. The possible keys and associated value types can be
found in the server and client code.

For example, if we execute scrcpy -m1920 --no-audio, then the server
execution will look like this:

# scid is a random number to identify different clients running on the same device
adb shell CLASSPATH=/data/local/tmp/scrcpy-server.jar app_process / com.genymobile.scrcpy.Server 2.1 scid=12345678 log_level=info audio=false max_size=1920

Components

When executed, its main() method is executed (on the "main" thread).
It parses the arguments, establishes the connection with the client and starts
the other "components":

  • the video streamer: it captures the video screen and send encoded video
    packets on the video socket (from the video thread).
  • the audio streamer: it uses several threads to capture raw packets,
    submits them to encoding and retrieve encoded packets, which it sends on the
    audio socket.
  • the controller: it receives control messages (typically input events)
    on the control socket from one thread, and sends device messages (e.g. to
    transmit the device clipboard content to the client) on the same control
    socket
    from another thread. Thus, the control socket is used in both
    directions (contrary to the video and audio sockets).

Screen video encoding

The encoding is managed by ScreenEncoder.

The video is encoded using the MediaCodec API. The codec encodes the content
of a Surface associated to the display, and writes the encoding packets to the
client (on the video socket).

On device rotation (or folding), the encoding session is reset and restarted.

New frames are produced only when changes occur on the surface. This avoids to
send unnecessary frames, but by default there might be drawbacks:

  • it does not send any frame on start if the device screen does not change,
  • after fast motion changes, the last frame may have poor quality.

Both problems are solved by the flag
KEY_REPEAT_PREVIOUS_FRAME_AFTER.

Audio encoding

Similarly, the audio is captured using an AudioRecord, and encoded using
the MediaCodec asynchronous API.

More details are available on the blog post introducing the audio feature.

Input events injection

Control messages are received from the client by the Controller (run in a
separate thread). There are several types of input events:

  • keycode (cf KeyEvent),
  • text (special characters may not be handled by keycodes directly),
  • mouse motion/click,
  • mouse scroll,
  • other commands (e.g. to switch the screen on or to copy the clipboard).

Some of them need to inject input events to the system. To do so, they use the
hidden method InputManager.injectInputEvent() (exposed by the
InputManager wrapper).

Client

The client relies on SDL, which provides cross-platform API for UI, input
events, threading, etc.

The video and audio streams are decoded by FFmpeg.

Initialization

The client parses the command line arguments, then runs one of two code
paths
:

In the remaining of this document, we assume that the "normal" mode is used
(read the code for the OTG mode).

On startup, the client:

  • opens the video, audio and control sockets;
  • pushes and starts the server on the device;
  • initializes its components (demuxers, decoders, recorder…).

Video and audio streams

Depending on the arguments passed to scrcpy, several components may be used.
Here is an overview of the video and audio components:

                                                 V4L2 sink
                                               /
                                       decoder
                                     /         \
        VIDEO -------------> demuxer             display
                                     \
                                       recorder
                                     /
        AUDIO -------------> demuxer
                                     \
                                       decoder --- audio player

The demuxer is responsible to extract video and audio packets (read some
header, split the video stream into packets at correct boundaries, etc.).

The demuxed packets may be sent to a decoder (one per stream, to produce
frames) and to a recorder (receiving both video and audio stream to record a
single file). The packets are encoded on the device (by MediaCodec), but when
recording, they are muxed (asynchronously) into a container (MKV or MP4) on
the client side.

Video frames are sent to the screen/display to be rendered in the scrcpy window.
They may also be sent to a V4L2 sink.

Audio "frames" (an array of decoded samples) are sent to the audio player.

Controller

The controller is responsible to send control messages to the device. It
runs in a separate thread, to avoid I/O on the main thread.

On SDL event, received on the main thread, the input manager creates
appropriate control messages. It is responsible to convert SDL events to
Android events. It then pushes the control messages to a queue hold by the
controller. On its own thread, the controller takes messages from the queue,
that it serializes and sends to the client.

Protocol

The protocol between the client and the server must be considered internal: it
may (and will) change at any time for any reason. Everything may change (the
number of sockets, the order in which the sockets must be opened, the data
format on the wire…) from version to version. A client must always be run with a
matching server version.

This section documents the current protocol in scrcpy v2.1.

Connection

Firstly, the client sets up an adb tunnel:

# By default, a reverse redirection: the computer listens, the device connects
adb reverse localabstract:scrcpy_<SCID> tcp:27183

# As a fallback (or if --force-adb forward is set), a forward redirection:
# the device listens, the computer connects
adb forward tcp:27183 localabstract:scrcpy_<SCID>

(<SCID> is a 31-bit random number, so that it does not fail when several
scrcpy instances start "at the same time" for the same device.)

Then, up to 3 sockets are opened, in that order:

  • a video socket
  • an audio socket
  • a control socket

Each one may be disabled (respectively by --no-video, --no-audio and
--no-control, directly or indirectly). For example, if --no-audio is set,
then the video socket is opened first, then the control socket.

On the first socket opened (whichever it is), if the tunnel is forward, then
a dummy byte is sent from the device to the client. This allows to detect a
connection error (the client connection does not fail as long as there is an adb
forward redirection, even if nothing is listening on the device side).

Still on this first socket, the device sends some metadata to
the client (currently only the device name, used as the window title, but there
might be other fields in the future).

You can read the client and server
code for more details.

Then each socket is used for its intended purpose.

Video and audio

On the video and audio sockets, the device first sends some codec
metadata
:

  • On the video socket, 12 bytes:
    • the codec id (u32) (H264, H265 or AV1)
    • the initial video width (u32)
    • the initial video height (u32)
  • On the audio socket, 4 bytes:
    • the codec id (u32) (OPUS, AAC or RAW)

Then each packet produced by MediaCodec is sent, prefixed by a 12-byte frame
header
:

  • config packet flag (u1)
  • key frame flag (u1)
  • PTS (u62)
  • packet size (u32)

Here is a schema describing the frame header:

    [. . . . . . . .|. . . .]. . . . . . . . . . . . . . . ...
     <-------------> <-----> <-----------------------------...
           PTS        packet        raw packet
                       size
     <--------------------->
           frame header

The most significant bits of the PTS are used for packet flags:

     byte 7   byte 6   byte 5   byte 4   byte 3   byte 2   byte 1   byte 0
    CK...... ........ ........ ........ ........ ........ ........ ........
    ^^<------------------------------------------------------------------->
    ||                                PTS
    | `- key frame
     `-- config packet

Controls

Controls messages are sent via a custom binary protocol.

The only documentation for this protocol is the set of unit tests on both sides:

Standalone server

Although the server is designed to work for the scrcpy client, it can be used
with any client which uses the same protocol.

For simplicity, some server-specific options have been added to produce raw
streams easily:

  • send_device_meta=false: disable the device metata (in practice, the device
    name) sent on the first socket
  • send_frame_meta=false: disable the 12-byte header for each packet
  • send_dummy_byte: disable the dummy byte sent on forward connections
  • send_codec_meta: disable the codec information (and initial device size for
    video)
  • raw_stream: disable all the above

Concretely, here is how to expose a raw H.264 stream on a TCP socket:

adb push scrcpy-server-v2.1 /data/local/tmp/scrcpy-server-manual.jar
adb forward tcp:1234 localabstract:scrcpy
adb shell CLASSPATH=/data/local/tmp/scrcpy-server-manual.jar \
    app_process / com.genymobile.scrcpy.Server 2.1 \
    tunnel_forward=true audio=false control=false cleanup=false \
    raw_stream=true max_size=1920

As soon as a client connects over TCP on port 1234, the device will start
streaming the video. For example, VLC can play the video (although you will
experience a very high latency, more details here):

vlc -Idummy --demux=h264 --network-caching=0 tcp://localhost:1234

Hack

For more details, go read the code!

If you find a bug, or have an awesome idea to implement, please discuss and
contribute ;-)

Debug the server

The server is pushed to the device by the client on startup.

To debug it, enable the server debugger during configuration:

meson setup x -Dserver_debugger=true
# or, if x is already configured
meson configure x -Dserver_debugger=true

If your device runs Android 8 or below, set the server_debugger_method to
old in addition:

meson setup x -Dserver_debugger=true -Dserver_debugger_method=old
# or, if x is already configured
meson configure x -Dserver_debugger=true -Dserver_debugger_method=old

Then recompile.

When you start scrcpy, it will start a debugger on port 5005 on the device.
Redirect that port to the computer:

adb forward tcp:5005 tcp:5005

In Android Studio, Run > Debug > Edit configurations... On the left, click on
+, Remote, and fill the form:

  • Host: localhost
  • Port: 5005

Then click on Debug.

©著作权归作者所有,转载或内容合作请联系作者
  • 序言:七十年代末,一起剥皮案震惊了整个滨河市,随后出现的几起案子,更是在滨河造成了极大的恐慌,老刑警刘岩,带你破解...
    沈念sama阅读 210,914评论 6 490
  • 序言:滨河连续发生了三起死亡事件,死亡现场离奇诡异,居然都是意外死亡,警方通过查阅死者的电脑和手机,发现死者居然都...
    沈念sama阅读 89,935评论 2 383
  • 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
    开封第一讲书人阅读 156,531评论 0 345
  • 文/不坏的土叔 我叫张陵,是天一观的道长。 经常有香客问我,道长,这世上最难降的妖魔是什么? 我笑而不...
    开封第一讲书人阅读 56,309评论 1 282
  • 正文 为了忘掉前任,我火速办了婚礼,结果婚礼上,老公的妹妹穿的比我还像新娘。我一直安慰自己,他们只是感情好,可当我...
    茶点故事阅读 65,381评论 5 384
  • 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
    开封第一讲书人阅读 49,730评论 1 289
  • 那天,我揣着相机与录音,去河边找鬼。 笑死,一个胖子当着我的面吹牛,可吹牛的内容都是我干的。 我是一名探鬼主播,决...
    沈念sama阅读 38,882评论 3 404
  • 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
    开封第一讲书人阅读 37,643评论 0 266
  • 序言:老挝万荣一对情侣失踪,失踪者是张志新(化名)和其女友刘颖,没想到半个月后,有当地人在树林里发现了一具尸体,经...
    沈念sama阅读 44,095评论 1 303
  • 正文 独居荒郊野岭守林人离奇死亡,尸身上长有42处带血的脓包…… 初始之章·张勋 以下内容为张勋视角 年9月15日...
    茶点故事阅读 36,448评论 2 325
  • 正文 我和宋清朗相恋三年,在试婚纱的时候发现自己被绿了。 大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
    茶点故事阅读 38,566评论 1 339
  • 序言:一个原本活蹦乱跳的男人离奇死亡,死状恐怖,灵堂内的尸体忽然破棺而出,到底是诈尸还是另有隐情,我是刑警宁泽,带...
    沈念sama阅读 34,253评论 4 328
  • 正文 年R本政府宣布,位于F岛的核电站,受9级特大地震影响,放射性物质发生泄漏。R本人自食恶果不足惜,却给世界环境...
    茶点故事阅读 39,829评论 3 312
  • 文/蒙蒙 一、第九天 我趴在偏房一处隐蔽的房顶上张望。 院中可真热闹,春花似锦、人声如沸。这庄子的主人今日做“春日...
    开封第一讲书人阅读 30,715评论 0 21
  • 文/苍兰香墨 我抬头看了看天上的太阳。三九已至,却和暖如春,着一层夹袄步出监牢的瞬间,已是汗流浃背。 一阵脚步声响...
    开封第一讲书人阅读 31,945评论 1 264
  • 我被黑心中介骗来泰国打工, 没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留,地道东北人。 一个月前我还...
    沈念sama阅读 46,248评论 2 360
  • 正文 我出身青楼,却偏偏与公主长得像,于是被迫代替她去往敌国和亲。 传闻我的和亲对象是个残疾皇子,可洞房花烛夜当晚...
    茶点故事阅读 43,440评论 2 348

推荐阅读更多精彩内容