xref: /openbmc/qemu/docs/colo-proxy.txt (revision 4921d0a7)
1COLO-proxy
2----------
3Copyright (c) 2016 Intel Corporation
4Copyright (c) 2016 HUAWEI TECHNOLOGIES CO., LTD.
5Copyright (c) 2016 Fujitsu, Corp.
6
7This work is licensed under the terms of the GNU GPL, version 2 or later.
8See the COPYING file in the top-level directory.
9
10This document gives an overview of COLO proxy's design.
11
12== Background ==
13COLO-proxy is a part of COLO project. It is used
14to compare the network package to help COLO decide
15whether to do checkpoint. With COLO-proxy's help,
16COLO greatly improves the performance.
17
18The filter-redirector, filter-mirror, colo-compare
19and filter-rewriter compose the COLO-proxy.
20
21== Architecture ==
22
23COLO-Proxy is based on qemu netfilter and it's a plugin for qemu netfilter
24(except colo-compare). It keep Secondary VM connect normally to
25client and compare packets sent by PVM with sent by SVM.
26If the packet difference, notify COLO-frame to do checkpoint and send
27all primary packet has queued. Otherwise just send the queued primary
28packet and drop the queued secondary packet.
29
30Below is a COLO proxy ascii figure:
31
32 Primary qemu                                                           Secondary qemu
33+--------------------------------------------------------------+       +----------------------------------------------------------------+
34| +----------------------------------------------------------+ |       |  +-----------------------------------------------------------+ |
35| |                                                          | |       |  |                                                           | |
36| |                        guest                             | |       |  |                        guest                              | |
37| |                                                          | |       |  |                                                           | |
38| +-------^--------------------------+-----------------------+ |       |  +---------------------+--------+----------------------------+ |
39|         |                          |                         |       |                        ^        |                              |
40|         |                          |                         |       |                        |        |                              |
41|         |  +------------------------------------------------------+  |                        |        |                              |
42|netfilter|  |                       |                         |    |  |   netfilter            |        |                              |
43| +----------+ +----------------------------+                  |    |  |  +-----------------------------------------------------------+ |
44| |       |  |                       |      |        out       |    |  |  |                     |        |  filter execute order      | |
45| |       |  |          +-----------------------------+        |    |  |  |                     |        | +------------------->      | |
46| |       |  |          |            |      |         |        |    |  |  |                     |        |   TCP                      | |
47| | +-----+--+-+  +-----v----+ +-----v----+ |pri +----+----+sec|    |  |  | +------------+  +---+----+---v+rewriter++  +------------+ | |
48| | |          |  |          | |          | |in  |         |in |    |  |  | |            |  |        |              |  |            | | |
49| | |  filter  |  |  filter  | |  filter  +------>  colo   <------+ +-------->  filter   +--> adjust |   adjust     +-->   filter   | | |
50| | |  mirror  |  |redirector| |redirector| |    | compare |   |  |    |  | | redirector |  | ack    |   seq        |  | redirector | | |
51| | |          |  |          | |          | |    |         |   |  |    |  | |            |  |        |              |  |            | | |
52| | +----^-----+  +----+-----+ +----------+ |    +---------+   |  |    |  | +------------+  +--------+--------------+  +---+--------+ | |
53| |      |   tx        |   rx           rx  |                  |  |    |  |            tx                        all       |  rx      | |
54| |      |             |                    |                  |  |    |  +-----------------------------------------------------------+ |
55| |      |             +--------------+     |                  |  |    |                                                   |            |
56| |      |   filter execute order     |     |                  |  |    |                                                   |            |
57| |      |  +---------------->        |     |                  |  +--------------------------------------------------------+            |
58| +-----------------------------------------+                  |       |                                                                |
59|        |                            |                        |       |                                                                |
60+--------------------------------------------------------------+       +----------------------------------------------------------------+
61         |guest receive               | guest send
62         |                            |
63+--------+----------------------------v------------------------+
64|                                                              |                          NOTE: filter direction is rx/tx/all
65|                         tap                                  |                          rx:receive packets sent to the netdev
66|                                                              |                          tx:receive packets sent by the netdev
67+--------------------------------------------------------------+
68
691.Guest receive packet route:
70
71Primary:
72
73Tap --> Mirror Client Filter
74Mirror client will send packet to guest,at the
75same time, copy and forward packet to secondary
76mirror server.
77
78Secondary:
79
80Mirror Server Filter --> TCP Rewriter
81If receive packet is TCP packet,we will adjust ack
82and update TCP checksum, then send to secondary
83guest. Otherwise directly send to guest.
84
852.Guest send packet route:
86
87Primary:
88
89Guest --> Redirect Server Filter
90Redirect server filter receive primary guest packet
91but do nothing, just pass to next filter.
92
93Redirect Server Filter --> COLO-Compare
94COLO-compare receive primary guest packet then
95waiting secondary redirect packet to compare it.
96If packet same,send queued primary packet and clear
97queued secondary packet, Otherwise send primary packet
98and do checkpoint.
99
100COLO-Compare --> Another Redirector Filter
101The redirector get packet from colo-compare by use
102chardev socket.
103
104Redirector Filter --> Tap
105Send the packet.
106
107Secondary:
108
109Guest --> TCP Rewriter Filter
110If the packet is TCP packet,we will adjust seq
111and update TCP checksum. Then send it to
112redirect client filter. Otherwise directly send to
113redirect client filter.
114
115Redirect Client Filter --> Redirect Server Filter
116Forward packet to primary.
117
118== Components introduction ==
119
120Filter-mirror is a netfilter plugin.
121It gives qemu the ability to mirror
122packets to a chardev.
123
124Filter-redirector is a netfilter plugin.
125It gives qemu the ability to redirect net packet.
126Redirector can redirect filter's net packet to outdev,
127and redirect indev's packet to filter.
128
129                    filter
130                      +
131          redirector  |
132             +--------------+
133             |        |     |
134             |        |     |
135             |        |     |
136  indev +---------+   +---------->  outdev
137             |    |         |
138             |    |         |
139             |    |         |
140             +--------------+
141                  |
142                  v
143                filter
144
145COLO-compare, we do packet comparing job.
146Packets coming from the primary char indev will be sent to outdev.
147Packets coming from the secondary char dev will be dropped after comparing.
148COLO-compare needs two input chardevs and one output chardev:
149primary_in=chardev1-id (source: primary send packet)
150secondary_in=chardev2-id (source: secondary send packet)
151outdev=chardev3-id
152
153Filter-rewriter will rewrite some of secondary packet to make
154secondary guest's tcp connection established successfully.
155In this module we will rewrite tcp packet's ack to the secondary
156from primary,and rewrite tcp packet's seq to the primary from
157secondary.
158
159== Usage ==
160
161Here is an example using demonstration IP and port addresses to more
162clearly describe the usage.
163
164Primary(ip:3.3.3.3):
165-netdev tap,id=hn0,vhost=off
166-device e1000,id=e0,netdev=hn0,mac=52:a4:00:12:78:66
167-chardev socket,id=mirror0,host=3.3.3.3,port=9003,server=on,wait=off
168-chardev socket,id=compare1,host=3.3.3.3,port=9004,server=on,wait=off
169-chardev socket,id=compare0,host=3.3.3.3,port=9001,server=on,wait=off
170-chardev socket,id=compare0-0,host=3.3.3.3,port=9001
171-chardev socket,id=compare_out,host=3.3.3.3,port=9005,server=on,wait=off
172-chardev socket,id=compare_out0,host=3.3.3.3,port=9005
173-object iothread,id=iothread1
174-object filter-mirror,id=m0,netdev=hn0,queue=tx,outdev=mirror0
175-object filter-redirector,netdev=hn0,id=redire0,queue=rx,indev=compare_out
176-object filter-redirector,netdev=hn0,id=redire1,queue=rx,outdev=compare0
177-object colo-compare,id=comp0,primary_in=compare0-0,secondary_in=compare1,outdev=compare_out0,iothread=iothread1
178
179Secondary(ip:3.3.3.8):
180-netdev tap,id=hn0,vhost=off
181-device e1000,netdev=hn0,mac=52:a4:00:12:78:66
182-chardev socket,id=red0,host=3.3.3.3,port=9003
183-chardev socket,id=red1,host=3.3.3.3,port=9004
184-object filter-redirector,id=f1,netdev=hn0,queue=tx,indev=red0
185-object filter-redirector,id=f2,netdev=hn0,queue=rx,outdev=red1
186-object filter-rewriter,id=f3,netdev=hn0,queue=all
187
188If you want to use virtio-net-pci or other driver with vnet_header:
189
190Primary(ip:3.3.3.3):
191-netdev tap,id=hn0,vhost=off,script=/etc/qemu-ifup,downscript=/etc/qemu-ifdown
192-device e1000,id=e0,netdev=hn0,mac=52:a4:00:12:78:66
193-chardev socket,id=mirror0,host=3.3.3.3,port=9003,server=on,wait=off
194-chardev socket,id=compare1,host=3.3.3.3,port=9004,server=on,wait=off
195-chardev socket,id=compare0,host=3.3.3.3,port=9001,server=on,wait=off
196-chardev socket,id=compare0-0,host=3.3.3.3,port=9001
197-chardev socket,id=compare_out,host=3.3.3.3,port=9005,server=on,wait=off
198-chardev socket,id=compare_out0,host=3.3.3.3,port=9005
199-object filter-mirror,id=m0,netdev=hn0,queue=tx,outdev=mirror0,vnet_hdr_support
200-object filter-redirector,netdev=hn0,id=redire0,queue=rx,indev=compare_out,vnet_hdr_support
201-object filter-redirector,netdev=hn0,id=redire1,queue=rx,outdev=compare0,vnet_hdr_support
202-object colo-compare,id=comp0,primary_in=compare0-0,secondary_in=compare1,outdev=compare_out0,vnet_hdr_support
203
204Secondary(ip:3.3.3.8):
205-netdev tap,id=hn0,vhost=off
206-device e1000,netdev=hn0,mac=52:a4:00:12:78:66
207-chardev socket,id=red0,host=3.3.3.3,port=9003
208-chardev socket,id=red1,host=3.3.3.3,port=9004
209-object filter-redirector,id=f1,netdev=hn0,queue=tx,indev=red0,vnet_hdr_support
210-object filter-redirector,id=f2,netdev=hn0,queue=rx,outdev=red1,vnet_hdr_support
211-object filter-rewriter,id=f3,netdev=hn0,queue=all,vnet_hdr_support
212
213Note:
214  a.COLO-proxy must work with COLO-frame and Block-replication.
215  b.Primary COLO must be started firstly, because COLO-proxy needs
216    chardev socket server running before secondary started.
217  c.Filter-rewriter only rewrite tcp packet.
218