146cca4ecSZhang ChenCOLO-proxy 246cca4ecSZhang Chen---------- 346cca4ecSZhang ChenCopyright (c) 2016 Intel Corporation 446cca4ecSZhang ChenCopyright (c) 2016 HUAWEI TECHNOLOGIES CO., LTD. 546cca4ecSZhang ChenCopyright (c) 2016 Fujitsu, Corp. 646cca4ecSZhang Chen 746cca4ecSZhang ChenThis work is licensed under the terms of the GNU GPL, version 2 or later. 846cca4ecSZhang ChenSee the COPYING file in the top-level directory. 946cca4ecSZhang Chen 1046cca4ecSZhang ChenThis document gives an overview of COLO proxy's design. 1146cca4ecSZhang Chen 1246cca4ecSZhang Chen== Background == 1346cca4ecSZhang ChenCOLO-proxy is a part of COLO project. It is used 1446cca4ecSZhang Chento compare the network package to help COLO decide 1546cca4ecSZhang Chenwhether to do checkpoint. With COLO-proxy's help, 1646cca4ecSZhang ChenCOLO greatly improves the performance. 1746cca4ecSZhang Chen 1846cca4ecSZhang ChenThe filter-redirector, filter-mirror, colo-compare 1946cca4ecSZhang Chenand filter-rewriter compose the COLO-proxy. 2046cca4ecSZhang Chen 2146cca4ecSZhang Chen== Architecture == 2246cca4ecSZhang Chen 2346cca4ecSZhang ChenCOLO-Proxy is based on qemu netfilter and it's a plugin for qemu netfilter 2446cca4ecSZhang Chen(except colo-compare). It keep Secondary VM connect normally to 2546cca4ecSZhang Chenclient and compare packets sent by PVM with sent by SVM. 2646cca4ecSZhang ChenIf the packet difference, notify COLO-frame to do checkpoint and send 2746cca4ecSZhang Chenall primary packet has queued. Otherwise just send the queued primary 2846cca4ecSZhang Chenpacket and drop the queued secondary packet. 2946cca4ecSZhang Chen 3046cca4ecSZhang ChenBelow is a COLO proxy ascii figure: 3146cca4ecSZhang Chen 3246cca4ecSZhang Chen Primary qemu Secondary qemu 3346cca4ecSZhang Chen+--------------------------------------------------------------+ +----------------------------------------------------------------+ 3446cca4ecSZhang Chen| +----------------------------------------------------------+ | | +-----------------------------------------------------------+ | 3546cca4ecSZhang Chen| | | | | | | | 3646cca4ecSZhang Chen| | guest | | | | guest | | 3746cca4ecSZhang Chen| | | | | | | | 3846cca4ecSZhang Chen| +-------^--------------------------+-----------------------+ | | +---------------------+--------+----------------------------+ | 3946cca4ecSZhang Chen| | | | | ^ | | 4046cca4ecSZhang Chen| | | | | | | | 4146cca4ecSZhang Chen| | +------------------------------------------------------+ | | | | 4246cca4ecSZhang Chen|netfilter| | | | | | netfilter | | | 4346cca4ecSZhang Chen| +----------+ +----------------------------+ | | | +-----------------------------------------------------------+ | 44806be373SLike Xu| | | | | | out | | | | | | filter execute order | | 4546cca4ecSZhang Chen| | | | +-----------------------------+ | | | | | | +-------------------> | | 4646cca4ecSZhang Chen| | | | | | | | | | | | | | TCP | | 4746cca4ecSZhang Chen| | +-----+--+-+ +-----v----+ +-----v----+ |pri +----+----+sec| | | | +------------+ +---+----+---v+rewriter++ +------------+ | | 4846cca4ecSZhang Chen| | | | | | | | |in | |in | | | | | | | | | | | | | 4946cca4ecSZhang Chen| | | filter | | filter | | filter +------> colo <------+ +--------> filter +--> adjust | adjust +--> filter | | | 5046cca4ecSZhang Chen| | | mirror | |redirector| |redirector| | | compare | | | | | | redirector | | ack | seq | | redirector | | | 5146cca4ecSZhang Chen| | | | | | | | | | | | | | | | | | | | | | | | 5246cca4ecSZhang Chen| | +----^-----+ +----+-----+ +----------+ | +---------+ | | | | +------------+ +--------+--------------+ +---+--------+ | | 5346cca4ecSZhang Chen| | | tx | rx rx | | | | | tx all | rx | | 5446cca4ecSZhang Chen| | | | | | | | +-----------------------------------------------------------+ | 5546cca4ecSZhang Chen| | | +--------------+ | | | | | | 56806be373SLike Xu| | | filter execute order | | | | | | | 5746cca4ecSZhang Chen| | | +----------------> | | | +--------------------------------------------------------+ | 5846cca4ecSZhang Chen| +-----------------------------------------+ | | | 5946cca4ecSZhang Chen| | | | | | 6046cca4ecSZhang Chen+--------------------------------------------------------------+ +----------------------------------------------------------------+ 6146cca4ecSZhang Chen |guest receive | guest send 6246cca4ecSZhang Chen | | 6346cca4ecSZhang Chen+--------+----------------------------v------------------------+ 6446cca4ecSZhang Chen| | NOTE: filter direction is rx/tx/all 6546cca4ecSZhang Chen| tap | rx:receive packets sent to the netdev 6646cca4ecSZhang Chen| | tx:receive packets sent by the netdev 6746cca4ecSZhang Chen+--------------------------------------------------------------+ 6846cca4ecSZhang Chen 6946cca4ecSZhang Chen1.Guest receive packet route: 7046cca4ecSZhang Chen 7146cca4ecSZhang ChenPrimary: 7246cca4ecSZhang Chen 7346cca4ecSZhang ChenTap --> Mirror Client Filter 7446cca4ecSZhang ChenMirror client will send packet to guest,at the 7546cca4ecSZhang Chensame time, copy and forward packet to secondary 7646cca4ecSZhang Chenmirror server. 7746cca4ecSZhang Chen 7846cca4ecSZhang ChenSecondary: 7946cca4ecSZhang Chen 8046cca4ecSZhang ChenMirror Server Filter --> TCP Rewriter 8146cca4ecSZhang ChenIf receive packet is TCP packet,we will adjust ack 8246cca4ecSZhang Chenand update TCP checksum, then send to secondary 8346cca4ecSZhang Chenguest. Otherwise directly send to guest. 8446cca4ecSZhang Chen 8546cca4ecSZhang Chen2.Guest send packet route: 8646cca4ecSZhang Chen 8746cca4ecSZhang ChenPrimary: 8846cca4ecSZhang Chen 8946cca4ecSZhang ChenGuest --> Redirect Server Filter 9046cca4ecSZhang ChenRedirect server filter receive primary guest packet 9146cca4ecSZhang Chenbut do nothing, just pass to next filter. 9246cca4ecSZhang Chen 9346cca4ecSZhang ChenRedirect Server Filter --> COLO-Compare 9446cca4ecSZhang ChenCOLO-compare receive primary guest packet then 95806be373SLike Xuwaiting secondary redirect packet to compare it. 9646cca4ecSZhang ChenIf packet same,send queued primary packet and clear 9746cca4ecSZhang Chenqueued secondary packet, Otherwise send primary packet 9846cca4ecSZhang Chenand do checkpoint. 9946cca4ecSZhang Chen 10046cca4ecSZhang ChenCOLO-Compare --> Another Redirector Filter 10146cca4ecSZhang ChenThe redirector get packet from colo-compare by use 10246cca4ecSZhang Chenchardev socket. 10346cca4ecSZhang Chen 10446cca4ecSZhang ChenRedirector Filter --> Tap 10546cca4ecSZhang ChenSend the packet. 10646cca4ecSZhang Chen 10746cca4ecSZhang ChenSecondary: 10846cca4ecSZhang Chen 10946cca4ecSZhang ChenGuest --> TCP Rewriter Filter 11046cca4ecSZhang ChenIf the packet is TCP packet,we will adjust seq 11146cca4ecSZhang Chenand update TCP checksum. Then send it to 11246cca4ecSZhang Chenredirect client filter. Otherwise directly send to 11346cca4ecSZhang Chenredirect client filter. 11446cca4ecSZhang Chen 11546cca4ecSZhang ChenRedirect Client Filter --> Redirect Server Filter 11646cca4ecSZhang ChenForward packet to primary. 11746cca4ecSZhang Chen 11846cca4ecSZhang Chen== Components introduction == 11946cca4ecSZhang Chen 12046cca4ecSZhang ChenFilter-mirror is a netfilter plugin. 12146cca4ecSZhang ChenIt gives qemu the ability to mirror 12246cca4ecSZhang Chenpackets to a chardev. 12346cca4ecSZhang Chen 12446cca4ecSZhang ChenFilter-redirector is a netfilter plugin. 12546cca4ecSZhang ChenIt gives qemu the ability to redirect net packet. 12646cca4ecSZhang ChenRedirector can redirect filter's net packet to outdev, 12746cca4ecSZhang Chenand redirect indev's packet to filter. 12846cca4ecSZhang Chen 12946cca4ecSZhang Chen filter 13046cca4ecSZhang Chen + 13146cca4ecSZhang Chen redirector | 13246cca4ecSZhang Chen +--------------+ 13346cca4ecSZhang Chen | | | 13446cca4ecSZhang Chen | | | 13546cca4ecSZhang Chen | | | 13646cca4ecSZhang Chen indev +---------+ +----------> outdev 13746cca4ecSZhang Chen | | | 13846cca4ecSZhang Chen | | | 13946cca4ecSZhang Chen | | | 14046cca4ecSZhang Chen +--------------+ 14146cca4ecSZhang Chen | 14246cca4ecSZhang Chen v 14346cca4ecSZhang Chen filter 14446cca4ecSZhang Chen 14546cca4ecSZhang ChenCOLO-compare, we do packet comparing job. 14646cca4ecSZhang ChenPackets coming from the primary char indev will be sent to outdev. 14746cca4ecSZhang ChenPackets coming from the secondary char dev will be dropped after comparing. 1489277d81fSVille SkyttäCOLO-compare needs two input chardevs and one output chardev: 14946cca4ecSZhang Chenprimary_in=chardev1-id (source: primary send packet) 15046cca4ecSZhang Chensecondary_in=chardev2-id (source: secondary send packet) 15146cca4ecSZhang Chenoutdev=chardev3-id 15246cca4ecSZhang Chen 15346cca4ecSZhang ChenFilter-rewriter will rewrite some of secondary packet to make 15446cca4ecSZhang Chensecondary guest's tcp connection established successfully. 15546cca4ecSZhang ChenIn this module we will rewrite tcp packet's ack to the secondary 15646cca4ecSZhang Chenfrom primary,and rewrite tcp packet's seq to the primary from 15746cca4ecSZhang Chensecondary. 15846cca4ecSZhang Chen 15946cca4ecSZhang Chen== Usage == 16046cca4ecSZhang Chen 161d4aa431fSZhang ChenHere is an example using demonstration IP and port addresses to more 162d4aa431fSZhang Chenclearly describe the usage. 163d4aa431fSZhang Chen 16446cca4ecSZhang ChenPrimary(ip:3.3.3.3): 165*7aa94e59SMichael Tokarev-netdev tap,id=hn0,vhost=off 16646cca4ecSZhang Chen-device e1000,id=e0,netdev=hn0,mac=52:a4:00:12:78:66 167c2387413SDaniel P. Berrangé-chardev socket,id=mirror0,host=3.3.3.3,port=9003,server=on,wait=off 168c2387413SDaniel P. Berrangé-chardev socket,id=compare1,host=3.3.3.3,port=9004,server=on,wait=off 169c2387413SDaniel P. Berrangé-chardev socket,id=compare0,host=3.3.3.3,port=9001,server=on,wait=off 17046cca4ecSZhang Chen-chardev socket,id=compare0-0,host=3.3.3.3,port=9001 171c2387413SDaniel P. Berrangé-chardev socket,id=compare_out,host=3.3.3.3,port=9005,server=on,wait=off 17246cca4ecSZhang Chen-chardev socket,id=compare_out0,host=3.3.3.3,port=9005 173861d51e6SWang Yong-object iothread,id=iothread1 17446cca4ecSZhang Chen-object filter-mirror,id=m0,netdev=hn0,queue=tx,outdev=mirror0 17546cca4ecSZhang Chen-object filter-redirector,netdev=hn0,id=redire0,queue=rx,indev=compare_out 17646cca4ecSZhang Chen-object filter-redirector,netdev=hn0,id=redire1,queue=rx,outdev=compare0 177861d51e6SWang Yong-object colo-compare,id=comp0,primary_in=compare0-0,secondary_in=compare1,outdev=compare_out0,iothread=iothread1 17846cca4ecSZhang Chen 17946cca4ecSZhang ChenSecondary(ip:3.3.3.8): 180*7aa94e59SMichael Tokarev-netdev tap,id=hn0,vhost=off 18146cca4ecSZhang Chen-device e1000,netdev=hn0,mac=52:a4:00:12:78:66 18246cca4ecSZhang Chen-chardev socket,id=red0,host=3.3.3.3,port=9003 18346cca4ecSZhang Chen-chardev socket,id=red1,host=3.3.3.3,port=9004 18446cca4ecSZhang Chen-object filter-redirector,id=f1,netdev=hn0,queue=tx,indev=red0 18546cca4ecSZhang Chen-object filter-redirector,id=f2,netdev=hn0,queue=rx,outdev=red1 1862484ff06SZhang Chen-object filter-rewriter,id=f3,netdev=hn0,queue=all 1872484ff06SZhang Chen 1882484ff06SZhang ChenIf you want to use virtio-net-pci or other driver with vnet_header: 1892484ff06SZhang Chen 1902484ff06SZhang ChenPrimary(ip:3.3.3.3): 1912484ff06SZhang Chen-netdev tap,id=hn0,vhost=off,script=/etc/qemu-ifup,downscript=/etc/qemu-ifdown 1922484ff06SZhang Chen-device e1000,id=e0,netdev=hn0,mac=52:a4:00:12:78:66 193c2387413SDaniel P. Berrangé-chardev socket,id=mirror0,host=3.3.3.3,port=9003,server=on,wait=off 194c2387413SDaniel P. Berrangé-chardev socket,id=compare1,host=3.3.3.3,port=9004,server=on,wait=off 195c2387413SDaniel P. Berrangé-chardev socket,id=compare0,host=3.3.3.3,port=9001,server=on,wait=off 1962484ff06SZhang Chen-chardev socket,id=compare0-0,host=3.3.3.3,port=9001 197c2387413SDaniel P. Berrangé-chardev socket,id=compare_out,host=3.3.3.3,port=9005,server=on,wait=off 1982484ff06SZhang Chen-chardev socket,id=compare_out0,host=3.3.3.3,port=9005 1992484ff06SZhang Chen-object filter-mirror,id=m0,netdev=hn0,queue=tx,outdev=mirror0,vnet_hdr_support 2002484ff06SZhang Chen-object filter-redirector,netdev=hn0,id=redire0,queue=rx,indev=compare_out,vnet_hdr_support 2012484ff06SZhang Chen-object filter-redirector,netdev=hn0,id=redire1,queue=rx,outdev=compare0,vnet_hdr_support 2022484ff06SZhang Chen-object colo-compare,id=comp0,primary_in=compare0-0,secondary_in=compare1,outdev=compare_out0,vnet_hdr_support 2032484ff06SZhang Chen 2042484ff06SZhang ChenSecondary(ip:3.3.3.8): 205*7aa94e59SMichael Tokarev-netdev tap,id=hn0,vhost=off 2062484ff06SZhang Chen-device e1000,netdev=hn0,mac=52:a4:00:12:78:66 2072484ff06SZhang Chen-chardev socket,id=red0,host=3.3.3.3,port=9003 2082484ff06SZhang Chen-chardev socket,id=red1,host=3.3.3.3,port=9004 2092484ff06SZhang Chen-object filter-redirector,id=f1,netdev=hn0,queue=tx,indev=red0,vnet_hdr_support 2102484ff06SZhang Chen-object filter-redirector,id=f2,netdev=hn0,queue=rx,outdev=red1,vnet_hdr_support 2112484ff06SZhang Chen-object filter-rewriter,id=f3,netdev=hn0,queue=all,vnet_hdr_support 21246cca4ecSZhang Chen 21346cca4ecSZhang ChenNote: 21446cca4ecSZhang Chen a.COLO-proxy must work with COLO-frame and Block-replication. 21546cca4ecSZhang Chen b.Primary COLO must be started firstly, because COLO-proxy needs 21646cca4ecSZhang Chen chardev socket server running before secondary started. 21746cca4ecSZhang Chen c.Filter-rewriter only rewrite tcp packet. 218