1*a5cfea33SMauro Carvalho Chehab.. SPDX-License-Identifier: GPL-2.0 2*a5cfea33SMauro Carvalho Chehab 3*a5cfea33SMauro Carvalho Chehab==== 4*a5cfea33SMauro Carvalho ChehabXFRM 5*a5cfea33SMauro Carvalho Chehab==== 6*a5cfea33SMauro Carvalho Chehab 7*a5cfea33SMauro Carvalho ChehabThe sync patches work is based on initial patches from 8*a5cfea33SMauro Carvalho ChehabKrisztian <hidden@balabit.hu> and others and additional patches 9*a5cfea33SMauro Carvalho Chehabfrom Jamal <hadi@cyberus.ca>. 10*a5cfea33SMauro Carvalho Chehab 11*a5cfea33SMauro Carvalho ChehabThe end goal for syncing is to be able to insert attributes + generate 12*a5cfea33SMauro Carvalho Chehabevents so that the SA can be safely moved from one machine to another 13*a5cfea33SMauro Carvalho Chehabfor HA purposes. 14*a5cfea33SMauro Carvalho ChehabThe idea is to synchronize the SA so that the takeover machine can do 15*a5cfea33SMauro Carvalho Chehabthe processing of the SA as accurate as possible if it has access to it. 16*a5cfea33SMauro Carvalho Chehab 17*a5cfea33SMauro Carvalho ChehabWe already have the ability to generate SA add/del/upd events. 18*a5cfea33SMauro Carvalho ChehabThese patches add ability to sync and have accurate lifetime byte (to 19*a5cfea33SMauro Carvalho Chehabensure proper decay of SAs) and replay counters to avoid replay attacks 20*a5cfea33SMauro Carvalho Chehabwith as minimal loss at failover time. 21*a5cfea33SMauro Carvalho ChehabThis way a backup stays as closely up-to-date as an active member. 22*a5cfea33SMauro Carvalho Chehab 23*a5cfea33SMauro Carvalho ChehabBecause the above items change for every packet the SA receives, 24*a5cfea33SMauro Carvalho Chehabit is possible for a lot of the events to be generated. 25*a5cfea33SMauro Carvalho ChehabFor this reason, we also add a nagle-like algorithm to restrict 26*a5cfea33SMauro Carvalho Chehabthe events. i.e we are going to set thresholds to say "let me 27*a5cfea33SMauro Carvalho Chehabknow if the replay sequence threshold is reached or 10 secs have passed" 28*a5cfea33SMauro Carvalho ChehabThese thresholds are set system-wide via sysctls or can be updated 29*a5cfea33SMauro Carvalho Chehabper SA. 30*a5cfea33SMauro Carvalho Chehab 31*a5cfea33SMauro Carvalho ChehabThe identified items that need to be synchronized are: 32*a5cfea33SMauro Carvalho Chehab- the lifetime byte counter 33*a5cfea33SMauro Carvalho Chehabnote that: lifetime time limit is not important if you assume the failover 34*a5cfea33SMauro Carvalho Chehabmachine is known ahead of time since the decay of the time countdown 35*a5cfea33SMauro Carvalho Chehabis not driven by packet arrival. 36*a5cfea33SMauro Carvalho Chehab- the replay sequence for both inbound and outbound 37*a5cfea33SMauro Carvalho Chehab 38*a5cfea33SMauro Carvalho Chehab1) Message Structure 39*a5cfea33SMauro Carvalho Chehab---------------------- 40*a5cfea33SMauro Carvalho Chehab 41*a5cfea33SMauro Carvalho Chehabnlmsghdr:aevent_id:optional-TLVs. 42*a5cfea33SMauro Carvalho Chehab 43*a5cfea33SMauro Carvalho ChehabThe netlink message types are: 44*a5cfea33SMauro Carvalho Chehab 45*a5cfea33SMauro Carvalho ChehabXFRM_MSG_NEWAE and XFRM_MSG_GETAE. 46*a5cfea33SMauro Carvalho Chehab 47*a5cfea33SMauro Carvalho ChehabA XFRM_MSG_GETAE does not have TLVs. 48*a5cfea33SMauro Carvalho Chehab 49*a5cfea33SMauro Carvalho ChehabA XFRM_MSG_NEWAE will have at least two TLVs (as is 50*a5cfea33SMauro Carvalho Chehabdiscussed further below). 51*a5cfea33SMauro Carvalho Chehab 52*a5cfea33SMauro Carvalho Chehabaevent_id structure looks like:: 53*a5cfea33SMauro Carvalho Chehab 54*a5cfea33SMauro Carvalho Chehab struct xfrm_aevent_id { 55*a5cfea33SMauro Carvalho Chehab struct xfrm_usersa_id sa_id; 56*a5cfea33SMauro Carvalho Chehab xfrm_address_t saddr; 57*a5cfea33SMauro Carvalho Chehab __u32 flags; 58*a5cfea33SMauro Carvalho Chehab __u32 reqid; 59*a5cfea33SMauro Carvalho Chehab }; 60*a5cfea33SMauro Carvalho Chehab 61*a5cfea33SMauro Carvalho ChehabThe unique SA is identified by the combination of xfrm_usersa_id, 62*a5cfea33SMauro Carvalho Chehabreqid and saddr. 63*a5cfea33SMauro Carvalho Chehab 64*a5cfea33SMauro Carvalho Chehabflags are used to indicate different things. The possible 65*a5cfea33SMauro Carvalho Chehabflags are:: 66*a5cfea33SMauro Carvalho Chehab 67*a5cfea33SMauro Carvalho Chehab XFRM_AE_RTHR=1, /* replay threshold*/ 68*a5cfea33SMauro Carvalho Chehab XFRM_AE_RVAL=2, /* replay value */ 69*a5cfea33SMauro Carvalho Chehab XFRM_AE_LVAL=4, /* lifetime value */ 70*a5cfea33SMauro Carvalho Chehab XFRM_AE_ETHR=8, /* expiry timer threshold */ 71*a5cfea33SMauro Carvalho Chehab XFRM_AE_CR=16, /* Event cause is replay update */ 72*a5cfea33SMauro Carvalho Chehab XFRM_AE_CE=32, /* Event cause is timer expiry */ 73*a5cfea33SMauro Carvalho Chehab XFRM_AE_CU=64, /* Event cause is policy update */ 74*a5cfea33SMauro Carvalho Chehab 75*a5cfea33SMauro Carvalho ChehabHow these flags are used is dependent on the direction of the 76*a5cfea33SMauro Carvalho Chehabmessage (kernel<->user) as well the cause (config, query or event). 77*a5cfea33SMauro Carvalho ChehabThis is described below in the different messages. 78*a5cfea33SMauro Carvalho Chehab 79*a5cfea33SMauro Carvalho ChehabThe pid will be set appropriately in netlink to recognize direction 80*a5cfea33SMauro Carvalho Chehab(0 to the kernel and pid = processid that created the event 81*a5cfea33SMauro Carvalho Chehabwhen going from kernel to user space) 82*a5cfea33SMauro Carvalho Chehab 83*a5cfea33SMauro Carvalho ChehabA program needs to subscribe to multicast group XFRMNLGRP_AEVENTS 84*a5cfea33SMauro Carvalho Chehabto get notified of these events. 85*a5cfea33SMauro Carvalho Chehab 86*a5cfea33SMauro Carvalho Chehab2) TLVS reflect the different parameters: 87*a5cfea33SMauro Carvalho Chehab----------------------------------------- 88*a5cfea33SMauro Carvalho Chehab 89*a5cfea33SMauro Carvalho Chehaba) byte value (XFRMA_LTIME_VAL) 90*a5cfea33SMauro Carvalho Chehab 91*a5cfea33SMauro Carvalho ChehabThis TLV carries the running/current counter for byte lifetime since 92*a5cfea33SMauro Carvalho Chehablast event. 93*a5cfea33SMauro Carvalho Chehab 94*a5cfea33SMauro Carvalho Chehabb)replay value (XFRMA_REPLAY_VAL) 95*a5cfea33SMauro Carvalho Chehab 96*a5cfea33SMauro Carvalho ChehabThis TLV carries the running/current counter for replay sequence since 97*a5cfea33SMauro Carvalho Chehablast event. 98*a5cfea33SMauro Carvalho Chehab 99*a5cfea33SMauro Carvalho Chehabc)replay threshold (XFRMA_REPLAY_THRESH) 100*a5cfea33SMauro Carvalho Chehab 101*a5cfea33SMauro Carvalho ChehabThis TLV carries the threshold being used by the kernel to trigger events 102*a5cfea33SMauro Carvalho Chehabwhen the replay sequence is exceeded. 103*a5cfea33SMauro Carvalho Chehab 104*a5cfea33SMauro Carvalho Chehabd) expiry timer (XFRMA_ETIMER_THRESH) 105*a5cfea33SMauro Carvalho Chehab 106*a5cfea33SMauro Carvalho ChehabThis is a timer value in milliseconds which is used as the nagle 107*a5cfea33SMauro Carvalho Chehabvalue to rate limit the events. 108*a5cfea33SMauro Carvalho Chehab 109*a5cfea33SMauro Carvalho Chehab3) Default configurations for the parameters: 110*a5cfea33SMauro Carvalho Chehab--------------------------------------------- 111*a5cfea33SMauro Carvalho Chehab 112*a5cfea33SMauro Carvalho ChehabBy default these events should be turned off unless there is 113*a5cfea33SMauro Carvalho Chehabat least one listener registered to listen to the multicast 114*a5cfea33SMauro Carvalho Chehabgroup XFRMNLGRP_AEVENTS. 115*a5cfea33SMauro Carvalho Chehab 116*a5cfea33SMauro Carvalho ChehabPrograms installing SAs will need to specify the two thresholds, however, 117*a5cfea33SMauro Carvalho Chehabin order to not change existing applications such as racoon 118*a5cfea33SMauro Carvalho Chehabwe also provide default threshold values for these different parameters 119*a5cfea33SMauro Carvalho Chehabin case they are not specified. 120*a5cfea33SMauro Carvalho Chehab 121*a5cfea33SMauro Carvalho Chehabthe two sysctls/proc entries are: 122*a5cfea33SMauro Carvalho Chehab 123*a5cfea33SMauro Carvalho Chehaba) /proc/sys/net/core/sysctl_xfrm_aevent_etime 124*a5cfea33SMauro Carvalho Chehabused to provide default values for the XFRMA_ETIMER_THRESH in incremental 125*a5cfea33SMauro Carvalho Chehabunits of time of 100ms. The default is 10 (1 second) 126*a5cfea33SMauro Carvalho Chehab 127*a5cfea33SMauro Carvalho Chehabb) /proc/sys/net/core/sysctl_xfrm_aevent_rseqth 128*a5cfea33SMauro Carvalho Chehabused to provide default values for XFRMA_REPLAY_THRESH parameter 129*a5cfea33SMauro Carvalho Chehabin incremental packet count. The default is two packets. 130*a5cfea33SMauro Carvalho Chehab 131*a5cfea33SMauro Carvalho Chehab4) Message types 132*a5cfea33SMauro Carvalho Chehab---------------- 133*a5cfea33SMauro Carvalho Chehab 134*a5cfea33SMauro Carvalho Chehaba) XFRM_MSG_GETAE issued by user-->kernel. 135*a5cfea33SMauro Carvalho Chehab XFRM_MSG_GETAE does not carry any TLVs. 136*a5cfea33SMauro Carvalho Chehab 137*a5cfea33SMauro Carvalho ChehabThe response is a XFRM_MSG_NEWAE which is formatted based on what 138*a5cfea33SMauro Carvalho ChehabXFRM_MSG_GETAE queried for. 139*a5cfea33SMauro Carvalho Chehab 140*a5cfea33SMauro Carvalho ChehabThe response will always have XFRMA_LTIME_VAL and XFRMA_REPLAY_VAL TLVs. 141*a5cfea33SMauro Carvalho Chehab* if XFRM_AE_RTHR flag is set, then XFRMA_REPLAY_THRESH is also retrieved 142*a5cfea33SMauro Carvalho Chehab* if XFRM_AE_ETHR flag is set, then XFRMA_ETIMER_THRESH is also retrieved 143*a5cfea33SMauro Carvalho Chehab 144*a5cfea33SMauro Carvalho Chehabb) XFRM_MSG_NEWAE is issued by either user space to configure 145*a5cfea33SMauro Carvalho Chehab or kernel to announce events or respond to a XFRM_MSG_GETAE. 146*a5cfea33SMauro Carvalho Chehab 147*a5cfea33SMauro Carvalho Chehabi) user --> kernel to configure a specific SA. 148*a5cfea33SMauro Carvalho Chehab 149*a5cfea33SMauro Carvalho Chehabany of the values or threshold parameters can be updated by passing the 150*a5cfea33SMauro Carvalho Chehabappropriate TLV. 151*a5cfea33SMauro Carvalho Chehab 152*a5cfea33SMauro Carvalho ChehabA response is issued back to the sender in user space to indicate success 153*a5cfea33SMauro Carvalho Chehabor failure. 154*a5cfea33SMauro Carvalho Chehab 155*a5cfea33SMauro Carvalho ChehabIn the case of success, additionally an event with 156*a5cfea33SMauro Carvalho ChehabXFRM_MSG_NEWAE is also issued to any listeners as described in iii). 157*a5cfea33SMauro Carvalho Chehab 158*a5cfea33SMauro Carvalho Chehabii) kernel->user direction as a response to XFRM_MSG_GETAE 159*a5cfea33SMauro Carvalho Chehab 160*a5cfea33SMauro Carvalho ChehabThe response will always have XFRMA_LTIME_VAL and XFRMA_REPLAY_VAL TLVs. 161*a5cfea33SMauro Carvalho Chehab 162*a5cfea33SMauro Carvalho ChehabThe threshold TLVs will be included if explicitly requested in 163*a5cfea33SMauro Carvalho Chehabthe XFRM_MSG_GETAE message. 164*a5cfea33SMauro Carvalho Chehab 165*a5cfea33SMauro Carvalho Chehabiii) kernel->user to report as event if someone sets any values or 166*a5cfea33SMauro Carvalho Chehab thresholds for an SA using XFRM_MSG_NEWAE (as described in #i above). 167*a5cfea33SMauro Carvalho Chehab In such a case XFRM_AE_CU flag is set to inform the user that 168*a5cfea33SMauro Carvalho Chehab the change happened as a result of an update. 169*a5cfea33SMauro Carvalho Chehab The message will always have XFRMA_LTIME_VAL and XFRMA_REPLAY_VAL TLVs. 170*a5cfea33SMauro Carvalho Chehab 171*a5cfea33SMauro Carvalho Chehabiv) kernel->user to report event when replay threshold or a timeout 172*a5cfea33SMauro Carvalho Chehab is exceeded. 173*a5cfea33SMauro Carvalho Chehab 174*a5cfea33SMauro Carvalho ChehabIn such a case either XFRM_AE_CR (replay exceeded) or XFRM_AE_CE (timeout 175*a5cfea33SMauro Carvalho Chehabhappened) is set to inform the user what happened. 176*a5cfea33SMauro Carvalho ChehabNote the two flags are mutually exclusive. 177*a5cfea33SMauro Carvalho ChehabThe message will always have XFRMA_LTIME_VAL and XFRMA_REPLAY_VAL TLVs. 178*a5cfea33SMauro Carvalho Chehab 179*a5cfea33SMauro Carvalho ChehabExceptions to threshold settings 180*a5cfea33SMauro Carvalho Chehab-------------------------------- 181*a5cfea33SMauro Carvalho Chehab 182*a5cfea33SMauro Carvalho ChehabIf you have an SA that is getting hit by traffic in bursts such that 183*a5cfea33SMauro Carvalho Chehabthere is a period where the timer threshold expires with no packets 184*a5cfea33SMauro Carvalho Chehabseen, then an odd behavior is seen as follows: 185*a5cfea33SMauro Carvalho ChehabThe first packet arrival after a timer expiry will trigger a timeout 186*a5cfea33SMauro Carvalho Chehabevent; i.e we don't wait for a timeout period or a packet threshold 187*a5cfea33SMauro Carvalho Chehabto be reached. This is done for simplicity and efficiency reasons. 188*a5cfea33SMauro Carvalho Chehab 189*a5cfea33SMauro Carvalho Chehab-JHS 190