xref: /openbmc/linux/Documentation/networking/xfrm_sync.rst (revision 4b4193256c8d3bc3a5397b5cd9494c2ad386317d)
1*a5cfea33SMauro Carvalho Chehab.. SPDX-License-Identifier: GPL-2.0
2*a5cfea33SMauro Carvalho Chehab
3*a5cfea33SMauro Carvalho Chehab====
4*a5cfea33SMauro Carvalho ChehabXFRM
5*a5cfea33SMauro Carvalho Chehab====
6*a5cfea33SMauro Carvalho Chehab
7*a5cfea33SMauro Carvalho ChehabThe sync patches work is based on initial patches from
8*a5cfea33SMauro Carvalho ChehabKrisztian <hidden@balabit.hu> and others and additional patches
9*a5cfea33SMauro Carvalho Chehabfrom Jamal <hadi@cyberus.ca>.
10*a5cfea33SMauro Carvalho Chehab
11*a5cfea33SMauro Carvalho ChehabThe end goal for syncing is to be able to insert attributes + generate
12*a5cfea33SMauro Carvalho Chehabevents so that the SA can be safely moved from one machine to another
13*a5cfea33SMauro Carvalho Chehabfor HA purposes.
14*a5cfea33SMauro Carvalho ChehabThe idea is to synchronize the SA so that the takeover machine can do
15*a5cfea33SMauro Carvalho Chehabthe processing of the SA as accurate as possible if it has access to it.
16*a5cfea33SMauro Carvalho Chehab
17*a5cfea33SMauro Carvalho ChehabWe already have the ability to generate SA add/del/upd events.
18*a5cfea33SMauro Carvalho ChehabThese patches add ability to sync and have accurate lifetime byte (to
19*a5cfea33SMauro Carvalho Chehabensure proper decay of SAs) and replay counters to avoid replay attacks
20*a5cfea33SMauro Carvalho Chehabwith as minimal loss at failover time.
21*a5cfea33SMauro Carvalho ChehabThis way a backup stays as closely up-to-date as an active member.
22*a5cfea33SMauro Carvalho Chehab
23*a5cfea33SMauro Carvalho ChehabBecause the above items change for every packet the SA receives,
24*a5cfea33SMauro Carvalho Chehabit is possible for a lot of the events to be generated.
25*a5cfea33SMauro Carvalho ChehabFor this reason, we also add a nagle-like algorithm to restrict
26*a5cfea33SMauro Carvalho Chehabthe events. i.e we are going to set thresholds to say "let me
27*a5cfea33SMauro Carvalho Chehabknow if the replay sequence threshold is reached or 10 secs have passed"
28*a5cfea33SMauro Carvalho ChehabThese thresholds are set system-wide via sysctls or can be updated
29*a5cfea33SMauro Carvalho Chehabper SA.
30*a5cfea33SMauro Carvalho Chehab
31*a5cfea33SMauro Carvalho ChehabThe identified items that need to be synchronized are:
32*a5cfea33SMauro Carvalho Chehab- the lifetime byte counter
33*a5cfea33SMauro Carvalho Chehabnote that: lifetime time limit is not important if you assume the failover
34*a5cfea33SMauro Carvalho Chehabmachine is known ahead of time since the decay of the time countdown
35*a5cfea33SMauro Carvalho Chehabis not driven by packet arrival.
36*a5cfea33SMauro Carvalho Chehab- the replay sequence for both inbound and outbound
37*a5cfea33SMauro Carvalho Chehab
38*a5cfea33SMauro Carvalho Chehab1) Message Structure
39*a5cfea33SMauro Carvalho Chehab----------------------
40*a5cfea33SMauro Carvalho Chehab
41*a5cfea33SMauro Carvalho Chehabnlmsghdr:aevent_id:optional-TLVs.
42*a5cfea33SMauro Carvalho Chehab
43*a5cfea33SMauro Carvalho ChehabThe netlink message types are:
44*a5cfea33SMauro Carvalho Chehab
45*a5cfea33SMauro Carvalho ChehabXFRM_MSG_NEWAE and XFRM_MSG_GETAE.
46*a5cfea33SMauro Carvalho Chehab
47*a5cfea33SMauro Carvalho ChehabA XFRM_MSG_GETAE does not have TLVs.
48*a5cfea33SMauro Carvalho Chehab
49*a5cfea33SMauro Carvalho ChehabA XFRM_MSG_NEWAE will have at least two TLVs (as is
50*a5cfea33SMauro Carvalho Chehabdiscussed further below).
51*a5cfea33SMauro Carvalho Chehab
52*a5cfea33SMauro Carvalho Chehabaevent_id structure looks like::
53*a5cfea33SMauro Carvalho Chehab
54*a5cfea33SMauro Carvalho Chehab   struct xfrm_aevent_id {
55*a5cfea33SMauro Carvalho Chehab	     struct xfrm_usersa_id           sa_id;
56*a5cfea33SMauro Carvalho Chehab	     xfrm_address_t                  saddr;
57*a5cfea33SMauro Carvalho Chehab	     __u32                           flags;
58*a5cfea33SMauro Carvalho Chehab	     __u32                           reqid;
59*a5cfea33SMauro Carvalho Chehab   };
60*a5cfea33SMauro Carvalho Chehab
61*a5cfea33SMauro Carvalho ChehabThe unique SA is identified by the combination of xfrm_usersa_id,
62*a5cfea33SMauro Carvalho Chehabreqid and saddr.
63*a5cfea33SMauro Carvalho Chehab
64*a5cfea33SMauro Carvalho Chehabflags are used to indicate different things. The possible
65*a5cfea33SMauro Carvalho Chehabflags are::
66*a5cfea33SMauro Carvalho Chehab
67*a5cfea33SMauro Carvalho Chehab	XFRM_AE_RTHR=1, /* replay threshold*/
68*a5cfea33SMauro Carvalho Chehab	XFRM_AE_RVAL=2, /* replay value */
69*a5cfea33SMauro Carvalho Chehab	XFRM_AE_LVAL=4, /* lifetime value */
70*a5cfea33SMauro Carvalho Chehab	XFRM_AE_ETHR=8, /* expiry timer threshold */
71*a5cfea33SMauro Carvalho Chehab	XFRM_AE_CR=16, /* Event cause is replay update */
72*a5cfea33SMauro Carvalho Chehab	XFRM_AE_CE=32, /* Event cause is timer expiry */
73*a5cfea33SMauro Carvalho Chehab	XFRM_AE_CU=64, /* Event cause is policy update */
74*a5cfea33SMauro Carvalho Chehab
75*a5cfea33SMauro Carvalho ChehabHow these flags are used is dependent on the direction of the
76*a5cfea33SMauro Carvalho Chehabmessage (kernel<->user) as well the cause (config, query or event).
77*a5cfea33SMauro Carvalho ChehabThis is described below in the different messages.
78*a5cfea33SMauro Carvalho Chehab
79*a5cfea33SMauro Carvalho ChehabThe pid will be set appropriately in netlink to recognize direction
80*a5cfea33SMauro Carvalho Chehab(0 to the kernel and pid = processid that created the event
81*a5cfea33SMauro Carvalho Chehabwhen going from kernel to user space)
82*a5cfea33SMauro Carvalho Chehab
83*a5cfea33SMauro Carvalho ChehabA program needs to subscribe to multicast group XFRMNLGRP_AEVENTS
84*a5cfea33SMauro Carvalho Chehabto get notified of these events.
85*a5cfea33SMauro Carvalho Chehab
86*a5cfea33SMauro Carvalho Chehab2) TLVS reflect the different parameters:
87*a5cfea33SMauro Carvalho Chehab-----------------------------------------
88*a5cfea33SMauro Carvalho Chehab
89*a5cfea33SMauro Carvalho Chehaba) byte value (XFRMA_LTIME_VAL)
90*a5cfea33SMauro Carvalho Chehab
91*a5cfea33SMauro Carvalho ChehabThis TLV carries the running/current counter for byte lifetime since
92*a5cfea33SMauro Carvalho Chehablast event.
93*a5cfea33SMauro Carvalho Chehab
94*a5cfea33SMauro Carvalho Chehabb)replay value (XFRMA_REPLAY_VAL)
95*a5cfea33SMauro Carvalho Chehab
96*a5cfea33SMauro Carvalho ChehabThis TLV carries the running/current counter for replay sequence since
97*a5cfea33SMauro Carvalho Chehablast event.
98*a5cfea33SMauro Carvalho Chehab
99*a5cfea33SMauro Carvalho Chehabc)replay threshold (XFRMA_REPLAY_THRESH)
100*a5cfea33SMauro Carvalho Chehab
101*a5cfea33SMauro Carvalho ChehabThis TLV carries the threshold being used by the kernel to trigger events
102*a5cfea33SMauro Carvalho Chehabwhen the replay sequence is exceeded.
103*a5cfea33SMauro Carvalho Chehab
104*a5cfea33SMauro Carvalho Chehabd) expiry timer (XFRMA_ETIMER_THRESH)
105*a5cfea33SMauro Carvalho Chehab
106*a5cfea33SMauro Carvalho ChehabThis is a timer value in milliseconds which is used as the nagle
107*a5cfea33SMauro Carvalho Chehabvalue to rate limit the events.
108*a5cfea33SMauro Carvalho Chehab
109*a5cfea33SMauro Carvalho Chehab3) Default configurations for the parameters:
110*a5cfea33SMauro Carvalho Chehab---------------------------------------------
111*a5cfea33SMauro Carvalho Chehab
112*a5cfea33SMauro Carvalho ChehabBy default these events should be turned off unless there is
113*a5cfea33SMauro Carvalho Chehabat least one listener registered to listen to the multicast
114*a5cfea33SMauro Carvalho Chehabgroup XFRMNLGRP_AEVENTS.
115*a5cfea33SMauro Carvalho Chehab
116*a5cfea33SMauro Carvalho ChehabPrograms installing SAs will need to specify the two thresholds, however,
117*a5cfea33SMauro Carvalho Chehabin order to not change existing applications such as racoon
118*a5cfea33SMauro Carvalho Chehabwe also provide default threshold values for these different parameters
119*a5cfea33SMauro Carvalho Chehabin case they are not specified.
120*a5cfea33SMauro Carvalho Chehab
121*a5cfea33SMauro Carvalho Chehabthe two sysctls/proc entries are:
122*a5cfea33SMauro Carvalho Chehab
123*a5cfea33SMauro Carvalho Chehaba) /proc/sys/net/core/sysctl_xfrm_aevent_etime
124*a5cfea33SMauro Carvalho Chehabused to provide default values for the XFRMA_ETIMER_THRESH in incremental
125*a5cfea33SMauro Carvalho Chehabunits of time of 100ms. The default is 10 (1 second)
126*a5cfea33SMauro Carvalho Chehab
127*a5cfea33SMauro Carvalho Chehabb) /proc/sys/net/core/sysctl_xfrm_aevent_rseqth
128*a5cfea33SMauro Carvalho Chehabused to provide default values for XFRMA_REPLAY_THRESH parameter
129*a5cfea33SMauro Carvalho Chehabin incremental packet count. The default is two packets.
130*a5cfea33SMauro Carvalho Chehab
131*a5cfea33SMauro Carvalho Chehab4) Message types
132*a5cfea33SMauro Carvalho Chehab----------------
133*a5cfea33SMauro Carvalho Chehab
134*a5cfea33SMauro Carvalho Chehaba) XFRM_MSG_GETAE issued by user-->kernel.
135*a5cfea33SMauro Carvalho Chehab   XFRM_MSG_GETAE does not carry any TLVs.
136*a5cfea33SMauro Carvalho Chehab
137*a5cfea33SMauro Carvalho ChehabThe response is a XFRM_MSG_NEWAE which is formatted based on what
138*a5cfea33SMauro Carvalho ChehabXFRM_MSG_GETAE queried for.
139*a5cfea33SMauro Carvalho Chehab
140*a5cfea33SMauro Carvalho ChehabThe response will always have XFRMA_LTIME_VAL and XFRMA_REPLAY_VAL TLVs.
141*a5cfea33SMauro Carvalho Chehab* if XFRM_AE_RTHR flag is set, then XFRMA_REPLAY_THRESH is also retrieved
142*a5cfea33SMauro Carvalho Chehab* if XFRM_AE_ETHR flag is set, then XFRMA_ETIMER_THRESH is also retrieved
143*a5cfea33SMauro Carvalho Chehab
144*a5cfea33SMauro Carvalho Chehabb) XFRM_MSG_NEWAE is issued by either user space to configure
145*a5cfea33SMauro Carvalho Chehab   or kernel to announce events or respond to a XFRM_MSG_GETAE.
146*a5cfea33SMauro Carvalho Chehab
147*a5cfea33SMauro Carvalho Chehabi) user --> kernel to configure a specific SA.
148*a5cfea33SMauro Carvalho Chehab
149*a5cfea33SMauro Carvalho Chehabany of the values or threshold parameters can be updated by passing the
150*a5cfea33SMauro Carvalho Chehabappropriate TLV.
151*a5cfea33SMauro Carvalho Chehab
152*a5cfea33SMauro Carvalho ChehabA response is issued back to the sender in user space to indicate success
153*a5cfea33SMauro Carvalho Chehabor failure.
154*a5cfea33SMauro Carvalho Chehab
155*a5cfea33SMauro Carvalho ChehabIn the case of success, additionally an event with
156*a5cfea33SMauro Carvalho ChehabXFRM_MSG_NEWAE is also issued to any listeners as described in iii).
157*a5cfea33SMauro Carvalho Chehab
158*a5cfea33SMauro Carvalho Chehabii) kernel->user direction as a response to XFRM_MSG_GETAE
159*a5cfea33SMauro Carvalho Chehab
160*a5cfea33SMauro Carvalho ChehabThe response will always have XFRMA_LTIME_VAL and XFRMA_REPLAY_VAL TLVs.
161*a5cfea33SMauro Carvalho Chehab
162*a5cfea33SMauro Carvalho ChehabThe threshold TLVs will be included if explicitly requested in
163*a5cfea33SMauro Carvalho Chehabthe XFRM_MSG_GETAE message.
164*a5cfea33SMauro Carvalho Chehab
165*a5cfea33SMauro Carvalho Chehabiii) kernel->user to report as event if someone sets any values or
166*a5cfea33SMauro Carvalho Chehab     thresholds for an SA using XFRM_MSG_NEWAE (as described in #i above).
167*a5cfea33SMauro Carvalho Chehab     In such a case XFRM_AE_CU flag is set to inform the user that
168*a5cfea33SMauro Carvalho Chehab     the change happened as a result of an update.
169*a5cfea33SMauro Carvalho Chehab     The message will always have XFRMA_LTIME_VAL and XFRMA_REPLAY_VAL TLVs.
170*a5cfea33SMauro Carvalho Chehab
171*a5cfea33SMauro Carvalho Chehabiv) kernel->user to report event when replay threshold or a timeout
172*a5cfea33SMauro Carvalho Chehab    is exceeded.
173*a5cfea33SMauro Carvalho Chehab
174*a5cfea33SMauro Carvalho ChehabIn such a case either XFRM_AE_CR (replay exceeded) or XFRM_AE_CE (timeout
175*a5cfea33SMauro Carvalho Chehabhappened) is set to inform the user what happened.
176*a5cfea33SMauro Carvalho ChehabNote the two flags are mutually exclusive.
177*a5cfea33SMauro Carvalho ChehabThe message will always have XFRMA_LTIME_VAL and XFRMA_REPLAY_VAL TLVs.
178*a5cfea33SMauro Carvalho Chehab
179*a5cfea33SMauro Carvalho ChehabExceptions to threshold settings
180*a5cfea33SMauro Carvalho Chehab--------------------------------
181*a5cfea33SMauro Carvalho Chehab
182*a5cfea33SMauro Carvalho ChehabIf you have an SA that is getting hit by traffic in bursts such that
183*a5cfea33SMauro Carvalho Chehabthere is a period where the timer threshold expires with no packets
184*a5cfea33SMauro Carvalho Chehabseen, then an odd behavior is seen as follows:
185*a5cfea33SMauro Carvalho ChehabThe first packet arrival after a timer expiry will trigger a timeout
186*a5cfea33SMauro Carvalho Chehabevent; i.e we don't wait for a timeout period or a packet threshold
187*a5cfea33SMauro Carvalho Chehabto be reached. This is done for simplicity and efficiency reasons.
188*a5cfea33SMauro Carvalho Chehab
189*a5cfea33SMauro Carvalho Chehab-JHS
190