dr_icm_pool.c - OpenGrok history log for /openbmc/linux/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_icm

Revision (<<< Hide revision tags) (Show revision tags >>>)	Date	Author	Comments
Revision tags: v6.6.25, v6.6.24, v6.6.23, v6.6.16, v6.6.15, v6.6.14, v6.6.13, v6.6.12, v6.6.11, v6.6.10, v6.6.9, v6.6.8, v6.6.7, v6.6.6, v6.6.5, v6.6.4, v6.6.3, v6.6.2, v6.5.11, v6.6.1, v6.5.10, v6.6, v6.5.9, v6.5.8, v6.5.7, v6.5.6, v6.5.5, v6.5.4, v6.5.3, v6.5.2, v6.1.51, v6.5.1, v6.1.50, v6.5, v6.1.49, v6.1.48, v6.1.46, v6.1.45, v6.1.44, v6.1.43, v6.1.42, v6.1.41, v6.1.40, v6.1.39, v6.1.38, v6.1.37, v6.1.36, v6.4, v6.1.35, v6.1.34, v6.1.33, v6.1.32, v6.1.31, v6.1.30, v6.1.29, v6.1.28, v6.1.27, v6.1.26, v6.3, v6.1.25, v6.1.24, v6.1.23, v6.1.22, v6.1.21, v6.1.20, v6.1.19, v6.1.18, v6.1.17, v6.1.16, v6.1.15, v6.1.14, v6.1.13, v6.2, v6.1.12, v6.1.11, v6.1.10, v6.1.9, v6.1.8, v6.1.7, v6.1.6, v6.1.5, v6.0.19, v6.0.18, v6.1.4, v6.1.3, v6.0.17, v6.1.2, v6.0.16, v6.1.1, v6.0.15, v6.0.14, v6.0.13, v6.1, v6.0.12, v6.0.11, v6.0.10, v5.15.80, v6.0.9, v5.15.79, v6.0.8, v5.15.78
# 57295e06	08-Nov-2022	Yevgeny Kliteynik <kliteyn@nvidia.com>	net/mlx5: DR, Add memory statistics for domain object Add counters for number of buddies that are currently in use per domain per buddy type (STE, MODIFY-HEADER, MODIFY-PATTERN). Signed-off-by: Ere net/mlx5: DR, Add memory statistics for domain object Add counters for number of buddies that are currently in use per domain per buddy type (STE, MODIFY-HEADER, MODIFY-PATTERN). Signed-off-by: Erez Shitrit <erezsh@nvidia.com> Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Alex Vesker <valex@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> show more ...
# 72b2cff6	08-Nov-2022	Yevgeny Kliteynik <kliteyn@nvidia.com>	net/mlx5: DR, Calculate sync threshold of each pool according to its type When certain ICM chunk is no longer needed, it needs to be freed. Fully freeing ICM memory involves issuing FW SYNC_STEERING net/mlx5: DR, Calculate sync threshold of each pool according to its type When certain ICM chunk is no longer needed, it needs to be freed. Fully freeing ICM memory involves issuing FW SYNC_STEERING command. This is very time consuming, and it is impractical to do it for every freed chunk. Instead, we manage these 'freed' chunks in hot list (list of chunks that are not required by SW any more, but HW might still access them). When size of the hot list reaches certain threshold, we purge it and issue SYNC_STEERING FW command. There is one threshold for all the different ICM types, which is not optimal, as different ICM types require different approach: STEs pool is very large, and it is very 'dynamic' in its nature, so letting hot list to become too large will result in a significant perf hiccup when purging the hot list. Modify action is much smaller and less dynamic, so we can let the hot list to grow to almost the size of the whole pool. This patch fixes this problem: instead of having same hot memory threshold for all the pools, sync operation will be triggered in accordance with the ICM type. Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> show more ...
Revision tags: v6.0.7, v5.15.77, v5.15.76, v6.0.6, v6.0.5, v5.15.75, v6.0.4, v6.0.3, v6.0.2, v5.15.74, v5.15.73, v6.0.1, v5.15.72, v6.0, v5.15.71, v5.15.70, v5.15.69, v5.15.68, v5.15.67, v5.15.66, v5.15.65, v5.15.64
# 108ff821	29-Aug-2022	Yevgeny Kliteynik <kliteyn@nvidia.com>	net/mlx5: DR, Add modify-header-pattern ICM pool There is a new ICM area for that memory, so we need to handle it as we did for the others ICM types. The patch added that specific pool with its requ net/mlx5: DR, Add modify-header-pattern ICM pool There is a new ICM area for that memory, so we need to handle it as we did for the others ICM types. The patch added that specific pool with its requirements and management. Signed-off-by: Muhammad Sammar <muhammads@nvidia.com> Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Alex Vesker <valex@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> show more ...
Revision tags: v5.15.63, v5.15.62, v5.15.61, v5.15.60, v5.15.59, v5.19, v5.15.58, v5.15.57, v5.15.56, v5.15.55, v5.15.54, v5.15.53, v5.15.52
# edaea001	30-Jun-2022	Yevgeny Kliteynik <kliteyn@nvidia.com>	net/mlx5: DR, Remove the buddy used_list No need to have the used_list - we don't need to keep track of the used chunks, we only need to know the amount of used memory. Signed-off-by: Yevgeny Klite net/mlx5: DR, Remove the buddy used_list No need to have the used_list - we don't need to keep track of the used chunks, we only need to know the amount of used memory. Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Alex Vesker <valex@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> show more ...
Revision tags: v5.15.51, v5.15.50, v5.15.49, v5.15.48, v5.15.47, v5.15.46, v5.15.45, v5.15.44
# 4519fc45	25-May-2022	Yevgeny Kliteynik <kliteyn@nvidia.com>	net/mlx5: DR, Keep track of hot ICM chunks in an array instead of list When ICM chunk is freed, it might still be accessed by HW until we do sync with HW. This sync is expensive operation, so we don net/mlx5: DR, Keep track of hot ICM chunks in an array instead of list When ICM chunk is freed, it might still be accessed by HW until we do sync with HW. This sync is expensive operation, so we don't do it often. Instead, when the chunk is freed, it is moved to the buddy's "hot memory" list. Once sync is done, we traverse the hot list and finally free all the chunks. It appears that traversing a long list takes unusually long time due to cache misses on many entries, which causes a big "hiccup" during rule insertion. This patch deals with this issue the following way: - Move hot chunks list from buddy to pool, so that the pool will keep track of all its hot memory. - Replace the list with pre-allocated array on the memory pool struct, and store only the information that is needed to later free this chunk in its buddy allocator. This cost additional memory for the array that is dynamically allocated, but it allows not to save long list of hot chunks, so at peak times it actually saves memory due to the fact that each array entry is much smaller than the chunk struct. This way an overhead of traversing the long list is virtually removed: the loop of freeing hot chunks takes ~27 msec instead of ~70 msec, where most of it are the actual freeing activities. Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Alex Vesker <valex@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> show more ...
# 133ea373	26-May-2022	Yevgeny Kliteynik <kliteyn@nvidia.com>	net/mlx5: DR, Lower sync threshold for ICM hot memory Instead of hiding the math in the code, define a value that sets the fraction of allowed hot memory of ICM pool. Set the threshold for sync of I net/mlx5: DR, Lower sync threshold for ICM hot memory Instead of hiding the math in the code, define a value that sets the fraction of allowed hot memory of ICM pool. Set the threshold for sync of ICM hot chunks to 1/4 of the pool instead of 1/2 of the pool. Although we will have more syncs, each sync will be shorter and will help with insertion rate stability. Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Alex Vesker <valex@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> show more ...
# fb628b71	25-May-2022	Yevgeny Kliteynik <kliteyn@nvidia.com>	net/mlx5: DR, Allocate htbl from its own slab allocator SW steering allocates/frees lots of htbl structs. Create a separate kmem_cache and allocate htbls from this allocator. Signed-off-by: Yevgeny net/mlx5: DR, Allocate htbl from its own slab allocator SW steering allocates/frees lots of htbl structs. Create a separate kmem_cache and allocate htbls from this allocator. Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Alex Vesker <valex@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> show more ...
# fd785e52	25-May-2022	Yevgeny Kliteynik <kliteyn@nvidia.com>	net/mlx5: DR, Allocate icm_chunks from their own slab allocator SW steering allocates/frees lots of icm_chunk structs. To make this more efficiently, create a separate kmem_cache and allocate these net/mlx5: DR, Allocate icm_chunks from their own slab allocator SW steering allocates/frees lots of icm_chunk structs. To make this more efficiently, create a separate kmem_cache and allocate these chunks from this allocator. By doing this we observe that the alloc/free "hiccups" frequency has become much lower, which allows for a more steady rule insersion rate. Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Alex Vesker <valex@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> show more ...
# 06ab4a40	25-May-2022	Yevgeny Kliteynik <kliteyn@nvidia.com>	net/mlx5: DR, Initialize chunk's ste_arrays at chunk creation Rather than cleaning the corresponding chunk's section of ste_arrays on chunk deletion, initialize these areas upon chunk creation. Chu net/mlx5: DR, Initialize chunk's ste_arrays at chunk creation Rather than cleaning the corresponding chunk's section of ste_arrays on chunk deletion, initialize these areas upon chunk creation. Chunk destruction tend to come in large batches (during pool syncing). To reduce the "hiccup" in such cases, moving ste_arrays init from chunk destruction to initialization. Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Alex Vesker <valex@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> show more ...
# d277b55f	26-May-2022	Yevgeny Kliteynik <kliteyn@nvidia.com>	net/mlx5: DR, Remove unneeded argument from dr_icm_chunk_destroy Remove an argument that can be extracted in the function. Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Alex Ve net/mlx5: DR, Remove unneeded argument from dr_icm_chunk_destroy Remove an argument that can be extracted in the function. Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Alex Vesker <valex@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> show more ...
Revision tags: v5.15.43, v5.15.42, v5.18, v5.15.41, v5.15.40, v5.15.39, v5.15.38, v5.15.37, v5.15.36, v5.15.35, v5.15.34, v5.15.33, v5.15.32, v5.15.31, v5.17, v5.15.30, v5.15.29, v5.15.28, v5.15.27, v5.15.26, v5.15.25, v5.15.24, v5.15.23, v5.15.22, v5.15.21, v5.15.20, v5.15.19, v5.15.18
# f51bb517	27-Jan-2022	Rongwei Liu <rongweil@nvidia.com>	net/mlx5: DR, Remove num_of_entries byte_size from struct mlx5_dr_icm_chunk Target to reduce the memory consumption in large scale of flow rules. They can be calculated quickly from buddy memory po net/mlx5: DR, Remove num_of_entries byte_size from struct mlx5_dr_icm_chunk Target to reduce the memory consumption in large scale of flow rules. They can be calculated quickly from buddy memory pool. 1. num_of_entries calls dr_icm_pool_get_chunk_num_of_entries(). 2. byte_size calls dr_icm_pool_get_chunk_byte_size(). Use chunk size in dr_icm_chunk to speed up and the one in dr_ste_htbl will be removed in the upcoming commit. This commit reduce 8 bytes from struct mlx5_dr_icm_chunk and its current size is 56 bytes. Signed-off-by: Rongwei Liu <rongweil@nvidia.com> Reviewed-by: Shun Hao <shunh@nvidia.com> Reviewed-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> show more ...
# 5c4f9b6e	27-Jan-2022	Rongwei Liu <rongweil@nvidia.com>	net/mlx5: DR, Remove icm_addr from mlx5dr_icm_chunk to reduce memory It can be calculated quickly from buddy memory pool by function mlx5dr_icm_pool_get_chunk_icm_addr(). This function is very light net/mlx5: DR, Remove icm_addr from mlx5dr_icm_chunk to reduce memory It can be calculated quickly from buddy memory pool by function mlx5dr_icm_pool_get_chunk_icm_addr(). This function is very lightweight and straightforward. Reduce 8 bytes and current size of struct mlx5_dr_icm_chunk is 64 bytes. Signed-off-by: Rongwei Liu <rongweil@nvidia.com> Reviewed-by: Shun Hao <shunh@nvidia.com> Reviewed-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> show more ...
# 003f4f9a	27-Jan-2022	Rongwei Liu <rongweil@nvidia.com>	net/mlx5: DR, Remove mr_addr rkey from struct mlx5dr_icm_chunk Reduce memory footprint by removing mr_addr and rkey from mlx5_dr_icm_chunk. 1. mr_addr is calculated by mlx5dr_icm_pool_get_chunk_mr_a net/mlx5: DR, Remove mr_addr rkey from struct mlx5dr_icm_chunk Reduce memory footprint by removing mr_addr and rkey from mlx5_dr_icm_chunk. 1. mr_addr is calculated by mlx5dr_icm_pool_get_chunk_mr_addr() 2. rkey is calculated by mlx5dr_icm_pool_get_chunk_rkey() The two new functions are very lightweight and straightforward. Reduce 8 bytes from struct mlx5_dr_icm_chunk, its current size is 72 bytes. Signed-off-by: Rongwei Liu <rongweil@nvidia.com> Reviewed-by: Shun Hao <shunh@nvidia.com> Reviewed-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> show more ...
Revision tags: v5.15.17, v5.4.173, v5.15.16, v5.15.15, v5.16
# ecd9c5cd	29-Dec-2021	Yevgeny Kliteynik <kliteyn@nvidia.com>	net/mlx5: DR, Fix the threshold that defines when pool sync is initiated When deciding whether to start syncing and actually free all the "hot" ICM chunks, we need to consider the type of the ICM ch net/mlx5: DR, Fix the threshold that defines when pool sync is initiated When deciding whether to start syncing and actually free all the "hot" ICM chunks, we need to consider the type of the ICM chunks that we're dealing with. For instance, the amount of available ICM for MODIFY_ACTION is significantly lower than the usual STE ICM, so the threshold should account for that - otherwise we can deplete MODIFY_ACTION memory just by creating and deleting the same modify header action in a continuous loop. This patch replaces the hard-coded threshold with a dynamic value. Fixes: 1c58651412bb ("net/mlx5: DR, ICM memory pools sync optimization") Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Alex Vesker <valex@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> show more ...
# e5b2bc30	23-Dec-2021	Yevgeny Kliteynik <kliteyn@nvidia.com>	net/mlx5: DR, Cache STE shadow memory During rule insertion on each ICM memory chunk we also allocate shadow memory used for management. This includes the hw_ste, dr_ste and miss list per entry. Sin net/mlx5: DR, Cache STE shadow memory During rule insertion on each ICM memory chunk we also allocate shadow memory used for management. This includes the hw_ste, dr_ste and miss list per entry. Since the scale of these allocations is large we noticed a performance hiccup that happens once malloc and free are stressed. In extreme usecases when ~1M chunks are freed at once, it might take up to 40 seconds to complete this, up to the point the kernel sees this as self-detected stall on CPU: rcu: INFO: rcu_sched self-detected stall on CPU To resolve this we will increase the reuse of shadow memory. Doing this we see that a time in the aforementioned usecase dropped from ~40 seconds to ~8-10 seconds. Fixes: 29cf8febd185 ("net/mlx5: DR, ICM pool memory allocator") Signed-off-by: Alex Vesker <valex@nvidia.com> Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> show more ...
Revision tags: v5.15.10, v5.15.9, v5.15.8, v5.15.7, v5.15.6, v5.15.5, v5.15.4, v5.15.3, v5.15.2, v5.15.1, v5.15, v5.14.14, v5.14.13, v5.14.12
# 83fec3f1	12-Oct-2021	Aharon Landau <aharonl@nvidia.com>	RDMA/mlx5: Replace struct mlx5_core_mkey by u32 key In mlx5_core and vdpa there is no use of mlx5_core_mkey members except for the key itself. As preparation for moving mlx5_core_mkey to mlx5_ib, t RDMA/mlx5: Replace struct mlx5_core_mkey by u32 key In mlx5_core and vdpa there is no use of mlx5_core_mkey members except for the key itself. As preparation for moving mlx5_core_mkey to mlx5_ib, the occurrences of struct mlx5_core_mkey in all modules except for mlx5_ib are replaced by a u32 key. Signed-off-by: Aharon Landau <aharonl@nvidia.com> Reviewed-by: Shay Drory <shayd@nvidia.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> show more ...
# d4d18848	29-Dec-2021	Yevgeny Kliteynik <kliteyn@nvidia.com>	net/mlx5: DR, Fix the threshold that defines when pool sync is initiated commit ecd9c5cd46e013659e2fad433057bad1ba66888e upstream. When deciding whether to start syncing and actually free all the " net/mlx5: DR, Fix the threshold that defines when pool sync is initiated commit ecd9c5cd46e013659e2fad433057bad1ba66888e upstream. When deciding whether to start syncing and actually free all the "hot" ICM chunks, we need to consider the type of the ICM chunks that we're dealing with. For instance, the amount of available ICM for MODIFY_ACTION is significantly lower than the usual STE ICM, so the threshold should account for that - otherwise we can deplete MODIFY_ACTION memory just by creating and deleting the same modify header action in a continuous loop. This patch replaces the hard-coded threshold with a dynamic value. Fixes: 1c58651412bb ("net/mlx5: DR, ICM memory pools sync optimization") Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Alex Vesker <valex@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> show more ...
# 117a5a7f	23-Dec-2021	Yevgeny Kliteynik <kliteyn@nvidia.com>	net/mlx5: DR, Cache STE shadow memory commit e5b2bc30c21139ae10f0e56989389d0bc7b7b1d6 upstream. During rule insertion on each ICM memory chunk we also allocate shadow memory used for management. Th net/mlx5: DR, Cache STE shadow memory commit e5b2bc30c21139ae10f0e56989389d0bc7b7b1d6 upstream. During rule insertion on each ICM memory chunk we also allocate shadow memory used for management. This includes the hw_ste, dr_ste and miss list per entry. Since the scale of these allocations is large we noticed a performance hiccup that happens once malloc and free are stressed. In extreme usecases when ~1M chunks are freed at once, it might take up to 40 seconds to complete this, up to the point the kernel sees this as self-detected stall on CPU: rcu: INFO: rcu_sched self-detected stall on CPU To resolve this we will increase the reuse of shadow memory. Doing this we see that a time in the aforementioned usecase dropped from ~40 seconds to ~8-10 seconds. Fixes: 29cf8febd185 ("net/mlx5: DR, ICM pool memory allocator") Signed-off-by: Alex Vesker <valex@nvidia.com> Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> show more ...
Revision tags: v5.14.11, v5.14.10, v5.14.9, v5.14.8, v5.14.7, v5.14.6, v5.10.67, v5.10.66, v5.14.5, v5.14.4, v5.10.65, v5.14.3, v5.10.64, v5.14.2, v5.10.63, v5.14.1, v5.10.62, v5.14, v5.10.61, v5.10.60, v5.10.53, v5.10.52, v5.10.51, v5.10.50, v5.10.49, v5.13, v5.10.46, v5.10.43, v5.10.42, v5.10.41, v5.10.40, v5.10.39, v5.4.119, v5.10.36, v5.10.35, v5.10.34, v5.4.116, v5.10.33, v5.12, v5.10.32, v5.10.31, v5.10.30, v5.10.27, v5.10.26, v5.10.25, v5.10.24, v5.10.23, v5.10.22, v5.10.21, v5.10.20, v5.10.19, v5.4.101, v5.10.18, v5.10.17, v5.11, v5.10.16, v5.10.15, v5.10.14, v5.10, v5.8.17, v5.8.16, v5.8.15, v5.9, v5.8.14, v5.8.13, v5.8.12, v5.8.11, v5.8.10
# 284836d9	14-Sep-2020	Yevgeny Kliteynik <kliteyn@nvidia.com>	net/mlx5: DR, Free unused buddy ICM memory Track buddy's used ICM memory, and free it if all of the buddy's memory bacame unused. Do this only for STEs. MODIFY_ACTION buddies are much smaller, so in net/mlx5: DR, Free unused buddy ICM memory Track buddy's used ICM memory, and free it if all of the buddy's memory bacame unused. Do this only for STEs. MODIFY_ACTION buddies are much smaller, so in case there is a large amount of modify_header actions, which result in large amount of MODIFY_ACTION buddies, doing this cleanup during sync will result in performance hit while not freeing significant amount of memory. Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Alex Vesker <valex@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> show more ...
# 1c586514	14-Sep-2020	Yevgeny Kliteynik <kliteyn@nvidia.com>	net/mlx5: DR, ICM memory pools sync optimization Track the pool's hot ICM memory when freeing/allocating chunk, so that when checking if the sync is required, just check if the pool hot memory has r net/mlx5: DR, ICM memory pools sync optimization Track the pool's hot ICM memory when freeing/allocating chunk, so that when checking if the sync is required, just check if the pool hot memory has reached the sync threshold. Signed-off-by: Hamdan Igbaria <hamdani@nvidia.com> Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Alex Vesker <valex@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> show more ...
# 3eb1006a	14-Sep-2020	Yevgeny Kliteynik <kliteyn@nvidia.com>	net/mlx5: DR, Sync chunks only during free When freeing chunks, we want to sync the steering so that all the "hot" memory will be written to ICM and all the chunks that are in the hot_list will be a net/mlx5: DR, Sync chunks only during free When freeing chunks, we want to sync the steering so that all the "hot" memory will be written to ICM and all the chunks that are in the hot_list will be actually destroyed. When allocating from the pool, we don't have a need to sync the steering, as we're not freeing anything, and sync might just hurt the performance in terms of flow-per-second offloaded. Signed-off-by: Erez Shitrit <erezsh@nvidia.com> Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Alex Vesker <valex@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> show more ...
# a00cd878	14-Sep-2020	Yevgeny Kliteynik <kliteyn@nvidia.com>	net/mlx5: DR, Handle ICM memory via buddy allocation instead of buckets Till now in order to manage the ICM memory we used bucket mechanism, which kept a bucket per specified size (sizes were betwee net/mlx5: DR, Handle ICM memory via buddy allocation instead of buckets Till now in order to manage the ICM memory we used bucket mechanism, which kept a bucket per specified size (sizes were between 1 block to 2^21 blocks). Now changing that with buddy-system mechanism, which gives us much more flexible way to manage the ICM memory. Its biggest advantage over the bucket is by using the same ICM memory area for all the sizes of blocks, which reduces the memory consumption. Signed-off-by: Erez Shitrit <erezsh@nvidia.com> Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> show more ...
Revision tags: v5.8.9, v5.8.8, v5.8.7, v5.8.6, v5.4.62, v5.8.5, v5.8.4, v5.4.61, v5.8.3, v5.4.60, v5.8.2, v5.4.59, v5.8.1, v5.4.58, v5.4.57, v5.4.56, v5.8, v5.7.12, v5.4.55, v5.7.11, v5.4.54, v5.7.10, v5.4.53, v5.4.52, v5.7.9, v5.7.8, v5.4.51, v5.4.50, v5.7.7, v5.4.49, v5.7.6, v5.7.5, v5.4.48, v5.7.4, v5.7.3, v5.4.47, v5.4.46, v5.7.2, v5.4.45, v5.7.1, v5.4.44, v5.7, v5.4.43, v5.4.42, v5.4.41, v5.4.40, v5.4.39, v5.4.38, v5.4.37, v5.4.36
# dff8e2d1	24-Apr-2020	Erez Shitrit <erezsh@mellanox.com>	net/mlx5: Use aligned variable while allocating ICM memory The alignment value is part of the input structure, so use it and spare extra memory allocation when is not needed. Now, using the new abil net/mlx5: Use aligned variable while allocating ICM memory The alignment value is part of the input structure, so use it and spare extra memory allocation when is not needed. Now, using the new ability when allocating icm for Direct-Rule insertion. Signed-off-by: Ariel Levkovich <lariel@mellanox.com> Signed-off-by: Erez Shitrit <erezsh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> show more ...
Revision tags: v5.4.35, v5.4.34, v5.4.33, v5.4.32, v5.4.31, v5.4.30, v5.4.29, v5.6, v5.4.28, v5.4.27, v5.4.26, v5.4.25, v5.4.24, v5.4.23, v5.4.22, v5.4.21, v5.4.20, v5.4.19, v5.4.18, v5.4.17, v5.4.16, v5.5, v5.4.15, v5.4.14, v5.4.13, v5.4.12, v5.4.11
# b7d0db55	12-Jan-2020	Erez Shitrit <erezsh@mellanox.com>	net/mlx5: DR, Improve log messages Few print messages are in debug level where they should be in error, and few messages are missing. Signed-off-by: Erez Shitrit <erezsh@mellanox.com> Signed-off-by net/mlx5: DR, Improve log messages Few print messages are in debug level where they should be in error, and few messages are missing. Signed-off-by: Erez Shitrit <erezsh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> show more ...
Revision tags: v5.4.10, v5.4.9, v5.4.8, v5.4.7, v5.4.6, v5.4.5, v5.4.4, v5.4.3, v5.3.15, v5.4.2, v5.4.1, v5.3.14, v5.4, v5.3.13, v5.3.12, v5.3.11, v5.3.10, v5.3.9, v5.3.8, v5.3.7, v5.3.6, v5.3.5, v5.3.4, v5.3.3
# 8b6b82ad	02-Oct-2019	Michal Kubecek <mkubecek@suse.cz>	mlx5: avoid 64-bit division in dr_icm_pool_mr_create() Recently added code introduces 64-bit division in dr_icm_pool_mr_create() so that build on 32-bit architectures fails with ERROR: "__umoddi3 mlx5: avoid 64-bit division in dr_icm_pool_mr_create() Recently added code introduces 64-bit division in dr_icm_pool_mr_create() so that build on 32-bit architectures fails with ERROR: "__umoddi3" [drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.ko] undefined! As the divisor is always a power of 2, we can use bitwise operation instead. Fixes: 29cf8febd185 ("net/mlx5: DR, ICM pool memory allocator") Reported-by: Borislav Petkov <bp@alien8.de> Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Signed-off-by: David S. Miller <davem@davemloft.net> show more ...
12