/openbmc/linux/drivers/gpu/drm/radeon/ |
H A D | r300.c | diff 9e5acbc21380ca8bd470ec23c139dd4e245a221d Wed May 20 06:02:37 CDT 2015 Denys Vlasenko <dvlasenk@redhat.com> radeon: Deinline indirect register accessor functions
This patch deinlines indirect register accessor functions.
These functions perform two mmio accesses, framed by spin lock/unlock. Spin lock/unlock by itself takes more than 50 cycles in ideal case (if lock is exclusively cached on current CPU).
With this .config: http://busybox.net/~vda/kernel_config, after uninlining these functions have sizes and callsite counts as follows:
r600_uvd_ctx_rreg: 111 bytes, 4 callsites r600_uvd_ctx_wreg: 113 bytes, 5 callsites eg_pif_phy0_rreg: 106 bytes, 13 callsites eg_pif_phy0_wreg: 108 bytes, 13 callsites eg_pif_phy1_rreg: 107 bytes, 13 callsites eg_pif_phy1_wreg: 108 bytes, 13 callsites rv370_pcie_rreg: 111 bytes, 21 callsites rv370_pcie_wreg: 113 bytes, 24 callsites r600_rcu_rreg: 111 bytes, 16 callsites r600_rcu_wreg: 113 bytes, 25 callsites cik_didt_rreg: 106 bytes, 10 callsites cik_didt_wreg: 107 bytes, 10 callsites tn_smc_rreg: 106 bytes, 126 callsites tn_smc_wreg: 107 bytes, 116 callsites eg_cg_rreg: 107 bytes, 20 callsites eg_cg_wreg: 108 bytes, 52 callsites
Functions r100_mm_rreg() and r100_mm_rreg() have a fast path and a locked (slow) path. This patch deinlines only slow path.
r100_mm_rreg_slow: 78 bytes, 2083 callsites r100_mm_wreg_slow: 81 bytes, 3570 callsites
Reduction in code size is more than 65,000 bytes:
text data bss dec hex filename 85740176 22294680 20627456 128662312 7ab3b28 vmlinux.before 85674192 22294776 20627456 128598664 7aa4288 vmlinux
Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: linux-kernel@vger.kernel.org Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
H A D | cik.c | diff 9e5acbc21380ca8bd470ec23c139dd4e245a221d Wed May 20 06:02:37 CDT 2015 Denys Vlasenko <dvlasenk@redhat.com> radeon: Deinline indirect register accessor functions
This patch deinlines indirect register accessor functions.
These functions perform two mmio accesses, framed by spin lock/unlock. Spin lock/unlock by itself takes more than 50 cycles in ideal case (if lock is exclusively cached on current CPU).
With this .config: http://busybox.net/~vda/kernel_config, after uninlining these functions have sizes and callsite counts as follows:
r600_uvd_ctx_rreg: 111 bytes, 4 callsites r600_uvd_ctx_wreg: 113 bytes, 5 callsites eg_pif_phy0_rreg: 106 bytes, 13 callsites eg_pif_phy0_wreg: 108 bytes, 13 callsites eg_pif_phy1_rreg: 107 bytes, 13 callsites eg_pif_phy1_wreg: 108 bytes, 13 callsites rv370_pcie_rreg: 111 bytes, 21 callsites rv370_pcie_wreg: 113 bytes, 24 callsites r600_rcu_rreg: 111 bytes, 16 callsites r600_rcu_wreg: 113 bytes, 25 callsites cik_didt_rreg: 106 bytes, 10 callsites cik_didt_wreg: 107 bytes, 10 callsites tn_smc_rreg: 106 bytes, 126 callsites tn_smc_wreg: 107 bytes, 116 callsites eg_cg_rreg: 107 bytes, 20 callsites eg_cg_wreg: 108 bytes, 52 callsites
Functions r100_mm_rreg() and r100_mm_rreg() have a fast path and a locked (slow) path. This patch deinlines only slow path.
r100_mm_rreg_slow: 78 bytes, 2083 callsites r100_mm_wreg_slow: 81 bytes, 3570 callsites
Reduction in code size is more than 65,000 bytes:
text data bss dec hex filename 85740176 22294680 20627456 128662312 7ab3b28 vmlinux.before 85674192 22294776 20627456 128598664 7aa4288 vmlinux
Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: linux-kernel@vger.kernel.org Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
H A D | ni.c | diff 9e5acbc21380ca8bd470ec23c139dd4e245a221d Wed May 20 06:02:37 CDT 2015 Denys Vlasenko <dvlasenk@redhat.com> radeon: Deinline indirect register accessor functions
This patch deinlines indirect register accessor functions.
These functions perform two mmio accesses, framed by spin lock/unlock. Spin lock/unlock by itself takes more than 50 cycles in ideal case (if lock is exclusively cached on current CPU).
With this .config: http://busybox.net/~vda/kernel_config, after uninlining these functions have sizes and callsite counts as follows:
r600_uvd_ctx_rreg: 111 bytes, 4 callsites r600_uvd_ctx_wreg: 113 bytes, 5 callsites eg_pif_phy0_rreg: 106 bytes, 13 callsites eg_pif_phy0_wreg: 108 bytes, 13 callsites eg_pif_phy1_rreg: 107 bytes, 13 callsites eg_pif_phy1_wreg: 108 bytes, 13 callsites rv370_pcie_rreg: 111 bytes, 21 callsites rv370_pcie_wreg: 113 bytes, 24 callsites r600_rcu_rreg: 111 bytes, 16 callsites r600_rcu_wreg: 113 bytes, 25 callsites cik_didt_rreg: 106 bytes, 10 callsites cik_didt_wreg: 107 bytes, 10 callsites tn_smc_rreg: 106 bytes, 126 callsites tn_smc_wreg: 107 bytes, 116 callsites eg_cg_rreg: 107 bytes, 20 callsites eg_cg_wreg: 108 bytes, 52 callsites
Functions r100_mm_rreg() and r100_mm_rreg() have a fast path and a locked (slow) path. This patch deinlines only slow path.
r100_mm_rreg_slow: 78 bytes, 2083 callsites r100_mm_wreg_slow: 81 bytes, 3570 callsites
Reduction in code size is more than 65,000 bytes:
text data bss dec hex filename 85740176 22294680 20627456 128662312 7ab3b28 vmlinux.before 85674192 22294776 20627456 128598664 7aa4288 vmlinux
Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: linux-kernel@vger.kernel.org Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
H A D | r100.c | diff 9e5acbc21380ca8bd470ec23c139dd4e245a221d Wed May 20 06:02:37 CDT 2015 Denys Vlasenko <dvlasenk@redhat.com> radeon: Deinline indirect register accessor functions
This patch deinlines indirect register accessor functions.
These functions perform two mmio accesses, framed by spin lock/unlock. Spin lock/unlock by itself takes more than 50 cycles in ideal case (if lock is exclusively cached on current CPU).
With this .config: http://busybox.net/~vda/kernel_config, after uninlining these functions have sizes and callsite counts as follows:
r600_uvd_ctx_rreg: 111 bytes, 4 callsites r600_uvd_ctx_wreg: 113 bytes, 5 callsites eg_pif_phy0_rreg: 106 bytes, 13 callsites eg_pif_phy0_wreg: 108 bytes, 13 callsites eg_pif_phy1_rreg: 107 bytes, 13 callsites eg_pif_phy1_wreg: 108 bytes, 13 callsites rv370_pcie_rreg: 111 bytes, 21 callsites rv370_pcie_wreg: 113 bytes, 24 callsites r600_rcu_rreg: 111 bytes, 16 callsites r600_rcu_wreg: 113 bytes, 25 callsites cik_didt_rreg: 106 bytes, 10 callsites cik_didt_wreg: 107 bytes, 10 callsites tn_smc_rreg: 106 bytes, 126 callsites tn_smc_wreg: 107 bytes, 116 callsites eg_cg_rreg: 107 bytes, 20 callsites eg_cg_wreg: 108 bytes, 52 callsites
Functions r100_mm_rreg() and r100_mm_rreg() have a fast path and a locked (slow) path. This patch deinlines only slow path.
r100_mm_rreg_slow: 78 bytes, 2083 callsites r100_mm_wreg_slow: 81 bytes, 3570 callsites
Reduction in code size is more than 65,000 bytes:
text data bss dec hex filename 85740176 22294680 20627456 128662312 7ab3b28 vmlinux.before 85674192 22294776 20627456 128598664 7aa4288 vmlinux
Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: linux-kernel@vger.kernel.org Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
H A D | evergreen.c | diff 9e5acbc21380ca8bd470ec23c139dd4e245a221d Wed May 20 06:02:37 CDT 2015 Denys Vlasenko <dvlasenk@redhat.com> radeon: Deinline indirect register accessor functions
This patch deinlines indirect register accessor functions.
These functions perform two mmio accesses, framed by spin lock/unlock. Spin lock/unlock by itself takes more than 50 cycles in ideal case (if lock is exclusively cached on current CPU).
With this .config: http://busybox.net/~vda/kernel_config, after uninlining these functions have sizes and callsite counts as follows:
r600_uvd_ctx_rreg: 111 bytes, 4 callsites r600_uvd_ctx_wreg: 113 bytes, 5 callsites eg_pif_phy0_rreg: 106 bytes, 13 callsites eg_pif_phy0_wreg: 108 bytes, 13 callsites eg_pif_phy1_rreg: 107 bytes, 13 callsites eg_pif_phy1_wreg: 108 bytes, 13 callsites rv370_pcie_rreg: 111 bytes, 21 callsites rv370_pcie_wreg: 113 bytes, 24 callsites r600_rcu_rreg: 111 bytes, 16 callsites r600_rcu_wreg: 113 bytes, 25 callsites cik_didt_rreg: 106 bytes, 10 callsites cik_didt_wreg: 107 bytes, 10 callsites tn_smc_rreg: 106 bytes, 126 callsites tn_smc_wreg: 107 bytes, 116 callsites eg_cg_rreg: 107 bytes, 20 callsites eg_cg_wreg: 108 bytes, 52 callsites
Functions r100_mm_rreg() and r100_mm_rreg() have a fast path and a locked (slow) path. This patch deinlines only slow path.
r100_mm_rreg_slow: 78 bytes, 2083 callsites r100_mm_wreg_slow: 81 bytes, 3570 callsites
Reduction in code size is more than 65,000 bytes:
text data bss dec hex filename 85740176 22294680 20627456 128662312 7ab3b28 vmlinux.before 85674192 22294776 20627456 128598664 7aa4288 vmlinux
Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: linux-kernel@vger.kernel.org Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
H A D | r600.c | diff 9e5acbc21380ca8bd470ec23c139dd4e245a221d Wed May 20 06:02:37 CDT 2015 Denys Vlasenko <dvlasenk@redhat.com> radeon: Deinline indirect register accessor functions
This patch deinlines indirect register accessor functions.
These functions perform two mmio accesses, framed by spin lock/unlock. Spin lock/unlock by itself takes more than 50 cycles in ideal case (if lock is exclusively cached on current CPU).
With this .config: http://busybox.net/~vda/kernel_config, after uninlining these functions have sizes and callsite counts as follows:
r600_uvd_ctx_rreg: 111 bytes, 4 callsites r600_uvd_ctx_wreg: 113 bytes, 5 callsites eg_pif_phy0_rreg: 106 bytes, 13 callsites eg_pif_phy0_wreg: 108 bytes, 13 callsites eg_pif_phy1_rreg: 107 bytes, 13 callsites eg_pif_phy1_wreg: 108 bytes, 13 callsites rv370_pcie_rreg: 111 bytes, 21 callsites rv370_pcie_wreg: 113 bytes, 24 callsites r600_rcu_rreg: 111 bytes, 16 callsites r600_rcu_wreg: 113 bytes, 25 callsites cik_didt_rreg: 106 bytes, 10 callsites cik_didt_wreg: 107 bytes, 10 callsites tn_smc_rreg: 106 bytes, 126 callsites tn_smc_wreg: 107 bytes, 116 callsites eg_cg_rreg: 107 bytes, 20 callsites eg_cg_wreg: 108 bytes, 52 callsites
Functions r100_mm_rreg() and r100_mm_rreg() have a fast path and a locked (slow) path. This patch deinlines only slow path.
r100_mm_rreg_slow: 78 bytes, 2083 callsites r100_mm_wreg_slow: 81 bytes, 3570 callsites
Reduction in code size is more than 65,000 bytes:
text data bss dec hex filename 85740176 22294680 20627456 128662312 7ab3b28 vmlinux.before 85674192 22294776 20627456 128598664 7aa4288 vmlinux
Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: linux-kernel@vger.kernel.org Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
H A D | radeon.h | diff 9e5acbc21380ca8bd470ec23c139dd4e245a221d Wed May 20 06:02:37 CDT 2015 Denys Vlasenko <dvlasenk@redhat.com> radeon: Deinline indirect register accessor functions
This patch deinlines indirect register accessor functions.
These functions perform two mmio accesses, framed by spin lock/unlock. Spin lock/unlock by itself takes more than 50 cycles in ideal case (if lock is exclusively cached on current CPU).
With this .config: http://busybox.net/~vda/kernel_config, after uninlining these functions have sizes and callsite counts as follows:
r600_uvd_ctx_rreg: 111 bytes, 4 callsites r600_uvd_ctx_wreg: 113 bytes, 5 callsites eg_pif_phy0_rreg: 106 bytes, 13 callsites eg_pif_phy0_wreg: 108 bytes, 13 callsites eg_pif_phy1_rreg: 107 bytes, 13 callsites eg_pif_phy1_wreg: 108 bytes, 13 callsites rv370_pcie_rreg: 111 bytes, 21 callsites rv370_pcie_wreg: 113 bytes, 24 callsites r600_rcu_rreg: 111 bytes, 16 callsites r600_rcu_wreg: 113 bytes, 25 callsites cik_didt_rreg: 106 bytes, 10 callsites cik_didt_wreg: 107 bytes, 10 callsites tn_smc_rreg: 106 bytes, 126 callsites tn_smc_wreg: 107 bytes, 116 callsites eg_cg_rreg: 107 bytes, 20 callsites eg_cg_wreg: 108 bytes, 52 callsites
Functions r100_mm_rreg() and r100_mm_rreg() have a fast path and a locked (slow) path. This patch deinlines only slow path.
r100_mm_rreg_slow: 78 bytes, 2083 callsites r100_mm_wreg_slow: 81 bytes, 3570 callsites
Reduction in code size is more than 65,000 bytes:
text data bss dec hex filename 85740176 22294680 20627456 128662312 7ab3b28 vmlinux.before 85674192 22294776 20627456 128598664 7aa4288 vmlinux
Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: linux-kernel@vger.kernel.org Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|