01c10f88 | 18-Apr-2023 |
Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> |
platform/x86/intel-uncore-freq: tpmi: Provide cluster level control
The new generation of CPUs have granular control at a cluster level. Each package/die can have multiple power domains, which furth
platform/x86/intel-uncore-freq: tpmi: Provide cluster level control
The new generation of CPUs have granular control at a cluster level. Each package/die can have multiple power domains, which further can have multiple fabric clusters. The TPMI interface allows control at fabric cluster level.
Use the updated uncore sysfs feature to expose controls at cluster level. At each cluster level there is a control for maximum and minimum uncore frequency. Also present current uncore frequency at a cluster level.
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Reviewed-by: Zhang Rui <rui.zhang@intel.com> Tested-by: Wendy Wang <wendy.wang@intel.com> Link: https://lore.kernel.org/r/20230418171340.681662-4-srinivas.pandruvada@linux.intel.com Signed-off-by: Hans de Goede <hdegoede@redhat.com>
show more ...
|
9b8dea80 | 18-Apr-2023 |
Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> |
platform/x86/intel-uncore-freq: Support for cluster level controls
An SoC can contain multiple power domains with individual or collection of mesh partitions. This partition is called fabric cluster
platform/x86/intel-uncore-freq: Support for cluster level controls
An SoC can contain multiple power domains with individual or collection of mesh partitions. This partition is called fabric cluster.
Certain type of meshes will need to run at the same frequency, they will be placed in the same fabric cluster. Benefit of fabric cluster is that it offers a scalable mechanism to deal with partitioned fabrics in a SoC.
The current sysfs interface supports control at package and die level. This interface is not enough to support more granular control at fabric cluster level.
SoCs with the support of TPMI (Topology Aware Register and PM Capsule Interface), can have multiple power domains. Each power domain can contain one or more fabric clusters.
To support such granular controls, enhance uncore common to optionally create new directories to provide controls at fabric cluster level. It is also important to have flexibility to change granularity for future version of SoCs. If the directory name contains scope like: "package_*_die_*_power_domain_*_cluster_*", then this is not expandable.
The cpufreq policies also have different scopes. There the scope of the policy (affected_cpus) specified by attributes inside each policy. So, follow the same model for uncore frequency scaling sysfs as: "sys/devices/system/cpu/cpufreq/policy*"
Allow client drivers to optionally support granular control for each fabric cluster. Here, the directory name will be "uncore" suffixed with an unique instance number. For example: uncore00, uncore01 etc. Attributes in the directory identify package id, power domain and fabric cluster id. This interface is expandable even if some new level of granularity is introduced. A new sysfs attribute can identify new level.
For compatibility with the existing sysfs and provide easy way to set limits for each fabric cluster in the package/die, the existing control at package/die levels are still provided. For majority of users, this is an easy approach.
For example: On a single package/die system, with three power domains and one fabric cluster per power domain:
$tree -L 2 /sys/devices/system/cpu/intel_uncore_frequency/ /sys/devices/system/cpu/intel_uncore_frequency/ ├── package_00_die_00 │ ├── current_freq_khz │ ├── initial_max_freq_khz │ ├── initial_min_freq_khz │ ├── max_freq_khz │ └── min_freq_khz ├── uncore00 │ ├── current_freq_khz │ ├── domain_id │ ├── fabric_cluster_id │ ├── initial_max_freq_khz │ ├── initial_min_freq_khz │ ├── max_freq_khz │ ├── min_freq_khz │ └── package_id ├── uncore01 │ ├── current_freq_khz │ ├── domain_id │ ├── fabric_cluster_id │ ├── initial_max_freq_khz │ ├── initial_min_freq_khz │ ├── max_freq_khz │ ├── min_freq_khz │ └── package_id └── uncore02 ├── current_freq_khz ├── domain_id ├── fabric_cluster_id ├── initial_max_freq_khz ├── initial_min_freq_khz ├── max_freq_khz ├── min_freq_khz └── package_id
The attribute for cluster id is "fabric_cluster_id" instead of just "cluster_id" is to avoid confusion with usage of term clusters in other part of the Linux kernel.
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Reviewed-by: Zhang Rui <rui.zhang@intel.com> Tested-by: Wendy Wang <wendy.wang@intel.com> Link: https://lore.kernel.org/r/20230418171340.681662-3-srinivas.pandruvada@linux.intel.com Signed-off-by: Hans de Goede <hdegoede@redhat.com>
show more ...
|
8a54e225 | 20-Apr-2023 |
Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> |
platform/x86/intel-uncore-freq: Uncore frequency control via TPMI
Implement support of uncore frequency control via TPMI (Topology Aware Register and PM Capsule Interface). This driver provides the
platform/x86/intel-uncore-freq: Uncore frequency control via TPMI
Implement support of uncore frequency control via TPMI (Topology Aware Register and PM Capsule Interface). This driver provides the similar functionality as the current uncore frequency driver using MSRs.
The hardware interface to read/write is basically substitution of MSR 0x620 and 0x621. There are specific MMIO offset and bits to get/set minimum and maximum uncore ratio, similar to MSRs.
The scope of the uncore MSRs is package/die. But new generation of CPUs have more granular control at a cluster level. Each package/die can have multiple power domains, which further can have multiple clusters. The TPMI interface allows control at cluster level.
The primary use case for uncore sysfs is to set maximum and minimum uncore frequency to reduce power consumption or latency. The current uncore sysfs control is per package/die. This is enough for the majority of users as workload will move to different power domains as it moves between different CPUs.
The current uncore sysfs provides controls at package/die level. When user sets maximum/minimum limits, the driver sets the same limits to each cluster.
Here number of power domains = number of resources in this aux device. There are offsets and bits to discover number of clusters and offset for each cluster level controls.
The TPMI documentation can be downloaded from: https://github.com/intel/tpmi_power_management
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Reviewed-by: Zhang Rui <rui.zhang@intel.com> Tested-by: Wendy Wang <wendy.wang@intel.com> Link: https://lore.kernel.org/r/20230420220514.747573-1-srinivas.pandruvada@linux.intel.com Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
show more ...
|
dbce412a | 03-Feb-2022 |
Srinivas Pandruvada <srinivas.pandruvada@intel.com> |
platform/x86/intel-uncore-freq: Split common and enumeration part
Split the current driver in two parts: - Common part: All the commom function other than enumeration function. - Enumeration/HW spec
platform/x86/intel-uncore-freq: Split common and enumeration part
Split the current driver in two parts: - Common part: All the commom function other than enumeration function. - Enumeration/HW specific part: The current enumeration using CPU model is left in the old module. This uses service of common driver to register sysfs objects. Also provide callbacks for MSR access related to uncore. - Add MODULE_DEVICE_TABLE to uncore-frequency.c
No functional changes are expected.
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@intel.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://lore.kernel.org/r/20220204000306.2517447-5-srinivas.pandruvada@linux.intel.com Signed-off-by: Hans de Goede <hdegoede@redhat.com>
show more ...
|
414eef27 | 03-Feb-2022 |
Srinivas Pandruvada <srinivas.pandruvada@intel.com> |
platform/x86/intel/uncore-freq: Display uncore current frequency
Add a new sysfs attribute "current_freq_khz" to display current uncore frequency. This value is read from MSR 0x621.
Root user permi
platform/x86/intel/uncore-freq: Display uncore current frequency
Add a new sysfs attribute "current_freq_khz" to display current uncore frequency. This value is read from MSR 0x621.
Root user permission is required to read uncore current frequency.
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@intel.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://lore.kernel.org/r/20220204000306.2517447-4-srinivas.pandruvada@linux.intel.com Signed-off-by: Hans de Goede <hdegoede@redhat.com>
show more ...
|
ae7b2ce5 | 03-Feb-2022 |
Srinivas Pandruvada <srinivas.pandruvada@intel.com> |
platform/x86/intel/uncore-freq: Use sysfs API to create attributes
Use of sysfs API is always preferable over using kobject calls to create attributes. Remove usage of kobject_init_and_add() and use
platform/x86/intel/uncore-freq: Use sysfs API to create attributes
Use of sysfs API is always preferable over using kobject calls to create attributes. Remove usage of kobject_init_and_add() and use sysfs_create_group(). To create relationship between sysfs attribute and uncore instance use device_attribute*, which is defined per uncore instance.
To create uniform locking for both read and write attributes take lock in the sysfs callbacks, not in the actual functions where the MSRs are read or updated.
No functional changes are expected.
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@intel.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://lore.kernel.org/r/20220204000306.2517447-3-srinivas.pandruvada@linux.intel.com Signed-off-by: Hans de Goede <hdegoede@redhat.com>
show more ...
|