diff mbox

[v5] ACPICA: Tables: Add mechanism to allow to balance late stage acpi_get_table() independently

Message ID cb1126b2eb1b83a5a97a33c1c5c37a02c32fd688.1496811175.git.lv.zheng@intel.com (mailing list archive)
State Accepted, archived
Delegated to: Rafael Wysocki
Headers show

Commit Message

Lv Zheng June 7, 2017, 4:54 a.m. UTC
Considering this case:
1. A program opens a sysfs table file 65535 times, it can increase
   validation_count and first increment cause the table to be mapped:
    validation_count = 65535
2. AML execution causes "Load" to be executed on the same table, this time
   it cannot increase validation_count, so validation_count remains:
    validation_count = 65535
3. The program closes sysfs table file 65535 times, it can decrease
   validation_count and the last decrement cause the table to be unmapped:
    validation_count = 0
4. AML code still accessing the loaded table, kernel crash can be observed.

This is because orginally ACPICA doesn't support unmapping tables during
OS late stage. So the current code only allows unmapping tables during OS
early stage, and for late stage, no acpi_put_table() clones should be
invoked, especially cases that can trigger frequent invocations of
acpi_get_table()/acpi_put_table() are forbidden:
1. sysfs table accesses
2. dynamic Load/Unload opcode executions
3. acpi_load_table()
4. etc.
Such frequent acpi_put_table() balance changes have to be done altogether.

This philosophy is not convenient for Linux driver writers. Since the API
is just there, developers will start to use acpi_put_table() during late
stage. So we need to consider a better mechanism to allow them to safely
invoke acpi_put_table().

This patch provides such a mechanism by adding a validation_count
threashold. When it is reached, the validation_count can no longer be
incremented/decremented to invalidate the table descriptor (means
preventing table unmappings) so that acpi_put_table() balance changes can be
done independently to each others.

Note: code added in acpi_tb_put_table() is actually a no-op but changes the
warning message into a warning once message. Lv Zheng.

Signed-off-by: Lv Zheng <lv.zheng@intel.com>
---
 drivers/acpi/acpica/tbutils.c | 36 +++++++++++++++++++++++++++---------
 include/acpi/actbl.h          | 13 +++++++++++++
 2 files changed, 40 insertions(+), 9 deletions(-)

Comments

Dan Williams June 7, 2017, 6:41 a.m. UTC | #1
On Tue, Jun 6, 2017 at 9:54 PM, Lv Zheng <lv.zheng@intel.com> wrote:
> Considering this case:
> 1. A program opens a sysfs table file 65535 times, it can increase
>    validation_count and first increment cause the table to be mapped:
>     validation_count = 65535
> 2. AML execution causes "Load" to be executed on the same table, this time
>    it cannot increase validation_count, so validation_count remains:
>     validation_count = 65535
> 3. The program closes sysfs table file 65535 times, it can decrease
>    validation_count and the last decrement cause the table to be unmapped:
>     validation_count = 0
> 4. AML code still accessing the loaded table, kernel crash can be observed.
>
> This is because orginally ACPICA doesn't support unmapping tables during
> OS late stage. So the current code only allows unmapping tables during OS
> early stage, and for late stage, no acpi_put_table() clones should be
> invoked, especially cases that can trigger frequent invocations of
> acpi_get_table()/acpi_put_table() are forbidden:
> 1. sysfs table accesses
> 2. dynamic Load/Unload opcode executions
> 3. acpi_load_table()
> 4. etc.
> Such frequent acpi_put_table() balance changes have to be done altogether.
>
> This philosophy is not convenient for Linux driver writers. Since the API
> is just there, developers will start to use acpi_put_table() during late
> stage. So we need to consider a better mechanism to allow them to safely
> invoke acpi_put_table().
>
> This patch provides such a mechanism by adding a validation_count
> threashold. When it is reached, the validation_count can no longer be
> incremented/decremented to invalidate the table descriptor (means
> preventing table unmappings) so that acpi_put_table() balance changes can be
> done independently to each others.
>
> Note: code added in acpi_tb_put_table() is actually a no-op but changes the
> warning message into a warning once message. Lv Zheng.
>

This still seems to be unnecessary gymnastics to keep the validation
count around and make it work for random drivers. Which ACPI tables
might be hot removed? If it's only a small handful of tables why not
teach the code that handles those exceptional cases to manage a
dedicated reference count mechanism? That way the other cases can be
left alone and not worry about balancing their references.
--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rafael J. Wysocki June 7, 2017, 9:14 p.m. UTC | #2
On Wed, Jun 7, 2017 at 8:41 AM, Dan Williams <dan.j.williams@intel.com> wrote:
> On Tue, Jun 6, 2017 at 9:54 PM, Lv Zheng <lv.zheng@intel.com> wrote:
>> Considering this case:
>> 1. A program opens a sysfs table file 65535 times, it can increase
>>    validation_count and first increment cause the table to be mapped:
>>     validation_count = 65535
>> 2. AML execution causes "Load" to be executed on the same table, this time
>>    it cannot increase validation_count, so validation_count remains:
>>     validation_count = 65535
>> 3. The program closes sysfs table file 65535 times, it can decrease
>>    validation_count and the last decrement cause the table to be unmapped:
>>     validation_count = 0
>> 4. AML code still accessing the loaded table, kernel crash can be observed.
>>
>> This is because orginally ACPICA doesn't support unmapping tables during
>> OS late stage. So the current code only allows unmapping tables during OS
>> early stage, and for late stage, no acpi_put_table() clones should be
>> invoked, especially cases that can trigger frequent invocations of
>> acpi_get_table()/acpi_put_table() are forbidden:
>> 1. sysfs table accesses
>> 2. dynamic Load/Unload opcode executions
>> 3. acpi_load_table()
>> 4. etc.
>> Such frequent acpi_put_table() balance changes have to be done altogether.
>>
>> This philosophy is not convenient for Linux driver writers. Since the API
>> is just there, developers will start to use acpi_put_table() during late
>> stage. So we need to consider a better mechanism to allow them to safely
>> invoke acpi_put_table().
>>
>> This patch provides such a mechanism by adding a validation_count
>> threashold. When it is reached, the validation_count can no longer be
>> incremented/decremented to invalidate the table descriptor (means
>> preventing table unmappings) so that acpi_put_table() balance changes can be
>> done independently to each others.
>>
>> Note: code added in acpi_tb_put_table() is actually a no-op but changes the
>> warning message into a warning once message. Lv Zheng.
>>
>
> This still seems to be unnecessary gymnastics to keep the validation
> count around and make it work for random drivers.

Well, I'm not sure I agree here.

If we can make it work at one point, it should not be too hard to
maintain that status.

Thanks,
Rafael
--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Dan Williams June 7, 2017, 9:24 p.m. UTC | #3
On Wed, Jun 7, 2017 at 2:14 PM, Rafael J. Wysocki <rafael@kernel.org> wrote:
> On Wed, Jun 7, 2017 at 8:41 AM, Dan Williams <dan.j.williams@intel.com> wrote:
>> On Tue, Jun 6, 2017 at 9:54 PM, Lv Zheng <lv.zheng@intel.com> wrote:
>>> Considering this case:
>>> 1. A program opens a sysfs table file 65535 times, it can increase
>>>    validation_count and first increment cause the table to be mapped:
>>>     validation_count = 65535
>>> 2. AML execution causes "Load" to be executed on the same table, this time
>>>    it cannot increase validation_count, so validation_count remains:
>>>     validation_count = 65535
>>> 3. The program closes sysfs table file 65535 times, it can decrease
>>>    validation_count and the last decrement cause the table to be unmapped:
>>>     validation_count = 0
>>> 4. AML code still accessing the loaded table, kernel crash can be observed.
>>>
>>> This is because orginally ACPICA doesn't support unmapping tables during
>>> OS late stage. So the current code only allows unmapping tables during OS
>>> early stage, and for late stage, no acpi_put_table() clones should be
>>> invoked, especially cases that can trigger frequent invocations of
>>> acpi_get_table()/acpi_put_table() are forbidden:
>>> 1. sysfs table accesses
>>> 2. dynamic Load/Unload opcode executions
>>> 3. acpi_load_table()
>>> 4. etc.
>>> Such frequent acpi_put_table() balance changes have to be done altogether.
>>>
>>> This philosophy is not convenient for Linux driver writers. Since the API
>>> is just there, developers will start to use acpi_put_table() during late
>>> stage. So we need to consider a better mechanism to allow them to safely
>>> invoke acpi_put_table().
>>>
>>> This patch provides such a mechanism by adding a validation_count
>>> threashold. When it is reached, the validation_count can no longer be
>>> incremented/decremented to invalidate the table descriptor (means
>>> preventing table unmappings) so that acpi_put_table() balance changes can be
>>> done independently to each others.
>>>
>>> Note: code added in acpi_tb_put_table() is actually a no-op but changes the
>>> warning message into a warning once message. Lv Zheng.
>>>
>>
>> This still seems to be unnecessary gymnastics to keep the validation
>> count around and make it work for random drivers.
>
> Well, I'm not sure I agree here.
>
> If we can make it work at one point, it should not be too hard to
> maintain that status.
>

I agree with that, my concern was with driver writers needing to be
worried about when it is safe to call acpi_put_table(). This reference
count behaves differently than other reference counts like kobjects.
The difference is not necessarily bad, but hopefully it can be
contained within the acpi core.
--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Lv Zheng June 8, 2017, 2:24 a.m. UTC | #4
Hi, Dan

> From: Dan Williams [mailto:dan.j.williams@intel.com]

> Subject: Re: [PATCH v5] ACPICA: Tables: Add mechanism to allow to balance late stage acpi_get_table()

> independently

> 

> On Wed, Jun 7, 2017 at 2:14 PM, Rafael J. Wysocki <rafael@kernel.org> wrote:

> > On Wed, Jun 7, 2017 at 8:41 AM, Dan Williams <dan.j.williams@intel.com> wrote:

> >> On Tue, Jun 6, 2017 at 9:54 PM, Lv Zheng <lv.zheng@intel.com> wrote:

> >>> Considering this case:

> >>> 1. A program opens a sysfs table file 65535 times, it can increase

> >>>    validation_count and first increment cause the table to be mapped:

> >>>     validation_count = 65535

> >>> 2. AML execution causes "Load" to be executed on the same table, this time

> >>>    it cannot increase validation_count, so validation_count remains:

> >>>     validation_count = 65535

> >>> 3. The program closes sysfs table file 65535 times, it can decrease

> >>>    validation_count and the last decrement cause the table to be unmapped:

> >>>     validation_count = 0

> >>> 4. AML code still accessing the loaded table, kernel crash can be observed.

> >>>

> >>> This is because orginally ACPICA doesn't support unmapping tables during

> >>> OS late stage. So the current code only allows unmapping tables during OS

> >>> early stage, and for late stage, no acpi_put_table() clones should be

> >>> invoked, especially cases that can trigger frequent invocations of

> >>> acpi_get_table()/acpi_put_table() are forbidden:

> >>> 1. sysfs table accesses

> >>> 2. dynamic Load/Unload opcode executions

> >>> 3. acpi_load_table()

> >>> 4. etc.

> >>> Such frequent acpi_put_table() balance changes have to be done altogether.

> >>>

> >>> This philosophy is not convenient for Linux driver writers. Since the API

> >>> is just there, developers will start to use acpi_put_table() during late

> >>> stage. So we need to consider a better mechanism to allow them to safely

> >>> invoke acpi_put_table().

> >>>

> >>> This patch provides such a mechanism by adding a validation_count

> >>> threashold. When it is reached, the validation_count can no longer be

> >>> incremented/decremented to invalidate the table descriptor (means

> >>> preventing table unmappings) so that acpi_put_table() balance changes can be

> >>> done independently to each others.

> >>>

> >>> Note: code added in acpi_tb_put_table() is actually a no-op but changes the

> >>> warning message into a warning once message. Lv Zheng.

> >>>

> >>

> >> This still seems to be unnecessary gymnastics to keep the validation

> >> count around and make it work for random drivers.

> >

> > Well, I'm not sure I agree here.

> >

> > If we can make it work at one point, it should not be too hard to

> > maintain that status.

> >

> 

> I agree with that, my concern was with driver writers needing to be

> worried about when it is safe to call acpi_put_table(). This reference

> count behaves differently than other reference counts like kobjects.


I don't think they behave differently.

"kref" needn't consider unbalanced "get/put".
Because when the drivers(users) are deploying "kref",
they are responsible for ensuring balanced "get/put".
"kref" needn't take too much care about "overflow/underflow"
as if all users ensure balanced "get/put",
"overflow/underflow" is not possible.
Occurrence of "overflow/underflow" means bugs.
And can be further captured as "panic".

If "kref" considers to "warn_once" overflow/underflow users,
the logic in this commit can also be introduced to kref.
However it's useless as all users have ensured balanced "get/put".
Putting useless check than panic on hot path could be a waste.

> The difference is not necessarily bad, but hopefully it can be

> contained within the acpi core.


The old warning logic for table desc is just derived from utdelete.c.
Which reduces communication cost when the mechanism is upstreamed.

ACPICA table "validation_count" is deployed on top of old design.
Where "table unmap" is forbidden for late stage.
Thus there is no users ensuring balanced "get/put".
Under this circumstances, when we start to deploy balanced "get/put",
we need to consider all users as a whole.
You cannot say current unbalanced "get/put" users have bugs.
They are there just because of historical reasons.

Fortunately after applying this patch,
drivers should be able to have a better environment to use the new APIs.

Cheers,
Lv
diff mbox

Patch

diff --git a/drivers/acpi/acpica/tbutils.c b/drivers/acpi/acpica/tbutils.c
index cd96026..4048523 100644
--- a/drivers/acpi/acpica/tbutils.c
+++ b/drivers/acpi/acpica/tbutils.c
@@ -416,9 +416,19 @@  acpi_tb_get_table(struct acpi_table_desc *table_desc,
 		}
 	}
 
-	table_desc->validation_count++;
-	if (table_desc->validation_count == 0) {
-		table_desc->validation_count--;
+	if (table_desc->validation_count < ACPI_MAX_TABLE_VALIDATIONS) {
+		table_desc->validation_count++;
+
+		/*
+		 * Ensure the warning message can only be displayed once. The
+		 * warning message occurs when the "get" operations are performed
+		 * more than "put" operations, causing count overflow.
+		 */
+		if (table_desc->validation_count >= ACPI_MAX_TABLE_VALIDATIONS) {
+			ACPI_WARNING((AE_INFO,
+				      "Table %p, Validation count overflows\n",
+				      table_desc));
+		}
 	}
 
 	*out_table = table_desc->pointer;
@@ -445,13 +455,21 @@  void acpi_tb_put_table(struct acpi_table_desc *table_desc)
 
 	ACPI_FUNCTION_TRACE(acpi_tb_put_table);
 
-	if (table_desc->validation_count == 0) {
-		ACPI_WARNING((AE_INFO,
-			      "Table %p, Validation count is zero before decrement\n",
-			      table_desc));
-		return_VOID;
+	if (table_desc->validation_count < ACPI_MAX_TABLE_VALIDATIONS) {
+		table_desc->validation_count--;
+
+		/*
+		 * Ensure the warning message can only be displayed once. The
+		 * warning message occurs when the "put" operations are performed
+		 * more than "get" operations, causing count underflow.
+		 */
+		if (table_desc->validation_count >= ACPI_MAX_TABLE_VALIDATIONS) {
+			ACPI_WARNING((AE_INFO,
+				      "Table %p, Validation count underflows\n",
+				      table_desc));
+			return_VOID;
+		}
 	}
-	table_desc->validation_count--;
 
 	if (table_desc->validation_count == 0) {
 
diff --git a/include/acpi/actbl.h b/include/acpi/actbl.h
index d92543f..f42e6d5 100644
--- a/include/acpi/actbl.h
+++ b/include/acpi/actbl.h
@@ -374,6 +374,19 @@  struct acpi_table_desc {
 	u16 validation_count;
 };
 
+/*
+ * Maximum validation count, when it is reached, validation count can no
+ * longer be changed. Which means, the table can no longer be invalidated.
+ * This mechanism is implemented for backward compatibility, where in OS
+ * late stage, old drivers are not facilitated with paired validations and
+ * invalidations.
+ * The maximum validation count can be defined to any value, but should be
+ * greater than the maximum number of OS early stage mapping slots as it
+ * must be ensured that no early stage mappings can be leaked to the late
+ * stage.
+ */
+#define ACPI_MAX_TABLE_VALIDATIONS          ACPI_UINT16_MAX
+
 /* Masks for Flags field above */
 
 #define ACPI_TABLE_ORIGIN_EXTERNAL_VIRTUAL  (0)	/* Virtual address, external maintained */