diff mbox

[6/8] tools/xenalyze: Fix off-by-one in MAX_CPUS range checks

Message ID 1456411743-17741-7-git-send-email-george.dunlap@eu.citrix.com (mailing list archive)
State New, archived
Headers show

Commit Message

George Dunlap Feb. 25, 2016, 2:49 p.m. UTC
Skip action / throw error if cpu/vcpu >= MAX_CPUS  rather than >.

Also add an assertion to vcpu_find, to make future errors of this kind
not out-of-bounds.

CID 1306871
CID 1306870
CID 1306869
CID 1306867

Signed-off-by: George Dunlap <george.dunlap@citrix.com>
---
CC: Ian Jackson <ian.jackson@citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
---
 tools/xentrace/xenalyze.c | 17 ++++++++++++-----
 1 file changed, 12 insertions(+), 5 deletions(-)

Comments

Ian Jackson Feb. 26, 2016, 12:30 p.m. UTC | #1
George Dunlap writes ("[PATCH 6/8] tools/xenalyze: Fix off-by-one in MAX_CPUS range checks"):
> Skip action / throw error if cpu/vcpu >= MAX_CPUS  rather than >.
> 
> Also add an assertion to vcpu_find, to make future errors of this kind
> not out-of-bounds.
...
> +    /* "Graceful" handling of vid >= MAX_CPUS should be handled elsewhere */
> +    if ( vid >= MAX_CPUS ) {
> +        fprintf(stderr, "%s: vcpu %d exceeds MAX_CPUS %d!\n",
> +                __func__, vid, MAX_CPUS);
> +        error(ERR_ASSERT, NULL);
> +    }

I'm not convinced by the existence of error(ERR_ASSERT,...).  What is
wrong with assert() ?

If you agree that ERR_ASSERT should be got rid of, then you could
start here...

But:

Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
George Dunlap Feb. 29, 2016, 4:58 p.m. UTC | #2
On 26/02/16 12:30, Ian Jackson wrote:
> George Dunlap writes ("[PATCH 6/8] tools/xenalyze: Fix off-by-one in MAX_CPUS range checks"):
>> Skip action / throw error if cpu/vcpu >= MAX_CPUS  rather than >.
>>
>> Also add an assertion to vcpu_find, to make future errors of this kind
>> not out-of-bounds.
> ...
>> +    /* "Graceful" handling of vid >= MAX_CPUS should be handled elsewhere */
>> +    if ( vid >= MAX_CPUS ) {
>> +        fprintf(stderr, "%s: vcpu %d exceeds MAX_CPUS %d!\n",
>> +                __func__, vid, MAX_CPUS);
>> +        error(ERR_ASSERT, NULL);
>> +    }
> 
> I'm not convinced by the existence of error(ERR_ASSERT,...).  What is
> wrong with assert() ?

Well one half of the reason for error() in general is to print out the
record which caused (or was involved in) the error before dying.  And
I'm guessing that once I decided I'd have error(ERR_ASSERT, xxx), that
for consistency I just decided to use error(ERR_ASSERT,...) everywhere.

But at least at this point, no instance of error(ERR_ASSERT...) actually
takes a pointer to a record, so that probably is something that could
just go away.

I'll send a new series with this updated.

 -George

> 
> If you agree that ERR_ASSERT should be got rid of, then you could
> start here...
> 
> But:
> 
> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
>
George Dunlap March 3, 2016, 12:44 p.m. UTC | #3
On 29/02/16 16:58, George Dunlap wrote:
> On 26/02/16 12:30, Ian Jackson wrote:
>> George Dunlap writes ("[PATCH 6/8] tools/xenalyze: Fix off-by-one in MAX_CPUS range checks"):
>>> Skip action / throw error if cpu/vcpu >= MAX_CPUS  rather than >.
>>>
>>> Also add an assertion to vcpu_find, to make future errors of this kind
>>> not out-of-bounds.
>> ...
>>> +    /* "Graceful" handling of vid >= MAX_CPUS should be handled elsewhere */
>>> +    if ( vid >= MAX_CPUS ) {
>>> +        fprintf(stderr, "%s: vcpu %d exceeds MAX_CPUS %d!\n",
>>> +                __func__, vid, MAX_CPUS);
>>> +        error(ERR_ASSERT, NULL);
>>> +    }
>>
>> I'm not convinced by the existence of error(ERR_ASSERT,...).  What is
>> wrong with assert() ?
> 
> Well one half of the reason for error() in general is to print out the
> record which caused (or was involved in) the error before dying.  And
> I'm guessing that once I decided I'd have error(ERR_ASSERT, xxx), that
> for consistency I just decided to use error(ERR_ASSERT,...) everywhere.

Oh, actually -- going through and implementing this change, I *think*
that the problem I had was actually that assert() doesn't flush stdout
before calling abort().  In dump mode every single trace record is
printed to stdout, which makes it fairly easy to figure out how you go
to the point of the assertion -- as long as it's actually printed out.

In fact in one location I had commented out an assert() and replaced it
with an if() {fprintf(...) error(...)}, presumably for exactly that reason.

In the case of xenalyze, all the recent trace records after an error
message is actually a lot more useful for forensics than having the
stack trace (which is what abort() gives you).

 -George
diff mbox

Patch

diff --git a/tools/xentrace/xenalyze.c b/tools/xentrace/xenalyze.c
index 249bebd..3e26a4c 100644
--- a/tools/xentrace/xenalyze.c
+++ b/tools/xentrace/xenalyze.c
@@ -6860,6 +6860,13 @@  struct vcpu_data * vcpu_find(int did, int vid)
     struct domain_data *d;
     struct vcpu_data *v;
 
+    /* "Graceful" handling of vid >= MAX_CPUS should be handled elsewhere */
+    if ( vid >= MAX_CPUS ) {
+        fprintf(stderr, "%s: vcpu %d exceeds MAX_CPUS %d!\n",
+                __func__, vid, MAX_CPUS);
+        error(ERR_ASSERT, NULL);
+    }
+
     d = domain_find(did);
 
     v = d->vcpu[vid];
@@ -7131,7 +7138,7 @@  void sched_runstate_process(struct pcpu_info *p)
         }
     }
 
-    if(r->vcpu > MAX_CPUS)
+    if(r->vcpu >= MAX_CPUS)
     {
         fprintf(warn, "%s: vcpu %u > MAX_VCPUS %d!\n",
                 __func__, r->vcpu, MAX_CPUS);
@@ -7441,14 +7448,14 @@  void sched_switch_process(struct pcpu_info *p)
                r->prev_dom, r->prev_vcpu,
                r->next_dom, r->next_vcpu);
 
-    if(r->prev_vcpu > MAX_CPUS)
+    if(r->prev_vcpu >= MAX_CPUS)
     {
         fprintf(warn, "%s: prev_vcpu %u > MAX_VCPUS %d!\n",
                 __func__, r->prev_vcpu, MAX_CPUS);
         return;
     }
 
-    if(r->next_vcpu > MAX_CPUS)
+    if(r->next_vcpu >= MAX_CPUS)
     {
         fprintf(warn, "%s: next_vcpu %u > MAX_VCPUS %d!\n",
                 __func__, r->next_vcpu, MAX_CPUS);
@@ -8518,7 +8525,7 @@  off_t scan_for_new_pcpu(off_t offset) {
 
     cd = (typeof(cd))rec.u.notsc.data;
 
-    if ( cd->cpu > MAX_CPUS )
+    if ( cd->cpu >= MAX_CPUS )
     {
         fprintf(stderr, "%s: cpu %d exceeds MAX_CPU %d!\n",
                 __func__, cd->cpu, MAX_CPUS);
@@ -8738,7 +8745,7 @@  void process_cpu_change(struct pcpu_info *p) {
                 (unsigned long long)p->file_offset);
     }
 
-    if(r->cpu > MAX_CPUS)
+    if(r->cpu >= MAX_CPUS)
     {
         fprintf(stderr, "FATAL: cpu %d > MAX_CPUS %d.\n",
                 r->cpu, MAX_CPUS);