Message ID | 20130208210149.GB26660@mtj.dyndns.org (mailing list archive) |
---|---|
State | Not Applicable |
Headers | show |
> * drivers/infiniband/core/cm.c:cm_alloc_id() > drivers/infiniband/hw/mlx4/cm.c:id_map_alloc() > > Used to wrap cyclic @start. Can be replaced with max(next, 0). > Note that this type of cyclic allocation using idr is buggy. These > are prone to spurious -ENOSPC failure after the first wraparound. The replacement code looks fine, but can you explain why the use is buggy? -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hello, On Fri, Feb 08, 2013 at 10:09:13PM +0000, Hefty, Sean wrote: > > Used to wrap cyclic @start. Can be replaced with max(next, 0). > > Note that this type of cyclic allocation using idr is buggy. These > > are prone to spurious -ENOSPC failure after the first wraparound. > > The replacement code looks fine, but can you explain why the use is buggy? So, if you want a cyclic allocation, the allocation should be tried in [start, END) and then [0, start); otherwise, after the allocation wraps for the first time, as the closer the starting point gets to END, the chance of not finding a vacant slot in [start, END) goes higher. When @start equals END - 1 for the second time, if the first END - 1 allocation is still around, you'll get -ENOSPC. In practice, I don't think anyone is hitting this. idr has always been horribly broken when it reaches higher range (> 1<<30 on 64bit) so things would have broken even before the first wraparound. It still is a theoretical possibility which may trigger if idr is used for, say, ipc messages or storage commands. Thanks.
On Fri, Feb 08, 2013 at 01:01:49PM -0800, Tejun Heo wrote: > MAX_IDR_MASK is another weirdness in the idr interface. As idr covers > whole positive integer range, it's defined as 0x7fffffff or INT_MAX. > > Its usage in idr_find(), idr_replace() and idr_remove() is bizarre. > They basically mask off the sign bit and operate on the rest, so if > the caller, by accident, passes in a negative number, the sign bit > will be masked off and the remaining part will be used as if that was > the input, which is worse than crashing. > > The constant is visible in idr.h and there are several users in the > kernel. > > * drivers/i2c/i2c-core.c:i2c_add_numbered_adapter() > > Basically used to test if adap->nr is a negative number which isn't > -1 and returns -EINVAL if so. idr_alloc() already has negative > @start checking (w/ WARN_ON_ONCE), so this can go away. > > * drivers/infiniband/core/cm.c:cm_alloc_id() > drivers/infiniband/hw/mlx4/cm.c:id_map_alloc() > > Used to wrap cyclic @start. Can be replaced with max(next, 0). > Note that this type of cyclic allocation using idr is buggy. These > are prone to spurious -ENOSPC failure after the first wraparound. > > * fs/super.c:get_anon_bdev() > > The ID allocated from ida is masked off before being tested whether > it's inside valid range. ida allocated ID can never be a negative > number and the masking is unnecessary. > > Update idr_*() functions to fail with -EINVAL when negative @id is > specified and update other MAX_IDR_MASK users as described above. > > This leaves MAX_IDR_MASK without any user, remove it and relocate > other MAX_IDR_* constants to lib/idr.c. > > Signed-off-by: Tejun Heo <tj@kernel.org> For the i2c-part: Acked-by: Wolfram Sang <wolfram@the-dreams.de> -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> So, if you want a cyclic allocation, the allocation should be tried in > [start, END) and then [0, start); otherwise, after the allocation > wraps for the first time, as the closer the starting point gets to > END, the chance of not finding a vacant slot in [start, END) goes > higher. When @start equals END - 1 for the second time, if the first > END - 1 allocation is still around, you'll get -ENOSPC. Got it - thanks. I'll make a note to fix this. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
--- a/drivers/i2c/i2c-core.c +++ b/drivers/i2c/i2c-core.c @@ -978,8 +978,6 @@ int i2c_add_numbered_adapter(struct i2c_ if (adap->nr == -1) /* -1 means dynamically assign bus id */ return i2c_add_adapter(adap); - if (adap->nr & ~MAX_IDR_MASK) - return -EINVAL; mutex_lock(&core_lock); id = idr_alloc(&i2c_adapter_idr, adap, adap->nr, adap->nr + 1, --- a/drivers/infiniband/core/cm.c +++ b/drivers/infiniband/core/cm.c @@ -390,7 +390,7 @@ static int cm_alloc_id(struct cm_id_priv id = idr_alloc(&cm.local_id_table, cm_id_priv, next_id, 0, GFP_NOWAIT); if (id >= 0) - next_id = ((unsigned) id + 1) & MAX_IDR_MASK; + next_id = max(id + 1, 0); spin_unlock_irqrestore(&cm.lock, flags); idr_preload_end(); --- a/drivers/infiniband/hw/mlx4/cm.c +++ b/drivers/infiniband/hw/mlx4/cm.c @@ -225,7 +225,7 @@ id_map_alloc(struct ib_device *ibdev, in ret = idr_alloc(&sriov->pv_id_table, ent, next_id, 0, GFP_NOWAIT); if (ret >= 0) { - next_id = ((unsigned)ret + 1) & MAX_IDR_MASK; + next_id = max(ret + 1, 0); ent->pv_cm_id = (u32)ret; sl_id_map_add(ibdev, ent); list_add_tail(&ent->list, &sriov->cm_list); --- a/fs/super.c +++ b/fs/super.c @@ -842,7 +842,7 @@ int get_anon_bdev(dev_t *p) else if (error) return -EAGAIN; - if ((dev & MAX_IDR_MASK) == (1 << MINORBITS)) { + if (dev == (1 << MINORBITS)) { spin_lock(&unnamed_dev_lock); ida_remove(&unnamed_dev_ida, dev); if (unnamed_dev_start > dev) --- a/include/linux/idr.h +++ b/include/linux/idr.h @@ -38,16 +38,6 @@ #define IDR_SIZE (1 << IDR_BITS) #define IDR_MASK ((1 << IDR_BITS)-1) -#define MAX_IDR_SHIFT (sizeof(int)*8 - 1) -#define MAX_IDR_BIT (1U << MAX_IDR_SHIFT) -#define MAX_IDR_MASK (MAX_IDR_BIT - 1) - -/* Leave the possibility of an incomplete final layer */ -#define MAX_IDR_LEVEL ((MAX_IDR_SHIFT + IDR_BITS - 1) / IDR_BITS) - -/* Number of id_layer structs to leave in free list */ -#define MAX_IDR_FREE (MAX_IDR_LEVEL * 2) - struct idr_layer { unsigned long bitmap; /* A zero bit means "space here" */ struct idr_layer __rcu *ary[1<<IDR_BITS]; --- a/lib/idr.c +++ b/lib/idr.c @@ -38,6 +38,15 @@ #include <linux/percpu.h> #include <linux/hardirq.h> +#define MAX_IDR_SHIFT (sizeof(int) * 8 - 1) +#define MAX_IDR_BIT (1U << MAX_IDR_SHIFT) + +/* Leave the possibility of an incomplete final layer */ +#define MAX_IDR_LEVEL ((MAX_IDR_SHIFT + IDR_BITS - 1) / IDR_BITS) + +/* Number of id_layer structs to leave in free list */ +#define MAX_IDR_FREE (MAX_IDR_LEVEL * 2) + static struct kmem_cache *idr_layer_cache; static DEFINE_PER_CPU(struct idr_layer *, idr_preload_head); static DEFINE_PER_CPU(int, idr_preload_cnt); @@ -539,8 +548,8 @@ void idr_remove(struct idr *idp, int id) struct idr_layer *p; struct idr_layer *to_free; - /* Mask off upper bits we don't use for the search. */ - id &= MAX_IDR_MASK; + if (WARN_ON_ONCE(id < 0)) + return; sub_remove(idp, (idp->layers - 1) * IDR_BITS, id); if (idp->top && idp->top->count == 1 && (idp->layers > 1) && @@ -647,14 +656,14 @@ void *idr_find(struct idr *idp, int id) int n; struct idr_layer *p; + if (WARN_ON_ONCE(id < 0)) + return NULL; + p = rcu_dereference_raw(idp->top); if (!p) return NULL; n = (p->layer+1) * IDR_BITS; - /* Mask off upper bits we don't use for the search. */ - id &= MAX_IDR_MASK; - if (id > idr_max(p->layer + 1)) return NULL; BUG_ON(n == 0); @@ -796,14 +805,15 @@ void *idr_replace(struct idr *idp, void int n; struct idr_layer *p, *old_p; + if (WARN_ON_ONCE(id < 0)) + return ERR_PTR(-EINVAL); + p = idp->top; if (!p) return ERR_PTR(-EINVAL); n = (p->layer+1) * IDR_BITS; - id &= MAX_IDR_MASK; - if (id >= (1 << n)) return ERR_PTR(-EINVAL);
MAX_IDR_MASK is another weirdness in the idr interface. As idr covers whole positive integer range, it's defined as 0x7fffffff or INT_MAX. Its usage in idr_find(), idr_replace() and idr_remove() is bizarre. They basically mask off the sign bit and operate on the rest, so if the caller, by accident, passes in a negative number, the sign bit will be masked off and the remaining part will be used as if that was the input, which is worse than crashing. The constant is visible in idr.h and there are several users in the kernel. * drivers/i2c/i2c-core.c:i2c_add_numbered_adapter() Basically used to test if adap->nr is a negative number which isn't -1 and returns -EINVAL if so. idr_alloc() already has negative @start checking (w/ WARN_ON_ONCE), so this can go away. * drivers/infiniband/core/cm.c:cm_alloc_id() drivers/infiniband/hw/mlx4/cm.c:id_map_alloc() Used to wrap cyclic @start. Can be replaced with max(next, 0). Note that this type of cyclic allocation using idr is buggy. These are prone to spurious -ENOSPC failure after the first wraparound. * fs/super.c:get_anon_bdev() The ID allocated from ida is masked off before being tested whether it's inside valid range. ida allocated ID can never be a negative number and the masking is unnecessary. Update idr_*() functions to fail with -EINVAL when negative @id is specified and update other MAX_IDR_MASK users as described above. This leaves MAX_IDR_MASK without any user, remove it and relocate other MAX_IDR_* constants to lib/idr.c. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Jean Delvare <khali@linux-fr.org> Cc: linux-i2c@vger.kernel.org Cc: Roland Dreier <roland@kernel.org> Cc: Sean Hefty <sean.hefty@intel.com> Cc: Hal Rosenstock <hal.rosenstock@gmail.com> Cc: "Marciniszyn, Mike" <mike.marciniszyn@intel.com> Cc: Jack Morgenstein <jackm@dev.mellanox.co.il> Cc: Or Gerlitz <ogerlitz@mellanox.com> Cc: linux-rdma@vger.kernel.org Cc: Al Viro <viro@zeniv.linux.org.uk> --- drivers/i2c/i2c-core.c | 2 -- drivers/infiniband/core/cm.c | 2 +- drivers/infiniband/hw/mlx4/cm.c | 2 +- fs/super.c | 2 +- include/linux/idr.h | 10 ---------- lib/idr.c | 24 +++++++++++++++++------- 6 files changed, 20 insertions(+), 22 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html