diff mbox

[RFC,RESEND,5/5] Input: evdev - add new EVIOCGABSRANGE ioctl

Message ID 1405775445-4454-6-git-send-email-dh.herrmann@gmail.com (mailing list archive)
State New, archived
Headers show

Commit Message

David Herrmann July 19, 2014, 1:10 p.m. UTC
When we introduced the slotted MT ABS extensions, we didn't take care to
make SYN_DROPPED recoverable. Imagine a client recevies a SYN_DROPPED and
syncs its current state via EVIOCGABS. It has to call this ioctl for each
and every ABS code separately. Besides being horribly slow, this series
of ioctl-calls is not atomic. The kernel might queue new ABS events while
the client fetches the data.

Now for normal ABS codes this is negligible as ABS values provide absolute
data. That is, there is usually no need to resync ABS codes as we don't
need previous values to interpret the next ABS code. Furthermore, most ABS
codes are also sent pretty frequently so a refresh is usually useless.

However, with the introduction of slotted ABS axes we added a relative
component: ABS_MT_SLOT. If a client syncs all ABS codes via EVIOCGABS
while the kernel has ABS-events and an ABS_MT_SLOT event queued, the
client will think any read ABS-event is for the retrieved SLOT, however,
this is not true as all events until the next ABS_MT_SLOT event are for
the previously active slot:

    Kernel queue is: { ABS_DROPPED,
                       ABS_MT_POSITION_X(slot: 0),
                       ABS_MT_SLOT(slot: 1),
                       ABS_MT_POSITION_X(slot: 1) }
    Client reads ABS_DROPPED from queue.
    Client syncs all ABS values:
      As part of that, client syncs ABS_MT_SLOT, which is 1 in the current
      view of the kernel.
    Client reads ABS_MT_POSITION_X and attributes it to slot 1 instead of
    slot 0, as the slot-value is not explicit.

This is just a simple example how the relative information provided by the
ABS_MT_SLOT axis can be problematic to clients.

Now there are many ways to fix this:
 * Make ABS_MT_SLOT a per-evdev-client attribute. On each
   EVIOCGABS(ABS_MT_SLOT) we can add fake ABS_MT_SLOT events to the queue.
   => Ugly and overkill
 * Flush all ABS events when clients read ABS_MT_SLOT.
   => Ugly hack and client might loose important ABS_MT_* events
 * Provide atomic EVIOCGABS API.
   => Sounds good!

This patch introduces EVIOCGABSRANGE. Unlike EVIOCGABS, this ioctl only
fetches ABS values, rather than the whole "struct input_absinfo" set.
However, the new ioctl can fetch a range of ABS axes atomically and will
flush matching events from the client's receive queue. Moreover, you can
fetch all axes for *all* slots with a single call.

This way, a client can simply run EVIOCGABSRANGE(0, ABS_CNT) and it will
receive a consistent view of the whole ABS state, while the kernel flushes
the receive-buffer for a consistent view.
While most clients probably only need
EVIOCGABSRANGE(ABS_MT_SLOT, ABS_MT_TOOL_y - ABS_MT_SLOT + 1), the ioctl
allows to receive an arbitrary range of axes.

Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
---
 drivers/input/evdev.c      | 180 ++++++++++++++++++++++++++++++++++++++++++++-
 include/uapi/linux/input.h |  44 ++++++++++-
 2 files changed, 219 insertions(+), 5 deletions(-)

Comments

Peter Hutterer Aug. 6, 2014, 1:35 a.m. UTC | #1
sorry for the late comments, not sure how that slipped through but it hadn't
shown up in my inbox unil Benjamin poked me.

On Sat, Jul 19, 2014 at 03:10:45PM +0200, David Herrmann wrote:
> When we introduced the slotted MT ABS extensions, we didn't take care to
> make SYN_DROPPED recoverable. Imagine a client recevies a SYN_DROPPED and
> syncs its current state via EVIOCGABS. It has to call this ioctl for each
> and every ABS code separately. Besides being horribly slow, this series
> of ioctl-calls is not atomic. The kernel might queue new ABS events while
> the client fetches the data.
> 
> Now for normal ABS codes this is negligible as ABS values provide absolute
> data. That is, there is usually no need to resync ABS codes as we don't
> need previous values to interpret the next ABS code. Furthermore, most ABS
> codes are also sent pretty frequently so a refresh is usually useless.
> 
> However, with the introduction of slotted ABS axes we added a relative
> component: ABS_MT_SLOT. If a client syncs all ABS codes via EVIOCGABS
> while the kernel has ABS-events and an ABS_MT_SLOT event queued, the
> client will think any read ABS-event is for the retrieved SLOT, however,
> this is not true as all events until the next ABS_MT_SLOT event are for
> the previously active slot:
> 
>     Kernel queue is: { ABS_DROPPED,

shouldn't this be SYN_DROPPED?

>                        ABS_MT_POSITION_X(slot: 0),
>                        ABS_MT_SLOT(slot: 1),
>                        ABS_MT_POSITION_X(slot: 1) }
>     Client reads ABS_DROPPED from queue.

here too

>     Client syncs all ABS values:
>       As part of that, client syncs ABS_MT_SLOT, which is 1 in the current
>       view of the kernel.
>     Client reads ABS_MT_POSITION_X and attributes it to slot 1 instead of
>     slot 0, as the slot-value is not explicit.
> 
> This is just a simple example how the relative information provided by the
> ABS_MT_SLOT axis can be problematic to clients.
> 
> Now there are many ways to fix this:
>  * Make ABS_MT_SLOT a per-evdev-client attribute. On each
>    EVIOCGABS(ABS_MT_SLOT) we can add fake ABS_MT_SLOT events to the queue.
>    => Ugly and overkill
>  * Flush all ABS events when clients read ABS_MT_SLOT.
>    => Ugly hack and client might loose important ABS_MT_* events
>  * Provide atomic EVIOCGABS API.
>    => Sounds good!
> 
> This patch introduces EVIOCGABSRANGE. Unlike EVIOCGABS, this ioctl only
> fetches ABS values, rather than the whole "struct input_absinfo" set.
> However, the new ioctl can fetch a range of ABS axes atomically and will
> flush matching events from the client's receive queue. Moreover, you can
> fetch all axes for *all* slots with a single call.
> 
> This way, a client can simply run EVIOCGABSRANGE(0, ABS_CNT) and it will
> receive a consistent view of the whole ABS state, while the kernel flushes
> the receive-buffer for a consistent view.
> While most clients probably only need
> EVIOCGABSRANGE(ABS_MT_SLOT, ABS_MT_TOOL_y - ABS_MT_SLOT + 1), the ioctl
> allows to receive an arbitrary range of axes.
> 
> Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
> ---
>  drivers/input/evdev.c      | 180 ++++++++++++++++++++++++++++++++++++++++++++-
>  include/uapi/linux/input.h |  44 ++++++++++-
>  2 files changed, 219 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/input/evdev.c b/drivers/input/evdev.c
> index 6386882..7a25a7a 100644
> --- a/drivers/input/evdev.c
> +++ b/drivers/input/evdev.c
> @@ -175,8 +175,9 @@ static bool __evdev_is_filtered(struct evdev_client *client,
>  	return mask && !test_bit(code, mask);
>  }
>  
> -/* flush queued events of type @type, caller must hold client->buffer_lock */
> -static void __evdev_flush_queue(struct evdev_client *client, unsigned int type)
> +/* flush queued, matching events, caller must hold client->buffer_lock */
> +static void __evdev_flush_queue(struct evdev_client *client, unsigned int type,
> +				unsigned int code_first, unsigned int code_last)
>  {
>  	unsigned int i, head, num;
>  	unsigned int mask = client->bufsize - 1;
> @@ -195,7 +196,9 @@ static void __evdev_flush_queue(struct evdev_client *client, unsigned int type)
>  		ev = &client->buffer[i];
>  		is_report = ev->type == EV_SYN && ev->code == SYN_REPORT;
>  
> -		if (ev->type == type) {
> +		if (ev->type == type &&
> +		    ev->code >= code_first &&
> +		    ev->code <= code_last) {
>  			/* drop matched entry */
>  			continue;
>  		} else if (is_report && !num) {
> @@ -786,6 +789,172 @@ static int handle_eviocgbit(struct input_dev *dev,
>  	return bits_to_user(bits, len, size, p, compat_mode);
>  }
>  
> +static inline void free_absrange(s32 **pages, size_t page_cnt)
> +{
> +	if (page_cnt > 1) {
> +		while (page_cnt > 0) {
> +			if (!pages[--page_cnt])
> +				break;
> +			__free_page(virt_to_page(pages[page_cnt]));
> +		}
> +		kfree(pages);
> +	} else if (page_cnt == 1) {
> +		kfree(pages);
> +	}
> +}
> +
> +static inline s32 *absrange_ptr(s32 **pages, size_t page_cnt, size_t slots,
> +				size_t i_code, size_t j_slot)
> +{
> +	size_t idx, off;
> +
> +	idx = (i_code * slots + j_slot) / (PAGE_SIZE / sizeof(s32));
> +	off = (i_code * slots + j_slot) % (PAGE_SIZE / sizeof(s32));
> +
> +	if (page_cnt == 1)
> +		return &((s32*)pages)[off];
> +	else
> +		return &pages[idx][off];
> +}
> +
> +static inline ssize_t fetch_absrange(struct evdev_client *client,
> +				     struct input_dev *dev, size_t start,
> +				     size_t count, size_t slots, s32 ***out)
> +{
> +	size_t size, page_cnt, i, j;
> +	unsigned long flags;
> +	s32 **pages;
> +
> +	/*
> +	 * Fetch data atomically from the device and flush buffers. We need to
> +	 * allocate a temporary buffer as copy_to_user() is not allowed while
> +	 * holding spinlocks. However, to-be-copied data might be huge and
> +	 * high-order allocations should be avoided. Therefore, do the
> +	 * page-allocation manually.
> +	 */
> +
> +	BUILD_BUG_ON(PAGE_SIZE % sizeof(s32) != 0);
> +
> +	size = sizeof(s32) * count * slots;
> +	page_cnt = DIV_ROUND_UP(size, PAGE_SIZE);
> +	if (page_cnt < 1) {
> +		return 0;
> +	} else if (page_cnt == 1) {
> +		pages = kzalloc(size, GFP_TEMPORARY);
> +		if (!pages)
> +			return -ENOMEM;
> +	} else {
> +		pages = kzalloc(sizeof(*pages) * page_cnt, GFP_TEMPORARY);
> +		if (!pages)
> +			return -ENOMEM;
> +
> +		for (i = 0; i < page_cnt; ++i) {
> +			pages[i] = (void*)get_zeroed_page(GFP_TEMPORARY);
> +			if (!pages[i]) {
> +				free_absrange(pages, page_cnt);
> +				return -ENOMEM;
> +			}
> +		}
> +	}
> +
> +	spin_lock_irqsave(&dev->event_lock, flags);
> +	spin_lock(&client->buffer_lock);
> +
> +	for (i = 0; i < count; ++i) {
> +		__u16 code;
> +		bool is_mt;
> +
> +		code = start + i;
> +		is_mt = input_is_mt_value(code);
> +		if (is_mt && !dev->mt)
> +			continue;
> +
> +		for (j = 0; j < slots; ++j) {
> +			__s32 v;
> +
> +			if (is_mt)
> +				v = input_mt_get_value(&dev->mt->slots[j],
> +						       code);
> +			else
> +				v = dev->absinfo[code].value;
> +
> +			*absrange_ptr(pages, page_cnt, slots, i, j) = v;
> +
> +			if (!is_mt)
> +				break;
> +		}
> +	}
> +
> +	spin_unlock(&client->buffer_lock);
> +	__evdev_flush_queue(client, EV_ABS, start, start + count - 1);
> +	spin_unlock_irqrestore(&dev->event_lock, flags);
> +
> +	*out = pages;
> +	return page_cnt;
> +}
> +
> +static int evdev_handle_get_absrange(struct evdev_client *client,
> +				     struct input_dev *dev,
> +				     struct input_absrange __user *p)
> +{
> +	size_t slots, code, count, i, j;
> +	struct input_absrange absbuf;
> +	s32 **vals = NULL;
> +	ssize_t val_cnt;
> +	s32 __user *b;
> +	int retval;
> +
> +	if (!dev->absinfo)
> +		return -EINVAL;
> +	if (copy_from_user(&absbuf, p, sizeof(absbuf)))
> +		return -EFAULT;
> +
> +	slots = min_t(size_t, dev->mt ? dev->mt->num_slots : 1, absbuf.slots);
> +	code = min_t(size_t, absbuf.code, ABS_CNT);
> +	count = min_t(size_t, absbuf.count, ABS_CNT);
> +
> +	/* first fetch data atomically from device */
> +
> +	if (code + count > ABS_CNT)
> +		count = ABS_CNT - code;
> +
> +	if (!slots || !count) {
> +		val_cnt = 0;
> +	} else {
> +		val_cnt = fetch_absrange(client, dev, code, count,
> +					 slots, &vals);
> +		if (val_cnt < 0)
> +			return val_cnt;
> +	}
> +
> +	/* now copy data to user-space */
> +
> +	b = (void __user*)(unsigned long)absbuf.buffer;
> +	for (i = 0; i < absbuf.count; ++i) {
> +		for (j = 0; j < absbuf.slots; ++j, ++b) {
> +			s32 v;
> +
> +			if (i >= count || j >= slots)
> +				v = 0;
> +			else
> +				v = *absrange_ptr(vals, val_cnt, slots, i, j);
> +
> +			if (put_user(v, b)) {
> +				retval = -EFAULT;
> +				goto out;
> +			}
> +		}
> +	}
> +
> +	retval = 0;
> +
> +out:
> +	free_absrange(vals, val_cnt);
> +	if (retval < 0)
> +		evdev_queue_syn_dropped(client);
> +	return retval;
> +}
> +
>  static int evdev_handle_get_keycode(struct input_dev *dev, void __user *p)
>  {
>  	struct input_keymap_entry ke = {
> @@ -889,7 +1058,7 @@ static int evdev_handle_get_val(struct evdev_client *client,
>  
>  	spin_unlock(&dev->event_lock);
>  
> -	__evdev_flush_queue(client, type);
> +	__evdev_flush_queue(client, type, 0, UINT_MAX);
>  
>  	spin_unlock_irq(&client->buffer_lock);
>  
> @@ -1006,6 +1175,9 @@ static long evdev_do_ioctl(struct file *file, unsigned int cmd,
>  		else
>  			return evdev_revoke(evdev, client, file);
>  
> +	case EVIOCGABSRANGE:
> +		return evdev_handle_get_absrange(client, dev, p);
> +
>  	case EVIOCGMASK:
>  		if (copy_from_user(&mask, p, sizeof(mask)))
>  			return -EFAULT;
> diff --git a/include/uapi/linux/input.h b/include/uapi/linux/input.h
> index f6ace0e..9f851d4 100644
> --- a/include/uapi/linux/input.h
> +++ b/include/uapi/linux/input.h
> @@ -32,7 +32,7 @@ struct input_event {
>   * Protocol version.
>   */
>  
> -#define EV_VERSION		0x010001
> +#define EV_VERSION		0x010002
>  
>  /*
>   * IOCTLs (0x00 - 0x7f)
> @@ -210,6 +210,48 @@ struct input_mask {
>   */
>  #define EVIOCSMASK		_IOW('E', 0x93, struct input_mask)	/* Set event-masks */
>  
> +struct input_absrange {
> +	__u16 slots;
> +	__u16 code;
> +	__u32 count;
> +	__u64 buffer;
> +};
> +
> +/**
> + * EVIOCGABSRANGE - Fetch range of ABS values
> + *
> + * This fetches the current values of a range of ABS codes atomically. The range
> + * of codes to fetch and the buffer-types are passed as "struct input_absrange",
> + * which has the following fields:
> + *      slots: Number of MT slots to fetch data for.
> + *       code: First ABS axis to query.
> + *      count: Number of ABS axes to query starting at @code.
> + *     buffer: Pointer to a receive buffer where to store the fetched ABS
> + *             values. This buffer must be an array of __s32 with at least
> + *             (@slots * @code) elements. The buffer is interpreted as two
> + *             dimensional __s32 array, declared as: __s32[slots][codes]

tbh this seems more complicated than necessary. Have you thought about
just dumping the events into the client buffer as if they came fresh in from
the device? So to sync, the client calls the ioctl with a buffer and a
buffer size, and the kernel simply writes a series of struct input_events
into that buffer, with ABS_MT_SLOT as required for all slots, (optionally?)
followed by a SYN_DROPPED. So the buffer afterwards could look like this:
   EV_ABS ABS_X 30
   EV_ABS ABS_X 1202
   EV_ABS ABS_MT_SLOT 0
   EV_ABS ABS_MT_POSITION_X 30
   EV_ABS ABS_MT_POSITION_Y 1202
   EV_ABS ABS_MT_SLOT 1
   EV_ABS ABS_MT_POSITION_X 80
   EV_ABS ABS_MT_POSITION_Y 1800
   EV_SYN SYN_REPORT 0

the client can then go through and just process the events on-by-one as it
would otherwise with real events.

This approach could be even extended to include EV_KEY, etc. providing a
single ioctl to sync the whole state of the device atomically.

comments?

> + *
> + * Compared to EVIOCGABS this ioctl allows to retrieve a range of ABS codes
> + * atomically regarding any concurrent buffer modifications. Furthermore, any
> + * pending events for codes that were retrived via this call are flushed from

typo: retrieved

Cheers,
   Peter

> + * the client's receive buffer. But unlike EVIOCGABS, this ioctl only returns
> + * the current value of an axis, rather than the whole "struct input_absinfo"
> + * set. All fields of "struct input_absinfo" except for the value are constant,
> + * though.
> + *
> + * The kernel's current view of the ABS axes is copied into the provided buffer.
> + * If an ABS axis is not enabled on the device, its value will be zero. Also, if
> + * an axis is not a slotted MT-axis, values for all but the first slot will be
> + * 0. If @slots is greater than the actual number of slots provided by the
> + * device, values for all slots higher than that will be 0.
> + *
> + * This call may fail with -EINVAL if the kernel doesn't support this call or
> + * the arguments are invalid, with -ENODEV if access was revoked, -ENOMEM if the
> + * kernel couldn't allocate temporary buffers for data-copy or -EFAULT if the
> + * passed pointer was invalid.
> + */
> +#define EVIOCGABSRANGE		_IOR('E', 0x94, struct input_absrange)
> +
>  #define EVIOCSCLOCKID		_IOW('E', 0xa0, int)			/* Set clockid to be used for timestamps */
>  
>  /*
> -- 
> 2.0.2
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-input" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Herrmann Aug. 8, 2014, 1:26 p.m. UTC | #2
Hi

On Wed, Aug 6, 2014 at 3:35 AM, Peter Hutterer <peter.hutterer@who-t.net> wrote:
> sorry for the late comments, not sure how that slipped through but it hadn't
> shown up in my inbox unil Benjamin poked me.
>
> On Sat, Jul 19, 2014 at 03:10:45PM +0200, David Herrmann wrote:
>> When we introduced the slotted MT ABS extensions, we didn't take care to
>> make SYN_DROPPED recoverable. Imagine a client recevies a SYN_DROPPED and
>> syncs its current state via EVIOCGABS. It has to call this ioctl for each
>> and every ABS code separately. Besides being horribly slow, this series
>> of ioctl-calls is not atomic. The kernel might queue new ABS events while
>> the client fetches the data.
>>
>> Now for normal ABS codes this is negligible as ABS values provide absolute
>> data. That is, there is usually no need to resync ABS codes as we don't
>> need previous values to interpret the next ABS code. Furthermore, most ABS
>> codes are also sent pretty frequently so a refresh is usually useless.
>>
>> However, with the introduction of slotted ABS axes we added a relative
>> component: ABS_MT_SLOT. If a client syncs all ABS codes via EVIOCGABS
>> while the kernel has ABS-events and an ABS_MT_SLOT event queued, the
>> client will think any read ABS-event is for the retrieved SLOT, however,
>> this is not true as all events until the next ABS_MT_SLOT event are for
>> the previously active slot:
>>
>>     Kernel queue is: { ABS_DROPPED,
>
> shouldn't this be SYN_DROPPED?

Whoops, indeed.

>>                        ABS_MT_POSITION_X(slot: 0),
>>                        ABS_MT_SLOT(slot: 1),
>>                        ABS_MT_POSITION_X(slot: 1) }
>>     Client reads ABS_DROPPED from queue.
>
> here too

Yep!

>>     Client syncs all ABS values:
>>       As part of that, client syncs ABS_MT_SLOT, which is 1 in the current
>>       view of the kernel.
>>     Client reads ABS_MT_POSITION_X and attributes it to slot 1 instead of
>>     slot 0, as the slot-value is not explicit.
>>
>> This is just a simple example how the relative information provided by the
>> ABS_MT_SLOT axis can be problematic to clients.
>>
>> Now there are many ways to fix this:
>>  * Make ABS_MT_SLOT a per-evdev-client attribute. On each
>>    EVIOCGABS(ABS_MT_SLOT) we can add fake ABS_MT_SLOT events to the queue.
>>    => Ugly and overkill
>>  * Flush all ABS events when clients read ABS_MT_SLOT.
>>    => Ugly hack and client might loose important ABS_MT_* events
>>  * Provide atomic EVIOCGABS API.
>>    => Sounds good!
>>
>> This patch introduces EVIOCGABSRANGE. Unlike EVIOCGABS, this ioctl only
>> fetches ABS values, rather than the whole "struct input_absinfo" set.
>> However, the new ioctl can fetch a range of ABS axes atomically and will
>> flush matching events from the client's receive queue. Moreover, you can
>> fetch all axes for *all* slots with a single call.
>>
>> This way, a client can simply run EVIOCGABSRANGE(0, ABS_CNT) and it will
>> receive a consistent view of the whole ABS state, while the kernel flushes
>> the receive-buffer for a consistent view.
>> While most clients probably only need
>> EVIOCGABSRANGE(ABS_MT_SLOT, ABS_MT_TOOL_y - ABS_MT_SLOT + 1), the ioctl
>> allows to receive an arbitrary range of axes.
>>
>> Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
>> ---
>>  drivers/input/evdev.c      | 180 ++++++++++++++++++++++++++++++++++++++++++++-
>>  include/uapi/linux/input.h |  44 ++++++++++-
>>  2 files changed, 219 insertions(+), 5 deletions(-)
>>
>> diff --git a/drivers/input/evdev.c b/drivers/input/evdev.c
>> index 6386882..7a25a7a 100644
>> --- a/drivers/input/evdev.c
>> +++ b/drivers/input/evdev.c
>> @@ -175,8 +175,9 @@ static bool __evdev_is_filtered(struct evdev_client *client,
>>       return mask && !test_bit(code, mask);
>>  }
>>
>> -/* flush queued events of type @type, caller must hold client->buffer_lock */
>> -static void __evdev_flush_queue(struct evdev_client *client, unsigned int type)
>> +/* flush queued, matching events, caller must hold client->buffer_lock */
>> +static void __evdev_flush_queue(struct evdev_client *client, unsigned int type,
>> +                             unsigned int code_first, unsigned int code_last)
>>  {
>>       unsigned int i, head, num;
>>       unsigned int mask = client->bufsize - 1;
>> @@ -195,7 +196,9 @@ static void __evdev_flush_queue(struct evdev_client *client, unsigned int type)
>>               ev = &client->buffer[i];
>>               is_report = ev->type == EV_SYN && ev->code == SYN_REPORT;
>>
>> -             if (ev->type == type) {
>> +             if (ev->type == type &&
>> +                 ev->code >= code_first &&
>> +                 ev->code <= code_last) {
>>                       /* drop matched entry */
>>                       continue;
>>               } else if (is_report && !num) {
>> @@ -786,6 +789,172 @@ static int handle_eviocgbit(struct input_dev *dev,
>>       return bits_to_user(bits, len, size, p, compat_mode);
>>  }
>>
>> +static inline void free_absrange(s32 **pages, size_t page_cnt)
>> +{
>> +     if (page_cnt > 1) {
>> +             while (page_cnt > 0) {
>> +                     if (!pages[--page_cnt])
>> +                             break;
>> +                     __free_page(virt_to_page(pages[page_cnt]));
>> +             }
>> +             kfree(pages);
>> +     } else if (page_cnt == 1) {
>> +             kfree(pages);
>> +     }
>> +}
>> +
>> +static inline s32 *absrange_ptr(s32 **pages, size_t page_cnt, size_t slots,
>> +                             size_t i_code, size_t j_slot)
>> +{
>> +     size_t idx, off;
>> +
>> +     idx = (i_code * slots + j_slot) / (PAGE_SIZE / sizeof(s32));
>> +     off = (i_code * slots + j_slot) % (PAGE_SIZE / sizeof(s32));
>> +
>> +     if (page_cnt == 1)
>> +             return &((s32*)pages)[off];
>> +     else
>> +             return &pages[idx][off];
>> +}
>> +
>> +static inline ssize_t fetch_absrange(struct evdev_client *client,
>> +                                  struct input_dev *dev, size_t start,
>> +                                  size_t count, size_t slots, s32 ***out)
>> +{
>> +     size_t size, page_cnt, i, j;
>> +     unsigned long flags;
>> +     s32 **pages;
>> +
>> +     /*
>> +      * Fetch data atomically from the device and flush buffers. We need to
>> +      * allocate a temporary buffer as copy_to_user() is not allowed while
>> +      * holding spinlocks. However, to-be-copied data might be huge and
>> +      * high-order allocations should be avoided. Therefore, do the
>> +      * page-allocation manually.
>> +      */
>> +
>> +     BUILD_BUG_ON(PAGE_SIZE % sizeof(s32) != 0);
>> +
>> +     size = sizeof(s32) * count * slots;
>> +     page_cnt = DIV_ROUND_UP(size, PAGE_SIZE);
>> +     if (page_cnt < 1) {
>> +             return 0;
>> +     } else if (page_cnt == 1) {
>> +             pages = kzalloc(size, GFP_TEMPORARY);
>> +             if (!pages)
>> +                     return -ENOMEM;
>> +     } else {
>> +             pages = kzalloc(sizeof(*pages) * page_cnt, GFP_TEMPORARY);
>> +             if (!pages)
>> +                     return -ENOMEM;
>> +
>> +             for (i = 0; i < page_cnt; ++i) {
>> +                     pages[i] = (void*)get_zeroed_page(GFP_TEMPORARY);
>> +                     if (!pages[i]) {
>> +                             free_absrange(pages, page_cnt);
>> +                             return -ENOMEM;
>> +                     }
>> +             }
>> +     }
>> +
>> +     spin_lock_irqsave(&dev->event_lock, flags);
>> +     spin_lock(&client->buffer_lock);
>> +
>> +     for (i = 0; i < count; ++i) {
>> +             __u16 code;
>> +             bool is_mt;
>> +
>> +             code = start + i;
>> +             is_mt = input_is_mt_value(code);
>> +             if (is_mt && !dev->mt)
>> +                     continue;
>> +
>> +             for (j = 0; j < slots; ++j) {
>> +                     __s32 v;
>> +
>> +                     if (is_mt)
>> +                             v = input_mt_get_value(&dev->mt->slots[j],
>> +                                                    code);
>> +                     else
>> +                             v = dev->absinfo[code].value;
>> +
>> +                     *absrange_ptr(pages, page_cnt, slots, i, j) = v;
>> +
>> +                     if (!is_mt)
>> +                             break;
>> +             }
>> +     }
>> +
>> +     spin_unlock(&client->buffer_lock);
>> +     __evdev_flush_queue(client, EV_ABS, start, start + count - 1);
>> +     spin_unlock_irqrestore(&dev->event_lock, flags);
>> +
>> +     *out = pages;
>> +     return page_cnt;
>> +}
>> +
>> +static int evdev_handle_get_absrange(struct evdev_client *client,
>> +                                  struct input_dev *dev,
>> +                                  struct input_absrange __user *p)
>> +{
>> +     size_t slots, code, count, i, j;
>> +     struct input_absrange absbuf;
>> +     s32 **vals = NULL;
>> +     ssize_t val_cnt;
>> +     s32 __user *b;
>> +     int retval;
>> +
>> +     if (!dev->absinfo)
>> +             return -EINVAL;
>> +     if (copy_from_user(&absbuf, p, sizeof(absbuf)))
>> +             return -EFAULT;
>> +
>> +     slots = min_t(size_t, dev->mt ? dev->mt->num_slots : 1, absbuf.slots);
>> +     code = min_t(size_t, absbuf.code, ABS_CNT);
>> +     count = min_t(size_t, absbuf.count, ABS_CNT);
>> +
>> +     /* first fetch data atomically from device */
>> +
>> +     if (code + count > ABS_CNT)
>> +             count = ABS_CNT - code;
>> +
>> +     if (!slots || !count) {
>> +             val_cnt = 0;
>> +     } else {
>> +             val_cnt = fetch_absrange(client, dev, code, count,
>> +                                      slots, &vals);
>> +             if (val_cnt < 0)
>> +                     return val_cnt;
>> +     }
>> +
>> +     /* now copy data to user-space */
>> +
>> +     b = (void __user*)(unsigned long)absbuf.buffer;
>> +     for (i = 0; i < absbuf.count; ++i) {
>> +             for (j = 0; j < absbuf.slots; ++j, ++b) {
>> +                     s32 v;
>> +
>> +                     if (i >= count || j >= slots)
>> +                             v = 0;
>> +                     else
>> +                             v = *absrange_ptr(vals, val_cnt, slots, i, j);
>> +
>> +                     if (put_user(v, b)) {
>> +                             retval = -EFAULT;
>> +                             goto out;
>> +                     }
>> +             }
>> +     }
>> +
>> +     retval = 0;
>> +
>> +out:
>> +     free_absrange(vals, val_cnt);
>> +     if (retval < 0)
>> +             evdev_queue_syn_dropped(client);
>> +     return retval;
>> +}
>> +
>>  static int evdev_handle_get_keycode(struct input_dev *dev, void __user *p)
>>  {
>>       struct input_keymap_entry ke = {
>> @@ -889,7 +1058,7 @@ static int evdev_handle_get_val(struct evdev_client *client,
>>
>>       spin_unlock(&dev->event_lock);
>>
>> -     __evdev_flush_queue(client, type);
>> +     __evdev_flush_queue(client, type, 0, UINT_MAX);
>>
>>       spin_unlock_irq(&client->buffer_lock);
>>
>> @@ -1006,6 +1175,9 @@ static long evdev_do_ioctl(struct file *file, unsigned int cmd,
>>               else
>>                       return evdev_revoke(evdev, client, file);
>>
>> +     case EVIOCGABSRANGE:
>> +             return evdev_handle_get_absrange(client, dev, p);
>> +
>>       case EVIOCGMASK:
>>               if (copy_from_user(&mask, p, sizeof(mask)))
>>                       return -EFAULT;
>> diff --git a/include/uapi/linux/input.h b/include/uapi/linux/input.h
>> index f6ace0e..9f851d4 100644
>> --- a/include/uapi/linux/input.h
>> +++ b/include/uapi/linux/input.h
>> @@ -32,7 +32,7 @@ struct input_event {
>>   * Protocol version.
>>   */
>>
>> -#define EV_VERSION           0x010001
>> +#define EV_VERSION           0x010002
>>
>>  /*
>>   * IOCTLs (0x00 - 0x7f)
>> @@ -210,6 +210,48 @@ struct input_mask {
>>   */
>>  #define EVIOCSMASK           _IOW('E', 0x93, struct input_mask)      /* Set event-masks */
>>
>> +struct input_absrange {
>> +     __u16 slots;
>> +     __u16 code;
>> +     __u32 count;
>> +     __u64 buffer;
>> +};
>> +
>> +/**
>> + * EVIOCGABSRANGE - Fetch range of ABS values
>> + *
>> + * This fetches the current values of a range of ABS codes atomically. The range
>> + * of codes to fetch and the buffer-types are passed as "struct input_absrange",
>> + * which has the following fields:
>> + *      slots: Number of MT slots to fetch data for.
>> + *       code: First ABS axis to query.
>> + *      count: Number of ABS axes to query starting at @code.
>> + *     buffer: Pointer to a receive buffer where to store the fetched ABS
>> + *             values. This buffer must be an array of __s32 with at least
>> + *             (@slots * @code) elements. The buffer is interpreted as two
>> + *             dimensional __s32 array, declared as: __s32[slots][codes]
>
> tbh this seems more complicated than necessary. Have you thought about
> just dumping the events into the client buffer as if they came fresh in from
> the device? So to sync, the client calls the ioctl with a buffer and a
> buffer size, and the kernel simply writes a series of struct input_events
> into that buffer, with ABS_MT_SLOT as required for all slots, (optionally?)
> followed by a SYN_DROPPED. So the buffer afterwards could look like this:
>    EV_ABS ABS_X 30
>    EV_ABS ABS_X 1202
>    EV_ABS ABS_MT_SLOT 0
>    EV_ABS ABS_MT_POSITION_X 30
>    EV_ABS ABS_MT_POSITION_Y 1202
>    EV_ABS ABS_MT_SLOT 1
>    EV_ABS ABS_MT_POSITION_X 80
>    EV_ABS ABS_MT_POSITION_Y 1800
>    EV_SYN SYN_REPORT 0
>
> the client can then go through and just process the events on-by-one as it
> would otherwise with real events.
>
> This approach could be even extended to include EV_KEY, etc. providing a
> single ioctl to sync the whole state of the device atomically.
>
> comments?

So you mean instead of passing a __32 array we pass a "struct
input_event" array and write it there? So bypassing the receive-queue?
That does sound quite nice, indeed. We could replace all the other
"sync" ioctls and just provide a way to receive all the events
directly.

Something like:

EVIOCQUERY(struct input_query)

struct input_query {
        __u16 type;
        __u16 start_code;
        __u16 end_code;
        __u16 slots;

        struct input_event buffer[];
};

This way, you specify the event type as "type", the start/end code and
the kernel copies the queried events into "buffer". For ABS we need
the extra "slots" variable, so you can query all slots atomically.
I think I will give it a try. I like the generic touch it has.

Btw., I wonder whether it is cheaper to use get_user_pages() on the
receive buffer instead of allocating temporary pages, and then mapping
them temporarily for direct access. Hm... stupid huge buffers..

Thanks
David
--
To unsubscribe from this list: send the line "unsubscribe linux-input" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Dmitry Torokhov Aug. 8, 2014, 5:47 p.m. UTC | #3
On Fri, Aug 08, 2014 at 03:26:56PM +0200, David Herrmann wrote:
> Hi
> 
> On Wed, Aug 6, 2014 at 3:35 AM, Peter Hutterer <peter.hutterer@who-t.net> wrote:
> >> +
> >> +/**
> >> + * EVIOCGABSRANGE - Fetch range of ABS values
> >> + *
> >> + * This fetches the current values of a range of ABS codes atomically. The range
> >> + * of codes to fetch and the buffer-types are passed as "struct input_absrange",
> >> + * which has the following fields:
> >> + *      slots: Number of MT slots to fetch data for.
> >> + *       code: First ABS axis to query.
> >> + *      count: Number of ABS axes to query starting at @code.
> >> + *     buffer: Pointer to a receive buffer where to store the fetched ABS
> >> + *             values. This buffer must be an array of __s32 with at least
> >> + *             (@slots * @code) elements. The buffer is interpreted as two
> >> + *             dimensional __s32 array, declared as: __s32[slots][codes]
> >
> > tbh this seems more complicated than necessary. Have you thought about
> > just dumping the events into the client buffer as if they came fresh in from
> > the device? So to sync, the client calls the ioctl with a buffer and a
> > buffer size, and the kernel simply writes a series of struct input_events
> > into that buffer, with ABS_MT_SLOT as required for all slots, (optionally?)
> > followed by a SYN_DROPPED. So the buffer afterwards could look like this:
> >    EV_ABS ABS_X 30
> >    EV_ABS ABS_X 1202
> >    EV_ABS ABS_MT_SLOT 0
> >    EV_ABS ABS_MT_POSITION_X 30
> >    EV_ABS ABS_MT_POSITION_Y 1202
> >    EV_ABS ABS_MT_SLOT 1
> >    EV_ABS ABS_MT_POSITION_X 80
> >    EV_ABS ABS_MT_POSITION_Y 1800
> >    EV_SYN SYN_REPORT 0
> >
> > the client can then go through and just process the events on-by-one as it
> > would otherwise with real events.
> >
> > This approach could be even extended to include EV_KEY, etc. providing a
> > single ioctl to sync the whole state of the device atomically.
> >
> > comments?

I like it.

> 
> So you mean instead of passing a __32 array we pass a "struct
> input_event" array and write it there? So bypassing the receive-queue?
> That does sound quite nice, indeed. We could replace all the other
> "sync" ioctls and just provide a way to receive all the events
> directly.
> 
> Something like:
> 
> EVIOCQUERY(struct input_query)
> 
> struct input_query {
>         __u16 type;
>         __u16 start_code;
>         __u16 end_code;
>         __u16 slots;
> 
>         struct input_event buffer[];
> };

No, it is more like EVIOCRESYNC(void) which makes input core to dump all
existing state into the client's standard event queue so that here is no
need to reconcile/reconstruct anything. We could give a new SYN marker
to indicate end-of-state boundary.

Thanks.
David Herrmann Aug. 10, 2014, 3:21 p.m. UTC | #4
Hi

On Fri, Aug 8, 2014 at 7:47 PM, Dmitry Torokhov
<dmitry.torokhov@gmail.com> wrote:
> On Fri, Aug 08, 2014 at 03:26:56PM +0200, David Herrmann wrote:
>> Hi
>>
>> On Wed, Aug 6, 2014 at 3:35 AM, Peter Hutterer <peter.hutterer@who-t.net> wrote:
>> >> +
>> >> +/**
>> >> + * EVIOCGABSRANGE - Fetch range of ABS values
>> >> + *
>> >> + * This fetches the current values of a range of ABS codes atomically. The range
>> >> + * of codes to fetch and the buffer-types are passed as "struct input_absrange",
>> >> + * which has the following fields:
>> >> + *      slots: Number of MT slots to fetch data for.
>> >> + *       code: First ABS axis to query.
>> >> + *      count: Number of ABS axes to query starting at @code.
>> >> + *     buffer: Pointer to a receive buffer where to store the fetched ABS
>> >> + *             values. This buffer must be an array of __s32 with at least
>> >> + *             (@slots * @code) elements. The buffer is interpreted as two
>> >> + *             dimensional __s32 array, declared as: __s32[slots][codes]
>> >
>> > tbh this seems more complicated than necessary. Have you thought about
>> > just dumping the events into the client buffer as if they came fresh in from
>> > the device? So to sync, the client calls the ioctl with a buffer and a
>> > buffer size, and the kernel simply writes a series of struct input_events
>> > into that buffer, with ABS_MT_SLOT as required for all slots, (optionally?)
>> > followed by a SYN_DROPPED. So the buffer afterwards could look like this:
>> >    EV_ABS ABS_X 30
>> >    EV_ABS ABS_X 1202
>> >    EV_ABS ABS_MT_SLOT 0
>> >    EV_ABS ABS_MT_POSITION_X 30
>> >    EV_ABS ABS_MT_POSITION_Y 1202
>> >    EV_ABS ABS_MT_SLOT 1
>> >    EV_ABS ABS_MT_POSITION_X 80
>> >    EV_ABS ABS_MT_POSITION_Y 1800
>> >    EV_SYN SYN_REPORT 0
>> >
>> > the client can then go through and just process the events on-by-one as it
>> > would otherwise with real events.
>> >
>> > This approach could be even extended to include EV_KEY, etc. providing a
>> > single ioctl to sync the whole state of the device atomically.
>> >
>> > comments?
>
> I like it.
>
>>
>> So you mean instead of passing a __32 array we pass a "struct
>> input_event" array and write it there? So bypassing the receive-queue?
>> That does sound quite nice, indeed. We could replace all the other
>> "sync" ioctls and just provide a way to receive all the events
>> directly.
>>
>> Something like:
>>
>> EVIOCQUERY(struct input_query)
>>
>> struct input_query {
>>         __u16 type;
>>         __u16 start_code;
>>         __u16 end_code;
>>         __u16 slots;
>>
>>         struct input_event buffer[];
>> };
>
> No, it is more like EVIOCRESYNC(void) which makes input core to dump all
> existing state into the client's standard event queue so that here is no
> need to reconcile/reconstruct anything. We could give a new SYN marker
> to indicate end-of-state boundary.

This doesn't make sense to me, for two reasons:
 * Events received during re-sync are usually handled differently than
   normal events. For instance, you *must* never handle key events
   from re-syncs for shortcuts, because you don't know the order
   they were pressed in. If we put sync events in the queue, users
   cannot know where an event came from.
   Inserting a SYNC marker is ridiculous, because following normal
   events may drop those (via SYN_DROPPED). We'd have to
   protect it. Furthermore, it's awful to handle this from user-space,
   because you have no idea how many events are sent as part
   of the sync.

 * The receive-queue is usually too small to hold all those events.
   I mean, evdev uses EVDEV_BUF_PACKETS as number of buffer
   slots (defined as 8!). You cannot even hold a whole keyboard
   state there. Yes, usually it should be enough, but re-syncing
   is about handling corner-cases. If we make SYN_DROPPED
   handling cause SYN_DROPPED, we can just ignore it.

I understand why EVIORESYNC(void) is tempting. It avoids all the
complexity of my patch and makes all the other sync-ioctls we have so
far obsolete. But the reason we want it, is to avoid shortcomings of
the limited receive-queue. I don't think any solution involving the
receive-queue is going to work out. Even if we always make sure the
receive-queue is big enough, we might still get new events coming in
before user-space can drain it.

How about this:

struct input_resync {
        __u64 count;
        struct input_event buffer[];
};

EVIOCSYNC(struct input_resync) copies the current state of all
available event-types of the device into the buffer. It returns the
number of events that it wants to write (so user-space can pass
count==0 the first time to figure out the required buffer-size). The
ioctl then flushed the _whole_ input queue.

Note that even if we used the client's receive buffer, we'd have to
copy huge amounts of events while holding the event lock. So both
ideas have to lock the buffers during the operation. The downside of
my solution is that we have to allocate temporary buffers as we cannot
copy to user-space while holding spin-locks. However, my code already
makes sure we don't do high-order allocations, so I think this is
fine.

This ioctl would also allow us to get rid of all the other QUERY
ioctls. User-space can use it to initialize it's state and then update
it according to incoming events. This way, we simplify the API
considerably! I really like this idea. I know, we only need this to
handle corner-cases. But as Peter already said, it's really easy to
trigger and ignoring it causes awful bugs in user-space.

Comments?
David
--
To unsubscribe from this list: send the line "unsubscribe linux-input" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Peter Hutterer Aug. 10, 2014, 11:17 p.m. UTC | #5
On Sun, Aug 10, 2014 at 05:21:47PM +0200, David Herrmann wrote:
> Hi
> 
> On Fri, Aug 8, 2014 at 7:47 PM, Dmitry Torokhov
> <dmitry.torokhov@gmail.com> wrote:
> > On Fri, Aug 08, 2014 at 03:26:56PM +0200, David Herrmann wrote:
> >> Hi
> >>
> >> On Wed, Aug 6, 2014 at 3:35 AM, Peter Hutterer <peter.hutterer@who-t.net> wrote:
> >> >> +
> >> >> +/**
> >> >> + * EVIOCGABSRANGE - Fetch range of ABS values
> >> >> + *
> >> >> + * This fetches the current values of a range of ABS codes atomically. The range
> >> >> + * of codes to fetch and the buffer-types are passed as "struct input_absrange",
> >> >> + * which has the following fields:
> >> >> + *      slots: Number of MT slots to fetch data for.
> >> >> + *       code: First ABS axis to query.
> >> >> + *      count: Number of ABS axes to query starting at @code.
> >> >> + *     buffer: Pointer to a receive buffer where to store the fetched ABS
> >> >> + *             values. This buffer must be an array of __s32 with at least
> >> >> + *             (@slots * @code) elements. The buffer is interpreted as two
> >> >> + *             dimensional __s32 array, declared as: __s32[slots][codes]
> >> >
> >> > tbh this seems more complicated than necessary. Have you thought about
> >> > just dumping the events into the client buffer as if they came fresh in from
> >> > the device? So to sync, the client calls the ioctl with a buffer and a
> >> > buffer size, and the kernel simply writes a series of struct input_events
> >> > into that buffer, with ABS_MT_SLOT as required for all slots, (optionally?)
> >> > followed by a SYN_DROPPED. So the buffer afterwards could look like this:
> >> >    EV_ABS ABS_X 30
> >> >    EV_ABS ABS_X 1202
> >> >    EV_ABS ABS_MT_SLOT 0
> >> >    EV_ABS ABS_MT_POSITION_X 30
> >> >    EV_ABS ABS_MT_POSITION_Y 1202
> >> >    EV_ABS ABS_MT_SLOT 1
> >> >    EV_ABS ABS_MT_POSITION_X 80
> >> >    EV_ABS ABS_MT_POSITION_Y 1800
> >> >    EV_SYN SYN_REPORT 0
> >> >
> >> > the client can then go through and just process the events on-by-one as it
> >> > would otherwise with real events.
> >> >
> >> > This approach could be even extended to include EV_KEY, etc. providing a
> >> > single ioctl to sync the whole state of the device atomically.
> >> >
> >> > comments?
> >
> > I like it.
> >
> >>
> >> So you mean instead of passing a __32 array we pass a "struct
> >> input_event" array and write it there? So bypassing the receive-queue?
> >> That does sound quite nice, indeed. We could replace all the other
> >> "sync" ioctls and just provide a way to receive all the events
> >> directly.
> >>
> >> Something like:
> >>
> >> EVIOCQUERY(struct input_query)
> >>
> >> struct input_query {
> >>         __u16 type;
> >>         __u16 start_code;
> >>         __u16 end_code;
> >>         __u16 slots;
> >>
> >>         struct input_event buffer[];
> >> };
> >
> > No, it is more like EVIOCRESYNC(void) which makes input core to dump all
> > existing state into the client's standard event queue so that here is no
> > need to reconcile/reconstruct anything. We could give a new SYN marker
> > to indicate end-of-state boundary.
> 
> This doesn't make sense to me, for two reasons:
>  * Events received during re-sync are usually handled differently than
>    normal events. For instance, you *must* never handle key events
>    from re-syncs for shortcuts, because you don't know the order
>    they were pressed in. 

I think you're vastly overestimating what the majority of userspace does ;)
both the xorg and the libinput-based stacks pretty much ignore this at the
moment and by the time we handle shortcuts we're already one or more APIs
away from the SYN_DROPPED handling.

also, I haven't seen SYN_DROPPED issues from keyboard events. They usually
happen only on absolute touch devices and ironically enough that's the one
case where the kernel currently doesn't allow for a race-less resync.

>    If we put sync events in the queue, users cannot know where an event
>    from.
>    Inserting a SYNC marker is ridiculous, because following normal
>    events may drop those (via SYN_DROPPED). We'd have to
>    protect it. Furthermore, it's awful to handle this from user-space,
>    because you have no idea how many events are sent as part
>    of the sync.

I suspect Dmitry meant a new EV_SYN code. From a userspace POV, I'd know
that after the ioctl() all events up until SYN_SYNC_DONE are the syncing
events. That would work well and is coincidentally almost identical to the
libevdev public API.

In libevdev we already have code to calculate the maximum number of events
by a device for the current SYN_DROPPED handling. And you can reduce the
number of events by only sending those in a non-zero state for EV_KEY,
EV_LED, etc.

IMO the real issue preventing this approach is:

>  * The receive-queue is usually too small to hold all those events.
>    I mean, evdev uses EVDEV_BUF_PACKETS as number of buffer
>    slots (defined as 8!). You cannot even hold a whole keyboard
>    state there. Yes, usually it should be enough, but re-syncing
>    is about handling corner-cases. If we make SYN_DROPPED
>    handling cause SYN_DROPPED, we can just ignore it.

yep, that too was my first thought. with a plain resync ioctl you're pretty
much guaranteed to get SYN_DROPPED before the client manages to handle the
resync. Even if you reduce the number of events as above because the most
common occurance for SYN_DROPPED is in the ABS_MT range which we cannot
meaningfully reduce.

> I understand why EVIORESYNC(void) is tempting. It avoids all the
> complexity of my patch and makes all the other sync-ioctls we have so
> far obsolete. But the reason we want it, is to avoid shortcomings of
> the limited receive-queue. I don't think any solution involving the
> receive-queue is going to work out. Even if we always make sure the
> receive-queue is big enough, we might still get new events coming in
> before user-space can drain it.
> 
> How about this:
> 
> struct input_resync {
>         __u64 count;
>         struct input_event buffer[];
> };
> 
> EVIOCSYNC(struct input_resync) copies the current state of all
> available event-types of the device into the buffer. It returns the
> number of events that it wants to write (so user-space can pass
> count==0 the first time to figure out the required buffer-size). The
> ioctl then flushed the _whole_ input queue.

if you go with the suggestion above of only putting events in with a
non-zero state then a count of 0 is guaranteed to return the wrong number
since you may get events between the two ioctl calls. I don't think
the double ioctl() call is that a case we need. as I said above the case of
all events must already be handled in userspace anyway.

what's the reason you don't use a EVIOCSYNC(len) approach btw? quick check
shows at 13 or 14 bits for len which should be enough for pretty much any
device for a long time.

> Note that even if we used the client's receive buffer, we'd have to
> copy huge amounts of events while holding the event lock. So both
> ideas have to lock the buffers during the operation. The downside of
> my solution is that we have to allocate temporary buffers as we cannot
> copy to user-space while holding spin-locks. However, my code already
> makes sure we don't do high-order allocations, so I think this is
> fine.
> 
> This ioctl would also allow us to get rid of all the other QUERY
> ioctls. User-space can use it to initialize it's state and then update
> it according to incoming events. This way, we simplify the API
> considerably! I really like this idea. I know, we only need this to
> handle corner-cases. But as Peter already said, it's really easy to
> trigger and ignoring it causes awful bugs in user-space.

summary from my side: I'd prefer EVIOCSYNC() with a buffer parameter, only
sending nonzero state for key/led/sw/etc. and the full state for abs. no
need for a new EV_SYN code, and provided it empties the whole queue we
should be race-free.

Cheers,
   Peter
--
To unsubscribe from this list: send the line "unsubscribe linux-input" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Dmitry Torokhov Aug. 11, 2014, 12:01 a.m. UTC | #6
On Mon, Aug 11, 2014 at 09:17:59AM +1000, Peter Hutterer wrote:
> On Sun, Aug 10, 2014 at 05:21:47PM +0200, David Herrmann wrote:
> > Hi
> > 
> > On Fri, Aug 8, 2014 at 7:47 PM, Dmitry Torokhov
> > <dmitry.torokhov@gmail.com> wrote:
> > > On Fri, Aug 08, 2014 at 03:26:56PM +0200, David Herrmann wrote:
> > >> Hi
> > >>
> > >> On Wed, Aug 6, 2014 at 3:35 AM, Peter Hutterer <peter.hutterer@who-t.net> wrote:
> > >> >> +
> > >> >> +/**
> > >> >> + * EVIOCGABSRANGE - Fetch range of ABS values
> > >> >> + *
> > >> >> + * This fetches the current values of a range of ABS codes atomically. The range
> > >> >> + * of codes to fetch and the buffer-types are passed as "struct input_absrange",
> > >> >> + * which has the following fields:
> > >> >> + *      slots: Number of MT slots to fetch data for.
> > >> >> + *       code: First ABS axis to query.
> > >> >> + *      count: Number of ABS axes to query starting at @code.
> > >> >> + *     buffer: Pointer to a receive buffer where to store the fetched ABS
> > >> >> + *             values. This buffer must be an array of __s32 with at least
> > >> >> + *             (@slots * @code) elements. The buffer is interpreted as two
> > >> >> + *             dimensional __s32 array, declared as: __s32[slots][codes]
> > >> >
> > >> > tbh this seems more complicated than necessary. Have you thought about
> > >> > just dumping the events into the client buffer as if they came fresh in from
> > >> > the device? So to sync, the client calls the ioctl with a buffer and a
> > >> > buffer size, and the kernel simply writes a series of struct input_events
> > >> > into that buffer, with ABS_MT_SLOT as required for all slots, (optionally?)
> > >> > followed by a SYN_DROPPED. So the buffer afterwards could look like this:
> > >> >    EV_ABS ABS_X 30
> > >> >    EV_ABS ABS_X 1202
> > >> >    EV_ABS ABS_MT_SLOT 0
> > >> >    EV_ABS ABS_MT_POSITION_X 30
> > >> >    EV_ABS ABS_MT_POSITION_Y 1202
> > >> >    EV_ABS ABS_MT_SLOT 1
> > >> >    EV_ABS ABS_MT_POSITION_X 80
> > >> >    EV_ABS ABS_MT_POSITION_Y 1800
> > >> >    EV_SYN SYN_REPORT 0
> > >> >
> > >> > the client can then go through and just process the events on-by-one as it
> > >> > would otherwise with real events.
> > >> >
> > >> > This approach could be even extended to include EV_KEY, etc. providing a
> > >> > single ioctl to sync the whole state of the device atomically.
> > >> >
> > >> > comments?
> > >
> > > I like it.
> > >
> > >>
> > >> So you mean instead of passing a __32 array we pass a "struct
> > >> input_event" array and write it there? So bypassing the receive-queue?
> > >> That does sound quite nice, indeed. We could replace all the other
> > >> "sync" ioctls and just provide a way to receive all the events
> > >> directly.
> > >>
> > >> Something like:
> > >>
> > >> EVIOCQUERY(struct input_query)
> > >>
> > >> struct input_query {
> > >>         __u16 type;
> > >>         __u16 start_code;
> > >>         __u16 end_code;
> > >>         __u16 slots;
> > >>
> > >>         struct input_event buffer[];
> > >> };
> > >
> > > No, it is more like EVIOCRESYNC(void) which makes input core to dump all
> > > existing state into the client's standard event queue so that here is no
> > > need to reconcile/reconstruct anything. We could give a new SYN marker
> > > to indicate end-of-state boundary.
> > 
> > This doesn't make sense to me, for two reasons:
> >  * Events received during re-sync are usually handled differently than
> >    normal events. For instance, you *must* never handle key events
> >    from re-syncs for shortcuts, because you don't know the order
> >    they were pressed in. 
> 
> I think you're vastly overestimating what the majority of userspace does ;)
> both the xorg and the libinput-based stacks pretty much ignore this at the
> moment and by the time we handle shortcuts we're already one or more APIs
> away from the SYN_DROPPED handling.

Also, client would not out of sudden get a bunch of events, it would have
requested them to be resent so it would know how to handle them properly.

> 
> also, I haven't seen SYN_DROPPED issues from keyboard events. They usually
> happen only on absolute touch devices and ironically enough that's the one
> case where the kernel currently doesn't allow for a race-less resync.
> 
> >    If we put sync events in the queue, users cannot know where an event
> >    from.
> >    Inserting a SYNC marker is ridiculous, because following normal
> >    events may drop those (via SYN_DROPPED). We'd have to
> >    protect it. Furthermore, it's awful to handle this from user-space,
> >    because you have no idea how many events are sent as part
> >    of the sync.
> 
> I suspect Dmitry meant a new EV_SYN code. From a userspace POV, I'd know
> that after the ioctl() all events up until SYN_SYNC_DONE are the syncing
> events. That would work well and is coincidentally almost identical to the
> libevdev public API.

Yes, that's what I ment.


> 
> In libevdev we already have code to calculate the maximum number of events
> by a device for the current SYN_DROPPED handling. And you can reduce the
> number of events by only sending those in a non-zero state for EV_KEY,
> EV_LED, etc.
> 
> IMO the real issue preventing this approach is:
> 
> >  * The receive-queue is usually too small to hold all those events.
> >    I mean, evdev uses EVDEV_BUF_PACKETS as number of buffer
> >    slots (defined as 8!). You cannot even hold a whole keyboard
> >    state there. Yes, usually it should be enough, but re-syncing

Note that EVDEV_BUF_PACKETS is not raw number of events in the queue, but
number for full "packets". For plain keyboards we end up with buffer enough to
hold 8 * 8 = 64 events. Since we only need to transmit active keys that should
be enough for any keyboard and if there are keyboards reporting more then we'd
have to change their hint anyway.

> >
> >    is about handling corner-cases. If we make SYN_DROPPED
> >    handling cause SYN_DROPPED, we can just ignore it.
> 
> yep, that too was my first thought. with a plain resync ioctl you're pretty
> much guaranteed to get SYN_DROPPED before the client manages to handle the
> resync. Even if you reduce the number of events as above because the most
> common occurance for SYN_DROPPED is in the ABS_MT range which we cannot
> meaningfully reduce.

Hmm, that's a problem... But is it? We need to make sure that buffer is large
enough for the MT device to transmit all it's contacts properly. We can not
expect that we'll always be able to reduce number of events if a user actively
uses 10 contacts. IOW we need to solve this issue regardless of this proposed
sync ioctl.

Maybe we need to review drivers and see if they need to supply their own hints
or update hinting logic in core?

Thanks.
Peter Hutterer Aug. 11, 2014, 2:13 a.m. UTC | #7
On Sun, Aug 10, 2014 at 05:01:35PM -0700, Dmitry Torokhov wrote:
[...]
> > >
> > >    is about handling corner-cases. If we make SYN_DROPPED
> > >    handling cause SYN_DROPPED, we can just ignore it.
> > 
> > yep, that too was my first thought. with a plain resync ioctl you're pretty
> > much guaranteed to get SYN_DROPPED before the client manages to handle the
> > resync. Even if you reduce the number of events as above because the most
> > common occurance for SYN_DROPPED is in the ABS_MT range which we cannot
> > meaningfully reduce.
> 
> Hmm, that's a problem... But is it? We need to make sure that buffer is large
> enough for the MT device to transmit all it's contacts properly. We can not
> expect that we'll always be able to reduce number of events if a user actively
> uses 10 contacts. IOW we need to solve this issue regardless of this proposed
> sync ioctl.
>
> Maybe we need to review drivers and see if they need to supply their own hints
> or update hinting logic in core?
 
The buffer is already large enough for at least one full report from the
device plus a few extra events [1]. for the devices we see SYN_REPORT most
frequently dumping the state means filling up the buffer to the almost
maximum. To give some room for movement, we need to increase the queue by at
least a factor 2. That gives us with room for one whole sync report and at
least one full extra event. Anything smaller we get the side-effect
that a client that is too slow and gets a SYN_DROPPED is actually worse off
because now the buffer is so full from the sync that a SYN_DROPPED is even
more likely to occur than before.

We also need to define the behaviour for the queue filling up while the
client is in the middle of a sync. That means the client must be
able to handle SYN_DROPPED as well as SYN_SYNC_DONE during a sync or the
kernel protects the events up to SYN_SYNC_DONE in the queue in the case of
a SYN_DROPPED.

Either way it's IMO more complicated than having a separate buffer for the
sync state.

Cheers,
   Peter

[1] almost, it doesn't account for EV_SW for example
--
To unsubscribe from this list: send the line "unsubscribe linux-input" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Herrmann Aug. 11, 2014, 10 a.m. UTC | #8
Hi

>> > No, it is more like EVIOCRESYNC(void) which makes input core to dump all
>> > existing state into the client's standard event queue so that here is no
>> > need to reconcile/reconstruct anything. We could give a new SYN marker
>> > to indicate end-of-state boundary.
>>
>> This doesn't make sense to me, for two reasons:
>>  * Events received during re-sync are usually handled differently than
>>    normal events. For instance, you *must* never handle key events
>>    from re-syncs for shortcuts, because you don't know the order
>>    they were pressed in.
>
> I think you're vastly overestimating what the majority of userspace does ;)
> both the xorg and the libinput-based stacks pretty much ignore this at the
> moment and by the time we handle shortcuts we're already one or more APIs
> away from the SYN_DROPPED handling.

I know what Xorg and weston do and, honestly, I don't care. If major
user-space components decide to ignore those side-effects, I'm fine.
They may have legitimate reason to do so. But I don't think we should
use this to argue against proper API design in the kernel.

> also, I haven't seen SYN_DROPPED issues from keyboard events. They usually
> happen only on absolute touch devices and ironically enough that's the one
> case where the kernel currently doesn't allow for a race-less resync.

Yes, obviously ABS and REL events are the culprits as they can occur
in vast amounts. However, any other events in between (like button
presses) may be dropped due to excessive ABS events. So I disagree
that this is an exclusive issue of ABS/REL, they just trigger the
SYN_DROPPED.

>>    If we put sync events in the queue, users cannot know where an event
>>    from.
>>    Inserting a SYNC marker is ridiculous, because following normal
>>    events may drop those (via SYN_DROPPED). We'd have to
>>    protect it. Furthermore, it's awful to handle this from user-space,
>>    because you have no idea how many events are sent as part
>>    of the sync.
>
> I suspect Dmitry meant a new EV_SYN code. From a userspace POV, I'd know
> that after the ioctl() all events up until SYN_SYNC_DONE are the syncing
> events. That would work well and is coincidentally almost identical to the
> libevdev public API.
>
> In libevdev we already have code to calculate the maximum number of events
> by a device for the current SYN_DROPPED handling. And you can reduce the
> number of events by only sending those in a non-zero state for EV_KEY,
> EV_LED, etc.

Yes, I understand what Dmitry meant and I see the similarities to
libevdev. However, there are several issues:

1) *If* you flush all events into the queue, followed by a
SYN_SYNC_DONE marker, any following *real* events might cause the
queue to overflow and thus drop the SYN_SYNC_DONE marker. So if
user-space requested a sync and then reads a SYN_DROPPED *before*
reading a SYN_SYNC_DONE, it knows the sync didn't complete and has do
redo the sync all over. This is doable, but makes we wonder whether
it's the right design.

2) The evdev queue is not designed to hold this many events. Even if
it is, a SYN_DROPPED happens during excessive event reports. So if we
now also fill the queue with all sync-events, we increase the change
of overflowing the queue again, instead of clearing it and syncing
separately.

3) Flushing sync-events in the queue with special semantics (like only
sending non-zero state) sounds to me like a dirty hack, not like
proper API design. I mean, the queue was designed to send realtime
events, not to provide initial/SYN_DROPPED syncing. I don't see why
this is preferable to a separate user-provided buffer.

> IMO the real issue preventing this approach is:
>
>>  * The receive-queue is usually too small to hold all those events.
>>    I mean, evdev uses EVDEV_BUF_PACKETS as number of buffer
>>    slots (defined as 8!). You cannot even hold a whole keyboard
>>    state there. Yes, usually it should be enough, but re-syncing
>>    is about handling corner-cases. If we make SYN_DROPPED
>>    handling cause SYN_DROPPED, we can just ignore it.
>
> yep, that too was my first thought. with a plain resync ioctl you're pretty
> much guaranteed to get SYN_DROPPED before the client manages to handle the
> resync. Even if you reduce the number of events as above because the most
> common occurance for SYN_DROPPED is in the ABS_MT range which we cannot
> meaningfully reduce.
>
>> I understand why EVIORESYNC(void) is tempting. It avoids all the
>> complexity of my patch and makes all the other sync-ioctls we have so
>> far obsolete. But the reason we want it, is to avoid shortcomings of
>> the limited receive-queue. I don't think any solution involving the
>> receive-queue is going to work out. Even if we always make sure the
>> receive-queue is big enough, we might still get new events coming in
>> before user-space can drain it.
>>
>> How about this:
>>
>> struct input_resync {
>>         __u64 count;
>>         struct input_event buffer[];
>> };
>>
>> EVIOCSYNC(struct input_resync) copies the current state of all
>> available event-types of the device into the buffer. It returns the
>> number of events that it wants to write (so user-space can pass
>> count==0 the first time to figure out the required buffer-size). The
>> ioctl then flushed the _whole_ input queue.
>
> if you go with the suggestion above of only putting events in with a
> non-zero state then a count of 0 is guaranteed to return the wrong number
> since you may get events between the two ioctl calls. I don't think
> the double ioctl() call is that a case we need. as I said above the case of
> all events must already be handled in userspace anyway.

I think it's a bad idea to only send events with non-zero state. I can
live with it, but I don't understand why we do such optimizations for
an edge-case like SYN_DROPPED handling. It just complicates things and
no-one cares whether we copy 10 or 100 events, right?

> what's the reason you don't use a EVIOCSYNC(len) approach btw? quick check
> shows at 13 or 14 bits for len which should be enough for pretty much any
> device for a long time.

I think we can just make user-space read the event-bitmasks and
calculate the length themselves. This way, we avoid double-ioctl()
calls and it's easy enough to do from user-space.

>> Note that even if we used the client's receive buffer, we'd have to
>> copy huge amounts of events while holding the event lock. So both
>> ideas have to lock the buffers during the operation. The downside of
>> my solution is that we have to allocate temporary buffers as we cannot
>> copy to user-space while holding spin-locks. However, my code already
>> makes sure we don't do high-order allocations, so I think this is
>> fine.
>>
>> This ioctl would also allow us to get rid of all the other QUERY
>> ioctls. User-space can use it to initialize it's state and then update
>> it according to incoming events. This way, we simplify the API
>> considerably! I really like this idea. I know, we only need this to
>> handle corner-cases. But as Peter already said, it's really easy to
>> trigger and ignoring it causes awful bugs in user-space.
>
> summary from my side: I'd prefer EVIOCSYNC() with a buffer parameter, only
> sending nonzero state for key/led/sw/etc. and the full state for abs. no
> need for a new EV_SYN code, and provided it empties the whole queue we
> should be race-free.

Except for "only send non-zero state" I fully agree. But I would also
accept it with that limitation/optimization.

Thanks
David
--
To unsubscribe from this list: send the line "unsubscribe linux-input" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Herrmann Aug. 11, 2014, 10:02 a.m. UTC | #9
Hi

On Mon, Aug 11, 2014 at 4:13 AM, Peter Hutterer
<peter.hutterer@who-t.net> wrote:
> On Sun, Aug 10, 2014 at 05:01:35PM -0700, Dmitry Torokhov wrote:
> [...]
>> > >
>> > >    is about handling corner-cases. If we make SYN_DROPPED
>> > >    handling cause SYN_DROPPED, we can just ignore it.
>> >
>> > yep, that too was my first thought. with a plain resync ioctl you're pretty
>> > much guaranteed to get SYN_DROPPED before the client manages to handle the
>> > resync. Even if you reduce the number of events as above because the most
>> > common occurance for SYN_DROPPED is in the ABS_MT range which we cannot
>> > meaningfully reduce.
>>
>> Hmm, that's a problem... But is it? We need to make sure that buffer is large
>> enough for the MT device to transmit all it's contacts properly. We can not
>> expect that we'll always be able to reduce number of events if a user actively
>> uses 10 contacts. IOW we need to solve this issue regardless of this proposed
>> sync ioctl.
>>
>> Maybe we need to review drivers and see if they need to supply their own hints
>> or update hinting logic in core?
>
> The buffer is already large enough for at least one full report from the
> device plus a few extra events [1]. for the devices we see SYN_REPORT most
> frequently dumping the state means filling up the buffer to the almost
> maximum. To give some room for movement, we need to increase the queue by at
> least a factor 2. That gives us with room for one whole sync report and at
> least one full extra event. Anything smaller we get the side-effect
> that a client that is too slow and gets a SYN_DROPPED is actually worse off
> because now the buffer is so full from the sync that a SYN_DROPPED is even
> more likely to occur than before.
>
> We also need to define the behaviour for the queue filling up while the
> client is in the middle of a sync. That means the client must be
> able to handle SYN_DROPPED as well as SYN_SYNC_DONE during a sync or the
> kernel protects the events up to SYN_SYNC_DONE in the queue in the case of
> a SYN_DROPPED.
>
> Either way it's IMO more complicated than having a separate buffer for the
> sync state.

Yepp, fully agree on both points you make (I summarized them way worse
than you did!).

Thanks
David
--
To unsubscribe from this list: send the line "unsubscribe linux-input" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/input/evdev.c b/drivers/input/evdev.c
index 6386882..7a25a7a 100644
--- a/drivers/input/evdev.c
+++ b/drivers/input/evdev.c
@@ -175,8 +175,9 @@  static bool __evdev_is_filtered(struct evdev_client *client,
 	return mask && !test_bit(code, mask);
 }
 
-/* flush queued events of type @type, caller must hold client->buffer_lock */
-static void __evdev_flush_queue(struct evdev_client *client, unsigned int type)
+/* flush queued, matching events, caller must hold client->buffer_lock */
+static void __evdev_flush_queue(struct evdev_client *client, unsigned int type,
+				unsigned int code_first, unsigned int code_last)
 {
 	unsigned int i, head, num;
 	unsigned int mask = client->bufsize - 1;
@@ -195,7 +196,9 @@  static void __evdev_flush_queue(struct evdev_client *client, unsigned int type)
 		ev = &client->buffer[i];
 		is_report = ev->type == EV_SYN && ev->code == SYN_REPORT;
 
-		if (ev->type == type) {
+		if (ev->type == type &&
+		    ev->code >= code_first &&
+		    ev->code <= code_last) {
 			/* drop matched entry */
 			continue;
 		} else if (is_report && !num) {
@@ -786,6 +789,172 @@  static int handle_eviocgbit(struct input_dev *dev,
 	return bits_to_user(bits, len, size, p, compat_mode);
 }
 
+static inline void free_absrange(s32 **pages, size_t page_cnt)
+{
+	if (page_cnt > 1) {
+		while (page_cnt > 0) {
+			if (!pages[--page_cnt])
+				break;
+			__free_page(virt_to_page(pages[page_cnt]));
+		}
+		kfree(pages);
+	} else if (page_cnt == 1) {
+		kfree(pages);
+	}
+}
+
+static inline s32 *absrange_ptr(s32 **pages, size_t page_cnt, size_t slots,
+				size_t i_code, size_t j_slot)
+{
+	size_t idx, off;
+
+	idx = (i_code * slots + j_slot) / (PAGE_SIZE / sizeof(s32));
+	off = (i_code * slots + j_slot) % (PAGE_SIZE / sizeof(s32));
+
+	if (page_cnt == 1)
+		return &((s32*)pages)[off];
+	else
+		return &pages[idx][off];
+}
+
+static inline ssize_t fetch_absrange(struct evdev_client *client,
+				     struct input_dev *dev, size_t start,
+				     size_t count, size_t slots, s32 ***out)
+{
+	size_t size, page_cnt, i, j;
+	unsigned long flags;
+	s32 **pages;
+
+	/*
+	 * Fetch data atomically from the device and flush buffers. We need to
+	 * allocate a temporary buffer as copy_to_user() is not allowed while
+	 * holding spinlocks. However, to-be-copied data might be huge and
+	 * high-order allocations should be avoided. Therefore, do the
+	 * page-allocation manually.
+	 */
+
+	BUILD_BUG_ON(PAGE_SIZE % sizeof(s32) != 0);
+
+	size = sizeof(s32) * count * slots;
+	page_cnt = DIV_ROUND_UP(size, PAGE_SIZE);
+	if (page_cnt < 1) {
+		return 0;
+	} else if (page_cnt == 1) {
+		pages = kzalloc(size, GFP_TEMPORARY);
+		if (!pages)
+			return -ENOMEM;
+	} else {
+		pages = kzalloc(sizeof(*pages) * page_cnt, GFP_TEMPORARY);
+		if (!pages)
+			return -ENOMEM;
+
+		for (i = 0; i < page_cnt; ++i) {
+			pages[i] = (void*)get_zeroed_page(GFP_TEMPORARY);
+			if (!pages[i]) {
+				free_absrange(pages, page_cnt);
+				return -ENOMEM;
+			}
+		}
+	}
+
+	spin_lock_irqsave(&dev->event_lock, flags);
+	spin_lock(&client->buffer_lock);
+
+	for (i = 0; i < count; ++i) {
+		__u16 code;
+		bool is_mt;
+
+		code = start + i;
+		is_mt = input_is_mt_value(code);
+		if (is_mt && !dev->mt)
+			continue;
+
+		for (j = 0; j < slots; ++j) {
+			__s32 v;
+
+			if (is_mt)
+				v = input_mt_get_value(&dev->mt->slots[j],
+						       code);
+			else
+				v = dev->absinfo[code].value;
+
+			*absrange_ptr(pages, page_cnt, slots, i, j) = v;
+
+			if (!is_mt)
+				break;
+		}
+	}
+
+	spin_unlock(&client->buffer_lock);
+	__evdev_flush_queue(client, EV_ABS, start, start + count - 1);
+	spin_unlock_irqrestore(&dev->event_lock, flags);
+
+	*out = pages;
+	return page_cnt;
+}
+
+static int evdev_handle_get_absrange(struct evdev_client *client,
+				     struct input_dev *dev,
+				     struct input_absrange __user *p)
+{
+	size_t slots, code, count, i, j;
+	struct input_absrange absbuf;
+	s32 **vals = NULL;
+	ssize_t val_cnt;
+	s32 __user *b;
+	int retval;
+
+	if (!dev->absinfo)
+		return -EINVAL;
+	if (copy_from_user(&absbuf, p, sizeof(absbuf)))
+		return -EFAULT;
+
+	slots = min_t(size_t, dev->mt ? dev->mt->num_slots : 1, absbuf.slots);
+	code = min_t(size_t, absbuf.code, ABS_CNT);
+	count = min_t(size_t, absbuf.count, ABS_CNT);
+
+	/* first fetch data atomically from device */
+
+	if (code + count > ABS_CNT)
+		count = ABS_CNT - code;
+
+	if (!slots || !count) {
+		val_cnt = 0;
+	} else {
+		val_cnt = fetch_absrange(client, dev, code, count,
+					 slots, &vals);
+		if (val_cnt < 0)
+			return val_cnt;
+	}
+
+	/* now copy data to user-space */
+
+	b = (void __user*)(unsigned long)absbuf.buffer;
+	for (i = 0; i < absbuf.count; ++i) {
+		for (j = 0; j < absbuf.slots; ++j, ++b) {
+			s32 v;
+
+			if (i >= count || j >= slots)
+				v = 0;
+			else
+				v = *absrange_ptr(vals, val_cnt, slots, i, j);
+
+			if (put_user(v, b)) {
+				retval = -EFAULT;
+				goto out;
+			}
+		}
+	}
+
+	retval = 0;
+
+out:
+	free_absrange(vals, val_cnt);
+	if (retval < 0)
+		evdev_queue_syn_dropped(client);
+	return retval;
+}
+
 static int evdev_handle_get_keycode(struct input_dev *dev, void __user *p)
 {
 	struct input_keymap_entry ke = {
@@ -889,7 +1058,7 @@  static int evdev_handle_get_val(struct evdev_client *client,
 
 	spin_unlock(&dev->event_lock);
 
-	__evdev_flush_queue(client, type);
+	__evdev_flush_queue(client, type, 0, UINT_MAX);
 
 	spin_unlock_irq(&client->buffer_lock);
 
@@ -1006,6 +1175,9 @@  static long evdev_do_ioctl(struct file *file, unsigned int cmd,
 		else
 			return evdev_revoke(evdev, client, file);
 
+	case EVIOCGABSRANGE:
+		return evdev_handle_get_absrange(client, dev, p);
+
 	case EVIOCGMASK:
 		if (copy_from_user(&mask, p, sizeof(mask)))
 			return -EFAULT;
diff --git a/include/uapi/linux/input.h b/include/uapi/linux/input.h
index f6ace0e..9f851d4 100644
--- a/include/uapi/linux/input.h
+++ b/include/uapi/linux/input.h
@@ -32,7 +32,7 @@  struct input_event {
  * Protocol version.
  */
 
-#define EV_VERSION		0x010001
+#define EV_VERSION		0x010002
 
 /*
  * IOCTLs (0x00 - 0x7f)
@@ -210,6 +210,48 @@  struct input_mask {
  */
 #define EVIOCSMASK		_IOW('E', 0x93, struct input_mask)	/* Set event-masks */
 
+struct input_absrange {
+	__u16 slots;
+	__u16 code;
+	__u32 count;
+	__u64 buffer;
+};
+
+/**
+ * EVIOCGABSRANGE - Fetch range of ABS values
+ *
+ * This fetches the current values of a range of ABS codes atomically. The range
+ * of codes to fetch and the buffer-types are passed as "struct input_absrange",
+ * which has the following fields:
+ *      slots: Number of MT slots to fetch data for.
+ *       code: First ABS axis to query.
+ *      count: Number of ABS axes to query starting at @code.
+ *     buffer: Pointer to a receive buffer where to store the fetched ABS
+ *             values. This buffer must be an array of __s32 with at least
+ *             (@slots * @code) elements. The buffer is interpreted as two
+ *             dimensional __s32 array, declared as: __s32[slots][codes]
+ *
+ * Compared to EVIOCGABS this ioctl allows to retrieve a range of ABS codes
+ * atomically regarding any concurrent buffer modifications. Furthermore, any
+ * pending events for codes that were retrived via this call are flushed from
+ * the client's receive buffer. But unlike EVIOCGABS, this ioctl only returns
+ * the current value of an axis, rather than the whole "struct input_absinfo"
+ * set. All fields of "struct input_absinfo" except for the value are constant,
+ * though.
+ *
+ * The kernel's current view of the ABS axes is copied into the provided buffer.
+ * If an ABS axis is not enabled on the device, its value will be zero. Also, if
+ * an axis is not a slotted MT-axis, values for all but the first slot will be
+ * 0. If @slots is greater than the actual number of slots provided by the
+ * device, values for all slots higher than that will be 0.
+ *
+ * This call may fail with -EINVAL if the kernel doesn't support this call or
+ * the arguments are invalid, with -ENODEV if access was revoked, -ENOMEM if the
+ * kernel couldn't allocate temporary buffers for data-copy or -EFAULT if the
+ * passed pointer was invalid.
+ */
+#define EVIOCGABSRANGE		_IOR('E', 0x94, struct input_absrange)
+
 #define EVIOCSCLOCKID		_IOW('E', 0xa0, int)			/* Set clockid to be used for timestamps */
 
 /*