From patchwork Tue May 31 02:19:49 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Martin K. Petersen" X-Patchwork-Id: 831362 Received: from mx4-phx2.redhat.com (mx4-phx2.redhat.com [209.132.183.25]) by demeter1.kernel.org (8.14.4/8.14.3) with ESMTP id p4V2MqeY022669 for ; Tue, 31 May 2011 02:23:13 GMT Received: from lists01.pubmisc.prod.ext.phx2.redhat.com (lists01.pubmisc.prod.ext.phx2.redhat.com [10.5.19.33]) by mx4-phx2.redhat.com (8.13.8/8.13.8) with ESMTP id p4V2KAXf023447; Mon, 30 May 2011 22:20:11 -0400 Received: from int-mx10.intmail.prod.int.phx2.redhat.com (int-mx10.intmail.prod.int.phx2.redhat.com [10.5.11.23]) by lists01.pubmisc.prod.ext.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id p4V2K8eu015438 for ; Mon, 30 May 2011 22:20:08 -0400 Received: from mx1.redhat.com (ext-mx11.extmail.prod.ext.phx2.redhat.com [10.5.110.16]) by int-mx10.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id p4V2K325004604; Mon, 30 May 2011 22:20:03 -0400 Received: from rcsinet10.oracle.com (rcsinet10.oracle.com [148.87.113.121]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id p4V2K22t016946; Mon, 30 May 2011 22:20:02 -0400 Received: from acsinet22.oracle.com (acsinet22.oracle.com [141.146.126.238]) by rcsinet10.oracle.com (Switch-3.4.2/Switch-3.4.2) with ESMTP id p4V2JwnN025995 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Tue, 31 May 2011 02:20:00 GMT Received: from acsmt358.oracle.com (acsmt358.oracle.com [141.146.40.158]) by acsinet22.oracle.com (8.14.4+Sun/8.14.4) with ESMTP id p4V2Jvwl019830 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Tue, 31 May 2011 02:19:57 GMT Received: from abhmt016.oracle.com (abhmt016.oracle.com [141.146.116.25]) by acsmt358.oracle.com (8.12.11.20060308/8.12.11) with ESMTP id p4V2JpBp027975; Mon, 30 May 2011 21:19:52 -0500 Received: from groovelator.mkp.net (/209.217.122.111) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Mon, 30 May 2011 19:19:51 -0700 To: Mike Snitzer From: "Martin K. Petersen" Organization: Oracle References: <4DDEA689.2090004@fusionio.com> <1306464169-4291-1-git-send-email-martin.petersen@oracle.com> <1306464169-4291-3-git-send-email-martin.petersen@oracle.com> <20110527130248.GA17219@redhat.com> Date: Mon, 30 May 2011 22:19:49 -0400 In-Reply-To: <20110527130248.GA17219@redhat.com> (Mike Snitzer's message of "Fri, 27 May 2011 09:02:49 -0400") Message-ID: User-Agent: Gnus/5.110013 (No Gnus v0.13) Emacs/23.2 (gnu/linux) MIME-Version: 1.0 X-Source-IP: acsinet22.oracle.com [141.146.126.238] X-Auth-Type: Internal IP X-CT-RefId: str=0001.0A090206.4DE45050.00A4:SCFMA922111,ss=1,fgs=0 X-RedHat-Spam-Score: -102.309 (RCVD_IN_DNSWL_MED, T_RP_MATCHES_RCVD, UNPARSEABLE_RELAY, USER_IN_WHITELIST) X-Scanned-By: MIMEDefang 2.68 on 10.5.11.23 X-Scanned-By: MIMEDefang 2.68 on 10.5.110.16 X-loop: dm-devel@redhat.com Cc: jaxboe@fusionio.com, msb@chromium.org, dm-devel@redhat.com, linux-kernel@vger.kernel.org, "Martin K. Petersen" Subject: Re: [dm-devel] [PATCH 2/3] block: Move non-rotational flag to queue limits X-BeenThere: dm-devel@redhat.com X-Mailman-Version: 2.1.12 Precedence: junk Reply-To: device-mapper development List-Id: device-mapper development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: dm-devel-bounces@redhat.com Errors-To: dm-devel-bounces@redhat.com X-Greylist: IP, sender and recipient auto-whitelisted, not delayed by milter-greylist-4.2.6 (demeter1.kernel.org [140.211.167.41]); Tue, 31 May 2011 02:23:26 +0000 (UTC) >>>>> "Mike" == Mike Snitzer writes: Mike> blk_queue_nonrot vs blk_queue_non_rotational lends itself to a Mike> small amount of confusion. Yeah, I just didn't feel like mucking with the existing call. But it looks like there are only a handful of users. Mike> What about: Mike> s/blk_queue_nonrot/blk_queue_non_rotational/ Mike> s/blk_queue_non_rotational/blk_queue_set_non_rotational/ Mike> ? Most of our other block layer calls take the form blk_queue_max_foo() for setting foo and {bdev,queue}_max_foo() for querying. So I guess the most appropriate thing to do would be to do something like this? block: Move non-rotational flag to queue limits To avoid special-casing the non-rotational flag when stacking it is moved from the queue flags to be part of the queue limits. This allows us to handle it like the remaining I/O topology information. Also rename blk_queue_nonrot() to be consistent with block layer calling conventions. Signed-off-by: Martin K. Petersen --- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel diff --git a/block/blk-settings.c b/block/blk-settings.c index b373721..f95760d 100644 --- a/block/blk-settings.c +++ b/block/blk-settings.c @@ -124,6 +124,7 @@ void blk_set_default_limits(struct queue_limits *lim) lim->io_opt = 0; lim->misaligned = 0; lim->cluster = 1; + lim->non_rotational = 0; } EXPORT_SYMBOL(blk_set_default_limits); @@ -143,6 +144,7 @@ void blk_set_stacking_limits(struct queue_limits *lim) lim->max_hw_sectors = INT_MAX; lim->max_sectors = BLK_DEF_MAX_SECTORS; lim->discard_zeroes_data = 1; + lim->non_rotational = 1; } EXPORT_SYMBOL(blk_set_stacking_limits); @@ -471,6 +473,22 @@ void blk_queue_io_opt(struct request_queue *q, unsigned int opt) EXPORT_SYMBOL(blk_queue_io_opt); /** + * blk_queue_non_rotational - set this queue as non-rotational + * @q: the request queue for the device + * + * Description: + * This setting may be used by drivers to indicate that the physical + * device is non-rotational (solid state device, array with + * non-volatile cache). Setting this may affect I/O scheduler + * decisions and readahead behavior. + */ +void blk_queue_non_rotational(struct request_queue *q) +{ + q->limits.non_rotational = 1; +} +EXPORT_SYMBOL(blk_queue_non_rotational); + +/** * blk_queue_stack_limits - inherit underlying queue limits for stacked drivers * @t: the stacking driver (top) * @b: the underlying device (bottom) @@ -552,6 +570,7 @@ int blk_stack_limits(struct queue_limits *t, struct queue_limits *b, t->cluster &= b->cluster; t->discard_zeroes_data &= b->discard_zeroes_data; + t->non_rotational &= b->non_rotational; /* Physical block size a multiple of the logical block size? */ if (t->physical_block_size & (t->logical_block_size - 1)) { diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c index d935bd8..2669b15 100644 --- a/block/blk-sysfs.c +++ b/block/blk-sysfs.c @@ -186,6 +186,22 @@ static ssize_t queue_max_hw_sectors_show(struct request_queue *q, char *page) return queue_var_show(max_hw_sectors_kb, (page)); } +static ssize_t queue_rotational_show(struct request_queue *q, char *page) +{ + return queue_var_show(!queue_non_rotational(q), page); +} + +static ssize_t queue_rotational_store(struct request_queue *q, + const char *page, size_t count) +{ + unsigned long rotational; + ssize_t ret = queue_var_store(&rotational, page, count); + + q->limits.non_rotational = !rotational; + + return ret; +} + #define QUEUE_SYSFS_BIT_FNS(name, flag, neg) \ static ssize_t \ queue_show_##name(struct request_queue *q, char *page) \ @@ -212,7 +228,6 @@ queue_store_##name(struct request_queue *q, const char *page, size_t count) \ return ret; \ } -QUEUE_SYSFS_BIT_FNS(nonrot, NONROT, 1); QUEUE_SYSFS_BIT_FNS(random, ADD_RANDOM, 0); QUEUE_SYSFS_BIT_FNS(iostats, IO_STAT, 0); #undef QUEUE_SYSFS_BIT_FNS @@ -352,8 +367,8 @@ static struct queue_sysfs_entry queue_discard_zeroes_data_entry = { static struct queue_sysfs_entry queue_nonrot_entry = { .attr = {.name = "rotational", .mode = S_IRUGO | S_IWUSR }, - .show = queue_show_nonrot, - .store = queue_store_nonrot, + .show = queue_rotational_show, + .store = queue_rotational_store, }; static struct queue_sysfs_entry queue_nomerges_entry = { diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c index 7c52d68..c84b539 100644 --- a/block/cfq-iosched.c +++ b/block/cfq-iosched.c @@ -1961,7 +1961,7 @@ static bool cfq_should_idle(struct cfq_data *cfqd, struct cfq_queue *cfqq) /* We do for queues that were marked with idle window flag. */ if (cfq_cfqq_idle_window(cfqq) && - !(blk_queue_nonrot(cfqd->queue) && cfqd->hw_tag)) + !(queue_non_rotational(cfqd->queue) && cfqd->hw_tag)) return true; /* @@ -1986,7 +1986,7 @@ static void cfq_arm_slice_timer(struct cfq_data *cfqd) * for devices that support queuing, otherwise we still have a problem * with sync vs async workloads. */ - if (blk_queue_nonrot(cfqd->queue) && cfqd->hw_tag) + if (queue_non_rotational(cfqd->queue) && cfqd->hw_tag) return; WARN_ON(!RB_EMPTY_ROOT(&cfqq->sort_list)); @@ -3237,7 +3237,7 @@ cfq_update_io_seektime(struct cfq_data *cfqd, struct cfq_queue *cfqq, } cfqq->seek_history <<= 1; - if (blk_queue_nonrot(cfqd->queue)) + if (queue_non_rotational(cfqd->queue)) cfqq->seek_history |= (n_sec < CFQQ_SECT_THR_NONROT); else cfqq->seek_history |= (sdist > CFQQ_SEEK_THR); diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c index e6fc716..fd96b44 100644 --- a/drivers/block/nbd.c +++ b/drivers/block/nbd.c @@ -774,7 +774,7 @@ static int __init nbd_init(void) /* * Tell the block layer that we are not a rotational device */ - queue_flag_set_unlocked(QUEUE_FLAG_NONROT, disk->queue); + blk_queue_non_rotational(queue); } if (register_blkdev(NBD_MAJOR, "nbd")) { diff --git a/drivers/ide/ide-disk.c b/drivers/ide/ide-disk.c index 2747980..422c558 100644 --- a/drivers/ide/ide-disk.c +++ b/drivers/ide/ide-disk.c @@ -682,7 +682,7 @@ static void ide_disk_setup(ide_drive_t *drive) queue_max_sectors(q) / 2); if (ata_id_is_ssd(id)) - queue_flag_set_unlocked(QUEUE_FLAG_NONROT, q); + blk_queue_non_rotational(q); /* calculate drive capacity, and select LBA if possible */ ide_disk_get_capacity(drive); diff --git a/drivers/mmc/card/queue.c b/drivers/mmc/card/queue.c index c07322c..9adce86 100644 --- a/drivers/mmc/card/queue.c +++ b/drivers/mmc/card/queue.c @@ -127,7 +127,7 @@ int mmc_init_queue(struct mmc_queue *mq, struct mmc_card *card, spinlock_t *lock mq->req = NULL; blk_queue_prep_rq(mq->queue, mmc_prep_request); - queue_flag_set_unlocked(QUEUE_FLAG_NONROT, mq->queue); + blk_queue_non_rotational(mq->queue); if (mmc_can_erase(card)) { queue_flag_set_unlocked(QUEUE_FLAG_DISCARD, mq->queue); mq->queue->limits.max_discard_sectors = UINT_MAX; diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c index bd0806e..7a5cf28 100644 --- a/drivers/scsi/sd.c +++ b/drivers/scsi/sd.c @@ -2257,7 +2257,7 @@ static void sd_read_block_characteristics(struct scsi_disk *sdkp) rot = get_unaligned_be16(&buffer[4]); if (rot == 1) - queue_flag_set_unlocked(QUEUE_FLAG_NONROT, sdkp->disk->queue); + blk_queue_non_rotational(sdkp->disk->queue); out: kfree(buffer); diff --git a/drivers/staging/zram/zram_drv.c b/drivers/staging/zram/zram_drv.c index aab4ec4..9bd0874 100644 --- a/drivers/staging/zram/zram_drv.c +++ b/drivers/staging/zram/zram_drv.c @@ -538,7 +538,7 @@ int zram_init_device(struct zram *zram) set_capacity(zram->disk, zram->disksize >> SECTOR_SHIFT); /* zram devices sort of resembles non-rotational disks */ - queue_flag_set_unlocked(QUEUE_FLAG_NONROT, zram->disk->queue); + blk_queue_non_rotational(zram->disk->queue); zram->mem_pool = xv_create_pool(); if (!zram->mem_pool) { diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index c7367ae..b32650c 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -588,7 +588,7 @@ static int __btrfs_open_devices(struct btrfs_fs_devices *fs_devices, device->in_fs_metadata = 0; device->mode = flags; - if (!blk_queue_nonrot(bdev_get_queue(bdev))) + if (!bdev_non_rotational(bdev)) fs_devices->rotating = 1; fs_devices->open_devices++; @@ -1619,7 +1619,7 @@ int btrfs_init_new_device(struct btrfs_root *root, char *device_path) root->fs_info->fs_devices->rw_devices++; root->fs_info->fs_devices->total_rw_bytes += device->total_bytes; - if (!blk_queue_nonrot(bdev_get_queue(bdev))) + if (!bdev_non_rotational(bdev)) root->fs_info->fs_devices->rotating = 1; total_bytes = btrfs_super_total_bytes(&root->fs_info->super_copy); diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index 517247d..fb8da90 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -258,6 +258,8 @@ struct queue_limits { unsigned char discard_misaligned; unsigned char cluster; unsigned char discard_zeroes_data; + + unsigned char non_rotational; }; struct request_queue @@ -396,13 +398,11 @@ struct request_queue #define QUEUE_FLAG_SAME_COMP 9 /* force complete on same CPU */ #define QUEUE_FLAG_FAIL_IO 10 /* fake timeout */ #define QUEUE_FLAG_STACKABLE 11 /* supports request stacking */ -#define QUEUE_FLAG_NONROT 12 /* non-rotational device (SSD) */ -#define QUEUE_FLAG_VIRT QUEUE_FLAG_NONROT /* paravirt device */ -#define QUEUE_FLAG_IO_STAT 13 /* do IO stats */ -#define QUEUE_FLAG_DISCARD 14 /* supports DISCARD */ -#define QUEUE_FLAG_NOXMERGES 15 /* No extended merges */ -#define QUEUE_FLAG_ADD_RANDOM 16 /* Contributes to random pool */ -#define QUEUE_FLAG_SECDISCARD 17 /* supports SECDISCARD */ +#define QUEUE_FLAG_IO_STAT 12 /* do IO stats */ +#define QUEUE_FLAG_DISCARD 13 /* supports DISCARD */ +#define QUEUE_FLAG_NOXMERGES 14 /* No extended merges */ +#define QUEUE_FLAG_ADD_RANDOM 15 /* Contributes to random pool */ +#define QUEUE_FLAG_SECDISCARD 16 /* supports SECDISCARD */ #define QUEUE_FLAG_DEFAULT ((1 << QUEUE_FLAG_IO_STAT) | \ (1 << QUEUE_FLAG_STACKABLE) | \ @@ -479,7 +479,6 @@ static inline void queue_flag_clear(unsigned int flag, struct request_queue *q) #define blk_queue_nomerges(q) test_bit(QUEUE_FLAG_NOMERGES, &(q)->queue_flags) #define blk_queue_noxmerges(q) \ test_bit(QUEUE_FLAG_NOXMERGES, &(q)->queue_flags) -#define blk_queue_nonrot(q) test_bit(QUEUE_FLAG_NONROT, &(q)->queue_flags) #define blk_queue_io_stat(q) test_bit(QUEUE_FLAG_IO_STAT, &(q)->queue_flags) #define blk_queue_add_random(q) test_bit(QUEUE_FLAG_ADD_RANDOM, &(q)->queue_flags) #define blk_queue_stackable(q) \ @@ -821,6 +820,7 @@ extern void blk_limits_io_min(struct queue_limits *limits, unsigned int min); extern void blk_queue_io_min(struct request_queue *q, unsigned int min); extern void blk_limits_io_opt(struct queue_limits *limits, unsigned int opt); extern void blk_queue_io_opt(struct request_queue *q, unsigned int opt); +extern void blk_queue_non_rotational(struct request_queue *q); extern void blk_set_default_limits(struct queue_limits *lim); extern void blk_set_stacking_limits(struct queue_limits *lim); extern int blk_stack_limits(struct queue_limits *t, struct queue_limits *b, @@ -1028,6 +1028,16 @@ static inline int bdev_io_opt(struct block_device *bdev) return queue_io_opt(bdev_get_queue(bdev)); } +static inline unsigned int queue_non_rotational(struct request_queue *q) +{ + return q->limits.non_rotational; +} + +static inline unsigned int bdev_non_rotational(struct block_device *bdev) +{ + return queue_non_rotational(bdev_get_queue(bdev)); +} + static inline int queue_alignment_offset(struct request_queue *q) { if (q->limits.misaligned) diff --git a/mm/swapfile.c b/mm/swapfile.c index d537d29..3f35457 100644 --- a/mm/swapfile.c +++ b/mm/swapfile.c @@ -2110,7 +2110,7 @@ SYSCALL_DEFINE2(swapon, const char __user *, specialfile, int, swap_flags) } if (p->bdev) { - if (blk_queue_nonrot(bdev_get_queue(p->bdev))) { + if (bdev_non_rotational(p->bdev)) { p->flags |= SWP_SOLIDSTATE; p->cluster_next = 1 + (random32() % p->highest_bit); }