Message ID | e35afc461c9a79f568f2bcd4ef2cd1e07fb50082.1471890326.git.tom.ty89@gmail.com (mailing list archive) |
---|---|
State | Not Applicable, archived |
Headers | show |
On Mon, Aug 22, 2016 at 1:55 PM, <tom.ty89@gmail.com> wrote: > From: Tom Yan <tom.ty89@gmail.com> > > In commit 5c79097a28c2 ("libata-scsi: reject WRITE SAME (16) with > n_block that exceeds limit"), it is made sure that > ata_set_lba_range_entries() will never be called with a request > size (n_block) that is larger than the number of blocks that a > 512-byte block TRIM payload can describe (65535 * 64 = 4194240), > in addition to acknowlegding the SCSI/block layer with the same > limit by advertising it as the Maximum Write Same Length. > > Therefore, it is unnecessary to hard code the same limit in > ata_set_lba_range_entries() itself, which would only cost extra > maintenance effort. Such effort can be noticed in, for example, > commit 2983860c7668 ("libata-scsi: avoid repeated calculation of > number of TRIM ranges"). > > Signed-off-by: Tom Yan <tom.ty89@gmail.com> > > diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c > index be9c76c..9b74ecb 100644 > --- a/drivers/ata/libata-scsi.c > +++ b/drivers/ata/libata-scsi.c > @@ -3322,7 +3322,7 @@ static unsigned int ata_scsi_write_same_xlat(struct ata_queued_cmd *qc) > buf = page_address(sg_page(scsi_sglist(scmd))); > > if (n_block <= 65535 * ATA_MAX_TRIM_RNUM) { > - size = ata_set_lba_range_entries(buf, ATA_MAX_TRIM_RNUM, block, n_block); > + size = ata_set_lba_range_entries(buf, block, n_block); > } else { > fp = 2; > goto invalid_fld; > diff --git a/include/linux/ata.h b/include/linux/ata.h > index adbc812..5e2e9ad 100644 > --- a/include/linux/ata.h > +++ b/include/linux/ata.h > @@ -1077,19 +1077,19 @@ static inline void ata_id_to_hd_driveid(u16 *id) > * TO NV CACHE PINNED SET. > */ > static inline unsigned ata_set_lba_range_entries(void *_buffer, > - unsigned num, u64 sector, unsigned long count) > + u64 sector, unsigned long count) > { > __le64 *buffer = _buffer; > unsigned i = 0, used_bytes; > > - while (i < num) { > - u64 entry = sector | > - ((u64)(count > 0xffff ? 0xffff : count) << 48); > + while (count > 0) { > + u64 range, entry; > + > + range = count > 0xffff ? 0xffff : count; > + entry = sector | (range << 48); > buffer[i++] = __cpu_to_le64(entry); > - if (count <= 0xffff) > - break; > - count -= 0xffff; > - sector += 0xffff; > + count -= range; > + sector += range; > } I think the problem here is that I can now inject a buffer overflow via SG_IO. > used_bytes = ALIGN(i * 8, 512); > -- > 2.9.3 > -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
I don't see how that's possible. count / n_block will always be smaller than 65535 * ATA_MAX_TRIM_RNUM(64) = 4194240. Not to mention that isn't even a "buffer limit" anyway. By SG_IO do you mean like SCSI Write Same commands that issued with sg_write_same or so? If that's the case, that's what exactly commit 5c79097a28c2 ("libata-scsi: reject WRITE SAME (16) with n_block that exceeds limit") is for. On 23 August 2016 at 03:58, Shaun Tancheff <shaun@tancheff.com> wrote: > On Mon, Aug 22, 2016 at 1:55 PM, <tom.ty89@gmail.com> wrote: >> From: Tom Yan <tom.ty89@gmail.com> >> >> In commit 5c79097a28c2 ("libata-scsi: reject WRITE SAME (16) with >> n_block that exceeds limit"), it is made sure that >> ata_set_lba_range_entries() will never be called with a request >> size (n_block) that is larger than the number of blocks that a >> 512-byte block TRIM payload can describe (65535 * 64 = 4194240), >> in addition to acknowlegding the SCSI/block layer with the same >> limit by advertising it as the Maximum Write Same Length. >> >> Therefore, it is unnecessary to hard code the same limit in >> ata_set_lba_range_entries() itself, which would only cost extra >> maintenance effort. Such effort can be noticed in, for example, >> commit 2983860c7668 ("libata-scsi: avoid repeated calculation of >> number of TRIM ranges"). >> >> Signed-off-by: Tom Yan <tom.ty89@gmail.com> >> >> diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c >> index be9c76c..9b74ecb 100644 >> --- a/drivers/ata/libata-scsi.c >> +++ b/drivers/ata/libata-scsi.c >> @@ -3322,7 +3322,7 @@ static unsigned int ata_scsi_write_same_xlat(struct ata_queued_cmd *qc) >> buf = page_address(sg_page(scsi_sglist(scmd))); >> >> if (n_block <= 65535 * ATA_MAX_TRIM_RNUM) { >> - size = ata_set_lba_range_entries(buf, ATA_MAX_TRIM_RNUM, block, n_block); >> + size = ata_set_lba_range_entries(buf, block, n_block); >> } else { >> fp = 2; >> goto invalid_fld; >> diff --git a/include/linux/ata.h b/include/linux/ata.h >> index adbc812..5e2e9ad 100644 >> --- a/include/linux/ata.h >> +++ b/include/linux/ata.h >> @@ -1077,19 +1077,19 @@ static inline void ata_id_to_hd_driveid(u16 *id) >> * TO NV CACHE PINNED SET. >> */ >> static inline unsigned ata_set_lba_range_entries(void *_buffer, >> - unsigned num, u64 sector, unsigned long count) >> + u64 sector, unsigned long count) >> { >> __le64 *buffer = _buffer; >> unsigned i = 0, used_bytes; >> >> - while (i < num) { >> - u64 entry = sector | >> - ((u64)(count > 0xffff ? 0xffff : count) << 48); >> + while (count > 0) { >> + u64 range, entry; >> + >> + range = count > 0xffff ? 0xffff : count; >> + entry = sector | (range << 48); >> buffer[i++] = __cpu_to_le64(entry); >> - if (count <= 0xffff) >> - break; >> - count -= 0xffff; >> - sector += 0xffff; >> + count -= range; >> + sector += range; >> } > > I think the problem here is that I can now inject a buffer overflow > via SG_IO. > >> used_bytes = ALIGN(i * 8, 512); >> -- >> 2.9.3 >> -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Aug 22, 2016 at 3:07 PM, Tom Yan <tom.ty89@gmail.com> wrote: > I don't see how that's possible. count / n_block will always be > smaller than 65535 * ATA_MAX_TRIM_RNUM(64) = 4194240. Not to mention > that isn't even a "buffer limit" anyway. By SG_IO do you mean like > SCSI Write Same commands that issued with sg_write_same or so? If > that's the case, that's what exactly commit 5c79097a28c2 > ("libata-scsi: reject WRITE SAME (16) with n_block that exceeds > limit") is for. Ah, I see. You are guarding the only user of ata_set_lba_range_entries(). Still if you are going to do that you have to alert any new user that they must have an appropriately sized buffer to be overwriting. Better to move it out of ata.h then the limit the scope of accidental misuse? Regards, Shaun -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 22 August 2016 at 20:32, Shaun Tancheff <shaun.tancheff@seagate.com> wrote: > On Mon, Aug 22, 2016 at 3:07 PM, Tom Yan <tom.ty89@gmail.com> wrote: >> I don't see how that's possible. count / n_block will always be >> smaller than 65535 * ATA_MAX_TRIM_RNUM(64) = 4194240. Not to mention >> that isn't even a "buffer limit" anyway. By SG_IO do you mean like >> SCSI Write Same commands that issued with sg_write_same or so? If >> that's the case, that's what exactly commit 5c79097a28c2 >> ("libata-scsi: reject WRITE SAME (16) with n_block that exceeds >> limit") is for. > > Ah, I see. You are guarding the only user of ata_set_lba_range_entries(). Yup. It is the only right thing to do anyway, that we leave the function "open" and guard per context when we use it. Say if ata_set_lba_range_entries() is gonna be a function that is shared by others, it would only make this commit more important. As I said, we did not guard it with a certain buffer limit, but merely redundantly guard it with a ("humanized") limit that applies to TRIM only. > Still if you are going to do that you have to alert any new user that they > must have an appropriately sized buffer to be overwriting. > > Better to move it out of ata.h then the limit the scope of accidental > misuse? I am not sure if it is really necessary but that's fine to me. I see that you have been doing it in your SCT Write Same patch series. Probably I can/should just leave the move to you? > > Regards, > Shaun -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Aug 22, 2016 at 3:53 PM, Tom Yan <tom.ty89@gmail.com> wrote: > On 22 August 2016 at 20:32, Shaun Tancheff <shaun.tancheff@seagate.com> wrote: >> On Mon, Aug 22, 2016 at 3:07 PM, Tom Yan <tom.ty89@gmail.com> wrote: >>> I don't see how that's possible. count / n_block will always be >>> smaller than 65535 * ATA_MAX_TRIM_RNUM(64) = 4194240. Not to mention >>> that isn't even a "buffer limit" anyway. By SG_IO do you mean like >>> SCSI Write Same commands that issued with sg_write_same or so? If >>> that's the case, that's what exactly commit 5c79097a28c2 >>> ("libata-scsi: reject WRITE SAME (16) with n_block that exceeds >>> limit") is for. >> >> Ah, I see. You are guarding the only user of ata_set_lba_range_entries(). > > Yup. It is the only right thing to do anyway, that we leave the > function "open" and guard per context when we use it. Say if > ata_set_lba_range_entries() is gonna be a function that is shared by > others, it would only make this commit more important. As I said, we > did not guard it with a certain buffer limit, but merely redundantly > guard it with a ("humanized") limit that applies to TRIM only. But the "humanized" limit is the one you just added and proceeded to change ata_set_lba_range_entries(). You changed from a buffer size to use "num" instead and now you want to remove the protection entirely? Why not just change to put this in front of ata_set_lba_range_entries() if (n_block > 65535 * ATA_MAX_TRIM_RNUM) { fp = 2; goto invalid_fld; } And then restore ata_set_lba_range_entries() to how it looked before you changed it in commit: 2983860c7 (libata-scsi: avoid repeated calculation of number of TRIM ranges) Then you can have ata_set_lba_range_entries() take the buffer size ... something like the following would be fine: size = ata_set_lba_range_entries(buf, scmd->device->sector_size, block, n_block); Now things are happily protected against both exceeding the b0 limit(s) and overflowing the sglist buffer. -- Shaun -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 23 August 2016 at 07:30, Shaun Tancheff <shaun@tancheff.com> wrote: > > But the "humanized" limit is the one you just added and proceeded to > change ata_set_lba_range_entries(). You changed from a buffer size > to use "num" instead and now you want to remove the protection > entirely? I am not sure about what you are talking about here. What I did is just to avoid setting an (inappropriate and unnecessary) hard-coded limit (with the original while-condition) for i (the unaligned buffer size) in this general (in the sense that TRIM payload is _not always_ of one 512-byte block, even if we may want to forever practice that in our kernel) function. If we really want/need to avoid hitting some real buffer limit (e.g. maximum length of scatter/gather list?), then we should in some way check n_block against that. If it is too large we then return used_bytes = 0 (optionally with some follow-up to add a response to such return value or so). We advertise 4194240 as Maximum Write Same Length so that the SCSI/block layer will know how to split the discard request, and then we make sure that the SATL reject SCSI commands (that is not issued through the discard / write same ioctl but with SG_IO / sg_write_same). That's all we really need to do, end of story. > > Why not just change to put this in front of ata_set_lba_range_entries() > > if (n_block > 65535 * ATA_MAX_TRIM_RNUM) { > fp = 2; > goto invalid_fld; > } Well you can change the if-else clause to that. Apparently you've been doing that in your patch series. But that has nothing to do with what I am fixing here. Neither does it affect the actual behavior. > > And then restore ata_set_lba_range_entries() to how it looked > before you changed it in commit: > > 2983860c7 (libata-scsi: avoid repeated calculation of number of TRIM ranges) I don't see how it is relevant at all. Though I should have introduced _this_ patch before that to save some work. Unfortunately I wasn't aware how ugly ata_set_lba_range_entries was. > > Then you can have ata_set_lba_range_entries() take the buffer size ... > something like the following would be fine: > > size = ata_set_lba_range_entries(buf, scmd->device->sector_size, > block, n_block); > > Now things are happily protected against both exceeding the b0 limit(s) and > overflowing the sglist buffer. We have no reason at all to bring the logical sector size in here. Only if in future ACS standards it is stated that payload block are counted in logical block size instead of statically 512, we would only need to make the now static "512" in ALIGN() in to 4096. For possible sglist buffer overflow, see what I mentioned above about checking n_block and return 0. We would not want to do it with the original while-way anyway. All it does would be to truncate larger request and partially complete it silently. Can you tell exactly how the size of the sglist buffer is limited? AFAIK it has nothing to do with logical sector size. I doubt that we will ever cause an overflow, given how conservative we have been on the size of TRIM payload. We would probably ignore the reported value once it is larger than 16 or so, if we would ever enable multi-block payload in the future. Realistically speaking, I sent this patch not really to get the function ready for multi-block payload (neither I have mentioned that in the commit message), but just to make the code proper and avoid silly effort (and confusion caused) you and I spent in our patches. > > -- > Shaun -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 23 August 2016 at 08:37, Tom Yan <tom.ty89@gmail.com> wrote: > On 23 August 2016 at 07:30, Shaun Tancheff <shaun@tancheff.com> wrote: >> >> But the "humanized" limit is the one you just added and proceeded to >> change ata_set_lba_range_entries(). You changed from a buffer size >> to use "num" instead and now you want to remove the protection >> entirely? > > I am not sure about what you are talking about here. What I did is > just to avoid setting an (inappropriate and unnecessary) hard-coded > limit (with the original while-condition) for i (the unaligned buffer > size) in this general (in the sense that TRIM payload is _not always_ > of one 512-byte block, even if we may want to forever practice that in > our kernel) function. > > If we really want/need to avoid hitting some real buffer limit (e.g. > maximum length of scatter/gather list?), then we should in some way > check n_block against that. If it is too large we then return > used_bytes = 0 (optionally with some follow-up to add a response to > such return value or so). > > We advertise 4194240 as Maximum Write Same Length so that the > SCSI/block layer will know how to split the discard request, and then > we make sure that the SATL reject SCSI commands (that is not issued > through the discard / write same ioctl but with SG_IO / > sg_write_same). That's all we really need to do, end of story. > >> >> Why not just change to put this in front of ata_set_lba_range_entries() >> >> if (n_block > 65535 * ATA_MAX_TRIM_RNUM) { >> fp = 2; >> goto invalid_fld; >> } > > Well you can change the if-else clause to that. Apparently you've been > doing that in your patch series. But that has nothing to do with what > I am fixing here. Neither does it affect the actual behavior. > >> >> And then restore ata_set_lba_range_entries() to how it looked >> before you changed it in commit: >> >> 2983860c7 (libata-scsi: avoid repeated calculation of number of TRIM ranges) > > I don't see how it is relevant at all. Though I should have introduced > _this_ patch before that to save some work. Unfortunately I wasn't > aware how ugly ata_set_lba_range_entries was. > >> >> Then you can have ata_set_lba_range_entries() take the buffer size ... >> something like the following would be fine: >> >> size = ata_set_lba_range_entries(buf, scmd->device->sector_size, >> block, n_block); >> >> Now things are happily protected against both exceeding the b0 limit(s) and >> overflowing the sglist buffer. > > We have no reason at all to bring the logical sector size in here. > Only if in future ACS standards it is stated that payload block are > counted in logical block size instead of statically 512, we would only > need to make the now static "512" in ALIGN() in to 4096. typo. I meant from static "512" to logical sector size, and I really think we should do it ONLY if the later ACS changes its statement. > > For possible sglist buffer overflow, see what I mentioned above about > checking n_block and return 0. We would not want to do it with the > original while-way anyway. All it does would be to truncate larger > request and partially complete it silently. > > Can you tell exactly how the size of the sglist buffer is limited? > AFAIK it has nothing to do with logical sector size. I doubt that we > will ever cause an overflow, given how conservative we have been on > the size of TRIM payload. We would probably ignore the reported value > once it is larger than 16 or so, if we would ever enable multi-block > payload in the future. > > Realistically speaking, I sent this patch not really to get the > function ready for multi-block payload (neither I have mentioned that > in the commit message), but just to make the code proper and avoid > silly effort (and confusion caused) you and I spent in our patches. > >> >> -- >> Shaun -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Aug 23, 2016 at 3:37 AM, Tom Yan <tom.ty89@gmail.com> wrote: > On 23 August 2016 at 07:30, Shaun Tancheff <shaun@tancheff.com> wrote: > If we really want/need to avoid hitting some real buffer limit (e.g. > maximum length of scatter/gather list?), then we should in some way > check n_block against that. If it is too large we then return > used_bytes = 0 (optionally with some follow-up to add a response to > such return value or so). Yes there is a real buffer limit, I can think of these two options: 1- Assume the setups from sd_setup_discard_cmnd() and/ or sd_setup_write_same_cmnd() are providing an sglist of sdp->sector_size via scsi_init_io() 2- Find (or write) a suitable sg_get_size(sgl, nents) to walk the sglist and calculate the available buffer size. #2 sounds like more fun but I'm not sure it's what people would prefer to see. -- Shaun -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 23 August 2016 at 09:18, Shaun Tancheff <shaun.tancheff@seagate.com> wrote: > On Tue, Aug 23, 2016 at 3:37 AM, Tom Yan <tom.ty89@gmail.com> wrote: >> On 23 August 2016 at 07:30, Shaun Tancheff <shaun@tancheff.com> wrote: > >> If we really want/need to avoid hitting some real buffer limit (e.g. >> maximum length of scatter/gather list?), then we should in some way >> check n_block against that. If it is too large we then return >> used_bytes = 0 (optionally with some follow-up to add a response to >> such return value or so). > > Yes there is a real buffer limit, I can think of these two options: > 1- Assume the setups from sd_setup_discard_cmnd() and/ > or sd_setup_write_same_cmnd() are providing an sglist of > sdp->sector_size via scsi_init_io() That sounds completely wrong. The scatter/gather list we are talking about here has nothing to do with the SCSI or block layer anymore. The SATL has _already_ parsed the SCSI Write Same (16) command and is packing ranges/payload according to that in this stage. If there is any limit it would probably the max_segment allowed by the host driver (e.g. ahci). It doesn't seem to make sense to me either that we would need to prevent sglist overflow in such level. Doesn't that mean we would need to do the same checking (specifically, as in hard coding checks in all kinds of procedures) in every use of scatter/gather list? That doesn't sound right at all. > > 2- Find (or write) a suitable sg_get_size(sgl, nents) to walk the > sglist and calculate the available buffer size. > > #2 sounds like more fun but I'm not sure it's what people would prefer to see. No idea if such thing exists / makes sense at all. > > -- > Shaun -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Wait a minute. I think you missed or misunderstood something when you listen to someone's opinion in that we should switch to sglist buffer. I think the danger people referred to is exactly what is revealed when the ugly code is removed in this commit (it doesn't mean that the code should be kept though). The original buffer appears to be open: buf = page_address(sg_page(scsi_sglist(scmd))); While the new buffer you adopted in in ata_format_sct_write_same() and ata_format_dsm_trim_descr() is of fixed size: buffer = ((void *)ata_scsi_rbuf); sctpg = ((void *)ata_scsi_rbuf); because: #define ATA_SCSI_RBUF_SIZE 4096 ... static u8 ata_scsi_rbuf[ATA_SCSI_RBUF_SIZE]; So the sglist buffer is always 4096 bytes. And hence you can probably safely use ATA_SCSI_RBUF_SIZE as the buflen param in the sg_copy_from_buffer() calls (at least in the case of ata_format_sct_write_same()). However, the return value of ata_format_dsm_trim_descr() should still always be used_bytes since that is needed by the ata taskfile construction. You may want to check (n_block / 65535 * 8 > ATA_SCSI_RBUF_SIZE). If it is true, then perhaps we may want to return 0, and make the SATL response with invalid CDB field if we catch that. Though IMHO this is really NOT a reason that is strong enough to prevent this patch from entering the repo first. On 23 August 2016 at 09:36, Tom Yan <tom.ty89@gmail.com> wrote: > On 23 August 2016 at 09:18, Shaun Tancheff <shaun.tancheff@seagate.com> wrote: >> On Tue, Aug 23, 2016 at 3:37 AM, Tom Yan <tom.ty89@gmail.com> wrote: >>> On 23 August 2016 at 07:30, Shaun Tancheff <shaun@tancheff.com> wrote: >> >>> If we really want/need to avoid hitting some real buffer limit (e.g. >>> maximum length of scatter/gather list?), then we should in some way >>> check n_block against that. If it is too large we then return >>> used_bytes = 0 (optionally with some follow-up to add a response to >>> such return value or so). >> >> Yes there is a real buffer limit, I can think of these two options: >> 1- Assume the setups from sd_setup_discard_cmnd() and/ >> or sd_setup_write_same_cmnd() are providing an sglist of >> sdp->sector_size via scsi_init_io() > > That sounds completely wrong. The scatter/gather list we are talking > about here has nothing to do with the SCSI or block layer anymore. The > SATL has _already_ parsed the SCSI Write Same (16) command and is > packing ranges/payload according to that in this stage. If there is > any limit it would probably the max_segment allowed by the host driver > (e.g. ahci). > > It doesn't seem to make sense to me either that we would need to > prevent sglist overflow in such level. Doesn't that mean we would need > to do the same checking (specifically, as in hard coding checks in all > kinds of procedures) in every use of scatter/gather list? That doesn't > sound right at all. > >> >> 2- Find (or write) a suitable sg_get_size(sgl, nents) to walk the >> sglist and calculate the available buffer size. >> >> #2 sounds like more fun but I'm not sure it's what people would prefer to see. > > No idea if such thing exists / makes sense at all. > >> >> -- >> Shaun -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Aug 23, 2016 at 5:17 AM, Tom Yan <tom.ty89@gmail.com> wrote: > Wait a minute. I think you missed or misunderstood something when you > listen to someone's opinion in that we should switch to sglist buffer. No, I think I can trust Christoph Hellwig <hch@lst.de> > I think the danger people referred to is exactly what is revealed when > the ugly code is removed in this commit (it doesn't mean that the code > should be kept though). > > The original buffer appears to be open: > buf = page_address(sg_page(scsi_sglist(scmd))); Which is unsafe. > While the new buffer you adopted in in ata_format_sct_write_same() and > ata_format_dsm_trim_descr() is of fixed size: Yes ... it is a temporary response buffer for simulated commands used to copy data to and from the command sg_list so as not to hold irqs while modifying the buffer. > buffer = ((void *)ata_scsi_rbuf); > > sctpg = ((void *)ata_scsi_rbuf); > > because: > > #define ATA_SCSI_RBUF_SIZE 4096 > ... > static u8 ata_scsi_rbuf[ATA_SCSI_RBUF_SIZE]; > > So the sglist buffer is always 4096 bytes. No. The sglist buffer attached to the write same / trim command is always sdp->sector_size > And hence you can probably safely use ATA_SCSI_RBUF_SIZE as the buflen > param in the sg_copy_from_buffer() calls (at least in the case of > ata_format_sct_write_same()). No. SCT Write Same has a fixed single 512 byte transfer. However, the return value of > ata_format_dsm_trim_descr() should still always be used_bytes since > that is needed by the ata taskfile construction. So long as it does not exceed its sglist/sector_size buffer. > You may want to check (n_block / 65535 * 8 > ATA_SCSI_RBUF_SIZE). If > it is true, then perhaps we may want to return 0, and make the SATL > response with invalid CDB field if we catch that. No that is not quite right you need to check if you are overflowing either RBUF or sdp->sector_size. > Though IMHO this is really NOT a reason that is strong enough to > prevent this patch from entering the repo first. > On 23 August 2016 at 09:36, Tom Yan <tom.ty89@gmail.com> wrote: >> On 23 August 2016 at 09:18, Shaun Tancheff <shaun.tancheff@seagate.com> wrote: >>> On Tue, Aug 23, 2016 at 3:37 AM, Tom Yan <tom.ty89@gmail.com> wrote: >>>> On 23 August 2016 at 07:30, Shaun Tancheff <shaun@tancheff.com> wrote: >>> >>>> If we really want/need to avoid hitting some real buffer limit (e.g. >>>> maximum length of scatter/gather list?), then we should in some way >>>> check n_block against that. If it is too large we then return >>>> used_bytes = 0 (optionally with some follow-up to add a response to >>>> such return value or so). >>> >>> Yes there is a real buffer limit, I can think of these two options: >>> 1- Assume the setups from sd_setup_discard_cmnd() and/ >>> or sd_setup_write_same_cmnd() are providing an sglist of >>> sdp->sector_size via scsi_init_io() >> >> That sounds completely wrong. The scatter/gather list we are talking >> about here has nothing to do with the SCSI or block layer anymore. The >> SATL has _already_ parsed the SCSI Write Same (16) command and is >> packing ranges/payload according to that in this stage. If there is >> any limit it would probably the max_segment allowed by the host driver >> (e.g. ahci). >> >> It doesn't seem to make sense to me either that we would need to >> prevent sglist overflow in such level. Doesn't that mean we would need >> to do the same checking (specifically, as in hard coding checks in all >> kinds of procedures) in every use of scatter/gather list? That doesn't >> sound right at all. >> >>> >>> 2- Find (or write) a suitable sg_get_size(sgl, nents) to walk the >>> sglist and calculate the available buffer size. >>> >>> #2 sounds like more fun but I'm not sure it's what people would prefer to see. >> >> No idea if such thing exists / makes sense at all. >> >>> >>> -- >>> Shaun
On 23 August 2016 at 10:45, Shaun Tancheff <shaun.tancheff@seagate.com> wrote: > On Tue, Aug 23, 2016 at 5:17 AM, Tom Yan <tom.ty89@gmail.com> wrote: >> Wait a minute. I think you missed or misunderstood something when you >> listen to someone's opinion in that we should switch to sglist buffer. > > No, I think I can trust Christoph Hellwig <hch@lst.de> I didn't say that you cannot trust him, but I just wonder you might have misinterpreted what he said. I haven't really follow your patch series earlier though. > >> I think the danger people referred to is exactly what is revealed when >> the ugly code is removed in this commit (it doesn't mean that the code >> should be kept though). >> >> The original buffer appears to be open: >> buf = page_address(sg_page(scsi_sglist(scmd))); > > Which is unsafe. Yup, but the while loop is not the right way to make it safe. It probably prevent any overflow, but does not handle / response to the request (n_block) properly. It would only be practically unsafe if we would ever call it with a request that needs a multi-block payload. > >> While the new buffer you adopted in in ata_format_sct_write_same() and >> ata_format_dsm_trim_descr() is of fixed size: > > Yes ... it is a temporary response buffer for simulated commands used to > copy data to and from the command sg_list so as not to hold irqs while > modifying the buffer. > >> buffer = ((void *)ata_scsi_rbuf); >> >> sctpg = ((void *)ata_scsi_rbuf); >> >> because: >> >> #define ATA_SCSI_RBUF_SIZE 4096 >> ... >> static u8 ata_scsi_rbuf[ATA_SCSI_RBUF_SIZE]; >> >> So the sglist buffer is always 4096 bytes. > > No. The sglist buffer attached to the write same / trim command Don't you think it is in appropriate to use ata_scsi_rbuf here then? > is always sdp->sector_size I can hardly see how sdp->sector_size is relevant to the sglist buffer (especially the sglist here is facing to the ATA device). Scatter/gather list could have multiple entries of a certain size (that is not necessarily, or unlikely, logical sector size). Although I don't really know how scatter/gather list works, it hardly seems making any sense by saying that the sglist buffer is always logical sector size. AFAIK, for example, USB Attached SCSI driver does not limit the maximum number of entries (max_segment) so the kernel fall back to SG_MAX_SEGMENTS (2048), while it reports a max_segment_size of 4096 in spite of the logical sector size with dma_get_max_seg_size(). For AHCI, it reports a max_segment of 168, while it does not seem to report any max_segment_size so dma_get_max_seg_size() fallback to 65536 (though practically it seems 4096 will still be used in normal writing). > >> And hence you can probably safely use ATA_SCSI_RBUF_SIZE as the buflen >> param in the sg_copy_from_buffer() calls (at least in the case of >> ata_format_sct_write_same()). > > No. SCT Write Same has a fixed single 512 byte transfer. Never mind on that. I have presumption that you should always fully copy ata_scsi_rbuf to the sglist. > > However, the return value of >> ata_format_dsm_trim_descr() should still always be used_bytes since >> that is needed by the ata taskfile construction. > > So long as it does not exceed its sglist/sector_size buffer. > >> You may want to check (n_block / 65535 * 8 > ATA_SCSI_RBUF_SIZE). If >> it is true, then perhaps we may want to return 0, and make the SATL >> response with invalid CDB field if we catch that. > > No that is not quite right you need to check if you are > overflowing either RBUF or sdp->sector_size. Again, this does not make any sense. See above. Apparently you really have no reason to use ata_scsi_rbuf. > >> Though IMHO this is really NOT a reason that is strong enough to >> prevent this patch from entering the repo first. > >> On 23 August 2016 at 09:36, Tom Yan <tom.ty89@gmail.com> wrote: >>> On 23 August 2016 at 09:18, Shaun Tancheff <shaun.tancheff@seagate.com> wrote: >>>> On Tue, Aug 23, 2016 at 3:37 AM, Tom Yan <tom.ty89@gmail.com> wrote: >>>>> On 23 August 2016 at 07:30, Shaun Tancheff <shaun@tancheff.com> wrote: >>>> >>>>> If we really want/need to avoid hitting some real buffer limit (e.g. >>>>> maximum length of scatter/gather list?), then we should in some way >>>>> check n_block against that. If it is too large we then return >>>>> used_bytes = 0 (optionally with some follow-up to add a response to >>>>> such return value or so). >>>> >>>> Yes there is a real buffer limit, I can think of these two options: >>>> 1- Assume the setups from sd_setup_discard_cmnd() and/ >>>> or sd_setup_write_same_cmnd() are providing an sglist of >>>> sdp->sector_size via scsi_init_io() >>> >>> That sounds completely wrong. The scatter/gather list we are talking >>> about here has nothing to do with the SCSI or block layer anymore. The >>> SATL has _already_ parsed the SCSI Write Same (16) command and is >>> packing ranges/payload according to that in this stage. If there is >>> any limit it would probably the max_segment allowed by the host driver >>> (e.g. ahci). >>> >>> It doesn't seem to make sense to me either that we would need to >>> prevent sglist overflow in such level. Doesn't that mean we would need >>> to do the same checking (specifically, as in hard coding checks in all >>> kinds of procedures) in every use of scatter/gather list? That doesn't >>> sound right at all. >>> >>>> >>>> 2- Find (or write) a suitable sg_get_size(sgl, nents) to walk the >>>> sglist and calculate the available buffer size. >>>> >>>> #2 sounds like more fun but I'm not sure it's what people would prefer to see. >>> >>> No idea if such thing exists / makes sense at all. >>> >>>> >>>> -- >>>> Shaun > > > > -- > Shaun Tancheff -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Tom, In my opinion this patch you submitted is simply making the code less safe against a buffer overflow without a sufficiently good reason. In future please comment on other patches as replies to those patches. Mixing them together is just confusing. --Shaun -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c index be9c76c..9b74ecb 100644 --- a/drivers/ata/libata-scsi.c +++ b/drivers/ata/libata-scsi.c @@ -3322,7 +3322,7 @@ static unsigned int ata_scsi_write_same_xlat(struct ata_queued_cmd *qc) buf = page_address(sg_page(scsi_sglist(scmd))); if (n_block <= 65535 * ATA_MAX_TRIM_RNUM) { - size = ata_set_lba_range_entries(buf, ATA_MAX_TRIM_RNUM, block, n_block); + size = ata_set_lba_range_entries(buf, block, n_block); } else { fp = 2; goto invalid_fld; diff --git a/include/linux/ata.h b/include/linux/ata.h index adbc812..5e2e9ad 100644 --- a/include/linux/ata.h +++ b/include/linux/ata.h @@ -1077,19 +1077,19 @@ static inline void ata_id_to_hd_driveid(u16 *id) * TO NV CACHE PINNED SET. */ static inline unsigned ata_set_lba_range_entries(void *_buffer, - unsigned num, u64 sector, unsigned long count) + u64 sector, unsigned long count) { __le64 *buffer = _buffer; unsigned i = 0, used_bytes; - while (i < num) { - u64 entry = sector | - ((u64)(count > 0xffff ? 0xffff : count) << 48); + while (count > 0) { + u64 range, entry; + + range = count > 0xffff ? 0xffff : count; + entry = sector | (range << 48); buffer[i++] = __cpu_to_le64(entry); - if (count <= 0xffff) - break; - count -= 0xffff; - sector += 0xffff; + count -= range; + sector += range; } used_bytes = ALIGN(i * 8, 512);