Message ID | 1422896713-25367-2-git-send-email-holler@ahsoftware.de (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Mon, Feb 02, 2015 at 06:05:09PM +0100, Alexander Holler wrote: > + if (inode) { > + // TODO: > + // if (inode is file and 's' flag is set) > + // secure = true; > + if (!secure) > + iput(inode); /* truncate the inode here */ > + else { > + struct super_block *sb = inode->i_sb; > + if (sb->s_op->set_secure_delete) > + sb->s_op->set_secure_delete(sb, true); > + // TODO: We should fail if secure isn't supported, > + // look up how that's possible here. > + iput(inode); /* truncate the inode here */ > + // TODO: check if sb is still valid after the inode is gone > + sync_filesystem(sb); > + if (sb->s_op->set_secure_delete) > + sb->s_op->set_secure_delete(sb, false); > + } Charming. Now, what exactly happens if two such syscalls overlap in time? Moroever, what makes you equate unlink() with inode removal? What happens if you race e.g. with stat(2) on the same thing? Or if there's an opened file over that sucker, for that matter? -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Am 03.02.2015 um 07:05 schrieb Al Viro: > On Mon, Feb 02, 2015 at 06:05:09PM +0100, Alexander Holler wrote: >> + if (inode) { >> + // TODO: >> + // if (inode is file and 's' flag is set) >> + // secure = true; >> + if (!secure) >> + iput(inode); /* truncate the inode here */ >> + else { >> + struct super_block *sb = inode->i_sb; >> + if (sb->s_op->set_secure_delete) >> + sb->s_op->set_secure_delete(sb, true); >> + // TODO: We should fail if secure isn't supported, >> + // look up how that's possible here. >> + iput(inode); /* truncate the inode here */ >> + // TODO: check if sb is still valid after the inode is gone >> + sync_filesystem(sb); >> + if (sb->s_op->set_secure_delete) >> + sb->s_op->set_secure_delete(sb, false); >> + } > > Charming. Now, what exactly happens if two such syscalls overlap in time? What do you think will happen? I assume you haven't looked at how I've implemented set_secure_delete(). CHarming. -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Am 03.02.2015 um 07:05 schrieb Al Viro: > On Mon, Feb 02, 2015 at 06:05:09PM +0100, Alexander Holler wrote: >> + if (inode) { >> + // TODO: >> + // if (inode is file and 's' flag is set) >> + // secure = true; >> + if (!secure) >> + iput(inode); /* truncate the inode here */ >> + else { >> + struct super_block *sb = inode->i_sb; >> + if (sb->s_op->set_secure_delete) >> + sb->s_op->set_secure_delete(sb, true); >> + // TODO: We should fail if secure isn't supported, >> + // look up how that's possible here. >> + iput(inode); /* truncate the inode here */ >> + // TODO: check if sb is still valid after the inode is gone >> + sync_filesystem(sb); >> + if (sb->s_op->set_secure_delete) >> + sb->s_op->set_secure_delete(sb, false); >> + } > > Charming. Now, what exactly happens if two such syscalls overlap in time? > Moroever, what makes you equate unlink() with inode removal? What happens > if you race e.g. with stat(2) on the same thing? Or if there's an opened > file over that sucker, for that matter? Sorry, but I first had to make breakfast after I've got angry about the usual arrogance of most Linux kernel maintainers. I've already answered the first question. Now to the second. That still might be a problem. But that's why this is a RFC, why there is a WIP (Work In Progress) before the patch, why I've written I've never looked at those sources before, why I've written they are imperfect and why I've written I have not spend much time on these patches. I've posted them to show how I think the problem might be solved. These patches evolved out of desperation that otherwise users have to wait another 30 years until they will be offered a way to really delete files. If the removal is somehow scheduled then, of course, the secure flag has to be scheduled too. Alexander Holler -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Feb 03, 2015 at 07:58:50AM +0100, Alexander Holler wrote: > > Charming. Now, what exactly happens if two such syscalls overlap in time? > > What do you think will happen? I assume you haven't looked at how I've > implemented set_secure_delete(). CHarming. AFAICS, you get random unlink() happening at the same time hit by that mess, whether they'd asked for it or not. What's more, this counter of yours is *not* guaranteed to be elevated during the final iput() of the inode you wanted to get - again, ls -lR racing with that syscall can elevate the refcount of dentry, making d_delete() in vfs_unlink() just remove that dentry from hash, while keeping it positive. If dentry reference grabbed by stat(2) is released after both dput() and iput() in do_unlinkat(), the final iput() will be done when stat(2) drops its reference to dentry, triggering immediate dentry_kill() (since dentry has already been unhashed) and dentry_iput() from it. IOW, this counter is both too crude (it's fs-wide, for crying out loud) *and* not guaranteed to cover enough. _IF_ you want that behaviour at all, it ought to be an in-core inode flag set by that syscall and checked by truncation logics to decide whether to do normal truncate of this "overwrite with zeroes" thing. While we are at it, "overwrite with zeroes" is too weak if the attacker might get hold of the actual hardware. Google for details - it's far too long story for l-k posting. Look for data recovery and secure data erasure... -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, 2015-02-03 at 07:58 +0100, Alexander Holler wrote: > Am 03.02.2015 um 07:05 schrieb Al Viro: > > On Mon, Feb 02, 2015 at 06:05:09PM +0100, Alexander Holler wrote: > >> + if (inode) { > >> + // TODO: > >> + // if (inode is file and 's' flag is set) > >> + // secure = true; > >> + if (!secure) > >> + iput(inode); /* truncate the inode here */ > >> + else { > >> + struct super_block *sb = inode->i_sb; > >> + if (sb->s_op->set_secure_delete) > >> + sb->s_op->set_secure_delete(sb, true); > >> + // TODO: We should fail if secure isn't supported, > >> + // look up how that's possible here. > >> + iput(inode); /* truncate the inode here */ > >> + // TODO: check if sb is still valid after the inode is gone > >> + sync_filesystem(sb); > >> + if (sb->s_op->set_secure_delete) > >> + sb->s_op->set_secure_delete(sb, false); > >> + } > > > > Charming. Now, what exactly happens if two such syscalls overlap in time? > > What do you think will happen? I assume you haven't looked at how I've > implemented set_secure_delete(). CHarming. Chill, why don't you. -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Am 03.02.2015 um 08:56 schrieb Al Viro: > While we are at it, "overwrite with zeroes" is too weak if the attacker > might get hold of the actual hardware. Google for details - it's far too > long story for l-k posting. Look for data recovery and secure data erasure... You might read http://link.springer.com/chapter/10.1007/978-3-540-89862-7_21 Here is an article in german about that: http://www.heise.de/security/meldung/Sicheres-Loeschen-Einmal-ueberschreiben-genuegt-198816.html In short, it's enough to overwrite it once with zeros, -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Feb 03, 2015 at 09:01:36AM +0100, Alexander Holler wrote: > Am 03.02.2015 um 08:56 schrieb Al Viro: > > >While we are at it, "overwrite with zeroes" is too weak if the attacker > >might get hold of the actual hardware. Google for details - it's far too > >long story for l-k posting. Look for data recovery and secure data erasure... > > You might read > > http://link.springer.com/chapter/10.1007/978-3-540-89862-7_21 > > Here is an article in german about that: > > http://www.heise.de/security/meldung/Sicheres-Loeschen-Einmal-ueberschreiben-genuegt-198816.html > > In short, it's enough to overwrite it once with zeros, Regardless of the media used? How does that work on e.g. flash? -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Am 03.02.2015 um 09:10 schrieb Al Viro: > On Tue, Feb 03, 2015 at 09:01:36AM +0100, Alexander Holler wrote: >> Am 03.02.2015 um 08:56 schrieb Al Viro: >> >>> While we are at it, "overwrite with zeroes" is too weak if the attacker >>> might get hold of the actual hardware. Google for details - it's far too >>> long story for l-k posting. Look for data recovery and secure data erasure... >> >> You might read >> >> http://link.springer.com/chapter/10.1007/978-3-540-89862-7_21 >> >> Here is an article in german about that: >> >> http://www.heise.de/security/meldung/Sicheres-Loeschen-Einmal-ueberschreiben-genuegt-198816.html >> >> In short, it's enough to overwrite it once with zeros, > > Regardless of the media used? How does that work on e.g. flash? That's why "secure trim" should be used if available. Blame the storage people for not offering it. But as I've already mentioned, they would just answer that filesystems don't (didn't) delete files anyway. -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Am 03.02.2015 um 08:56 schrieb Al Viro: > On Tue, Feb 03, 2015 at 07:58:50AM +0100, Alexander Holler wrote: > >>> Charming. Now, what exactly happens if two such syscalls overlap in time? >> >> What do you think will happen? I assume you haven't looked at how I've >> implemented set_secure_delete(). CHarming. > > AFAICS, you get random unlink() happening at the same time hit by that > mess, whether they'd asked for it or not. What's more, this counter > of yours is *not* guaranteed to be elevated during the final iput() of the > inode you wanted to get - again, ls -lR racing with that syscall can > elevate the refcount of dentry, making d_delete() in vfs_unlink() just > remove that dentry from hash, while keeping it positive. If dentry > reference grabbed by stat(2) is released after both dput() and iput() in > do_unlinkat(), the final iput() will be done when stat(2) drops its > reference to dentry, triggering immediate dentry_kill() (since dentry > has already been unhashed) and dentry_iput() from it. Thanks for the short explanation. I will see if I can make sense out of it for me to get an idea how to solve that. > > IOW, this counter is both too crude (it's fs-wide, for crying out loud) > *and* not guaranteed to cover enough. _IF_ you want that behaviour at Sure it is crude. But it keeps the patches simple. As I've written, unlinkat_s() isn't meant for everyday usage, just for the rare case when one really wants to get rid of some contents. Therefor execution speed or an i/o slowdown while the "secure deletion" is in work is totally ignored And that "rare case" doesn't include military security levels, it's just meant for ordinary people which want make it much, much harder for other ordinary people (or geeks or kernel maintainers) to read the deleted content ever again. It's far too easy to use grep or something similiar to find seemingly deleted stuff at device level again (after it was deleted by what filesystems are offering nowadays). Especially if one thinks at stuff like certificates and similiar which can be identified by common patterns (bit sequences) they use. Alexander Holler -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Am 03.02.2015 um 09:51 schrieb Alexander Holler: > Am 03.02.2015 um 08:56 schrieb Al Viro: >> On Tue, Feb 03, 2015 at 07:58:50AM +0100, Alexander Holler wrote: >> >>>> Charming. Now, what exactly happens if two such syscalls overlap in >>>> time? >>> >>> What do you think will happen? I assume you haven't looked at how I've >>> implemented set_secure_delete(). CHarming. >> >> AFAICS, you get random unlink() happening at the same time hit by that >> mess, whether they'd asked for it or not. What's more, this counter >> of yours is *not* guaranteed to be elevated during the final iput() of >> the >> inode you wanted to get - again, ls -lR racing with that syscall can >> elevate the refcount of dentry, making d_delete() in vfs_unlink() just >> remove that dentry from hash, while keeping it positive. If dentry >> reference grabbed by stat(2) is released after both dput() and iput() in >> do_unlinkat(), the final iput() will be done when stat(2) drops its >> reference to dentry, triggering immediate dentry_kill() (since dentry >> has already been unhashed) and dentry_iput() from it. > > Thanks for the short explanation. I will see if I can make sense out of > it for me to get an idea how to solve that. > >> >> IOW, this counter is both too crude (it's fs-wide, for crying out loud) >> *and* not guaranteed to cover enough. _IF_ you want that behaviour at > > Sure it is crude. > > But it keeps the patches simple. As I've written, unlinkat_s() isn't > meant for everyday usage, just for the rare case when one really wants > to get rid of some contents. Therefor execution speed or an i/o slowdown > while the "secure deletion" is in work is totally ignored > > And that "rare case" doesn't include military security levels, it's just > meant for ordinary people which want make it much, much harder for other > ordinary people (or geeks or kernel maintainers) to read the deleted > content ever again. It's far too easy to use grep or something similiar > to find seemingly deleted stuff at device level again (after it was > deleted by what filesystems are offering nowadays). Especially if one > thinks at stuff like certificates and similiar which can be identified > by common patterns (bit sequences) they use. Or to give another more common example: If you delete your contact list, I likely might find again by just searching for 0x6f726956 at the device level (assuming you've stored a contact in that list with the same surname as yours. And, because I've only mentioned in a different thread, now think at the problem that nowadays storage is often fixed (soldered) to devices which don't offer a way to delete the whole storage. You might have luck if the contact list in question was stored in some encrypted part, but that presumes that the key for that encrypted part isn't somehow stored on the same device too. Which unfortunately isn't always the case (maybe because of usability). And ... That's why I think filesystems should offer a way to really delete files. Most people would be happy, even if filesystems won't delete stuff at military security levels and would disregard all the cases when they couldn't make sure that stuff is really deleted. To conclude, most people would be already happy if the most trivial case would be handled right and not just by marking files as deleted but leaving the contents intact. Alexander Holler -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Am 03.02.2015 um 10:23 schrieb Alexander Holler: > Or to give another more common example: If you delete your contact list, > I likely might find again by just searching for 0x6f726956 at the device > level (assuming you've stored a contact in that list with the same > surname as yours. > > And, because I've only mentioned in a different thread, now think at the > problem that nowadays storage is often fixed (soldered) to devices which > don't offer a way to delete the whole storage. You might have luck if > the contact list in question was stored in some encrypted part, but that > presumes that the key for that encrypted part isn't somehow stored on > the same device too. Which unfortunately isn't always the case (maybe > because of usability). And ... > > That's why I think filesystems should offer a way to really delete > files. Most people would be happy, even if filesystems won't delete > stuff at military security levels and would disregard all the cases when > they couldn't make sure that stuff is really deleted. > > To conclude, most people would be already happy if the most trivial case > would be handled right and not just by marking files as deleted but > leaving the contents intact. Maybe I should rename me to Quijote, switch back to pen and paper and should start to raise carrier pigeons. ;) E.g. my parents are stull successfully using contact lists on paper. These are still more readable, easier to handle and smaller than any available electronic replacement. And they have absolutely no problem to destroy an old one when they replace it with a new one. Regards, Alexander Holler -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Am 03.02.2015 um 13:48 schrieb Alexander Holler: > E.g. my parents are stull successfully using contact lists on paper. > These are still more readable, easier to handle and smaller than any > available electronic replacement. And they have absolutely no problem to > destroy an old one when they replace it with a new one. Which very seldom happens, no screen which can crack and no battery. ;) -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Alexander Holler wrote: > Am 03.02.2015 um 09:51 schrieb Alexander Holler: >> Am 03.02.2015 um 08:56 schrieb Al Viro: >>> On Tue, Feb 03, 2015 at 07:58:50AM +0100, Alexander Holler wrote: >>> >>>>> Charming. Now, what exactly happens if two such syscalls overlap in >>>>> time? >>>> >>>> What do you think will happen? I assume you haven't looked at how I've >>>> implemented set_secure_delete(). CHarming. >>> >>> AFAICS, you get random unlink() happening at the same time hit by that >>> mess, whether they'd asked for it or not. What's more, this counter >>> of yours is *not* guaranteed to be elevated during the final iput() of >>> the >>> inode you wanted to get - again, ls -lR racing with that syscall can >>> elevate the refcount of dentry, making d_delete() in vfs_unlink() just >>> remove that dentry from hash, while keeping it positive. If dentry >>> reference grabbed by stat(2) is released after both dput() and iput() in >>> do_unlinkat(), the final iput() will be done when stat(2) drops its >>> reference to dentry, triggering immediate dentry_kill() (since dentry >>> has already been unhashed) and dentry_iput() from it. >> >> Thanks for the short explanation. I will see if I can make sense out of >> it for me to get an idea how to solve that. >> >>> >>> IOW, this counter is both too crude (it's fs-wide, for crying out loud) >>> *and* not guaranteed to cover enough. _IF_ you want that behaviour at >> >> Sure it is crude. >> >> But it keeps the patches simple. As I've written, unlinkat_s() isn't >> meant for everyday usage, just for the rare case when one really wants >> to get rid of some contents. Therefor execution speed or an i/o slowdown >> while the "secure deletion" is in work is totally ignored >> >> And that "rare case" doesn't include military security levels, it's just >> meant for ordinary people which want make it much, much harder for other >> ordinary people (or geeks or kernel maintainers) to read the deleted >> content ever again. It's far too easy to use grep or something similiar >> to find seemingly deleted stuff at device level again (after it was >> deleted by what filesystems are offering nowadays). Especially if one >> thinks at stuff like certificates and similiar which can be identified >> by common patterns (bit sequences) they use. > > Or to give another more common example: If you delete your contact list, > I likely might find again by just searching for 0x6f726956 at the device > level (assuming you've stored a contact in that list with the same > surname as yours. > > And, because I've only mentioned in a different thread, now think at the > problem that nowadays storage is often fixed (soldered) to devices which > don't offer a way to delete the whole storage. You might have luck if > the contact list in question was stored in some encrypted part, but that > presumes that the key for that encrypted part isn't somehow stored on > the same device too. Which unfortunately isn't always the case (maybe > because of usability). And ... > > That's why I think filesystems should offer a way to really delete > files. Most people would be happy, even if filesystems won't delete > stuff at military security levels and would disregard all the cases when > they couldn't make sure that stuff is really deleted. > > To conclude, most people would be already happy if the most trivial case > would be handled right and not just by marking files as deleted but > leaving the contents intact. Well, one other issue is that this only ensures that the extents referenced at the time of explicit deletion are wiped. On COW filesystems this is most obviously insufficient (All the old unreferenced-after-COW copies will have been left in place rather than erased in this manner). However, on other filesystems it can still happen anyway. So even just architecturally, this operates at the wrong point in time to make the guarantees it's claiming. Secure deletion behavior affects the entire filesystem, because it may very well make internal copies, not just explicit user-driven ones. Because of that, it'd pretty much have to be a mount flag of some variety, as far as I understand. When enabled, any time an extent is freed, it must be wiped. And even then, a single mount with that option disabled would destroy any such guarantees. -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Feb 03, 2015 at 01:48:56PM +0100, Alexander Holler wrote: > > E.g. my parents are stull successfully using contact lists on paper. These > are still more readable, easier to handle and smaller than any available > electronic replacement. And they have absolutely no problem to destroy an > old one when they replace it with a new one. Sort of using crypto, which I think is still the best response to this particular use case --- the trick is making it easy to use, but that's a desktop/distro integration problem --- the most secure way to really make sure information on paper and on an SSD is secure is actually the same --- you use a shredder. And this isn't just for "military grade security". There are some commercial cloud providers which do precisely this, because they know that their customer's security and their reputation is not easily valued, and certainly the cost of some piddling flash chips, after they have been depreciated, is defintiely cheaper than a security exposure. Look at the estimates of how much money Target lost with their little security oopsie. Again, this is commercial, not military security. Cheers, - Ted -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Am 03.02.2015 um 18:48 schrieb Theodore Ts'o: > On Tue, Feb 03, 2015 at 01:48:56PM +0100, Alexander Holler wrote: >> >> E.g. my parents are stull successfully using contact lists on paper. These >> are still more readable, easier to handle and smaller than any available >> electronic replacement. And they have absolutely no problem to destroy an >> old one when they replace it with a new one. > > Sort of using crypto, which I think is still the best response to this > particular use case --- the trick is making it easy to use, but that's > a desktop/distro integration problem --- the most secure way to really > make sure information on paper and on an SSD is secure is actually the > same --- you use a shredder. > > And this isn't just for "military grade security". There are some > commercial cloud providers which do precisely this, because they know > that their customer's security and their reputation is not easily > valued, and certainly the cost of some piddling flash chips, after > they have been depreciated, is defintiely cheaper than a security > exposure. Look at the estimates of how much money Target lost with > their little security oopsie. Again, this is commercial, not military > security. Yeah, as I've already admitted in the bug, I never should have use the word secure, because everyone nowadays seems to end up in panic when reading that word. So, if I would be able to use sed on my mails, I would replace unlinkat_s() with unlinkat_w() (for wipe) or would say that _s does stand for 'shred' in the means of shred(1). Anyways, as I hopefully still have some years left to live (also I'm writing SW since 30a, I'm not that old as some of you guys and still have to work at least two decades), I might be able to see where we will end up with all that fiasco we've already managed to drive into. Regards, Alexander Holler -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Feb 03, 2015 at 07:01:50PM +0100, Alexander Holler wrote: > Yeah, as I've already admitted in the bug, I never should have use > the word secure, because everyone nowadays seems to end up in panic > when reading that word. > > So, if I would be able to use sed on my mails, I would replace > unlinkat_s() with unlinkat_w() (for wipe) or would say that _s does > stand for 'shred' in the means of shred(1). TBH, I suspect that the saner API would be something like EXT2_IOC_[SG[ETFLAGS, allowing to set and query that along with other flags (append-only, etc.). Forget about unlink; first of all, whatever API you use should only _mark_ the inode as "zero freed blocks" (or trim, for that matter). You can't force freeing of an inode, so either you make sure that subsequent freeing of inode, whenever it happens, will do that work, or your API is hopelessly racy. Moreover, when link has been removed it's too late to report that fs has no way to e.g. trim those blocks, so you really want to have it done _before_ the actual link removal. And if the file contents is that sensitive, you'd better extend the same protection to all operations that free its blocks, including truncate(), fallocate() hole-punching, whatever. What's more, if you divorce that from link removal, you probably don't want it as in-core-only flag - have it stored in inode, if fs supports that. Alternatively, you might want to represent it as xattr - as much as I hate those, it might turn out to be the best fit in this case, if we end up with several variants for freed blocks disposal. Not sure... But whichever way we represent that state, IMO a) operation should be similar to chmod/chattr/setfattr - modifying inode metadata. b) it should affect _all_ operations freeing blocks of that file from that point on c) it should be able to fail, telling you that you can't do that for this backing store. -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Feb 3, 2015, at 4:33 PM, Al Viro <viro@ZenIV.linux.org.uk> wrote: > On Tue, Feb 03, 2015 at 07:01:50PM +0100, Alexander Holler wrote: >> Yeah, as I've already admitted in the bug, I never should have use >> the word secure, because everyone nowadays seems to end up in panic >> when reading that word. >> >> So, if I would be able to use sed on my mails, I would replace >> unlinkat_s() with unlinkat_w() (for wipe) or would say that _s does >> stand for 'shred' in the means of shred(1). > > TBH, I suspect that the saner API would be something like > EXT2_IOC_[SG]ETFLAGS, allowing to set and query that along with other > flags (append-only, etc.). > > Forget about unlink; first of all, whatever API you use should only > _mark_ the inode as "zero freed blocks" (or trim, for that matter). This already exists for a long time. "chattr +s file [file...]" marks inodes for "secure deletion" (EXT2_SECRM_FL), but this wasn't implemented. Cheers, Andreas > You can't force freeing of an inode, so either you make sure that > subsequent freeing of inode, whenever it happens, will do that work, > or your API is hopelessly racy. Moreover, when link has been removed > it's too late to report that fs has no way to e.g. trim those blocks, > so you really want to have it done _before_ the actual link removal. > And if the file contents is that sensitive, > you'd better extend the same protection to all operations that free its > blocks, including truncate(), fallocate() hole-punching, whatever. What's > more, if you divorce that from link removal, you probably don't want it as > in-core-only flag - have it stored in inode, if fs supports that. > > Alternatively, you might want to represent it as xattr - as much as I hate > those, it might turn out to be the best fit in this case, if we end up > with several variants for freed blocks disposal. Not sure... > > But whichever way we represent that state, IMO > a) operation should be similar to chmod/chattr/setfattr - > modifying inode metadata. > b) it should affect _all_ operations freeing blocks of that > file from that point on > c) it should be able to fail, telling you that you can't do > that for this backing store. Cheers, Andreas -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[CC += linux-api@] On Mon, Feb 2, 2015 at 6:05 PM, Alexander Holler <holler@ahsoftware.de> wrote: > Signed-off-by: Alexander Holler <holler@ahsoftware.de> > --- > arch/x86/syscalls/syscall_32.tbl | 1 + > arch/x86/syscalls/syscall_64.tbl | 1 + > fs/namei.c | 38 ++++++++++++++++++++++++++++++----- > include/asm-generic/audit_dir_write.h | 1 + > include/linux/fs.h | 1 + > include/linux/syscalls.h | 1 + > include/uapi/asm-generic/unistd.h | 4 +++- > tools/perf/builtin-trace.c | 2 ++ > 8 files changed, 43 insertions(+), 6 deletions(-) > > diff --git a/arch/x86/syscalls/syscall_32.tbl b/arch/x86/syscalls/syscall_32.tbl > index 9fe1b5d..7a3d530 100644 > --- a/arch/x86/syscalls/syscall_32.tbl > +++ b/arch/x86/syscalls/syscall_32.tbl > @@ -364,3 +364,4 @@ > 355 i386 getrandom sys_getrandom > 356 i386 memfd_create sys_memfd_create > 357 i386 bpf sys_bpf > +359 i386 unlinkat_s sys_unlinkat_s > diff --git a/arch/x86/syscalls/syscall_64.tbl b/arch/x86/syscalls/syscall_64.tbl > index 281150b..97eaf01 100644 > --- a/arch/x86/syscalls/syscall_64.tbl > +++ b/arch/x86/syscalls/syscall_64.tbl > @@ -328,6 +328,7 @@ > 319 common memfd_create sys_memfd_create > 320 common kexec_file_load sys_kexec_file_load > 321 common bpf sys_bpf > +322 common unlinkat_s sys_unlinkat_s > > # > # x32-specific system call numbers start at 512 to avoid cache impact > diff --git a/fs/namei.c b/fs/namei.c > index db5fe86..1ad3724 100644 > --- a/fs/namei.c > +++ b/fs/namei.c > @@ -3717,7 +3717,7 @@ EXPORT_SYMBOL(vfs_unlink); > * writeout happening, and we don't want to prevent access to the directory > * while waiting on the I/O. > */ > -static long do_unlinkat(int dfd, const char __user *pathname) > +static long do_unlinkat(int dfd, const char __user *pathname, bool secure) > { > int error; > struct filename *name; > @@ -3759,8 +3759,25 @@ exit2: > dput(dentry); > } > mutex_unlock(&nd.path.dentry->d_inode->i_mutex); > - if (inode) > - iput(inode); /* truncate the inode here */ > + if (inode) { > + // TODO: > + // if (inode is file and 's' flag is set) > + // secure = true; > + if (!secure) > + iput(inode); /* truncate the inode here */ > + else { > + struct super_block *sb = inode->i_sb; > + if (sb->s_op->set_secure_delete) > + sb->s_op->set_secure_delete(sb, true); > + // TODO: We should fail if secure isn't supported, > + // look up how that's possible here. > + iput(inode); /* truncate the inode here */ > + // TODO: check if sb is still valid after the inode is gone > + sync_filesystem(sb); > + if (sb->s_op->set_secure_delete) > + sb->s_op->set_secure_delete(sb, false); > + } > + } > inode = NULL; > if (delegated_inode) { > error = break_deleg_wait(&delegated_inode); > @@ -3796,12 +3813,23 @@ SYSCALL_DEFINE3(unlinkat, int, dfd, const char __user *, pathname, int, flag) > if (flag & AT_REMOVEDIR) > return do_rmdir(dfd, pathname); > > - return do_unlinkat(dfd, pathname); > + return do_unlinkat(dfd, pathname, false); > } > > SYSCALL_DEFINE1(unlink, const char __user *, pathname) > { > - return do_unlinkat(AT_FDCWD, pathname); > + return do_unlinkat(AT_FDCWD, pathname, false); > +} > + > +SYSCALL_DEFINE3(unlinkat_s, int, dfd, const char __user *, pathname, int, flag) > +{ > + if ((flag & ~AT_REMOVEDIR) != 0) > + return -EINVAL; > + > + if (flag & AT_REMOVEDIR) > + return do_rmdir(dfd, pathname); > + > + return do_unlinkat(dfd, pathname, true); > } > > int vfs_symlink(struct inode *dir, struct dentry *dentry, const char *oldname) > diff --git a/include/asm-generic/audit_dir_write.h b/include/asm-generic/audit_dir_write.h > index 7b61db4..5282aba 100644 > --- a/include/asm-generic/audit_dir_write.h > +++ b/include/asm-generic/audit_dir_write.h > @@ -29,4 +29,5 @@ __NR_unlinkat, > __NR_renameat, > __NR_linkat, > __NR_symlinkat, > +__NR_unlinkat_s, > #endif > diff --git a/include/linux/fs.h b/include/linux/fs.h > index 9ab779e..039e969 100644 > --- a/include/linux/fs.h > +++ b/include/linux/fs.h > @@ -1594,6 +1594,7 @@ struct super_operations { > int (*bdev_try_to_free_page)(struct super_block*, struct page*, gfp_t); > long (*nr_cached_objects)(struct super_block *, int); > long (*free_cached_objects)(struct super_block *, long, int); > + void (*set_secure_delete) (struct super_block *, bool); > }; > > /* > diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h > index bda9b81..b88019b 100644 > --- a/include/linux/syscalls.h > +++ b/include/linux/syscalls.h > @@ -877,4 +877,5 @@ asmlinkage long sys_seccomp(unsigned int op, unsigned int flags, > asmlinkage long sys_getrandom(char __user *buf, size_t count, > unsigned int flags); > asmlinkage long sys_bpf(int cmd, union bpf_attr *attr, unsigned int size); > +asmlinkage long sys_unlinkat_s(int dfd, const char __user * pathname, int flag); > #endif > diff --git a/include/uapi/asm-generic/unistd.h b/include/uapi/asm-generic/unistd.h > index 22749c1..2ba072e 100644 > --- a/include/uapi/asm-generic/unistd.h > +++ b/include/uapi/asm-generic/unistd.h > @@ -707,9 +707,11 @@ __SYSCALL(__NR_getrandom, sys_getrandom) > __SYSCALL(__NR_memfd_create, sys_memfd_create) > #define __NR_bpf 280 > __SYSCALL(__NR_bpf, sys_bpf) > +#define __NR_unlinkat_s 281 > +__SYSCALL(__NR_unlinkat_s, sys_unlinkat_s) > > #undef __NR_syscalls > -#define __NR_syscalls 281 > +#define __NR_syscalls 282 > > /* > * All syscalls below here should go away really, > diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c > index fb12645..1507335 100644 > --- a/tools/perf/builtin-trace.c > +++ b/tools/perf/builtin-trace.c > @@ -1110,6 +1110,8 @@ static struct syscall_fmt { > { .name = "uname", .errmsg = true, .alias = "newuname", }, > { .name = "unlinkat", .errmsg = true, > .arg_scnprintf = { [0] = SCA_FDAT, /* dfd */ }, }, > + { .name = "unlinkat_s", .errmsg = true, > + .arg_scnprintf = { [0] = SCA_FDAT, /* dfd */ }, }, > { .name = "utimensat", .errmsg = true, > .arg_scnprintf = { [0] = SCA_FDAT, /* dirfd */ }, }, > { .name = "write", .errmsg = true, > -- > 2.1.0 > > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/
Am 04.02.2015 um 00:33 schrieb Al Viro: > On Tue, Feb 03, 2015 at 07:01:50PM +0100, Alexander Holler wrote: > >> Yeah, as I've already admitted in the bug, I never should have use >> the word secure, because everyone nowadays seems to end up in panic >> when reading that word. >> >> So, if I would be able to use sed on my mails, I would replace >> unlinkat_s() with unlinkat_w() (for wipe) or would say that _s does >> stand for 'shred' in the means of shred(1). > > TBH, I suspect that the saner API would be something like EXT2_IOC_[SG[ETFLAGS, > allowing to set and query that along with other flags (append-only, etc.). > > Forget about unlink; first of all, whatever API you use should only _mark_ > the inode as "zero freed blocks" (or trim, for that matter). You can't > force freeing of an inode, so either you make sure that subsequent freeing > of inode, whenever it happens, will do that work, or your API is hopelessly > racy. Moreover, when link has been removed it's too late to report that > fs has no way to e.g. trim those blocks, so you really want to have it done > _before_ the actual link removal. And if the file contents is that sensitive, > you'd better extend the same protection to all operations that free its > blocks, including truncate(), fallocate() hole-punching, whatever. What's > more, if you divorce that from link removal, you probably don't want it as > in-core-only flag - have it stored in inode, if fs supports that. > > Alternatively, you might want to represent it as xattr - as much as I hate > those, it might turn out to be the best fit in this case, if we end up > with several variants for freed blocks disposal. Not sure... > > But whichever way we represent that state, IMO > a) operation should be similar to chmod/chattr/setfattr - modifying > inode metadata. > b) it should affect _all_ operations freeing blocks of that file > from that point on > c) it should be able to fail, telling you that you can't do that for > this backing store. My intention to use unlinkat() or unlinkat_s() was the following: - It can be supported by most filesystems (see my fat patch) - It doesn't really make any promises it can't, like deleting leftovers of an already modified file. That's where a much more complicated solution like the 's' attribute would appropriate. It just should try to wipe the current contents of a file. The second reason was also the reason why I've crafted the subject of the RFC very carefully: "Offer a way for userspace to request real deletion of files". I did that to avoid the nitpickers. It doesn't say how the request is or has to be handled. I was aware of all the problems which arise if one tries to fullfill what the 's' flag promises. The final result of trying to get a 100 percent solution is just what we have now: nothing at all. It wasn't the first time I've posted a patch to LKML, I know that maintainers like to request high towers from ordinary people and therefor very often nice dog houses were refused. There might be a legitimate reason to request a high tower from a big company, but that's something totally different. And I refuse to try to understand why maintainers request high towers. ;) And because hope never dies, I was again silly enough to post a simple patch. ;) Regards, Alexander Hpller -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, 4 Feb 2015, Alexander Holler wrote: > Date: Wed, 04 Feb 2015 11:19:01 +0100 > From: Alexander Holler <holler@ahsoftware.de> > To: Al Viro <viro@ZenIV.linux.org.uk> > Cc: Theodore Ts'o <tytso@mit.edu>, linux-fsdevel@vger.kernel.org, > linux-kernel@vger.kernel.org > Subject: Re: [PATCH 1/5] WIP: Add syscall unlinkat_s (currently x86* only) > > Am 04.02.2015 um 00:33 schrieb Al Viro: > > On Tue, Feb 03, 2015 at 07:01:50PM +0100, Alexander Holler wrote: > > > > > Yeah, as I've already admitted in the bug, I never should have use > > > the word secure, because everyone nowadays seems to end up in panic > > > when reading that word. > > > > > > So, if I would be able to use sed on my mails, I would replace > > > unlinkat_s() with unlinkat_w() (for wipe) or would say that _s does > > > stand for 'shred' in the means of shred(1). > > > > TBH, I suspect that the saner API would be something like > > EXT2_IOC_[SG[ETFLAGS, > > allowing to set and query that along with other flags (append-only, etc.). > > > > Forget about unlink; first of all, whatever API you use should only _mark_ > > the inode as "zero freed blocks" (or trim, for that matter). You can't > > force freeing of an inode, so either you make sure that subsequent freeing > > of inode, whenever it happens, will do that work, or your API is hopelessly > > racy. Moreover, when link has been removed it's too late to report that > > fs has no way to e.g. trim those blocks, so you really want to have it done > > _before_ the actual link removal. And if the file contents is that > > sensitive, > > you'd better extend the same protection to all operations that free its > > blocks, including truncate(), fallocate() hole-punching, whatever. What's > > more, if you divorce that from link removal, you probably don't want it as > > in-core-only flag - have it stored in inode, if fs supports that. > > > > Alternatively, you might want to represent it as xattr - as much as I hate > > those, it might turn out to be the best fit in this case, if we end up > > with several variants for freed blocks disposal. Not sure... > > > > But whichever way we represent that state, IMO > > a) operation should be similar to chmod/chattr/setfattr - modifying > > inode metadata. > > b) it should affect _all_ operations freeing blocks of that file > > from that point on > > c) it should be able to fail, telling you that you can't do that for > > this backing store. > > My intention to use unlinkat() or unlinkat_s() was the following: > > - It can be supported by most filesystems (see my fat patch) > > - It doesn't really make any promises it can't, like deleting leftovers of an > already modified file. That's where a much more complicated solution like the > 's' attribute would appropriate. It just should try to wipe the current > contents of a file. > > The second reason was also the reason why I've crafted the subject of the RFC > very carefully: "Offer a way for userspace to request real deletion of files". > > I did that to avoid the nitpickers. It doesn't say how the request is or has > to be handled. I was aware of all the problems which arise if one tries to > fullfill what the 's' flag promises. The final result of trying to get a 100 > percent solution is just what we have now: nothing at all. > > It wasn't the first time I've posted a patch to LKML, I know that maintainers > like to request high towers from ordinary people and therefor very often nice > dog houses were refused. There might be a legitimate reason to request a high > tower from a big company, but that's something totally different. > > And I refuse to try to understand why maintainers request high towers. ;) > > And because hope never dies, I was again silly enough to post a simple patch. > ;) It really comes down to what your intentions and expectations are. If you just wanted to have a discussion about this feature, you have it and that's fine. If your intention was to propose the actual solution, the actual feature that would be merged to kernel, then your expectation should have been exactly this, discussion, improvement propositions, cases to take care of and so on. For example the interface Al proposed is much better than the solution in the patch. And he is right about the other block freeing operations. No one told you not to proceed with the implementation AFAIK. Some are skeptical and that's fine, but you have all means to proof them wrong. The fact is that the current patches are useless for anything other than proof-of-concept. Now you know more that needs to be done or thought about, but if you're not willing to do the work, then please stop complaining about "high towers". I am not a maintainer and I thinks that the feedback you've got is entirely reasonable. Take it as you will. One more thing, can I ask you what were your expectations when posting those patches ? Regards, -Lukas > > Regards, > > Alexander Hpller > -- > To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Am 04.02.2015 um 13:07 schrieb Lukáš Czerner: > The fact is that the current patches are useless for anything other > than proof-of-concept. Now you know more that needs to be done or That's wrong. The patches already work. If you delete a file which isn't in use by something else, the current contents will be wiped on traditional harddrives. I assume that already fulfills more than 50% of use cases of ordinary people. > thought about, but if you're not willing to do the work, then please > stop complaining about "high towers". I am not a maintainer and I > thinks that the feedback you've got is entirely reasonable. Take it > as you will. > > One more thing, can I ask you what were your expectations when > posting those patches ? I've posted them for other users which are happy with what I've explained above. Besides requesting an API which makes such a simple solution, in contrast to the the 's' bit, possible. Alexander Holler -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Am 04.02.2015 um 13:22 schrieb Alexander Holler: > Am 04.02.2015 um 13:07 schrieb Lukáš Czerner: > >> The fact is that the current patches are useless for anything other >> than proof-of-concept. Now you know more that needs to be done or > > That's wrong. The patches already work. If you delete a file which isn't > in use by something else, the current contents will be wiped on > traditional harddrives. I assume that already fulfills more than 50% of > use cases of ordinary people. > >> thought about, but if you're not willing to do the work, then please >> stop complaining about "high towers". I am not a maintainer and I >> thinks that the feedback you've got is entirely reasonable. Take it >> as you will. >> >> One more thing, can I ask you what were your expectations when >> posting those patches ? > > I've posted them for other users which are happy with what I've > explained above. Besides requesting an API which makes such a simple > solution, in contrast to the the 's' bit, possible. To be more precise: How do you add something like EXT2_IOC_[SG[ETFLAGS to vfat or one of the dozens other filesystems which don't know about linux-specific flags? I don't see a way to do that, so there's only unlinkat() left. Alexander Holler -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Am 04.02.2015 um 13:42 schrieb Alexander Holler: > Am 04.02.2015 um 13:22 schrieb Alexander Holler: >> Am 04.02.2015 um 13:07 schrieb Lukáš Czerner: >> >>> The fact is that the current patches are useless for anything other >>> than proof-of-concept. Now you know more that needs to be done or >> >> That's wrong. The patches already work. If you delete a file which isn't >> in use by something else, the current contents will be wiped on >> traditional harddrives. I assume that already fulfills more than 50% of >> use cases of ordinary people. >> >>> thought about, but if you're not willing to do the work, then please >>> stop complaining about "high towers". I am not a maintainer and I >>> thinks that the feedback you've got is entirely reasonable. Take it >>> as you will. >>> >>> One more thing, can I ask you what were your expectations when >>> posting those patches ? >> >> I've posted them for other users which are happy with what I've >> explained above. Besides requesting an API which makes such a simple >> solution, in contrast to the the 's' bit, possible. > > To be more precise: How do you add something like EXT2_IOC_[SG[ETFLAGS > to vfat or one of the dozens other filesystems which don't know about > linux-specific flags? I don't see a way to do that, so there's only > unlinkat() left. Or to be give an actual use case, mount a (v)fat formatted usb-stick, -hdd or mmc, delete a file with the patches I offered, unmount it, try to find the contents of the deleted file at device-level (e.g. by grepping the partition). Alexander Holler -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Alexander, On Wed, Feb 4, 2015 at 1:22 PM, Alexander Holler <holler@ahsoftware.de> wrote: > Am 04.02.2015 um 13:07 schrieb Lukáš Czerner: > >> The fact is that the current patches are useless for anything other >> than proof-of-concept. Now you know more that needs to be done or > > > That's wrong. The patches already work. If you delete a file which isn't in > use by something else, the current contents will be wiped on traditional > harddrives. I assume that already fulfills more than 50% of use cases of > ordinary people. You are getting various feedback from people, that you seem to be ignoring. Al Viro, in his curmedgeonly way, points out that the problems are much deeper than you realize. He does not say so explicitly, but I imagine his point is that he does not want to see the kernel cluttered with "partial" solutions that will simply increase the maintenance burden in the long term, and leave bugs to be fixed further down the line. You seem not to be listening. Lukáš points out to you that getting a feature like this into the kernel is complex process. You seem unwilling to hear that, and still just want your partial solution. I tell you that discussions of APIs should CC linux-api, which I am now CCing into this thread, again, because, again, you're not listening to feedback. Nobody is asking for "high towers"; they just have their eyes on the big picture. And the people here are just "ordinary people" with a *lot* of experience dealing with kernel code (I exclude myself) . They see many complexities that you don't. Getting intersting features into the kernel requires a lot of work, and careful listening. Thanks, Michael
Am 04.02.2015 um 13:50 schrieb Alexander Holler: > Am 04.02.2015 um 13:42 schrieb Alexander Holler: >> Am 04.02.2015 um 13:22 schrieb Alexander Holler: >>> Am 04.02.2015 um 13:07 schrieb Lukáš Czerner: >>> >>>> The fact is that the current patches are useless for anything other >>>> than proof-of-concept. Now you know more that needs to be done or >>> >>> That's wrong. The patches already work. If you delete a file which isn't >>> in use by something else, the current contents will be wiped on >>> traditional harddrives. I assume that already fulfills more than 50% of >>> use cases of ordinary people. >>> >>>> thought about, but if you're not willing to do the work, then please >>>> stop complaining about "high towers". I am not a maintainer and I >>>> thinks that the feedback you've got is entirely reasonable. Take it >>>> as you will. >>>> >>>> One more thing, can I ask you what were your expectations when >>>> posting those patches ? >>> >>> I've posted them for other users which are happy with what I've >>> explained above. Besides requesting an API which makes such a simple >>> solution, in contrast to the the 's' bit, possible. >> >> To be more precise: How do you add something like EXT2_IOC_[SG[ETFLAGS >> to vfat or one of the dozens other filesystems which don't know about >> linux-specific flags? I don't see a way to do that, so there's only >> unlinkat() left. > > Or to be give an actual use case, mount a (v)fat formatted usb-stick, > -hdd or mmc, delete a file with the patches I offered, unmount it, try > to find the contents of the deleted file at device-level (e.g. by > grepping the partition). Maybe I should mention that I've tried it with bug reports instead patches before. Beeing aware that I might be unable to write perfect patches with the resources I'm able or willing to spend. I just needed some days until the one for ext4 was closed, leaving no hope that it might become fixed without trying it myself. Alexander Holler -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Am 04.02.2015 um 14:06 schrieb Michael Kerrisk: > Alexander, > > On Wed, Feb 4, 2015 at 1:22 PM, Alexander Holler <holler@ahsoftware.de> wrote: >> Am 04.02.2015 um 13:07 schrieb Lukáš Czerner: >> >>> The fact is that the current patches are useless for anything other >>> than proof-of-concept. Now you know more that needs to be done or >> >> >> That's wrong. The patches already work. If you delete a file which isn't in >> use by something else, the current contents will be wiped on traditional >> harddrives. I assume that already fulfills more than 50% of use cases of >> ordinary people. > > You are getting various feedback from people, that you seem to be ignoring. I'm happy for all the feedback. But it doesn't help me. I'm not going to spend the necessary time unpaid. . > Al Viro, in his curmedgeonly way, points out that the problems are > much deeper than you realize. He does not say so explicitly, but I > imagine his point is that he does not want to see the kernel cluttered > with "partial" solutions that will simply increase the maintenance > burden in the long term, and leave bugs to be fixed further down the > line. You seem not to be listening. It doesn't help me nor anyone else. As Eric Sandeen made me aware through in bug, look at http://lwn.net/Articles/462437/ what already happened. > Lukáš points out to you that getting a feature like this into the > kernel is complex process. You seem unwilling to hear that, and still > just want your partial solution. Wrong. I don't want my partial solution to be part of the official kernel. I don't care. I offered it for other users because I'm aware that has become almost impossible for normal people to get something into the kernel without spending an unbelievable amount of time most people can't afford to spend. > I tell you that discussions of APIs should CC linux-api, which I am > now CCing into this thread, again, because, again, you're not > listening to feedback. Please don't confuse "not listening" with "unable to fulfill Linux kernel maintainer requests". Alexander Holler -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Am 04.02.2015 um 14:21 schrieb Alexander Holler: >> I tell you that discussions of APIs should CC linux-api, which I am >> now CCing into this thread, again, because, again, you're not >> listening to feedback. > > Please don't confuse "not listening" with "unable to fulfill Linux > kernel maintainer requests". I really wonder what do you expect from people not getting paid to spend time for fulfilling maintainer request? I've written bugs and even offered some patches (regardless how usefull there are in your eyes, it's more than most other people can do). And all what it brought me is that I receive flames like your one. Do you really think that's the right way to stimulate people in helping to make Linux better? Alexander Holler -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Am 04.02.2015 um 14:29 schrieb Alexander Holler: > Am 04.02.2015 um 14:21 schrieb Alexander Holler: > >>> I tell you that discussions of APIs should CC linux-api, which I am >>> now CCing into this thread, again, because, again, you're not >>> listening to feedback. >> >> Please don't confuse "not listening" with "unable to fulfill Linux >> kernel maintainer requests". > > I really wonder what do you expect from people not getting paid to spend > time for fulfilling maintainer request? > > I've written bugs and even offered some patches (regardless how usefull > there are in your eyes, it's more than most other people can do). > > And all what it brought me is that I receive flames like your one. > > Do you really think that's the right way to stimulate people in helping > to make Linux better? I'm really sorry that I can't spend several unpaid months with reading and understanding ever changing linux kernel sources in order to become a Linux filesystem expert and send some fully working perfect patches which do fix the problem in question. And I can't spend the necessary time to play remote keyboard for kernel maintainers which might be willing to explain me what has to be done according to their view. I've already offered what I was willing to do, for the price of having to defend myself over and over. And unfortunately that wasn't the first time I've ended up with having to defend myself. My conclusion is that I'm a real fool having posted multiple times patches to this list. It just doesn't make any sense and most of the time the only reward are flames. Alexander Holler -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, 4 Feb 2015, Alexander Holler wrote: > Date: Wed, 04 Feb 2015 14:21:12 +0100 > From: Alexander Holler <holler@ahsoftware.de> > To: Michael Kerrisk <mtk.manpages@gmail.com> > Cc: Lukáš Czerner <lczerner@redhat.com>, Al Viro <viro@zeniv.linux.org.uk>, > Theodore Ts'o <tytso@mit.edu>, > Linux-Fsdevel <linux-fsdevel@vger.kernel.org>, > Linux Kernel <linux-kernel@vger.kernel.org>, > Linux API <linux-api@vger.kernel.org> > Subject: Re: [PATCH 1/5] WIP: Add syscall unlinkat_s (currently x86* only) > > Am 04.02.2015 um 14:06 schrieb Michael Kerrisk: > > Alexander, > > > > On Wed, Feb 4, 2015 at 1:22 PM, Alexander Holler <holler@ahsoftware.de> > > wrote: > > > Am 04.02.2015 um 13:07 schrieb Lukáš Czerner: > > > > > > > The fact is that the current patches are useless for anything other > > > > than proof-of-concept. Now you know more that needs to be done or > > > > > > > > > That's wrong. The patches already work. If you delete a file which isn't > > > in > > > use by something else, the current contents will be wiped on traditional > > > harddrives. I assume that already fulfills more than 50% of use cases of > > > ordinary people. > > > > You are getting various feedback from people, that you seem to be ignoring. > > I'm happy for all the feedback. But it doesn't help me. I'm not going to spend > the necessary time unpaid. Right, you'd much rather have someone else to spend the time on your request unpaid. That's understandable, but unreasonable. You want it, implement it, or pay someone else to do it for you. > . > > Al Viro, in his curmedgeonly way, points out that the problems are > > much deeper than you realize. He does not say so explicitly, but I > > imagine his point is that he does not want to see the kernel cluttered > > with "partial" solutions that will simply increase the maintenance > > burden in the long term, and leave bugs to be fixed further down the > > line. You seem not to be listening. > > It doesn't help me nor anyone else. As Eric Sandeen made me aware through in > bug, look at http://lwn.net/Articles/462437/ what already happened. That's what people have been trying to tell you. It's not an easy task and there are plenty of cases to think about. As you can see IBM tasked their developer to do it, but they did not succeed. And here you come with your simplistic patches crying about "high towers. But you're the one apparently interested in this feature and you've been warned that's it's not a simple task. But if you really want it I really do encourage you to try. I'd be happy to have a working and reliable secure delete feature but it's not my priority at all. -Lukas > > > Lukáš points out to you that getting a feature like this into the > > kernel is complex process. You seem unwilling to hear that, and still > > just want your partial solution. > > Wrong. I don't want my partial solution to be part of the official kernel. I > don't care. I offered it for other users because I'm aware that has become > almost impossible for normal people to get something into the kernel without > spending an unbelievable amount of time most people can't afford to spend. > > > I tell you that discussions of APIs should CC linux-api, which I am > > now CCing into this thread, again, because, again, you're not > > listening to feedback. > > Please don't confuse "not listening" with "unable to fulfill Linux kernel > maintainer requests". > > Alexander Holler > >
On 2015-02-04 09:19, Alexander Holler wrote: > Am 04.02.2015 um 14:29 schrieb Alexander Holler: > I'm really sorry that I can't spend several unpaid months with reading > and understanding ever changing linux kernel sources in order to become > a Linux filesystem expert and send some fully working perfect patches > which do fix the problem in question. You aren't expected to do so. Code review is an integral part of the development process here, and only truly trivial patches (stuff like fixing typos in kernel messages and documentation) get merged without it. If you pay attention to the list itself, even the veteran kernel developers almost never manage to produce a patch that is deemed absolutely perfect, and end up revising things multiple times before they get merged. > And I can't spend the necessary time to play remote keyboard for kernel > maintainers which might be willing to explain me what has to be done > according to their view. I've already offered what I was willing to do, > for the price of having to defend myself over and over. And > unfortunately that wasn't the first time I've ended up with having to > defend myself. You seem to fail to understand that open source development runs primarily on volunteer work (yes there are people paid to work on open source software, but that is a generally exceptional case). A large majority of the people who are kernel maintainers are donating their free time to the project. > My conclusion is that I'm a real fool having posted multiple times > patches to this list. It just doesn't make any sense and most of the > time the only reward are flames. If you aren't serious about trying to get something into the mainline kernel, you should be tagging _all_ of the e-mails in that patch-set with [RFC] in the subject line. In none of the responses that I've seen has anyone been anything but polite (albeit in some cases moderately annoyed). If you really consider such attempts at constructive criticism to be flames, then a development mailing list isn't the place you should be posting patches. -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Am 04.02.2015 um 15:52 schrieb Lukáš Czerner: > On Wed, 4 Feb 2015, Alexander Holler wrote: >> I'm happy for all the feedback. But it doesn't help me. I'm not going to spend >> the necessary time unpaid. > > Right, you'd much rather have someone else to spend the time on your > request unpaid. That's understandable, but unreasonable. You want > it, implement it, or pay someone else to do it for you. Maybe you should attach a big fat red warning to the kernels bugzilla that filing a bug means either to fix it yourself or pay somone to do that. I've never demanded that someone else fixes it. I've just explained a problem. Unbelievable how someone could do such without paying someone else to fix it or by fixing it themself ... -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, 4 Feb 2015, Alexander Holler wrote: > Date: Wed, 04 Feb 2015 17:12:52 +0100 > From: Alexander Holler <holler@ahsoftware.de> > To: Lukáš Czerner <lczerner@redhat.com> > Cc: Michael Kerrisk <mtk.manpages@gmail.com>, > Al Viro <viro@zeniv.linux.org.uk>, Theodore Ts'o <tytso@mit.edu>, > Linux-Fsdevel <linux-fsdevel@vger.kernel.org>, > Linux Kernel <linux-kernel@vger.kernel.org>, > Linux API <linux-api@vger.kernel.org> > Subject: Re: [PATCH 1/5] WIP: Add syscall unlinkat_s (currently x86* only) > > Am 04.02.2015 um 15:52 schrieb Lukáš Czerner: > > On Wed, 4 Feb 2015, Alexander Holler wrote: > > >> I'm happy for all the feedback. But it doesn't help me. I'm not going to spend > >> the necessary time unpaid. > > > > Right, you'd much rather have someone else to spend the time on your > > request unpaid. That's understandable, but unreasonable. You want > > it, implement it, or pay someone else to do it for you. > > Maybe you should attach a big fat red warning to the kernels bugzilla > that filing a bug means either to fix it yourself or pay somone to do that. > > I've never demanded that someone else fixes it. > > I've just explained a problem. > > Unbelievable how someone could do such without paying someone else to > fix it or by fixing it themself ... It's not a bug, you're requesting a feature.
Am 04.02.2015 um 17:25 schrieb Lukáš Czerner: > On Wed, 4 Feb 2015, Alexander Holler wrote: >> Am 04.02.2015 um 15:52 schrieb Lukáš Czerner: >>> On Wed, 4 Feb 2015, Alexander Holler wrote: >> >>>> I'm happy for all the feedback. But it doesn't help me. I'm not going to spend >>>> the necessary time unpaid. >>> >>> Right, you'd much rather have someone else to spend the time on your >>> request unpaid. That's understandable, but unreasonable. You want >>> it, implement it, or pay someone else to do it for you. >> >> Maybe you should attach a big fat red warning to the kernels bugzilla >> that filing a bug means either to fix it yourself or pay somone to do that. >> >> I've never demanded that someone else fixes it. >> >> I've just explained a problem. >> >> Unbelievable how someone could do such without paying someone else to >> fix it or by fixing it themself ... > > It's not a bug, you're requesting a feature. > Ok, I'm guilty. May I ask if there's somewhere a feature request tracker which doesn't cruzify someone because he suggest a (maybe wrong) solution and tries to show that this might work with some prelimary, broken, silly, quick and dirty patches? -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Am 04.02.2015 um 17:45 schrieb Alexander Holler: > Am 04.02.2015 um 17:25 schrieb Lukáš Czerner: >> On Wed, 4 Feb 2015, Alexander Holler wrote: > >>> Am 04.02.2015 um 15:52 schrieb Lukáš Czerner: >>>> On Wed, 4 Feb 2015, Alexander Holler wrote: >>> >>>>> I'm happy for all the feedback. But it doesn't help me. I'm not >>>>> going to spend >>>>> the necessary time unpaid. >>>> >>>> Right, you'd much rather have someone else to spend the time on your >>>> request unpaid. That's understandable, but unreasonable. You want >>>> it, implement it, or pay someone else to do it for you. >>> >>> Maybe you should attach a big fat red warning to the kernels bugzilla >>> that filing a bug means either to fix it yourself or pay somone to do >>> that. >>> >>> I've never demanded that someone else fixes it. >>> >>> I've just explained a problem. >>> >>> Unbelievable how someone could do such without paying someone else to >>> fix it or by fixing it themself ... >> >> It's not a bug, you're requesting a feature. >> > > Ok, I'm guilty. > > May I ask if there's somewhere a feature request tracker which doesn't > cruzify someone because he suggest a (maybe wrong) solution and tries to > show that this might work with some prelimary, broken, silly, quick and > dirty patches? I guess the answer is FreeBSD or similiar. ;) -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Feb 04, 2015 at 03:52:02PM +0100, Lukáš Czerner wrote: > > I'm happy for all the feedback. But it doesn't help me. I'm not going to spend > > the necessary time unpaid. > > Right, you'd much rather have someone else to spend the time on your > request unpaid. That's understandable, but unreasonable. You want > it, implement it, or pay someone else to do it for you. > > > It doesn't help me nor anyone else. As Eric Sandeen made me aware through in > > bug, look at http://lwn.net/Articles/462437/ what already happened. > > That's what people have been trying to tell you. It's not an easy > task and there are plenty of cases to think about. As you can see > IBM tasked their developer to do it, but they did not succeed. And > here you come with your simplistic patches crying about "high > towers. But you're the one apparently interested in this feature > and you've been warned that's it's not a simple task. And indeed, people who do have salaries paid by companies who care about this general problem in actual products have been working on addressing it using encryption, such that when the user is removed from the device, the key is blasted. More importantly, when the user is not logged in, the key isn't even *available* on the device. So it solves more problems than the one that you are concerned about, and in general maintainers prefer solutions that solve multiple problems, because that minimizes the number of one-time hacks and partial/toy solutions which turn into long-term maintainance headaches. (After all, if you insist on having a partial/toy solution merged, that turns into an unfunded mandate which the maintainers effectively have to support for free, forever.) You've rejected encryption as a proposed solution as not meeting your requirements (which if I understand your objections, can be summarized as "encryption is too hard"). This is fine, but if you want someone *else* to implement your partial toy solution which is less secure, then you will either need to pay someone to do it or do it yourself. > > Wrong. I don't want my partial solution to be part of the official kernel. I > > don't care. I offered it for other users because I'm aware that has become > > almost impossible for normal people to get something into the kernel without > > spending an unbelievable amount of time most people can't afford to spend. So you expect other users to just apply your patches and use an unofficial system call number that might get reassigned to some other user later on? If that's all you want, then ok, you're done. The patches have been posted to LKML, and you can give people URL's if they want to try applying the patches on their own. Cheers, - Ted -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Am 04.02.2015 um 20:33 schrieb Theodore Ts'o: > And indeed, people who do have salaries paid by companies who care > about this general problem in actual products have been working on > addressing it using encryption, such that when the user is removed > from the device, the key is blasted. More importantly, when the user > is not logged in, the key isn't even *available* on the device. So it > solves more problems than the one that you are concerned about, and in > general maintainers prefer solutions that solve multiple problems, > because that minimizes the number of one-time hacks and partial/toy > solutions which turn into long-term maintainance headaches. (After > all, if you insist on having a partial/toy solution merged, that turns > into an unfunded mandate which the maintainers effectively have to > support for free, forever.) It's just another layer above and an rather ugly workaround which ends up in having to manage keys and doesn't solve the real problem. Besides that it's much more complicated especially in kind of kernel sources to manage. > You've rejected encryption as a proposed solution as not meeting your > requirements (which if I understand your objections, can be summarized > as "encryption is too hard"). This is fine, but if you want someone > *else* to implement your partial toy solution which is less secure, > then you will either need to pay someone to do it or do it yourself. I haven't rejected it. I'm using that myself since around 10 years, because of the impossibility to really delete files when using Linux. >>> Wrong. I don't want my partial solution to be part of the official kernel. I >>> don't care. I offered it for other users because I'm aware that has become >>> almost impossible for normal people to get something into the kernel without >>> spending an unbelievable amount of time most people can't afford to spend. > > So you expect other users to just apply your patches and use an > unofficial system call number that might get reassigned to some other > user later on? People do such all the time because the mainline kernel is otherwise unusable on many boards. Besides that, I don't expect that anyone uses my patches. As said multiple times before, they are an offer and were primarily meant to show a possible simple solution for many use cases. They already work with inside some, of course maybe uncomfortable, limits and don't do any worse. just better. Alexander Holler -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/arch/x86/syscalls/syscall_32.tbl b/arch/x86/syscalls/syscall_32.tbl index 9fe1b5d..7a3d530 100644 --- a/arch/x86/syscalls/syscall_32.tbl +++ b/arch/x86/syscalls/syscall_32.tbl @@ -364,3 +364,4 @@ 355 i386 getrandom sys_getrandom 356 i386 memfd_create sys_memfd_create 357 i386 bpf sys_bpf +359 i386 unlinkat_s sys_unlinkat_s diff --git a/arch/x86/syscalls/syscall_64.tbl b/arch/x86/syscalls/syscall_64.tbl index 281150b..97eaf01 100644 --- a/arch/x86/syscalls/syscall_64.tbl +++ b/arch/x86/syscalls/syscall_64.tbl @@ -328,6 +328,7 @@ 319 common memfd_create sys_memfd_create 320 common kexec_file_load sys_kexec_file_load 321 common bpf sys_bpf +322 common unlinkat_s sys_unlinkat_s # # x32-specific system call numbers start at 512 to avoid cache impact diff --git a/fs/namei.c b/fs/namei.c index db5fe86..1ad3724 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -3717,7 +3717,7 @@ EXPORT_SYMBOL(vfs_unlink); * writeout happening, and we don't want to prevent access to the directory * while waiting on the I/O. */ -static long do_unlinkat(int dfd, const char __user *pathname) +static long do_unlinkat(int dfd, const char __user *pathname, bool secure) { int error; struct filename *name; @@ -3759,8 +3759,25 @@ exit2: dput(dentry); } mutex_unlock(&nd.path.dentry->d_inode->i_mutex); - if (inode) - iput(inode); /* truncate the inode here */ + if (inode) { + // TODO: + // if (inode is file and 's' flag is set) + // secure = true; + if (!secure) + iput(inode); /* truncate the inode here */ + else { + struct super_block *sb = inode->i_sb; + if (sb->s_op->set_secure_delete) + sb->s_op->set_secure_delete(sb, true); + // TODO: We should fail if secure isn't supported, + // look up how that's possible here. + iput(inode); /* truncate the inode here */ + // TODO: check if sb is still valid after the inode is gone + sync_filesystem(sb); + if (sb->s_op->set_secure_delete) + sb->s_op->set_secure_delete(sb, false); + } + } inode = NULL; if (delegated_inode) { error = break_deleg_wait(&delegated_inode); @@ -3796,12 +3813,23 @@ SYSCALL_DEFINE3(unlinkat, int, dfd, const char __user *, pathname, int, flag) if (flag & AT_REMOVEDIR) return do_rmdir(dfd, pathname); - return do_unlinkat(dfd, pathname); + return do_unlinkat(dfd, pathname, false); } SYSCALL_DEFINE1(unlink, const char __user *, pathname) { - return do_unlinkat(AT_FDCWD, pathname); + return do_unlinkat(AT_FDCWD, pathname, false); +} + +SYSCALL_DEFINE3(unlinkat_s, int, dfd, const char __user *, pathname, int, flag) +{ + if ((flag & ~AT_REMOVEDIR) != 0) + return -EINVAL; + + if (flag & AT_REMOVEDIR) + return do_rmdir(dfd, pathname); + + return do_unlinkat(dfd, pathname, true); } int vfs_symlink(struct inode *dir, struct dentry *dentry, const char *oldname) diff --git a/include/asm-generic/audit_dir_write.h b/include/asm-generic/audit_dir_write.h index 7b61db4..5282aba 100644 --- a/include/asm-generic/audit_dir_write.h +++ b/include/asm-generic/audit_dir_write.h @@ -29,4 +29,5 @@ __NR_unlinkat, __NR_renameat, __NR_linkat, __NR_symlinkat, +__NR_unlinkat_s, #endif diff --git a/include/linux/fs.h b/include/linux/fs.h index 9ab779e..039e969 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -1594,6 +1594,7 @@ struct super_operations { int (*bdev_try_to_free_page)(struct super_block*, struct page*, gfp_t); long (*nr_cached_objects)(struct super_block *, int); long (*free_cached_objects)(struct super_block *, long, int); + void (*set_secure_delete) (struct super_block *, bool); }; /* diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h index bda9b81..b88019b 100644 --- a/include/linux/syscalls.h +++ b/include/linux/syscalls.h @@ -877,4 +877,5 @@ asmlinkage long sys_seccomp(unsigned int op, unsigned int flags, asmlinkage long sys_getrandom(char __user *buf, size_t count, unsigned int flags); asmlinkage long sys_bpf(int cmd, union bpf_attr *attr, unsigned int size); +asmlinkage long sys_unlinkat_s(int dfd, const char __user * pathname, int flag); #endif diff --git a/include/uapi/asm-generic/unistd.h b/include/uapi/asm-generic/unistd.h index 22749c1..2ba072e 100644 --- a/include/uapi/asm-generic/unistd.h +++ b/include/uapi/asm-generic/unistd.h @@ -707,9 +707,11 @@ __SYSCALL(__NR_getrandom, sys_getrandom) __SYSCALL(__NR_memfd_create, sys_memfd_create) #define __NR_bpf 280 __SYSCALL(__NR_bpf, sys_bpf) +#define __NR_unlinkat_s 281 +__SYSCALL(__NR_unlinkat_s, sys_unlinkat_s) #undef __NR_syscalls -#define __NR_syscalls 281 +#define __NR_syscalls 282 /* * All syscalls below here should go away really, diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c index fb12645..1507335 100644 --- a/tools/perf/builtin-trace.c +++ b/tools/perf/builtin-trace.c @@ -1110,6 +1110,8 @@ static struct syscall_fmt { { .name = "uname", .errmsg = true, .alias = "newuname", }, { .name = "unlinkat", .errmsg = true, .arg_scnprintf = { [0] = SCA_FDAT, /* dfd */ }, }, + { .name = "unlinkat_s", .errmsg = true, + .arg_scnprintf = { [0] = SCA_FDAT, /* dfd */ }, }, { .name = "utimensat", .errmsg = true, .arg_scnprintf = { [0] = SCA_FDAT, /* dirfd */ }, }, { .name = "write", .errmsg = true,
Signed-off-by: Alexander Holler <holler@ahsoftware.de> --- arch/x86/syscalls/syscall_32.tbl | 1 + arch/x86/syscalls/syscall_64.tbl | 1 + fs/namei.c | 38 ++++++++++++++++++++++++++++++----- include/asm-generic/audit_dir_write.h | 1 + include/linux/fs.h | 1 + include/linux/syscalls.h | 1 + include/uapi/asm-generic/unistd.h | 4 +++- tools/perf/builtin-trace.c | 2 ++ 8 files changed, 43 insertions(+), 6 deletions(-)