diff mbox

[1/5] WIP: Add syscall unlinkat_s (currently x86* only)

Message ID 1422896713-25367-2-git-send-email-holler@ahsoftware.de (mailing list archive)
State New, archived
Headers show

Commit Message

Alexander Holler Feb. 2, 2015, 5:05 p.m. UTC
Signed-off-by: Alexander Holler <holler@ahsoftware.de>
---
 arch/x86/syscalls/syscall_32.tbl      |  1 +
 arch/x86/syscalls/syscall_64.tbl      |  1 +
 fs/namei.c                            | 38 ++++++++++++++++++++++++++++++-----
 include/asm-generic/audit_dir_write.h |  1 +
 include/linux/fs.h                    |  1 +
 include/linux/syscalls.h              |  1 +
 include/uapi/asm-generic/unistd.h     |  4 +++-
 tools/perf/builtin-trace.c            |  2 ++
 8 files changed, 43 insertions(+), 6 deletions(-)

Comments

Al Viro Feb. 3, 2015, 6:05 a.m. UTC | #1
On Mon, Feb 02, 2015 at 06:05:09PM +0100, Alexander Holler wrote:
> +	if (inode) {
> +		// TODO:
> +		// if (inode is file and 's' flag is set)
> +		// 	secure = true;
> +		if (!secure)
> +			iput(inode);	/* truncate the inode here */
> +		else {
> +			struct super_block *sb = inode->i_sb;
> +			if (sb->s_op->set_secure_delete)
> +				sb->s_op->set_secure_delete(sb, true);
> +			// TODO: We should fail if secure isn't supported,
> +			// look up how that's possible here.
> +			iput(inode);	/* truncate the inode here */
> +			// TODO: check if sb is still valid after the inode is gone
> +			sync_filesystem(sb);
> +			if (sb->s_op->set_secure_delete)
> +				sb->s_op->set_secure_delete(sb, false);
> +		}

Charming.  Now, what exactly happens if two such syscalls overlap in time?
Moroever, what makes you equate unlink() with inode removal?  What happens
if you race e.g. with stat(2) on the same thing?  Or if there's an opened
file over that sucker, for that matter?
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Alexander Holler Feb. 3, 2015, 6:58 a.m. UTC | #2
Am 03.02.2015 um 07:05 schrieb Al Viro:
> On Mon, Feb 02, 2015 at 06:05:09PM +0100, Alexander Holler wrote:
>> +	if (inode) {
>> +		// TODO:
>> +		// if (inode is file and 's' flag is set)
>> +		// 	secure = true;
>> +		if (!secure)
>> +			iput(inode);	/* truncate the inode here */
>> +		else {
>> +			struct super_block *sb = inode->i_sb;
>> +			if (sb->s_op->set_secure_delete)
>> +				sb->s_op->set_secure_delete(sb, true);
>> +			// TODO: We should fail if secure isn't supported,
>> +			// look up how that's possible here.
>> +			iput(inode);	/* truncate the inode here */
>> +			// TODO: check if sb is still valid after the inode is gone
>> +			sync_filesystem(sb);
>> +			if (sb->s_op->set_secure_delete)
>> +				sb->s_op->set_secure_delete(sb, false);
>> +		}
> 
> Charming.  Now, what exactly happens if two such syscalls overlap in time?

What do you think will happen? I assume you haven't looked at how I've
implemented set_secure_delete(). CHarming.

--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Alexander Holler Feb. 3, 2015, 7:52 a.m. UTC | #3
Am 03.02.2015 um 07:05 schrieb Al Viro:
> On Mon, Feb 02, 2015 at 06:05:09PM +0100, Alexander Holler wrote:
>> +	if (inode) {
>> +		// TODO:
>> +		// if (inode is file and 's' flag is set)
>> +		// 	secure = true;
>> +		if (!secure)
>> +			iput(inode);	/* truncate the inode here */
>> +		else {
>> +			struct super_block *sb = inode->i_sb;
>> +			if (sb->s_op->set_secure_delete)
>> +				sb->s_op->set_secure_delete(sb, true);
>> +			// TODO: We should fail if secure isn't supported,
>> +			// look up how that's possible here.
>> +			iput(inode);	/* truncate the inode here */
>> +			// TODO: check if sb is still valid after the inode is gone
>> +			sync_filesystem(sb);
>> +			if (sb->s_op->set_secure_delete)
>> +				sb->s_op->set_secure_delete(sb, false);
>> +		}
>
> Charming.  Now, what exactly happens if two such syscalls overlap in time?
> Moroever, what makes you equate unlink() with inode removal?  What happens
> if you race e.g. with stat(2) on the same thing?  Or if there's an opened
> file over that sucker, for that matter?

Sorry, but I first had to make breakfast after I've got angry about the 
usual arrogance of most Linux kernel maintainers.

I've already answered the first question.

Now to the second. That still might be a problem. But that's why this is 
a RFC, why there is a WIP (Work In Progress) before the patch, why I've 
written I've never looked at those sources before, why I've written they 
are imperfect and why I've written I have not spend much time on these 
patches. I've posted them to show how I think the problem might be 
solved. These patches evolved out of desperation that otherwise users 
have to wait another 30 years until they will be offered a way to really 
delete files.

If the removal is somehow scheduled then, of course, the secure flag has 
to be scheduled too.

Alexander Holler
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Al Viro Feb. 3, 2015, 7:56 a.m. UTC | #4
On Tue, Feb 03, 2015 at 07:58:50AM +0100, Alexander Holler wrote:

> > Charming.  Now, what exactly happens if two such syscalls overlap in time?
> 
> What do you think will happen? I assume you haven't looked at how I've
> implemented set_secure_delete(). CHarming.

AFAICS, you get random unlink() happening at the same time hit by that
mess, whether they'd asked for it or not.  What's more, this counter
of yours is *not* guaranteed to be elevated during the final iput() of the
inode you wanted to get - again, ls -lR racing with that syscall can
elevate the refcount of dentry, making d_delete() in vfs_unlink() just
remove that dentry from hash, while keeping it positive.  If dentry
reference grabbed by stat(2) is released after both dput() and iput() in
do_unlinkat(), the final iput() will be done when stat(2) drops its
reference to dentry, triggering immediate dentry_kill() (since dentry
has already been unhashed) and dentry_iput() from it.

IOW, this counter is both too crude (it's fs-wide, for crying out loud)
*and* not guaranteed to cover enough.  _IF_ you want that behaviour at
all, it ought to be an in-core inode flag set by that syscall and
checked by truncation logics to decide whether to do normal truncate of
this "overwrite with zeroes" thing.

While we are at it, "overwrite with zeroes" is too weak if the attacker
might get hold of the actual hardware.  Google for details - it's far too
long story for l-k posting.  Look for data recovery and secure data erasure...
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Davidlohr Bueso Feb. 3, 2015, 7:58 a.m. UTC | #5
On Tue, 2015-02-03 at 07:58 +0100, Alexander Holler wrote:
> Am 03.02.2015 um 07:05 schrieb Al Viro:
> > On Mon, Feb 02, 2015 at 06:05:09PM +0100, Alexander Holler wrote:
> >> +	if (inode) {
> >> +		// TODO:
> >> +		// if (inode is file and 's' flag is set)
> >> +		// 	secure = true;
> >> +		if (!secure)
> >> +			iput(inode);	/* truncate the inode here */
> >> +		else {
> >> +			struct super_block *sb = inode->i_sb;
> >> +			if (sb->s_op->set_secure_delete)
> >> +				sb->s_op->set_secure_delete(sb, true);
> >> +			// TODO: We should fail if secure isn't supported,
> >> +			// look up how that's possible here.
> >> +			iput(inode);	/* truncate the inode here */
> >> +			// TODO: check if sb is still valid after the inode is gone
> >> +			sync_filesystem(sb);
> >> +			if (sb->s_op->set_secure_delete)
> >> +				sb->s_op->set_secure_delete(sb, false);
> >> +		}
> > 
> > Charming.  Now, what exactly happens if two such syscalls overlap in time?
> 
> What do you think will happen? I assume you haven't looked at how I've
> implemented set_secure_delete(). CHarming.

Chill, why don't you.

--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Alexander Holler Feb. 3, 2015, 8:01 a.m. UTC | #6
Am 03.02.2015 um 08:56 schrieb Al Viro:

> While we are at it, "overwrite with zeroes" is too weak if the attacker
> might get hold of the actual hardware.  Google for details - it's far too
> long story for l-k posting.  Look for data recovery and secure data erasure...

You might read

http://link.springer.com/chapter/10.1007/978-3-540-89862-7_21

Here is an article in german about that:

http://www.heise.de/security/meldung/Sicheres-Loeschen-Einmal-ueberschreiben-genuegt-198816.html

In short, it's enough to overwrite it once with zeros,
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Al Viro Feb. 3, 2015, 8:10 a.m. UTC | #7
On Tue, Feb 03, 2015 at 09:01:36AM +0100, Alexander Holler wrote:
> Am 03.02.2015 um 08:56 schrieb Al Viro:
> 
> >While we are at it, "overwrite with zeroes" is too weak if the attacker
> >might get hold of the actual hardware.  Google for details - it's far too
> >long story for l-k posting.  Look for data recovery and secure data erasure...
> 
> You might read
> 
> http://link.springer.com/chapter/10.1007/978-3-540-89862-7_21
> 
> Here is an article in german about that:
> 
> http://www.heise.de/security/meldung/Sicheres-Loeschen-Einmal-ueberschreiben-genuegt-198816.html
> 
> In short, it's enough to overwrite it once with zeros,

Regardless of the media used?  How does that work on e.g. flash?
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Alexander Holler Feb. 3, 2015, 8:17 a.m. UTC | #8
Am 03.02.2015 um 09:10 schrieb Al Viro:
> On Tue, Feb 03, 2015 at 09:01:36AM +0100, Alexander Holler wrote:
>> Am 03.02.2015 um 08:56 schrieb Al Viro:
>>
>>> While we are at it, "overwrite with zeroes" is too weak if the attacker
>>> might get hold of the actual hardware.  Google for details - it's far too
>>> long story for l-k posting.  Look for data recovery and secure data erasure...
>>
>> You might read
>>
>> http://link.springer.com/chapter/10.1007/978-3-540-89862-7_21
>>
>> Here is an article in german about that:
>>
>> http://www.heise.de/security/meldung/Sicheres-Loeschen-Einmal-ueberschreiben-genuegt-198816.html
>>
>> In short, it's enough to overwrite it once with zeros,
>
> Regardless of the media used?  How does that work on e.g. flash?

That's why "secure trim" should be used if available. Blame the storage 
people for not offering it. But as I've already mentioned, they would 
just answer that filesystems don't (didn't) delete files anyway.

--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Alexander Holler Feb. 3, 2015, 8:51 a.m. UTC | #9
Am 03.02.2015 um 08:56 schrieb Al Viro:
> On Tue, Feb 03, 2015 at 07:58:50AM +0100, Alexander Holler wrote:
>
>>> Charming.  Now, what exactly happens if two such syscalls overlap in time?
>>
>> What do you think will happen? I assume you haven't looked at how I've
>> implemented set_secure_delete(). CHarming.
>
> AFAICS, you get random unlink() happening at the same time hit by that
> mess, whether they'd asked for it or not.  What's more, this counter
> of yours is *not* guaranteed to be elevated during the final iput() of the
> inode you wanted to get - again, ls -lR racing with that syscall can
> elevate the refcount of dentry, making d_delete() in vfs_unlink() just
> remove that dentry from hash, while keeping it positive.  If dentry
> reference grabbed by stat(2) is released after both dput() and iput() in
> do_unlinkat(), the final iput() will be done when stat(2) drops its
> reference to dentry, triggering immediate dentry_kill() (since dentry
> has already been unhashed) and dentry_iput() from it.

Thanks for the short explanation. I will see if I can make sense out of 
it for me to get an idea how to solve that.

>
> IOW, this counter is both too crude (it's fs-wide, for crying out loud)
> *and* not guaranteed to cover enough.  _IF_ you want that behaviour at

Sure it is crude.

But it keeps the patches simple. As I've written, unlinkat_s() isn't 
meant for everyday usage, just for the rare case when one really wants 
to get rid of some contents. Therefor execution speed or an i/o slowdown 
while the "secure deletion" is in work is totally ignored

And that "rare case" doesn't include military security levels, it's just 
meant for ordinary people which want make it much, much harder for other 
ordinary people (or geeks or kernel maintainers) to read the deleted 
content ever again. It's far too easy to use grep or something similiar 
to find seemingly deleted stuff at device level again (after it was 
deleted by what filesystems are offering nowadays). Especially if one 
thinks at stuff like certificates and similiar which can be identified 
by common patterns (bit sequences) they use.

Alexander Holler
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Alexander Holler Feb. 3, 2015, 9:23 a.m. UTC | #10
Am 03.02.2015 um 09:51 schrieb Alexander Holler:
> Am 03.02.2015 um 08:56 schrieb Al Viro:
>> On Tue, Feb 03, 2015 at 07:58:50AM +0100, Alexander Holler wrote:
>>
>>>> Charming.  Now, what exactly happens if two such syscalls overlap in
>>>> time?
>>>
>>> What do you think will happen? I assume you haven't looked at how I've
>>> implemented set_secure_delete(). CHarming.
>>
>> AFAICS, you get random unlink() happening at the same time hit by that
>> mess, whether they'd asked for it or not.  What's more, this counter
>> of yours is *not* guaranteed to be elevated during the final iput() of
>> the
>> inode you wanted to get - again, ls -lR racing with that syscall can
>> elevate the refcount of dentry, making d_delete() in vfs_unlink() just
>> remove that dentry from hash, while keeping it positive.  If dentry
>> reference grabbed by stat(2) is released after both dput() and iput() in
>> do_unlinkat(), the final iput() will be done when stat(2) drops its
>> reference to dentry, triggering immediate dentry_kill() (since dentry
>> has already been unhashed) and dentry_iput() from it.
>
> Thanks for the short explanation. I will see if I can make sense out of
> it for me to get an idea how to solve that.
>
>>
>> IOW, this counter is both too crude (it's fs-wide, for crying out loud)
>> *and* not guaranteed to cover enough.  _IF_ you want that behaviour at
>
> Sure it is crude.
>
> But it keeps the patches simple. As I've written, unlinkat_s() isn't
> meant for everyday usage, just for the rare case when one really wants
> to get rid of some contents. Therefor execution speed or an i/o slowdown
> while the "secure deletion" is in work is totally ignored
>
> And that "rare case" doesn't include military security levels, it's just
> meant for ordinary people which want make it much, much harder for other
> ordinary people (or geeks or kernel maintainers) to read the deleted
> content ever again. It's far too easy to use grep or something similiar
> to find seemingly deleted stuff at device level again (after it was
> deleted by what filesystems are offering nowadays). Especially if one
> thinks at stuff like certificates and similiar which can be identified
> by common patterns (bit sequences) they use.

Or to give another more common example: If you delete your contact list, 
I likely might find again by just searching for 0x6f726956 at the device 
level (assuming you've stored a contact in that list with the same 
surname as yours.

And, because I've only mentioned in a different thread, now think at the 
problem that nowadays storage is often fixed (soldered) to devices which 
don't offer a way to delete the whole storage. You might have luck if 
the contact list in question was stored in some encrypted part, but that 
presumes that the key for that encrypted part isn't somehow stored on 
the same device too. Which unfortunately isn't always the case (maybe 
because of usability). And ...

That's why I think filesystems should offer a way to really delete 
files. Most people would be happy, even if filesystems won't delete 
stuff at military security levels and would disregard all the cases when 
they couldn't make sure that stuff is really deleted.

To conclude, most people would be already happy if the most trivial case 
would be handled right and not just by marking files as deleted but 
leaving the contents intact.

Alexander Holler
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Alexander Holler Feb. 3, 2015, 12:48 p.m. UTC | #11
Am 03.02.2015 um 10:23 schrieb Alexander Holler:

> Or to give another more common example: If you delete your contact list,
> I likely might find again by just searching for 0x6f726956 at the device
> level (assuming you've stored a contact in that list with the same
> surname as yours.
>
> And, because I've only mentioned in a different thread, now think at the
> problem that nowadays storage is often fixed (soldered) to devices which
> don't offer a way to delete the whole storage. You might have luck if
> the contact list in question was stored in some encrypted part, but that
> presumes that the key for that encrypted part isn't somehow stored on
> the same device too. Which unfortunately isn't always the case (maybe
> because of usability). And ...
>
> That's why I think filesystems should offer a way to really delete
> files. Most people would be happy, even if filesystems won't delete
> stuff at military security levels and would disregard all the cases when
> they couldn't make sure that stuff is really deleted.
>
> To conclude, most people would be already happy if the most trivial case
> would be handled right and not just by marking files as deleted but
> leaving the contents intact.


Maybe I should rename me to Quijote, switch back to pen and paper and
should start to raise carrier pigeons. ;)

E.g. my parents are stull successfully using contact lists on paper. 
These are still more readable, easier to handle and smaller than any 
available electronic replacement. And they have absolutely no problem to 
destroy an old one when they replace it with a new one.

Regards,

Alexander Holler
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Alexander Holler Feb. 3, 2015, 12:54 p.m. UTC | #12
Am 03.02.2015 um 13:48 schrieb Alexander Holler:

> E.g. my parents are stull successfully using contact lists on paper.
> These are still more readable, easier to handle and smaller than any
> available electronic replacement. And they have absolutely no problem to
> destroy an old one when they replace it with a new one.

Which very seldom happens, no screen which can crack and no battery. ;)

--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Alex Elsayed Feb. 3, 2015, 4:44 p.m. UTC | #13
Alexander Holler wrote:

> Am 03.02.2015 um 09:51 schrieb Alexander Holler:
>> Am 03.02.2015 um 08:56 schrieb Al Viro:
>>> On Tue, Feb 03, 2015 at 07:58:50AM +0100, Alexander Holler wrote:
>>>
>>>>> Charming.  Now, what exactly happens if two such syscalls overlap in
>>>>> time?
>>>>
>>>> What do you think will happen? I assume you haven't looked at how I've
>>>> implemented set_secure_delete(). CHarming.
>>>
>>> AFAICS, you get random unlink() happening at the same time hit by that
>>> mess, whether they'd asked for it or not.  What's more, this counter
>>> of yours is *not* guaranteed to be elevated during the final iput() of
>>> the
>>> inode you wanted to get - again, ls -lR racing with that syscall can
>>> elevate the refcount of dentry, making d_delete() in vfs_unlink() just
>>> remove that dentry from hash, while keeping it positive.  If dentry
>>> reference grabbed by stat(2) is released after both dput() and iput() in
>>> do_unlinkat(), the final iput() will be done when stat(2) drops its
>>> reference to dentry, triggering immediate dentry_kill() (since dentry
>>> has already been unhashed) and dentry_iput() from it.
>>
>> Thanks for the short explanation. I will see if I can make sense out of
>> it for me to get an idea how to solve that.
>>
>>>
>>> IOW, this counter is both too crude (it's fs-wide, for crying out loud)
>>> *and* not guaranteed to cover enough.  _IF_ you want that behaviour at
>>
>> Sure it is crude.
>>
>> But it keeps the patches simple. As I've written, unlinkat_s() isn't
>> meant for everyday usage, just for the rare case when one really wants
>> to get rid of some contents. Therefor execution speed or an i/o slowdown
>> while the "secure deletion" is in work is totally ignored
>>
>> And that "rare case" doesn't include military security levels, it's just
>> meant for ordinary people which want make it much, much harder for other
>> ordinary people (or geeks or kernel maintainers) to read the deleted
>> content ever again. It's far too easy to use grep or something similiar
>> to find seemingly deleted stuff at device level again (after it was
>> deleted by what filesystems are offering nowadays). Especially if one
>> thinks at stuff like certificates and similiar which can be identified
>> by common patterns (bit sequences) they use.
> 
> Or to give another more common example: If you delete your contact list,
> I likely might find again by just searching for 0x6f726956 at the device
> level (assuming you've stored a contact in that list with the same
> surname as yours.
> 
> And, because I've only mentioned in a different thread, now think at the
> problem that nowadays storage is often fixed (soldered) to devices which
> don't offer a way to delete the whole storage. You might have luck if
> the contact list in question was stored in some encrypted part, but that
> presumes that the key for that encrypted part isn't somehow stored on
> the same device too. Which unfortunately isn't always the case (maybe
> because of usability). And ...
> 
> That's why I think filesystems should offer a way to really delete
> files. Most people would be happy, even if filesystems won't delete
> stuff at military security levels and would disregard all the cases when
> they couldn't make sure that stuff is really deleted.
> 
> To conclude, most people would be already happy if the most trivial case
> would be handled right and not just by marking files as deleted but
> leaving the contents intact.

Well, one other issue is that this only ensures that the extents referenced 
at the time of explicit deletion are wiped.

On COW filesystems this is most obviously insufficient (All the old 
unreferenced-after-COW copies will have been left in place rather than 
erased in this manner). However, on other filesystems it can still happen 
anyway.

So even just architecturally, this operates at the wrong point in time to 
make the guarantees it's claiming. Secure deletion behavior affects the 
entire filesystem, because it may very well make internal copies, not just 
explicit user-driven ones.

Because of that, it'd pretty much have to be a mount flag of some variety, 
as far as I understand. When enabled, any time an extent is freed, it must 
be wiped. And even then, a single mount with that option disabled would 
destroy any such guarantees.

--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Theodore Ts'o Feb. 3, 2015, 5:48 p.m. UTC | #14
On Tue, Feb 03, 2015 at 01:48:56PM +0100, Alexander Holler wrote:
> 
> E.g. my parents are stull successfully using contact lists on paper. These
> are still more readable, easier to handle and smaller than any available
> electronic replacement. And they have absolutely no problem to destroy an
> old one when they replace it with a new one.

Sort of using crypto, which I think is still the best response to this
particular use case --- the trick is making it easy to use, but that's
a desktop/distro integration problem --- the most secure way to really
make sure information on paper and on an SSD is secure is actually the
same --- you use a shredder.

And this isn't just for "military grade security".  There are some
commercial cloud providers which do precisely this, because they know
that their customer's security and their reputation is not easily
valued, and certainly the cost of some piddling flash chips, after
they have been depreciated, is defintiely cheaper than a security
exposure.  Look at the estimates of how much money Target lost with
their little security oopsie.  Again, this is commercial, not military
security.

Cheers,

					- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Alexander Holler Feb. 3, 2015, 6:01 p.m. UTC | #15
Am 03.02.2015 um 18:48 schrieb Theodore Ts'o:
> On Tue, Feb 03, 2015 at 01:48:56PM +0100, Alexander Holler wrote:
>>
>> E.g. my parents are stull successfully using contact lists on paper. These
>> are still more readable, easier to handle and smaller than any available
>> electronic replacement. And they have absolutely no problem to destroy an
>> old one when they replace it with a new one.
>
> Sort of using crypto, which I think is still the best response to this
> particular use case --- the trick is making it easy to use, but that's
> a desktop/distro integration problem --- the most secure way to really
> make sure information on paper and on an SSD is secure is actually the
> same --- you use a shredder.
>
> And this isn't just for "military grade security".  There are some
> commercial cloud providers which do precisely this, because they know
> that their customer's security and their reputation is not easily
> valued, and certainly the cost of some piddling flash chips, after
> they have been depreciated, is defintiely cheaper than a security
> exposure.  Look at the estimates of how much money Target lost with
> their little security oopsie.  Again, this is commercial, not military
> security.

Yeah, as I've already admitted in the bug, I never should have use the 
word secure, because everyone nowadays seems to end up in panic when 
reading that word.

So, if I would be able to use sed on my mails, I would replace 
unlinkat_s() with unlinkat_w() (for wipe) or would say that _s does 
stand for 'shred' in the means of shred(1).

Anyways, as I hopefully still have some years left to live (also I'm 
writing SW since 30a, I'm not that old as some of you guys and still 
have to work at least two decades), I might be able to see where we will 
end up with all that fiasco we've already managed to drive into.

Regards,

Alexander Holler
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Al Viro Feb. 3, 2015, 11:33 p.m. UTC | #16
On Tue, Feb 03, 2015 at 07:01:50PM +0100, Alexander Holler wrote:

> Yeah, as I've already admitted in the bug, I never should have use
> the word secure, because everyone nowadays seems to end up in panic
> when reading that word.
> 
> So, if I would be able to use sed on my mails, I would replace
> unlinkat_s() with unlinkat_w() (for wipe) or would say that _s does
> stand for 'shred' in the means of shred(1).

TBH, I suspect that the saner API would be something like EXT2_IOC_[SG[ETFLAGS,
allowing to set and query that along with other flags (append-only, etc.).

Forget about unlink; first of all, whatever API you use should only _mark_
the inode as "zero freed blocks" (or trim, for that matter).  You can't
force freeing of an inode, so either you make sure that subsequent freeing
of inode, whenever it happens, will do that work, or your API is hopelessly
racy.  Moreover, when link has been removed it's too late to report that
fs has no way to e.g. trim those blocks, so you really want to have it done
_before_ the actual link removal.  And if the file contents is that sensitive,
you'd better extend the same protection to all operations that free its
blocks, including truncate(), fallocate() hole-punching, whatever.  What's
more, if you divorce that from link removal, you probably don't want it as
in-core-only flag - have it stored in inode, if fs supports that.

Alternatively, you might want to represent it as xattr - as much as I hate
those, it might turn out to be the best fit in this case, if we end up
with several variants for freed blocks disposal.  Not sure...

But whichever way we represent that state, IMO
	a) operation should be similar to chmod/chattr/setfattr - modifying
inode metadata.
	b) it should affect _all_ operations freeing blocks of that file
from that point on
	c) it should be able to fail, telling you that you can't do that for
this backing store.
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Andreas Dilger Feb. 4, 2015, 4:16 a.m. UTC | #17
On Feb 3, 2015, at 4:33 PM, Al Viro <viro@ZenIV.linux.org.uk> wrote:
> On Tue, Feb 03, 2015 at 07:01:50PM +0100, Alexander Holler wrote:
>> Yeah, as I've already admitted in the bug, I never should have use
>> the word secure, because everyone nowadays seems to end up in panic
>> when reading that word.
>> 
>> So, if I would be able to use sed on my mails, I would replace
>> unlinkat_s() with unlinkat_w() (for wipe) or would say that _s does
>> stand for 'shred' in the means of shred(1).
> 
> TBH, I suspect that the saner API would be something like
> EXT2_IOC_[SG]ETFLAGS, allowing to set and query that along with other
> flags (append-only, etc.).
> 
> Forget about unlink; first of all, whatever API you use should only
> _mark_ the inode as "zero freed blocks" (or trim, for that matter).

This already exists for a long time.  "chattr +s file [file...]" marks
inodes for "secure deletion" (EXT2_SECRM_FL), but this wasn't implemented.

Cheers, Andreas

> You can't force freeing of an inode, so either you make sure that
> subsequent freeing of inode, whenever it happens, will do that work,
> or your API is hopelessly racy.  Moreover, when link has been removed
> it's too late to report that fs has no way to e.g. trim those blocks,
> so you really want to have it done _before_ the actual link removal.
> And if the file contents is that sensitive,
> you'd better extend the same protection to all operations that free its
> blocks, including truncate(), fallocate() hole-punching, whatever.  What's
> more, if you divorce that from link removal, you probably don't want it as
> in-core-only flag - have it stored in inode, if fs supports that.
> 
> Alternatively, you might want to represent it as xattr - as much as I hate
> those, it might turn out to be the best fit in this case, if we end up
> with several variants for freed blocks disposal.  Not sure...
> 
> But whichever way we represent that state, IMO
> 	a) operation should be similar to chmod/chattr/setfattr -
>            modifying inode metadata.
> 	b) it should affect _all_ operations freeing blocks of that
>            file from that point on
> 	c) it should be able to fail, telling you that you can't do
>            that for this backing store.

Cheers, Andreas





--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Michael Kerrisk (man-pages) Feb. 4, 2015, 8:01 a.m. UTC | #18
[CC += linux-api@]

On Mon, Feb 2, 2015 at 6:05 PM, Alexander Holler <holler@ahsoftware.de> wrote:
> Signed-off-by: Alexander Holler <holler@ahsoftware.de>
> ---
>  arch/x86/syscalls/syscall_32.tbl      |  1 +
>  arch/x86/syscalls/syscall_64.tbl      |  1 +
>  fs/namei.c                            | 38 ++++++++++++++++++++++++++++++-----
>  include/asm-generic/audit_dir_write.h |  1 +
>  include/linux/fs.h                    |  1 +
>  include/linux/syscalls.h              |  1 +
>  include/uapi/asm-generic/unistd.h     |  4 +++-
>  tools/perf/builtin-trace.c            |  2 ++
>  8 files changed, 43 insertions(+), 6 deletions(-)
>
> diff --git a/arch/x86/syscalls/syscall_32.tbl b/arch/x86/syscalls/syscall_32.tbl
> index 9fe1b5d..7a3d530 100644
> --- a/arch/x86/syscalls/syscall_32.tbl
> +++ b/arch/x86/syscalls/syscall_32.tbl
> @@ -364,3 +364,4 @@
>  355    i386    getrandom               sys_getrandom
>  356    i386    memfd_create            sys_memfd_create
>  357    i386    bpf                     sys_bpf
> +359    i386    unlinkat_s              sys_unlinkat_s
> diff --git a/arch/x86/syscalls/syscall_64.tbl b/arch/x86/syscalls/syscall_64.tbl
> index 281150b..97eaf01 100644
> --- a/arch/x86/syscalls/syscall_64.tbl
> +++ b/arch/x86/syscalls/syscall_64.tbl
> @@ -328,6 +328,7 @@
>  319    common  memfd_create            sys_memfd_create
>  320    common  kexec_file_load         sys_kexec_file_load
>  321    common  bpf                     sys_bpf
> +322    common  unlinkat_s              sys_unlinkat_s
>
>  #
>  # x32-specific system call numbers start at 512 to avoid cache impact
> diff --git a/fs/namei.c b/fs/namei.c
> index db5fe86..1ad3724 100644
> --- a/fs/namei.c
> +++ b/fs/namei.c
> @@ -3717,7 +3717,7 @@ EXPORT_SYMBOL(vfs_unlink);
>   * writeout happening, and we don't want to prevent access to the directory
>   * while waiting on the I/O.
>   */
> -static long do_unlinkat(int dfd, const char __user *pathname)
> +static long do_unlinkat(int dfd, const char __user *pathname, bool secure)
>  {
>         int error;
>         struct filename *name;
> @@ -3759,8 +3759,25 @@ exit2:
>                 dput(dentry);
>         }
>         mutex_unlock(&nd.path.dentry->d_inode->i_mutex);
> -       if (inode)
> -               iput(inode);    /* truncate the inode here */
> +       if (inode) {
> +               // TODO:
> +               // if (inode is file and 's' flag is set)
> +               //      secure = true;
> +               if (!secure)
> +                       iput(inode);    /* truncate the inode here */
> +               else {
> +                       struct super_block *sb = inode->i_sb;
> +                       if (sb->s_op->set_secure_delete)
> +                               sb->s_op->set_secure_delete(sb, true);
> +                       // TODO: We should fail if secure isn't supported,
> +                       // look up how that's possible here.
> +                       iput(inode);    /* truncate the inode here */
> +                       // TODO: check if sb is still valid after the inode is gone
> +                       sync_filesystem(sb);
> +                       if (sb->s_op->set_secure_delete)
> +                               sb->s_op->set_secure_delete(sb, false);
> +               }
> +       }
>         inode = NULL;
>         if (delegated_inode) {
>                 error = break_deleg_wait(&delegated_inode);
> @@ -3796,12 +3813,23 @@ SYSCALL_DEFINE3(unlinkat, int, dfd, const char __user *, pathname, int, flag)
>         if (flag & AT_REMOVEDIR)
>                 return do_rmdir(dfd, pathname);
>
> -       return do_unlinkat(dfd, pathname);
> +       return do_unlinkat(dfd, pathname, false);
>  }
>
>  SYSCALL_DEFINE1(unlink, const char __user *, pathname)
>  {
> -       return do_unlinkat(AT_FDCWD, pathname);
> +       return do_unlinkat(AT_FDCWD, pathname, false);
> +}
> +
> +SYSCALL_DEFINE3(unlinkat_s, int, dfd, const char __user *, pathname, int, flag)
> +{
> +       if ((flag & ~AT_REMOVEDIR) != 0)
> +               return -EINVAL;
> +
> +       if (flag & AT_REMOVEDIR)
> +               return do_rmdir(dfd, pathname);
> +
> +       return do_unlinkat(dfd, pathname, true);
>  }
>
>  int vfs_symlink(struct inode *dir, struct dentry *dentry, const char *oldname)
> diff --git a/include/asm-generic/audit_dir_write.h b/include/asm-generic/audit_dir_write.h
> index 7b61db4..5282aba 100644
> --- a/include/asm-generic/audit_dir_write.h
> +++ b/include/asm-generic/audit_dir_write.h
> @@ -29,4 +29,5 @@ __NR_unlinkat,
>  __NR_renameat,
>  __NR_linkat,
>  __NR_symlinkat,
> +__NR_unlinkat_s,
>  #endif
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index 9ab779e..039e969 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -1594,6 +1594,7 @@ struct super_operations {
>         int (*bdev_try_to_free_page)(struct super_block*, struct page*, gfp_t);
>         long (*nr_cached_objects)(struct super_block *, int);
>         long (*free_cached_objects)(struct super_block *, long, int);
> +       void (*set_secure_delete) (struct super_block *, bool);
>  };
>
>  /*
> diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
> index bda9b81..b88019b 100644
> --- a/include/linux/syscalls.h
> +++ b/include/linux/syscalls.h
> @@ -877,4 +877,5 @@ asmlinkage long sys_seccomp(unsigned int op, unsigned int flags,
>  asmlinkage long sys_getrandom(char __user *buf, size_t count,
>                               unsigned int flags);
>  asmlinkage long sys_bpf(int cmd, union bpf_attr *attr, unsigned int size);
> +asmlinkage long sys_unlinkat_s(int dfd, const char __user * pathname, int flag);
>  #endif
> diff --git a/include/uapi/asm-generic/unistd.h b/include/uapi/asm-generic/unistd.h
> index 22749c1..2ba072e 100644
> --- a/include/uapi/asm-generic/unistd.h
> +++ b/include/uapi/asm-generic/unistd.h
> @@ -707,9 +707,11 @@ __SYSCALL(__NR_getrandom, sys_getrandom)
>  __SYSCALL(__NR_memfd_create, sys_memfd_create)
>  #define __NR_bpf 280
>  __SYSCALL(__NR_bpf, sys_bpf)
> +#define __NR_unlinkat_s 281
> +__SYSCALL(__NR_unlinkat_s, sys_unlinkat_s)
>
>  #undef __NR_syscalls
> -#define __NR_syscalls 281
> +#define __NR_syscalls 282
>
>  /*
>   * All syscalls below here should go away really,
> diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
> index fb12645..1507335 100644
> --- a/tools/perf/builtin-trace.c
> +++ b/tools/perf/builtin-trace.c
> @@ -1110,6 +1110,8 @@ static struct syscall_fmt {
>         { .name     = "uname",      .errmsg = true, .alias = "newuname", },
>         { .name     = "unlinkat",   .errmsg = true,
>           .arg_scnprintf = { [0] = SCA_FDAT, /* dfd */ }, },
> +       { .name     = "unlinkat_s",   .errmsg = true,
> +         .arg_scnprintf = { [0] = SCA_FDAT, /* dfd */ }, },
>         { .name     = "utimensat",  .errmsg = true,
>           .arg_scnprintf = { [0] = SCA_FDAT, /* dirfd */ }, },
>         { .name     = "write",      .errmsg = true,
> --
> 2.1.0
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
Alexander Holler Feb. 4, 2015, 10:19 a.m. UTC | #19
Am 04.02.2015 um 00:33 schrieb Al Viro:
> On Tue, Feb 03, 2015 at 07:01:50PM +0100, Alexander Holler wrote:
>
>> Yeah, as I've already admitted in the bug, I never should have use
>> the word secure, because everyone nowadays seems to end up in panic
>> when reading that word.
>>
>> So, if I would be able to use sed on my mails, I would replace
>> unlinkat_s() with unlinkat_w() (for wipe) or would say that _s does
>> stand for 'shred' in the means of shred(1).
>
> TBH, I suspect that the saner API would be something like EXT2_IOC_[SG[ETFLAGS,
> allowing to set and query that along with other flags (append-only, etc.).
>
> Forget about unlink; first of all, whatever API you use should only _mark_
> the inode as "zero freed blocks" (or trim, for that matter).  You can't
> force freeing of an inode, so either you make sure that subsequent freeing
> of inode, whenever it happens, will do that work, or your API is hopelessly
> racy.  Moreover, when link has been removed it's too late to report that
> fs has no way to e.g. trim those blocks, so you really want to have it done
> _before_ the actual link removal.  And if the file contents is that sensitive,
> you'd better extend the same protection to all operations that free its
> blocks, including truncate(), fallocate() hole-punching, whatever.  What's
> more, if you divorce that from link removal, you probably don't want it as
> in-core-only flag - have it stored in inode, if fs supports that.
>
> Alternatively, you might want to represent it as xattr - as much as I hate
> those, it might turn out to be the best fit in this case, if we end up
> with several variants for freed blocks disposal.  Not sure...
>
> But whichever way we represent that state, IMO
> 	a) operation should be similar to chmod/chattr/setfattr - modifying
> inode metadata.
> 	b) it should affect _all_ operations freeing blocks of that file
> from that point on
> 	c) it should be able to fail, telling you that you can't do that for
> this backing store.

My intention to use unlinkat() or unlinkat_s() was the following:

- It can be supported by most filesystems (see my fat patch)

- It doesn't really make any promises it can't, like deleting leftovers 
of an already modified file. That's where a much more complicated 
solution like the 's' attribute would appropriate. It just should try to 
wipe the current contents of a file.

The second reason was also the reason why I've crafted the subject of 
the RFC very carefully: "Offer a way for userspace to request real 
deletion of files".

I did that to avoid the nitpickers. It doesn't say how the request is or 
has to be handled. I was aware of all the problems which arise if one 
tries to fullfill what the 's' flag promises. The final result of trying 
to get a 100 percent solution is just what we have now: nothing at all.

It wasn't the first time I've posted a patch to LKML, I know that 
maintainers like to request high towers from ordinary people and 
therefor very often nice dog houses were refused. There might be a 
legitimate reason to request a high tower from a big company, but that's 
something totally different.

And I refuse to try to understand why maintainers request high towers. ;)

And because hope never dies, I was again silly enough to post a simple 
patch. ;)

Regards,

Alexander Hpller
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Lukas Czerner Feb. 4, 2015, 12:07 p.m. UTC | #20
On Wed, 4 Feb 2015, Alexander Holler wrote:

> Date: Wed, 04 Feb 2015 11:19:01 +0100
> From: Alexander Holler <holler@ahsoftware.de>
> To: Al Viro <viro@ZenIV.linux.org.uk>
> Cc: Theodore Ts'o <tytso@mit.edu>, linux-fsdevel@vger.kernel.org,
>     linux-kernel@vger.kernel.org
> Subject: Re: [PATCH 1/5] WIP: Add syscall unlinkat_s (currently x86* only)
> 
> Am 04.02.2015 um 00:33 schrieb Al Viro:
> > On Tue, Feb 03, 2015 at 07:01:50PM +0100, Alexander Holler wrote:
> > 
> > > Yeah, as I've already admitted in the bug, I never should have use
> > > the word secure, because everyone nowadays seems to end up in panic
> > > when reading that word.
> > > 
> > > So, if I would be able to use sed on my mails, I would replace
> > > unlinkat_s() with unlinkat_w() (for wipe) or would say that _s does
> > > stand for 'shred' in the means of shred(1).
> > 
> > TBH, I suspect that the saner API would be something like
> > EXT2_IOC_[SG[ETFLAGS,
> > allowing to set and query that along with other flags (append-only, etc.).
> > 
> > Forget about unlink; first of all, whatever API you use should only _mark_
> > the inode as "zero freed blocks" (or trim, for that matter).  You can't
> > force freeing of an inode, so either you make sure that subsequent freeing
> > of inode, whenever it happens, will do that work, or your API is hopelessly
> > racy.  Moreover, when link has been removed it's too late to report that
> > fs has no way to e.g. trim those blocks, so you really want to have it done
> > _before_ the actual link removal.  And if the file contents is that
> > sensitive,
> > you'd better extend the same protection to all operations that free its
> > blocks, including truncate(), fallocate() hole-punching, whatever.  What's
> > more, if you divorce that from link removal, you probably don't want it as
> > in-core-only flag - have it stored in inode, if fs supports that.
> > 
> > Alternatively, you might want to represent it as xattr - as much as I hate
> > those, it might turn out to be the best fit in this case, if we end up
> > with several variants for freed blocks disposal.  Not sure...
> > 
> > But whichever way we represent that state, IMO
> > 	a) operation should be similar to chmod/chattr/setfattr - modifying
> > inode metadata.
> > 	b) it should affect _all_ operations freeing blocks of that file
> > from that point on
> > 	c) it should be able to fail, telling you that you can't do that for
> > this backing store.
> 
> My intention to use unlinkat() or unlinkat_s() was the following:
> 
> - It can be supported by most filesystems (see my fat patch)
> 
> - It doesn't really make any promises it can't, like deleting leftovers of an
> already modified file. That's where a much more complicated solution like the
> 's' attribute would appropriate. It just should try to wipe the current
> contents of a file.
> 
> The second reason was also the reason why I've crafted the subject of the RFC
> very carefully: "Offer a way for userspace to request real deletion of files".
> 
> I did that to avoid the nitpickers. It doesn't say how the request is or has
> to be handled. I was aware of all the problems which arise if one tries to
> fullfill what the 's' flag promises. The final result of trying to get a 100
> percent solution is just what we have now: nothing at all.
> 
> It wasn't the first time I've posted a patch to LKML, I know that maintainers
> like to request high towers from ordinary people and therefor very often nice
> dog houses were refused. There might be a legitimate reason to request a high
> tower from a big company, but that's something totally different.
> 
> And I refuse to try to understand why maintainers request high towers. ;)
> 
> And because hope never dies, I was again silly enough to post a simple patch.
> ;)

It really comes down to what your intentions and expectations are.
If you just wanted to have a discussion about this feature, you have
it and that's fine.

If your intention was to propose the actual solution, the actual
feature that would be merged to kernel, then your expectation should
have been exactly this, discussion, improvement propositions, cases
to take care of and so on.

For example the interface Al proposed is much better than the
solution in the patch. And he is right about the other block freeing
operations.

No one told you not to proceed with the implementation AFAIK. Some
are skeptical and that's fine, but you have all means to proof them
wrong.

The fact is that the current patches are useless for anything other
than proof-of-concept. Now you know more that needs to be done or
thought about, but if you're not willing to do the work, then please
stop complaining about "high towers". I am not a maintainer and I
thinks that the feedback you've got is entirely reasonable. Take it
as you will.

One more thing, can I ask you what were your expectations when
posting those patches ?

Regards,
-Lukas

> 
> Regards,
> 
> Alexander Hpller
> --
> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Alexander Holler Feb. 4, 2015, 12:22 p.m. UTC | #21
Am 04.02.2015 um 13:07 schrieb Lukáš Czerner:

> The fact is that the current patches are useless for anything other
> than proof-of-concept. Now you know more that needs to be done or

That's wrong. The patches already work. If you delete a file which isn't 
in use by something else, the current contents will be wiped on 
traditional harddrives. I assume that already fulfills more than 50% of 
use cases of ordinary people.

> thought about, but if you're not willing to do the work, then please
> stop complaining about "high towers". I am not a maintainer and I
> thinks that the feedback you've got is entirely reasonable. Take it
> as you will.
>
> One more thing, can I ask you what were your expectations when
> posting those patches ?

I've posted them for other users which are happy with what I've 
explained above. Besides requesting an API which makes such a simple 
solution, in contrast to the the 's' bit, possible.

Alexander Holler

--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Alexander Holler Feb. 4, 2015, 12:42 p.m. UTC | #22
Am 04.02.2015 um 13:22 schrieb Alexander Holler:
> Am 04.02.2015 um 13:07 schrieb Lukáš Czerner:
>
>> The fact is that the current patches are useless for anything other
>> than proof-of-concept. Now you know more that needs to be done or
>
> That's wrong. The patches already work. If you delete a file which isn't
> in use by something else, the current contents will be wiped on
> traditional harddrives. I assume that already fulfills more than 50% of
> use cases of ordinary people.
>
>> thought about, but if you're not willing to do the work, then please
>> stop complaining about "high towers". I am not a maintainer and I
>> thinks that the feedback you've got is entirely reasonable. Take it
>> as you will.
>>
>> One more thing, can I ask you what were your expectations when
>> posting those patches ?
>
> I've posted them for other users which are happy with what I've
> explained above. Besides requesting an API which makes such a simple
> solution, in contrast to the the 's' bit, possible.

To be more precise: How do you add something like EXT2_IOC_[SG[ETFLAGS 
to vfat or one of the dozens other filesystems which don't know about 
linux-specific flags? I don't see a way to do that, so there's only 
unlinkat() left.

Alexander Holler



--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Alexander Holler Feb. 4, 2015, 12:50 p.m. UTC | #23
Am 04.02.2015 um 13:42 schrieb Alexander Holler:
> Am 04.02.2015 um 13:22 schrieb Alexander Holler:
>> Am 04.02.2015 um 13:07 schrieb Lukáš Czerner:
>>
>>> The fact is that the current patches are useless for anything other
>>> than proof-of-concept. Now you know more that needs to be done or
>>
>> That's wrong. The patches already work. If you delete a file which isn't
>> in use by something else, the current contents will be wiped on
>> traditional harddrives. I assume that already fulfills more than 50% of
>> use cases of ordinary people.
>>
>>> thought about, but if you're not willing to do the work, then please
>>> stop complaining about "high towers". I am not a maintainer and I
>>> thinks that the feedback you've got is entirely reasonable. Take it
>>> as you will.
>>>
>>> One more thing, can I ask you what were your expectations when
>>> posting those patches ?
>>
>> I've posted them for other users which are happy with what I've
>> explained above. Besides requesting an API which makes such a simple
>> solution, in contrast to the the 's' bit, possible.
>
> To be more precise: How do you add something like EXT2_IOC_[SG[ETFLAGS
> to vfat or one of the dozens other filesystems which don't know about
> linux-specific flags? I don't see a way to do that, so there's only
> unlinkat() left.

Or to be give an actual use case, mount a (v)fat formatted usb-stick, 
-hdd or mmc, delete a file with the patches I offered, unmount it, try 
to find the contents of the deleted file at device-level (e.g. by 
grepping the partition).

Alexander Holler

--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Michael Kerrisk (man-pages) Feb. 4, 2015, 1:06 p.m. UTC | #24
Alexander,

On Wed, Feb 4, 2015 at 1:22 PM, Alexander Holler <holler@ahsoftware.de> wrote:
> Am 04.02.2015 um 13:07 schrieb Lukáš Czerner:
>
>> The fact is that the current patches are useless for anything other
>> than proof-of-concept. Now you know more that needs to be done or
>
>
> That's wrong. The patches already work. If you delete a file which isn't in
> use by something else, the current contents will be wiped on traditional
> harddrives. I assume that already fulfills more than 50% of use cases of
> ordinary people.

You are getting various feedback from people, that you seem to be ignoring.

Al Viro, in his curmedgeonly way, points out that the problems are
much deeper than you realize. He does not say so explicitly, but I
imagine his point is that he does not want to see the kernel cluttered
with "partial" solutions that will simply increase the maintenance
burden in the long term, and leave bugs to be fixed further down the
line. You seem not to be listening.

Lukáš points out to you that getting a feature like this into the
kernel is complex process. You seem unwilling to hear that, and still
just want your partial solution.

I tell you that discussions of APIs should CC linux-api, which I am
now CCing into this thread, again, because, again, you're not
listening to feedback.

Nobody is asking for "high towers"; they just have their eyes on the
big picture. And the people here are just "ordinary people" with a
*lot* of experience dealing with kernel code (I exclude myself) . They
see many complexities that you don't. Getting intersting features into
the kernel requires a lot of work, and careful listening.

Thanks,

Michael
Alexander Holler Feb. 4, 2015, 1:07 p.m. UTC | #25
Am 04.02.2015 um 13:50 schrieb Alexander Holler:
> Am 04.02.2015 um 13:42 schrieb Alexander Holler:
>> Am 04.02.2015 um 13:22 schrieb Alexander Holler:
>>> Am 04.02.2015 um 13:07 schrieb Lukáš Czerner:
>>>
>>>> The fact is that the current patches are useless for anything other
>>>> than proof-of-concept. Now you know more that needs to be done or
>>>
>>> That's wrong. The patches already work. If you delete a file which isn't
>>> in use by something else, the current contents will be wiped on
>>> traditional harddrives. I assume that already fulfills more than 50% of
>>> use cases of ordinary people.
>>>
>>>> thought about, but if you're not willing to do the work, then please
>>>> stop complaining about "high towers". I am not a maintainer and I
>>>> thinks that the feedback you've got is entirely reasonable. Take it
>>>> as you will.
>>>>
>>>> One more thing, can I ask you what were your expectations when
>>>> posting those patches ?
>>>
>>> I've posted them for other users which are happy with what I've
>>> explained above. Besides requesting an API which makes such a simple
>>> solution, in contrast to the the 's' bit, possible.
>>
>> To be more precise: How do you add something like EXT2_IOC_[SG[ETFLAGS
>> to vfat or one of the dozens other filesystems which don't know about
>> linux-specific flags? I don't see a way to do that, so there's only
>> unlinkat() left.
>
> Or to be give an actual use case, mount a (v)fat formatted usb-stick,
> -hdd or mmc, delete a file with the patches I offered, unmount it, try
> to find the contents of the deleted file at device-level (e.g. by
> grepping the partition).

Maybe I should mention that I've tried it with bug reports instead 
patches before. Beeing aware that I might be unable to write perfect 
patches with the resources I'm able or willing to spend.

I just needed some days until the one for ext4 was closed, leaving no 
hope that it might become fixed without trying it myself.

Alexander Holler

--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Alexander Holler Feb. 4, 2015, 1:21 p.m. UTC | #26
Am 04.02.2015 um 14:06 schrieb Michael Kerrisk:
> Alexander,
>
> On Wed, Feb 4, 2015 at 1:22 PM, Alexander Holler <holler@ahsoftware.de> wrote:
>> Am 04.02.2015 um 13:07 schrieb Lukáš Czerner:
>>
>>> The fact is that the current patches are useless for anything other
>>> than proof-of-concept. Now you know more that needs to be done or
>>
>>
>> That's wrong. The patches already work. If you delete a file which isn't in
>> use by something else, the current contents will be wiped on traditional
>> harddrives. I assume that already fulfills more than 50% of use cases of
>> ordinary people.
>
> You are getting various feedback from people, that you seem to be ignoring.

I'm happy for all the feedback. But it doesn't help me. I'm not going to 
spend the necessary time unpaid.
.
> Al Viro, in his curmedgeonly way, points out that the problems are
> much deeper than you realize. He does not say so explicitly, but I
> imagine his point is that he does not want to see the kernel cluttered
> with "partial" solutions that will simply increase the maintenance
> burden in the long term, and leave bugs to be fixed further down the
> line. You seem not to be listening.

It doesn't help me nor anyone else. As Eric Sandeen made me aware 
through in bug, look at http://lwn.net/Articles/462437/ what already 
happened.

> Lukáš points out to you that getting a feature like this into the
> kernel is complex process. You seem unwilling to hear that, and still
> just want your partial solution.

Wrong. I don't want my partial solution to be part of the official 
kernel. I don't care. I offered it for other users because I'm aware 
that has become almost impossible for normal people to get something 
into the kernel without spending an unbelievable amount of time most 
people can't afford to spend.

> I tell you that discussions of APIs should CC linux-api, which I am
> now CCing into this thread, again, because, again, you're not
> listening to feedback.

Please don't confuse "not listening" with "unable to fulfill Linux 
kernel maintainer requests".

Alexander Holler

--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Alexander Holler Feb. 4, 2015, 1:29 p.m. UTC | #27
Am 04.02.2015 um 14:21 schrieb Alexander Holler:

>> I tell you that discussions of APIs should CC linux-api, which I am
>> now CCing into this thread, again, because, again, you're not
>> listening to feedback.
>
> Please don't confuse "not listening" with "unable to fulfill Linux
> kernel maintainer requests".

I really wonder what do you expect from people not getting paid to spend 
time for fulfilling maintainer request?

I've written bugs and even offered some patches (regardless how usefull 
there are in your eyes, it's more than most other people can do).

And all what it brought me is that I receive flames like your one.

Do you really think that's the right way to stimulate people in helping 
to make Linux better?

Alexander Holler


--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Alexander Holler Feb. 4, 2015, 2:19 p.m. UTC | #28
Am 04.02.2015 um 14:29 schrieb Alexander Holler:
> Am 04.02.2015 um 14:21 schrieb Alexander Holler:
>
>>> I tell you that discussions of APIs should CC linux-api, which I am
>>> now CCing into this thread, again, because, again, you're not
>>> listening to feedback.
>>
>> Please don't confuse "not listening" with "unable to fulfill Linux
>> kernel maintainer requests".
>
> I really wonder what do you expect from people not getting paid to spend
> time for fulfilling maintainer request?
>
> I've written bugs and even offered some patches (regardless how usefull
> there are in your eyes, it's more than most other people can do).
>
> And all what it brought me is that I receive flames like your one.
>
> Do you really think that's the right way to stimulate people in helping
> to make Linux better?

I'm really sorry that I can't spend several unpaid months with reading 
and understanding ever changing linux kernel sources in order to become 
a Linux filesystem expert and send some fully working perfect patches 
which do fix the problem in question.

And I can't spend the necessary time to play remote keyboard for kernel 
maintainers which might be willing to explain me what has to be done 
according to their view. I've already offered what I was willing to do, 
for the price of having to defend myself over and over. And 
unfortunately that wasn't the first time I've ended up with having to 
defend myself.

My conclusion is that I'm a real fool having posted multiple times 
patches to this list. It just doesn't make any sense and most of the 
time the only reward are flames.

Alexander Holler
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Lukas Czerner Feb. 4, 2015, 2:52 p.m. UTC | #29
On Wed, 4 Feb 2015, Alexander Holler wrote:

> Date: Wed, 04 Feb 2015 14:21:12 +0100
> From: Alexander Holler <holler@ahsoftware.de>
> To: Michael Kerrisk <mtk.manpages@gmail.com>
> Cc: Lukáš Czerner <lczerner@redhat.com>, Al Viro <viro@zeniv.linux.org.uk>,
>     Theodore Ts'o <tytso@mit.edu>,
>     Linux-Fsdevel <linux-fsdevel@vger.kernel.org>,
>     Linux Kernel <linux-kernel@vger.kernel.org>,
>     Linux API <linux-api@vger.kernel.org>
> Subject: Re: [PATCH 1/5] WIP: Add syscall unlinkat_s (currently x86* only)
> 
> Am 04.02.2015 um 14:06 schrieb Michael Kerrisk:
> > Alexander,
> > 
> > On Wed, Feb 4, 2015 at 1:22 PM, Alexander Holler <holler@ahsoftware.de>
> > wrote:
> > > Am 04.02.2015 um 13:07 schrieb Lukáš Czerner:
> > > 
> > > > The fact is that the current patches are useless for anything other
> > > > than proof-of-concept. Now you know more that needs to be done or
> > > 
> > > 
> > > That's wrong. The patches already work. If you delete a file which isn't
> > > in
> > > use by something else, the current contents will be wiped on traditional
> > > harddrives. I assume that already fulfills more than 50% of use cases of
> > > ordinary people.
> > 
> > You are getting various feedback from people, that you seem to be ignoring.
> 
> I'm happy for all the feedback. But it doesn't help me. I'm not going to spend
> the necessary time unpaid.

Right, you'd much rather have someone else to spend the time on your
request unpaid. That's understandable, but unreasonable. You want
it, implement it, or pay someone else to do it for you.

> .
> > Al Viro, in his curmedgeonly way, points out that the problems are
> > much deeper than you realize. He does not say so explicitly, but I
> > imagine his point is that he does not want to see the kernel cluttered
> > with "partial" solutions that will simply increase the maintenance
> > burden in the long term, and leave bugs to be fixed further down the
> > line. You seem not to be listening.
> 
> It doesn't help me nor anyone else. As Eric Sandeen made me aware through in
> bug, look at http://lwn.net/Articles/462437/ what already happened.

That's what people have been trying to tell you. It's not an easy
task and there are plenty of cases to think about. As you can see
IBM tasked their developer to do it, but they did not succeed. And
here you come with your simplistic patches crying about "high
towers. But you're the one apparently interested in this feature
and you've been warned that's it's not a simple task.

But if you really want it I really do encourage you to try. I'd be
happy to have a working and reliable secure delete feature but it's
not my priority at all.

-Lukas

> 
> > Lukáš points out to you that getting a feature like this into the
> > kernel is complex process. You seem unwilling to hear that, and still
> > just want your partial solution.
> 
> Wrong. I don't want my partial solution to be part of the official kernel. I
> don't care. I offered it for other users because I'm aware that has become
> almost impossible for normal people to get something into the kernel without
> spending an unbelievable amount of time most people can't afford to spend.
> 
> > I tell you that discussions of APIs should CC linux-api, which I am
> > now CCing into this thread, again, because, again, you're not
> > listening to feedback.
> 
> Please don't confuse "not listening" with "unable to fulfill Linux kernel
> maintainer requests".
> 
> Alexander Holler
> 
>
Austin S. Hemmelgarn Feb. 4, 2015, 3 p.m. UTC | #30
On 2015-02-04 09:19, Alexander Holler wrote:
> Am 04.02.2015 um 14:29 schrieb Alexander Holler:
> I'm really sorry that I can't spend several unpaid months with reading
> and understanding ever changing linux kernel sources in order to become
> a Linux filesystem expert and send some fully working perfect patches
> which do fix the problem in question.
You aren't expected to do so.  Code review is an integral part of the 
development process here, and only truly trivial patches (stuff like 
fixing typos in kernel messages and documentation) get merged without 
it.  If you pay attention to the list itself, even the veteran kernel 
developers almost never manage to produce a patch that is deemed 
absolutely perfect, and end up revising things multiple times before 
they get merged.
> And I can't spend the necessary time to play remote keyboard for kernel
> maintainers which might be willing to explain me what has to be done
> according to their view. I've already offered what I was willing to do,
> for the price of having to defend myself over and over. And
> unfortunately that wasn't the first time I've ended up with having to
> defend myself.
You seem to fail to understand that open source development runs 
primarily on volunteer work (yes there are people paid to work on open 
source software, but that is a generally exceptional case).  A large 
majority of the people who are kernel maintainers are donating their 
free time to the project.
> My conclusion is that I'm a real fool having posted multiple times
> patches to this list. It just doesn't make any sense and most of the
> time the only reward are flames.
If you aren't serious about trying to get something into the mainline 
kernel, you should be tagging _all_ of the e-mails in that patch-set 
with [RFC] in the subject line.

In none of the responses that I've seen has anyone been anything but 
polite (albeit in some cases moderately annoyed).  If you really 
consider such attempts at constructive criticism to be flames, then a 
development mailing list isn't the place you should be posting patches.
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Alexander Holler Feb. 4, 2015, 4:12 p.m. UTC | #31
Am 04.02.2015 um 15:52 schrieb Lukáš Czerner:
> On Wed, 4 Feb 2015, Alexander Holler wrote:

>> I'm happy for all the feedback. But it doesn't help me. I'm not going to spend
>> the necessary time unpaid.
> 
> Right, you'd much rather have someone else to spend the time on your
> request unpaid. That's understandable, but unreasonable. You want
> it, implement it, or pay someone else to do it for you.

Maybe you should attach a big fat red warning to the kernels bugzilla
that filing a bug means either to fix it yourself or pay somone to do that.

I've never demanded that someone else fixes it.

I've just explained a problem.

Unbelievable how someone could do such without paying someone else to
fix it or by fixing it themself ...
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Lukas Czerner Feb. 4, 2015, 4:25 p.m. UTC | #32
On Wed, 4 Feb 2015, Alexander Holler wrote:

> Date: Wed, 04 Feb 2015 17:12:52 +0100
> From: Alexander Holler <holler@ahsoftware.de>
> To: Lukáš Czerner <lczerner@redhat.com>
> Cc: Michael Kerrisk <mtk.manpages@gmail.com>,
>     Al Viro <viro@zeniv.linux.org.uk>, Theodore Ts'o <tytso@mit.edu>,
>     Linux-Fsdevel <linux-fsdevel@vger.kernel.org>,
>     Linux Kernel <linux-kernel@vger.kernel.org>,
>     Linux API <linux-api@vger.kernel.org>
> Subject: Re: [PATCH 1/5] WIP: Add syscall unlinkat_s (currently x86* only)
> 
> Am 04.02.2015 um 15:52 schrieb Lukáš Czerner:
> > On Wed, 4 Feb 2015, Alexander Holler wrote:
> 
> >> I'm happy for all the feedback. But it doesn't help me. I'm not going to spend
> >> the necessary time unpaid.
> > 
> > Right, you'd much rather have someone else to spend the time on your
> > request unpaid. That's understandable, but unreasonable. You want
> > it, implement it, or pay someone else to do it for you.
> 
> Maybe you should attach a big fat red warning to the kernels bugzilla
> that filing a bug means either to fix it yourself or pay somone to do that.
> 
> I've never demanded that someone else fixes it.
> 
> I've just explained a problem.
> 
> Unbelievable how someone could do such without paying someone else to
> fix it or by fixing it themself ...

It's not a bug, you're requesting a feature.
Alexander Holler Feb. 4, 2015, 4:45 p.m. UTC | #33
Am 04.02.2015 um 17:25 schrieb Lukáš Czerner:
> On Wed, 4 Feb 2015, Alexander Holler wrote:

>> Am 04.02.2015 um 15:52 schrieb Lukáš Czerner:
>>> On Wed, 4 Feb 2015, Alexander Holler wrote:
>>
>>>> I'm happy for all the feedback. But it doesn't help me. I'm not going to spend
>>>> the necessary time unpaid.
>>>
>>> Right, you'd much rather have someone else to spend the time on your
>>> request unpaid. That's understandable, but unreasonable. You want
>>> it, implement it, or pay someone else to do it for you.
>>
>> Maybe you should attach a big fat red warning to the kernels bugzilla
>> that filing a bug means either to fix it yourself or pay somone to do that.
>>
>> I've never demanded that someone else fixes it.
>>
>> I've just explained a problem.
>>
>> Unbelievable how someone could do such without paying someone else to
>> fix it or by fixing it themself ...
>
> It's not a bug, you're requesting a feature.
>

Ok, I'm guilty.

May I ask if there's somewhere a feature request tracker which doesn't 
cruzify someone because he suggest a (maybe wrong) solution and tries to 
show that this might work with some prelimary, broken, silly, quick and 
dirty patches?


--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Alexander Holler Feb. 4, 2015, 4:53 p.m. UTC | #34
Am 04.02.2015 um 17:45 schrieb Alexander Holler:
> Am 04.02.2015 um 17:25 schrieb Lukáš Czerner:
>> On Wed, 4 Feb 2015, Alexander Holler wrote:
>
>>> Am 04.02.2015 um 15:52 schrieb Lukáš Czerner:
>>>> On Wed, 4 Feb 2015, Alexander Holler wrote:
>>>
>>>>> I'm happy for all the feedback. But it doesn't help me. I'm not
>>>>> going to spend
>>>>> the necessary time unpaid.
>>>>
>>>> Right, you'd much rather have someone else to spend the time on your
>>>> request unpaid. That's understandable, but unreasonable. You want
>>>> it, implement it, or pay someone else to do it for you.
>>>
>>> Maybe you should attach a big fat red warning to the kernels bugzilla
>>> that filing a bug means either to fix it yourself or pay somone to do
>>> that.
>>>
>>> I've never demanded that someone else fixes it.
>>>
>>> I've just explained a problem.
>>>
>>> Unbelievable how someone could do such without paying someone else to
>>> fix it or by fixing it themself ...
>>
>> It's not a bug, you're requesting a feature.
>>
>
> Ok, I'm guilty.
>
> May I ask if there's somewhere a feature request tracker which doesn't
> cruzify someone because he suggest a (maybe wrong) solution and tries to
> show that this might work with some prelimary, broken, silly, quick and
> dirty patches?

I guess the answer is FreeBSD or similiar. ;)



--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Theodore Ts'o Feb. 4, 2015, 7:33 p.m. UTC | #35
On Wed, Feb 04, 2015 at 03:52:02PM +0100, Lukáš Czerner wrote:
> > I'm happy for all the feedback. But it doesn't help me. I'm not going to spend
> > the necessary time unpaid.
> 
> Right, you'd much rather have someone else to spend the time on your
> request unpaid. That's understandable, but unreasonable. You want
> it, implement it, or pay someone else to do it for you.
> 
> > It doesn't help me nor anyone else. As Eric Sandeen made me aware through in
> > bug, look at http://lwn.net/Articles/462437/ what already happened.
> 
> That's what people have been trying to tell you. It's not an easy
> task and there are plenty of cases to think about. As you can see
> IBM tasked their developer to do it, but they did not succeed. And
> here you come with your simplistic patches crying about "high
> towers. But you're the one apparently interested in this feature
> and you've been warned that's it's not a simple task.

And indeed, people who do have salaries paid by companies who care
about this general problem in actual products have been working on
addressing it using encryption, such that when the user is removed
from the device, the key is blasted.  More importantly, when the user
is not logged in, the key isn't even *available* on the device.  So it
solves more problems than the one that you are concerned about, and in
general maintainers prefer solutions that solve multiple problems,
because that minimizes the number of one-time hacks and partial/toy
solutions which turn into long-term maintainance headaches.  (After
all, if you insist on having a partial/toy solution merged, that turns
into an unfunded mandate which the maintainers effectively have to
support for free, forever.)

You've rejected encryption as a proposed solution as not meeting your
requirements (which if I understand your objections, can be summarized
as "encryption is too hard").  This is fine, but if you want someone
*else* to implement your partial toy solution which is less secure,
then you will either need to pay someone to do it or do it yourself.

> > Wrong. I don't want my partial solution to be part of the official kernel. I
> > don't care. I offered it for other users because I'm aware that has become
> > almost impossible for normal people to get something into the kernel without
> > spending an unbelievable amount of time most people can't afford to spend.

So you expect other users to just apply your patches and use an
unofficial system call number that might get reassigned to some other
user later on?

If that's all you want, then ok, you're done.  The patches have been
posted to LKML, and you can give people URL's if they want to try
applying the patches on their own.

Cheers,

						- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Alexander Holler Feb. 4, 2015, 7:56 p.m. UTC | #36
Am 04.02.2015 um 20:33 schrieb Theodore Ts'o:

> And indeed, people who do have salaries paid by companies who care
> about this general problem in actual products have been working on
> addressing it using encryption, such that when the user is removed
> from the device, the key is blasted.  More importantly, when the user
> is not logged in, the key isn't even *available* on the device.  So it
> solves more problems than the one that you are concerned about, and in
> general maintainers prefer solutions that solve multiple problems,
> because that minimizes the number of one-time hacks and partial/toy
> solutions which turn into long-term maintainance headaches.  (After
> all, if you insist on having a partial/toy solution merged, that turns
> into an unfunded mandate which the maintainers effectively have to
> support for free, forever.)

It's just another layer above and an rather ugly workaround which ends 
up in having to manage keys and doesn't solve the real problem. Besides 
that it's much more complicated especially in kind of kernel sources to 
manage.

> You've rejected encryption as a proposed solution as not meeting your
> requirements (which if I understand your objections, can be summarized
> as "encryption is too hard").  This is fine, but if you want someone
> *else* to implement your partial toy solution which is less secure,
> then you will either need to pay someone to do it or do it yourself.

I haven't rejected it. I'm using that myself since around 10 years, 
because of the impossibility to really delete files when using Linux.

>>> Wrong. I don't want my partial solution to be part of the official kernel. I
>>> don't care. I offered it for other users because I'm aware that has become
>>> almost impossible for normal people to get something into the kernel without
>>> spending an unbelievable amount of time most people can't afford to spend.
>
> So you expect other users to just apply your patches and use an
> unofficial system call number that might get reassigned to some other
> user later on?

People do such all the time because the mainline kernel is otherwise 
unusable on many boards.

Besides that, I don't expect that anyone uses my patches.

As said multiple times before, they are an offer and were primarily 
meant to show a possible simple solution for many use cases. They 
already work with inside some, of course maybe uncomfortable, limits and 
don't do any worse. just better.

Alexander Holler
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/arch/x86/syscalls/syscall_32.tbl b/arch/x86/syscalls/syscall_32.tbl
index 9fe1b5d..7a3d530 100644
--- a/arch/x86/syscalls/syscall_32.tbl
+++ b/arch/x86/syscalls/syscall_32.tbl
@@ -364,3 +364,4 @@ 
 355	i386	getrandom		sys_getrandom
 356	i386	memfd_create		sys_memfd_create
 357	i386	bpf			sys_bpf
+359	i386	unlinkat_s		sys_unlinkat_s
diff --git a/arch/x86/syscalls/syscall_64.tbl b/arch/x86/syscalls/syscall_64.tbl
index 281150b..97eaf01 100644
--- a/arch/x86/syscalls/syscall_64.tbl
+++ b/arch/x86/syscalls/syscall_64.tbl
@@ -328,6 +328,7 @@ 
 319	common	memfd_create		sys_memfd_create
 320	common	kexec_file_load		sys_kexec_file_load
 321	common	bpf			sys_bpf
+322	common	unlinkat_s		sys_unlinkat_s
 
 #
 # x32-specific system call numbers start at 512 to avoid cache impact
diff --git a/fs/namei.c b/fs/namei.c
index db5fe86..1ad3724 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -3717,7 +3717,7 @@  EXPORT_SYMBOL(vfs_unlink);
  * writeout happening, and we don't want to prevent access to the directory
  * while waiting on the I/O.
  */
-static long do_unlinkat(int dfd, const char __user *pathname)
+static long do_unlinkat(int dfd, const char __user *pathname, bool secure)
 {
 	int error;
 	struct filename *name;
@@ -3759,8 +3759,25 @@  exit2:
 		dput(dentry);
 	}
 	mutex_unlock(&nd.path.dentry->d_inode->i_mutex);
-	if (inode)
-		iput(inode);	/* truncate the inode here */
+	if (inode) {
+		// TODO:
+		// if (inode is file and 's' flag is set)
+		// 	secure = true;
+		if (!secure)
+			iput(inode);	/* truncate the inode here */
+		else {
+			struct super_block *sb = inode->i_sb;
+			if (sb->s_op->set_secure_delete)
+				sb->s_op->set_secure_delete(sb, true);
+			// TODO: We should fail if secure isn't supported,
+			// look up how that's possible here.
+			iput(inode);	/* truncate the inode here */
+			// TODO: check if sb is still valid after the inode is gone
+			sync_filesystem(sb);
+			if (sb->s_op->set_secure_delete)
+				sb->s_op->set_secure_delete(sb, false);
+		}
+	}
 	inode = NULL;
 	if (delegated_inode) {
 		error = break_deleg_wait(&delegated_inode);
@@ -3796,12 +3813,23 @@  SYSCALL_DEFINE3(unlinkat, int, dfd, const char __user *, pathname, int, flag)
 	if (flag & AT_REMOVEDIR)
 		return do_rmdir(dfd, pathname);
 
-	return do_unlinkat(dfd, pathname);
+	return do_unlinkat(dfd, pathname, false);
 }
 
 SYSCALL_DEFINE1(unlink, const char __user *, pathname)
 {
-	return do_unlinkat(AT_FDCWD, pathname);
+	return do_unlinkat(AT_FDCWD, pathname, false);
+}
+
+SYSCALL_DEFINE3(unlinkat_s, int, dfd, const char __user *, pathname, int, flag)
+{
+	if ((flag & ~AT_REMOVEDIR) != 0)
+		return -EINVAL;
+
+	if (flag & AT_REMOVEDIR)
+		return do_rmdir(dfd, pathname);
+
+	return do_unlinkat(dfd, pathname, true);
 }
 
 int vfs_symlink(struct inode *dir, struct dentry *dentry, const char *oldname)
diff --git a/include/asm-generic/audit_dir_write.h b/include/asm-generic/audit_dir_write.h
index 7b61db4..5282aba 100644
--- a/include/asm-generic/audit_dir_write.h
+++ b/include/asm-generic/audit_dir_write.h
@@ -29,4 +29,5 @@  __NR_unlinkat,
 __NR_renameat,
 __NR_linkat,
 __NR_symlinkat,
+__NR_unlinkat_s,
 #endif
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 9ab779e..039e969 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1594,6 +1594,7 @@  struct super_operations {
 	int (*bdev_try_to_free_page)(struct super_block*, struct page*, gfp_t);
 	long (*nr_cached_objects)(struct super_block *, int);
 	long (*free_cached_objects)(struct super_block *, long, int);
+	void (*set_secure_delete) (struct super_block *, bool);
 };
 
 /*
diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index bda9b81..b88019b 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -877,4 +877,5 @@  asmlinkage long sys_seccomp(unsigned int op, unsigned int flags,
 asmlinkage long sys_getrandom(char __user *buf, size_t count,
 			      unsigned int flags);
 asmlinkage long sys_bpf(int cmd, union bpf_attr *attr, unsigned int size);
+asmlinkage long sys_unlinkat_s(int dfd, const char __user * pathname, int flag);
 #endif
diff --git a/include/uapi/asm-generic/unistd.h b/include/uapi/asm-generic/unistd.h
index 22749c1..2ba072e 100644
--- a/include/uapi/asm-generic/unistd.h
+++ b/include/uapi/asm-generic/unistd.h
@@ -707,9 +707,11 @@  __SYSCALL(__NR_getrandom, sys_getrandom)
 __SYSCALL(__NR_memfd_create, sys_memfd_create)
 #define __NR_bpf 280
 __SYSCALL(__NR_bpf, sys_bpf)
+#define __NR_unlinkat_s 281
+__SYSCALL(__NR_unlinkat_s, sys_unlinkat_s)
 
 #undef __NR_syscalls
-#define __NR_syscalls 281
+#define __NR_syscalls 282
 
 /*
  * All syscalls below here should go away really,
diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index fb12645..1507335 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -1110,6 +1110,8 @@  static struct syscall_fmt {
 	{ .name	    = "uname",	    .errmsg = true, .alias = "newuname", },
 	{ .name	    = "unlinkat",   .errmsg = true,
 	  .arg_scnprintf = { [0] = SCA_FDAT, /* dfd */ }, },
+	{ .name	    = "unlinkat_s",   .errmsg = true,
+	  .arg_scnprintf = { [0] = SCA_FDAT, /* dfd */ }, },
 	{ .name	    = "utimensat",  .errmsg = true,
 	  .arg_scnprintf = { [0] = SCA_FDAT, /* dirfd */ }, },
 	{ .name	    = "write",	    .errmsg = true,