Privsep is intended to provide some security in depth. The idea is based on the principal of least privilege. We assume that an attacker has exploited some bug in Nova and is able to execute arbitrary code. If privsep is doing its job, the attacker should at least not be able to perform any privileged operations which Nova does not normally perform. For developers, from the spec[1]: "The intention is to allow slightly more code into the privileged portion - enough that we now have sufficient “context” to make better security decisions. For example move from “run chown” to “take ownership of VM output file”." Unfortunately, I suspect out of expediency in the initial forklift from rootwrap, we've lost this critical principal of moving security-critical logic into privsep itself. Consequently, a brief and non-exhaustive audit reveals we have privsep functions to do the following: fs.py: * mount anything of any type, including remote storage, anywhere. * unmount anything * shred any file or block device, without restriction path.py: * read any file or block device, without restriction * overwrite any existing file or block device utils.py: * send any signal to any process Just the existence of these methods in nova's privsep module essentially gives the unprivileged nova process unfettered root access, which entirely defeats the purpose of privsep. As it stands today, privsep gives us literally nothing. I suspect that it's worse than rootwrap, in fact. If privsep protection is something we want, we need to do 2 things: 1. Immediately stop adding or using generic privsep functions. 2. Audit privsep properly for problematic methods, and remove them. For example, in libvirt.driver.get_console_output() we do nova.privsep.path.writefile(console_log, 'a+', data). Instead, we should have, eg, nova.privsep.libvirt.ensure_console_log_for_instance(instance). This would determine the instance path from its own secure config (see [1] again), and touch that file. We should probably also compile a list of privsep anti-patterns if there isn't one already to make life simpler for reviewers. For example, I suggest that no privsep function should take as an argument a path, or any other reference to a system resource; it should determine these itself. I view this as essentially a continuation of the privsep forklift. As with all such large mechanical changes I expect this to be disruptive and unloved by reviewers. Is it something we'd be prepared to prioritise? Matt [1] https://specs.openstack.org/openstack/oslo-specs/specs/liberty/privsep.html -- Matthew Booth Red Hat OpenStack Engineer, Compute DFG Phone: +442070094448 (UK)
On 2019-03-29 11:18:10 +0000 (+0000), Matthew Booth wrote: [...]
I suspect out of expediency in the initial forklift from rootwrap, we've lost this critical principal of moving security-critical logic into privsep itself. [...]
Yes, the expectation was that once the privsep framework was available, services relying on rootwrap would rework sensitive calls to operate within privsep and minimally limit those services' ability to influence their execution in dangerous ways. Nova isn't the only one still in this state (either with far-too-dangerous privsep functions exposed or still mostly relying on really lax rootwrap filters). This could make for an excellent cross-project effort, perhaps even a cycle goal, so I've added the [tc] tag to the subject. I've also tagged it for the [security-sig] as members there may have an interest in assisting with the effort. -- Jeremy Stanley
The reality is that privsep was always going to be a process. It's taken more than 80 patches to get close to removing rootwrap. There are other advantages to removing rootwrap, mainly around performance, the integration of library code, and general non-bonkersness (cat to tee to write to a file as root), etc. There is president in the code to mark calls as undesirable, and others could be marked like that as well, but ultimately someone needs to do an audit and fix things... That's more than one person can reasonably do. So, who wants to help try and improve this? Patches welcome. Michael On Fri., 29 Mar. 2019, 10:26 pm Matthew Booth, <mbooth@redhat.com> wrote:
Privsep is intended to provide some security in depth. The idea is based on the principal of least privilege. We assume that an attacker has exploited some bug in Nova and is able to execute arbitrary code. If privsep is doing its job, the attacker should at least not be able to perform any privileged operations which Nova does not normally perform.
For developers, from the spec[1]:
"The intention is to allow slightly more code into the privileged portion - enough that we now have sufficient “context” to make better security decisions. For example move from “run chown” to “take ownership of VM output file”."
Unfortunately, I suspect out of expediency in the initial forklift from rootwrap, we've lost this critical principal of moving security-critical logic into privsep itself. Consequently, a brief and non-exhaustive audit reveals we have privsep functions to do the following:
fs.py:
* mount anything of any type, including remote storage, anywhere. * unmount anything * shred any file or block device, without restriction
path.py:
* read any file or block device, without restriction * overwrite any existing file or block device
utils.py:
* send any signal to any process
Just the existence of these methods in nova's privsep module essentially gives the unprivileged nova process unfettered root access, which entirely defeats the purpose of privsep. As it stands today, privsep gives us literally nothing. I suspect that it's worse than rootwrap, in fact. If privsep protection is something we want, we need to do 2 things:
1. Immediately stop adding or using generic privsep functions. 2. Audit privsep properly for problematic methods, and remove them.
For example, in libvirt.driver.get_console_output() we do nova.privsep.path.writefile(console_log, 'a+', data). Instead, we should have, eg, nova.privsep.libvirt.ensure_console_log_for_instance(instance). This would determine the instance path from its own secure config (see [1] again), and touch that file.
We should probably also compile a list of privsep anti-patterns if there isn't one already to make life simpler for reviewers. For example, I suggest that no privsep function should take as an argument a path, or any other reference to a system resource; it should determine these itself.
I view this as essentially a continuation of the privsep forklift. As with all such large mechanical changes I expect this to be disruptive and unloved by reviewers. Is it something we'd be prepared to prioritise?
Matt
[1] https://specs.openstack.org/openstack/oslo-specs/specs/liberty/privsep.html -- Matthew Booth Red Hat OpenStack Engineer, Compute DFG
Phone: +442070094448 (UK)
Michael Still wrote:
The reality is that privsep was always going to be a process. It's taken more than 80 patches to get close to removing rootwrap.
There are other advantages to removing rootwrap, mainly around performance, the integration of library code, and general non-bonkersness (cat to tee to write to a file as root), etc.
There is president in the code to mark calls as undesirable, and others could be marked like that as well, but ultimately someone needs to do an audit and fix things... That's more than one person can reasonably do.
So, who wants to help try and improve this? Patches welcome.
It's been on my priority-2 TODO list for a while to help with that... Now if people would stop adding to my priority-1 TODO list... Agree that's definitely more than a one-person job, but migrating a specific call is also a reasonably self-contained unit of work that (1) does not require a deep understanding of all the code around it, and (2) does not commit you for a lifelong feature maintenance duty... So maybe it would be a good thing to suggest newcomers / students to get a poke at? I'm happy to help with the reviewing if we can come up with a topic name that helps finding those. -- Thierry Carrez (ttx)
On Sat., 30 Mar. 2019, 6:28 pm Thierry Carrez, <thierry@openstack.org> wrote:
Michael Still wrote:
The reality is that privsep was always going to be a process. It's taken more than 80 patches to get close to removing rootwrap.
There are other advantages to removing rootwrap, mainly around performance, the integration of library code, and general non-bonkersness (cat to tee to write to a file as root), etc.
There is president in the code to mark calls as undesirable, and others could be marked like that as well, but ultimately someone needs to do an audit and fix things... That's more than one person can reasonably do.
So, who wants to help try and improve this? Patches welcome.
It's been on my priority-2 TODO list for a while to help with that... Now if people would stop adding to my priority-1 TODO list...
Agree that's definitely more than a one-person job, but migrating a specific call is also a reasonably self-contained unit of work that (1) does not require a deep understanding of all the code around it, and (2) does not commit you for a lifelong feature maintenance duty... So maybe it would be a good thing to suggest newcomers / students to get a poke at? I'm happy to help with the reviewing if we can come up with a topic name that helps finding those.
One concern I have is that I am not sure it's always as simple as it looks. For example, we could enforce that device files are always in /dev, but is that always true on all architectures with all hypervisors? How do we know that? We could add enforcement and at the same time add a workaround flag to turn it off, which is immediately deprecated so people have a release to notice a breakage. Do we do that per enforcement rule? That's a lot of flags! Finally, the initial forklift has been a series of relatively simple code swaps (which is really the root of Mr Booth's concern). Even with taking the easy path there haven't been heaps of volunteers helping and some of this code has been in review for quite a long time. Do we really think that volunteers are going to show up now? Michael
On Sat, 30 Mar 2019 at 08:32, Michael Still <mikal@stillhq.com> wrote:
On Sat., 30 Mar. 2019, 6:28 pm Thierry Carrez, <thierry@openstack.org> wrote:
Michael Still wrote:
The reality is that privsep was always going to be a process. It's taken more than 80 patches to get close to removing rootwrap.
There are other advantages to removing rootwrap, mainly around performance, the integration of library code, and general non-bonkersness (cat to tee to write to a file as root), etc.
There is president in the code to mark calls as undesirable, and others could be marked like that as well, but ultimately someone needs to do an audit and fix things... That's more than one person can reasonably do.
So, who wants to help try and improve this? Patches welcome.
It's been on my priority-2 TODO list for a while to help with that... Now if people would stop adding to my priority-1 TODO list...
Agree that's definitely more than a one-person job, but migrating a specific call is also a reasonably self-contained unit of work that (1) does not require a deep understanding of all the code around it, and (2) does not commit you for a lifelong feature maintenance duty... So maybe it would be a good thing to suggest newcomers / students to get a poke at? I'm happy to help with the reviewing if we can come up with a topic name that helps finding those.
One concern I have is that I am not sure it's always as simple as it looks. For example, we could enforce that device files are always in /dev, but is that always true on all architectures with all hypervisors? How do we know that?
I think the answer here is that we don't have privsep functions like that *at all*. So instead of a generic privsep.data.discombobulate(device) you have privsep.libvirt.volume.foo.discombobulate(volume). I mentioned this as a general principal in my original mail which has since been snipped, that we should probably *never* pass a reference to a system resource (the most obvious example being a path) to a privsep function. Instead, privsep has its own config and can work that out for itself. This does mean a proliferation of specific privsep functions over a few generic ones, but that's how the security model works. In general we should expect classes to have a shadow privsep class containing security-sensitive logic.
We could add enforcement and at the same time add a workaround flag to turn it off, which is immediately deprecated so people have a release to notice a breakage. Do we do that per enforcement rule? That's a lot of flags!
Finally, the initial forklift has been a series of relatively simple code swaps (which is really the root of Mr Booth's concern). Even with taking the easy path there haven't been heaps of volunteers helping and some of this code has been in review for quite a long time. Do we really think that volunteers are going to show up now?
Exactly. Most folks acknowledge the importance of this stuff, but not to the extend of prioritising it for review: it's boring, involves large code churn, and doesn't immediately add anything obvious. Worse the code churn means that unless it lands quickly it's very soon going to be in merge conflict hell. Writing the code is likely the easy bit. Unless we can agree on this as a priority theme for a while I don't see us making any progress on it in practise. Matt -- Matthew Booth Red Hat OpenStack Engineer, Compute DFG Phone: +442070094448 (UK)
On 4/1/19 9:23 AM, Matthew Booth wrote:
On Sat, 30 Mar 2019 at 08:32, Michael Still <mikal@stillhq.com> wrote:
On Sat., 30 Mar. 2019, 6:28 pm Thierry Carrez, <thierry@openstack.org> wrote:
Michael Still wrote:
The reality is that privsep was always going to be a process. It's taken more than 80 patches to get close to removing rootwrap.
There are other advantages to removing rootwrap, mainly around performance, the integration of library code, and general non-bonkersness (cat to tee to write to a file as root), etc.
There is president in the code to mark calls as undesirable, and others could be marked like that as well, but ultimately someone needs to do an audit and fix things... That's more than one person can reasonably do.
So, who wants to help try and improve this? Patches welcome.
It's been on my priority-2 TODO list for a while to help with that... Now if people would stop adding to my priority-1 TODO list...
Agree that's definitely more than a one-person job, but migrating a specific call is also a reasonably self-contained unit of work that (1) does not require a deep understanding of all the code around it, and (2) does not commit you for a lifelong feature maintenance duty... So maybe it would be a good thing to suggest newcomers / students to get a poke at? I'm happy to help with the reviewing if we can come up with a topic name that helps finding those.
One concern I have is that I am not sure it's always as simple as it looks. For example, we could enforce that device files are always in /dev, but is that always true on all architectures with all hypervisors? How do we know that?
I think the answer here is that we don't have privsep functions like that *at all*. So instead of a generic privsep.data.discombobulate(device) you have privsep.libvirt.volume.foo.discombobulate(volume). I mentioned this as a general principal in my original mail which has since been snipped, that we should probably *never* pass a reference to a system resource (the most obvious example being a path) to a privsep function. Instead, privsep has its own config and can work that out for itself. This does mean a proliferation of specific privsep functions over a few generic ones, but that's how the security model works. In general we should expect classes to have a shadow privsep class containing security-sensitive logic.
This matches my understanding of the intent behind privsep. Instead of trying to figure out how to filter out dangerous paths or whatever, you have the privileged code build the path out of sanitized data passed in. You still have to be careful about preventing injection attacks (say I pass in [uuid]/../../..), but at least you don't have to try to write a rootwrap filter for every possible path in every possible driver. The privileged driver code itself knows the correct path and won't allow anything else.
On Tue, Apr 2, 2019 at 5:06 AM Ben Nemec <openstack@nemebean.com> wrote: [snip] This matches my understanding of the intent behind privsep. Instead of
trying to figure out how to filter out dangerous paths or whatever, you have the privileged code build the path out of sanitized data passed in. You still have to be careful about preventing injection attacks (say I pass in [uuid]/../../..), but at least you don't have to try to write a rootwrap filter for every possible path in every possible driver. The privileged driver code itself knows the correct path and won't allow anything else.
I don't think anyone disputes that theory. The issue here is that the calling code in nova makes that very hard in some cases. For example, nova.privsep.path.readfile exists because it is called by nova.virt.disk.localfs.read_file [1], which takes a raw path to read. That is in turn used by calling code to read at least two paths that are generated in calling code. Now all of that is fixable, and that's a relatively simple example, but it requires a call by call analysis that is quite expensive to do and will cause a lot of code churn to fix. No one is saying we shouldn't do it. However, the current privsep code took the approach of an incremental improvement on top of what we already had, instead of attempting to boil the ocean in a single patch series. The code for nova.virt.disk.localfs.read_file used to call a method called read_file_as_root that was just as bad as what we have now, but used rootwrap to cat arbitrary files in the filesystem (git sha f535e8bb9905b5632416135af5789704db6d2867 is the original transition). My point is that we need to start somewhere, and proposing that we abandon privsep entirely because its not immediately perfect isn't very helpful. Michael 1: Well, there is one other caller, but you get the idea.
On 4/1/19 2:59 PM, Michael Still wrote:
On Tue, Apr 2, 2019 at 5:06 AM Ben Nemec <openstack@nemebean.com <mailto:openstack@nemebean.com>> wrote:
[snip]
This matches my understanding of the intent behind privsep. Instead of trying to figure out how to filter out dangerous paths or whatever, you have the privileged code build the path out of sanitized data passed in. You still have to be careful about preventing injection attacks (say I pass in [uuid]/../../..), but at least you don't have to try to write a rootwrap filter for every possible path in every possible driver. The privileged driver code itself knows the correct path and won't allow anything else.
I don't think anyone disputes that theory.
The issue here is that the calling code in nova makes that very hard in some cases. For example, nova.privsep.path.readfile exists because it is called by nova.virt.disk.localfs.read_file [1], which takes a raw path to read. That is in turn used by calling code to read at least two paths that are generated in calling code. Now all of that is fixable, and that's a relatively simple example, but it requires a call by call analysis that is quite expensive to do and will cause a lot of code churn to fix.
No one is saying we shouldn't do it. However, the current privsep code took the approach of an incremental improvement on top of what we already had, instead of attempting to boil the ocean in a single patch series. The code for nova.virt.disk.localfs.read_file used to call a method called read_file_as_root that was just as bad as what we have now, but used rootwrap to cat arbitrary files in the filesystem (git sha f535e8bb9905b5632416135af5789704db6d2867 is the original transition).
My point is that we need to start somewhere, and proposing that we abandon privsep entirely because its not immediately perfect isn't very helpful.
For the record, that was not at all my intent. It doesn't sound to me like the current privsep implementation is meaningfully worse than using rootwrap, and it seems to me that it should be much easier (note: easier, not easy) to move forward now that everything is using privsep. I was just +1'ing the desired end state of privsep calls not taking paths as input anymore.
Michael
1: Well, there is one other caller, but you get the idea.
Ben Nemec wrote:
On 4/1/19 2:59 PM, Michael Still wrote:
On Tue, Apr 2, 2019 at 5:06 AM Ben Nemec <openstack@nemebean.com <mailto:openstack@nemebean.com>> wrote:
[snip] This matches my understanding of the intent behind privsep. Instead of trying to figure out how to filter out dangerous paths or whatever, you have the privileged code build the path out of sanitized data passed in. You still have to be careful about preventing injection attacks (say I pass in [uuid]/../../..), but at least you don't have to try to write a rootwrap filter for every possible path in every possible driver. The privileged driver code itself knows the correct path and won't allow anything else.
I don't think anyone disputes that theory.
The issue here is that the calling code in nova makes that very hard in some cases. For example, nova.privsep.path.readfile exists because it is called by nova.virt.disk.localfs.read_file [1], which takes a raw path to read. That is in turn used by calling code to read at least two paths that are generated in calling code. Now all of that is fixable, and that's a relatively simple example, but it requires a call by call analysis that is quite expensive to do and will cause a lot of code churn to fix.
No one is saying we shouldn't do it. However, the current privsep code took the approach of an incremental improvement on top of what we already had, instead of attempting to boil the ocean in a single patch series. The code for nova.virt.disk.localfs.read_file used to call a method called read_file_as_root that was just as bad as what we have now, but used rootwrap to cat arbitrary files in the filesystem (git sha f535e8bb9905b5632416135af5789704db6d2867 is the original transition).
My point is that we need to start somewhere, and proposing that we abandon privsep entirely because its not immediately perfect isn't very helpful.
For the record, that was not at all my intent. It doesn't sound to me like the current privsep implementation is meaningfully worse than using rootwrap, and it seems to me that it should be much easier (note: easier, not easy) to move forward now that everything is using privsep. I was just +1'ing the desired end state of privsep calls not taking paths as input anymore.
Also, the reason why the calling Nova code has generic things like nova.virt.disk.localfs.read_file is because rootwrap /encouraged/ reusing generic filters across calling code. I see the privsep transition as having 3 steps: 1- introduce privsep 2- change rootwrap calls into generic privsep functions 3- start refactoring calling code so that generic privsep functions can be replaced by narrow, context-aware functions You can tackle (2) and (3) at the same time. You get performance gains as soon as you start doing (2) bits, but you won't really gain security until (3) bits are completed for all the actually-this-means-root generic functions. So I'm not sure we should "immediately stop adding or using generic privsep functions" as they don't really make the situation worse than it already is security-wise. For *new* features we should definitely avoid the privsep generic functions antipattern -- but transforming a generic rootwrap call into a generic privsep function still sounds like a gain. It would be good to describe the antipattern and how to write "good" privsep functions though, if only to be able to point developers and reviewers to that. Suggestions on where we could do that? -- Thierry Carrez (ttx)
1- introduce privsep 2- change rootwrap calls into generic privsep functions 3- start refactoring calling code so that generic privsep functions can be replaced by narrow, context-aware functions
You can tackle (2) and (3) at the same time.
In Nova at least, (2) (without (3)) already has patches proposed all the way up [A], so I'm going to go out on a limb and say we're going to wait to tackle (3) until after that series [B], at least for existing code.
It would be good to describe the antipattern and how to write "good" privsep functions though, if only to be able to point developers and reviewers to that. Suggestions on where we could do that? Agree with this for sure. I understand the rootwrap->privsep thing well enough to review the existing series, but will need help understanding how (3) will need to look.
Long-term, the document should obviously live somewhere non-project-specific, and I don't know where that would be. Short(er)-term, since we have momentum on the issue in Nova, as well as a clear picture of all the places it needs to be applied (thanks to (2)/[A]), how about we include it in a Nova spec, since we're going to need one anyway? -efried [A] https://review.openstack.org/#/q/topic:my-own-personal-alternative-universe+...) [B] Note that that series has been in flight for quite a while. The patch that actually removes rootwrap (https://review.openstack.org/#/c/554438/) was first proposed right about a year ago. I'm hoping this email thread gets the series some more review attention.
On 4/3/19 8:15 AM, Eric Fried wrote:
1- introduce privsep 2- change rootwrap calls into generic privsep functions 3- start refactoring calling code so that generic privsep functions can be replaced by narrow, context-aware functions
You can tackle (2) and (3) at the same time.
In Nova at least, (2) (without (3)) already has patches proposed all the way up [A], so I'm going to go out on a limb and say we're going to wait to tackle (3) until after that series [B], at least for existing code.
It would be good to describe the antipattern and how to write "good" privsep functions though, if only to be able to point developers and reviewers to that. Suggestions on where we could do that? Agree with this for sure. I understand the rootwrap->privsep thing well enough to review the existing series, but will need help understanding how (3) will need to look.
Long-term, the document should obviously live somewhere non-project-specific, and I don't know where that would be. Short(er)-term, since we have momentum on the issue in Nova, as well as a clear picture of all the places it needs to be applied (thanks to (2)/[A]), how about we include it in a Nova spec, since we're going to need one anyway?
Wouldn't we put privsep best practices in the privsep docs? Currently the usage docs[0] just link to Michael's blog posts about implementing privsep, but that seems like the logical place to keep the guidelines for writing good privileged functions. 0: https://docs.openstack.org/oslo.privsep/latest/user/index.html
-efried
[A] https://review.openstack.org/#/q/topic:my-own-personal-alternative-universe+...) [B] Note that that series has been in flight for quite a while. The patch that actually removes rootwrap (https://review.openstack.org/#/c/554438/) was first proposed right about a year ago. I'm hoping this email thread gets the series some more review attention.
Ben Nemec wrote:
[...]
It would be good to describe the antipattern and how to write "good" privsep functions though, if only to be able to point developers and reviewers to that. Suggestions on where we could do that? Agree with this for sure. I understand the rootwrap->privsep thing well enough to review the existing series, but will need help understanding how (3) will need to look.
Long-term, the document should obviously live somewhere non-project-specific, and I don't know where that would be. Short(er)-term, since we have momentum on the issue in Nova, as well as a clear picture of all the places it needs to be applied (thanks to (2)/[A]), how about we include it in a Nova spec, since we're going to need one anyway?
Wouldn't we put privsep best practices in the privsep docs? Currently the usage docs[0] just link to Michael's blog posts about implementing privsep, but that seems like the logical place to keep the guidelines for writing good privileged functions.
Makes sense. I'll try to describe the antipattern, unless someone beats me to it. -- Thierry Carrez (ttx)
Thierry Carrez wrote:
Ben Nemec wrote:
[...]
It would be good to describe the antipattern and how to write "good" privsep functions though, if only to be able to point developers and reviewers to that. Suggestions on where we could do that? Agree with this for sure. I understand the rootwrap->privsep thing well enough to review the existing series, but will need help understanding how (3) will need to look.
Long-term, the document should obviously live somewhere non-project-specific, and I don't know where that would be. Short(er)-term, since we have momentum on the issue in Nova, as well as a clear picture of all the places it needs to be applied (thanks to (2)/[A]), how about we include it in a Nova spec, since we're going to need one anyway?
Wouldn't we put privsep best practices in the privsep docs? Currently the usage docs[0] just link to Michael's blog posts about implementing privsep, but that seems like the logical place to keep the guidelines for writing good privileged functions.
Makes sense. I'll try to describe the antipattern, unless someone beats me to it.
A start at: https://review.openstack.org/649997 -- Thierry Carrez (ttx)
Scrubbing the Nova PTG agenda (hence added [ptg] subject tag), and this is currently on it.
1- introduce privsep 2- change rootwrap calls into generic privsep functions 3- start refactoring calling code so that generic privsep functions can be replaced by narrow, context-aware functions
Based on the discussion in this thread, it sounds to me like nobody disagrees about what should be done; it's going to be a matter of getting mikal's series (2 above, [A] below) finished up and then finding one or more bodies to throw at the next step (3 above). Can I ask someone (perhaps Mr. Booth?) to file a blueprint to track this? Is there any part of 3 that we expect to be able to start/finish in Train? And other than that, is there anything further to discuss, or can we strike this from the PTG agenda?
[A] https://review.openstack.org/#/q/topic:my-own-personal-alternative-universe+...) [B] Note that that series has been in flight for quite a while. The patch that actually removes rootwrap (https://review.openstack.org/#/c/554438/) was first proposed right about a year ago. I'm hoping this email thread gets the series some more review attention.
Thanks, efried .
participants (6)
-
Ben Nemec
-
Eric Fried
-
Jeremy Stanley
-
Matthew Booth
-
Michael Still
-
Thierry Carrez