Anybody have any thoughts on sand-boxing and/or process isolation among mongrel2 handlers running on the same server? I have a scenario where one or more of my handler processes may be running untrusted Lua code. So far, it seems to be really tricky to get right. And I'm beginning to wonder if it can realistically be done safely after reading this: http://blog.phpfog.com/2011/03/22/how-we-got-owned-by-a-few-teenagers-and-why-it-will-never-happen-again/ Has anyone run into a similar situation or have any ideas on how to approach it? ✈ Matt
On Thu, Mar 24, 2011 at 09:17:09AM -0700, Matt Towers wrote: > Anybody have any thoughts on sand-boxing and/or process isolation > among mongrel2 handlers running on the same server? I have a scenario > where one or more of my handler processes may be running untrusted Lua > code. So far, it seems to be really tricky to get right. In Lua this is pretty easy at the language level where you run it in a limited environment using setfenv. If you need to give them access to the disk with chroot, then what you do is start up and then after it's all loaded do a chroot. I *believe* that the lposix library or lua_posix has chroot. -- Zed A. Shaw http://zedshaw.com/
On Sat, Mar 26, 2011 at 7:41 AM, Zed A. Shaw <zedshaw@zedshaw.com> wrote: > On Thu, Mar 24, 2011 at 09:17:09AM -0700, Matt Towers wrote: > > Anybody have any thoughts on sand-boxing and/or process isolation > > among mongrel2 handlers running on the same server? I have a scenario > > where one or more of my handler processes may be running untrusted Lua > > code. So far, it seems to be really tricky to get right. > > In Lua this is pretty easy at the language level where you run it in a > limited environment using setfenv. If you need to give them access to > the disk with chroot, then what you do is start up and then after it's > all loaded do a chroot. I *believe* that the lposix library or > lua_posix has chroot. > > -- > Zed A. Shaw > http://zedshaw.com/ > It's also easy enough to manually restrict everything, since you can trivially create your own wrapper functions around the core system and enforce any constraints you need there. A chroot is better though, since you didn't write it. :) I have a sandbox around somewhere for an old IRC bot I wrote, I'll see if I can dig it up. Also if you have troubles #lua on freenode should be able to point you in the right direction, Shirik and batrick in particular who run lua_bot and batbot respectively .
Hello, On 2011-03-24 17:17, Matt Towers wrote: > Anybody have any thoughts on sand-boxing and/or process isolation among > mongrel2 handlers running on the same server? I have a scenario where > one or more of my handler processes may be running untrusted Lua code. > So far, it seems to be really tricky to get right. > > And I'm beginning to wonder if it can realistically be done safely after > reading this: > http://blog.phpfog.com/2011/03/22/how-we-got-owned-by-a-few-teenagers-and-why-it-will-never-happen-again/ I read it too, the problem they faced was not the same as what you want to do and basically for what they wanted to do, it is "easy" to control as they give an instance per user. They failed when they started to have shared instance without the proper security. Heroku is using shared instance, they had security issues too, but it is possible to do it. > Has anyone run into a similar situation or have any ideas on how to > approach it? I would be extremely interested to have some input on the subject too. I have been looking at a way to do that no later than yesterday. Basically you need to control two parts: 1- the process access of the host at the "file level". 2- the zmq sockets. For the first part, it is easy, just chroot. For the zmq part, you do not want the handler of a given customer to bind on the host serving another customer and pulling requests for itself. This is where I haven't found a solution. I tried to look at per process control of the open ports or binded ports. I found an old patch: http://ex-parrot.com/~pdw/user-port-hack/ Basically, each user would have just a limited range of ports where he can bind/connect. Maybe just 2 of them (send/recv) for the handler and basta. Or you can use iptables with --uid-owner --gid-owner rules. If you combine chroot + iptable/kernel patch + restricted stack (for example only lua or python or PHP or Ruby) you can also have your language virtual machine forces some restrictions through monkey patching. In fact, as I write these lines, I think that in PHP it would be easy, just chroot each script and run each of them with a given php.ini. Put in the php.ini the authorized ports for zmq and compile a special PHP zmq extension which will check the php.ini value before opening a zmq connection. I suppose you can do the same for any language. You would need also to apply the same "rules" for the raw sockets (because you can trash a zmq socket if you connect with a standard socket call). Anyway, interesting problem and I am very interested with what you come up with. loïc -- Indefero - Project management and code hosting - http://www.indefero.net Photon - High Performance PHP Framework - http://photon-project.com Céondo Ltd - Web + Science = Fun - http://www.ceondo.com
> > For the zmq part, you do not want the handler of a given customer to > bind on the host serving another customer and pulling requests for > itself. This is where I haven't found a solution. I tried to look at per > process control of the open ports or binded ports. I found an old patch: > > http://ex-parrot.com/~pdw/user-port-hack/ > > Basically, each user would have just a limited range of ports where he > can bind/connect. Maybe just 2 of them (send/recv) for the handler and > basta. > > Or you can use iptables with --uid-owner --gid-owner rules. Obviously this only works if mongrel2 is on the local machine, but what about unix sockets instead of tcp sockets?
>2011/3/25 Jason Miller <jason@milr.com>: >>2011/3/24 Loic d'Anterroches <loic@ceondo.com>: >> For the zmq part, you do not want the handler of a given customer to >> bind on the host serving another customer and pulling requests for >> itself. This is where I haven't found a solution. I tried to look at per >> process control of the open ports or binded ports. I found an old patch: >> >> http://ex-parrot.com/~pdw/user-port-hack/ >> >> Basically, each user would have just a limited range of ports where he >> can bind/connect. Maybe just 2 of them (send/recv) for the handler and >> basta. >> >> Or you can use iptables with --uid-owner --gid-owner rules. > Obviously this only works if mongrel2 is on the local machine, but what > about unix sockets instead of tcp sockets? > Another idea could be to have a separated process that creates two ipc: sockets. This other process would be running as a user that can open sockets. It would just dump all m2 push/pull messages to an ipc://in socket and read from a ipc://out socket writing to m2 sub socket. So the user process would read/write always to an ipc: sockets The process that runs the untrusted code would be iptables restricted to open tcp: sockets (with --uid-owner/--gid-owner). Note that the two ipc: socket would be inside the chroot, so each user app would have its own ipc://{in,out} socket pair. Just remember to observe if this solution will bring a impeditive performance hit for your needs. -- Dalton Barreto http://daltonmatos.wordpress.com
On 2011-03-26 03:35, Dalton Barreto wrote: >> 2011/3/25 Jason Miller <jason@milr.com>: >>> 2011/3/24 Loic d'Anterroches <loic@ceondo.com>: >>> For the zmq part, you do not want the handler of a given customer to >>> bind on the host serving another customer and pulling requests for >>> itself. This is where I haven't found a solution. I tried to look at per >>> process control of the open ports or binded ports. I found an old patch: >>> >>> http://ex-parrot.com/~pdw/user-port-hack/ >>> >>> Basically, each user would have just a limited range of ports where he >>> can bind/connect. Maybe just 2 of them (send/recv) for the handler and >>> basta. >>> >>> Or you can use iptables with --uid-owner --gid-owner rules. >> Obviously this only works if mongrel2 is on the local machine, but what >> about unix sockets instead of tcp sockets? >> > > > Another idea could be to have a separated process that creates two > ipc: sockets. This other process would be running as a user that can > open sockets. It would just dump all m2 push/pull messages to an > ipc://in socket and read from a ipc://out socket writing to m2 sub > socket. So the user process would read/write always to an ipc: sockets > > The process that runs the untrusted code would be iptables restricted > to open tcp: sockets (with --uid-owner/--gid-owner). Note that the two > ipc: socket would be inside the chroot, so each user app would have > its own ipc://{in,out} socket pair. > > Just remember to observe if this solution will bring a impeditive > performance hit for your needs. What I would be pleased to have is the kind of flexibility we have with the firewall/security groups of EC2 but applied to a process. This way I could chroot my process and then say: you can communicate only with these IPs and on those ports. Being able to do that would mean being able to provide shared hosting of Mongrel2 in an Heroku like setup. Especially, I could allow the users to manage their own workers. For example, they could setup a central worker to process emails spamming their friends and get it to be accessed by their handlers even if they are on different VMs. At the moment, I am building the system for my company use, as all the code is trusted, this is not yet a big issue. For really Heroku like system, what would be the simplest way is maybe to start small with a single language support and to secure it directly at the supported language level. You have sandboxing or equivalent for nearly all the languages and you have the sources of all the VMs. For example with Python, you can just restrict the number of available modules, compile a customized zmq extension etc. I think this is the way AppEngine went. One language with custom libs to secure the system. loïc
You could get close by provisioning an EC2 micro instance for a given developer's handlers. ✈ Matt On Mar 26, 2011, at 12:54 , Loic d'Anterroches wrote: > What I would be pleased to have is the kind of flexibility we have with > the firewall/security groups of EC2 but applied to a process. This way I > could chroot my process and then say: you can communicate only with > these IPs and on those ports.
While speed testing Brubeck I used a few micro instances. It took them a long time to load and response times were really volatile. I'd be cautious using one for anything where response time is important. On Sun, Mar 27, 2011 at 7:02 PM, Matt Towers <matt@ziplinegames.com> wrote: > You could get close by provisioning an EC2 micro instance for a given > developer's handlers. > > ✈ Matt > > > > On Mar 26, 2011, at 12:54 , Loic d'Anterroches wrote: > > What I would be pleased to have is the kind of flexibility we have with > the firewall/security groups of EC2 but applied to a process. This way I > could chroot my process and then say: you can communicate only with > these IPs and on those ports. > > >
On 2011-03-28 01:04, James Dennis wrote: > While speed testing Brubeck I used a few micro instances. It took them a > long time to load and response times were really volatile. > > I'd be cautious using one for anything where response time is important. A very good description of the reasons of this volatility: http://perfcap.blogspot.com/2011/03/understanding-and-using-amazon-ebs.html It is coming from an engineer at Netflix. Basically, to be able to have a correct quality of service with AWS they end up doing their own QoS management on top of it, they use one 1TB EBS to minimize the number of clients on the same hardware node, get only instances which are big enough to be nearly alone on the hardware node. So, they had to find a way to minimize contention. Funny, just as I write this, AWS is announcing dedicated instances: http://aws.typepad.com/aws/2011/03/amazon-ec2-dedicated-instances.html > You could get close by provisioning an EC2 micro instance for a > given developer's handlers. Very expensive if you want mass hosting. I bet that 99.9% of the websites running at the moment are basically just a glorified blog, requiring a maximum of 64MB RAM (data and ram for the processes inclusive). And not that flexible as you cannot in that case spread the handlers over several VMs to have natural failover in case of failure of an instance. I am more and more looking at the full stack control with the ability to code only with a given language to correctly sandbox and control. loïc > > On Mar 26, 2011, at 12:54 , Loic d'Anterroches wrote: > >> What I would be pleased to have is the kind of flexibility we have >> with >> the firewall/security groups of EC2 but applied to a process. This >> way I >> could chroot my process and then say: you can communicate only with >> these IPs and on those ports. > >
On Fri, Mar 25, 2011 at 3:22 PM, Jason Miller <jason@milr.com> wrote: > > > > For the zmq part, you do not want the handler of a given customer to > > bind on the host serving another customer and pulling requests for > > itself. This is where I haven't found a solution. I tried to look at per > > process control of the open ports or binded ports. I found an old patch: > > > > http://ex-parrot.com/~pdw/user-port-hack/ > > > > Basically, each user would have just a limited range of ports where he > > can bind/connect. Maybe just 2 of them (send/recv) for the handler and > > basta. > > > > Or you can use iptables with --uid-owner --gid-owner rules. > Obviously this only works if mongrel2 is on the local machine, but what > about unix sockets instead of tcp sockets? > > I'd suggest just giving the untrusted code a single socket, and a restricted api to use with it. Letting them create things is a recipe for trouble.
On Mar 24, 2011, at 11:31 , Loic d'Anterroches wrote: > Anyway, interesting problem and I am very interested with what you come > up with. Right now I'm thinking that I'll severely sandbox the handler's Lua environment. No filesystem access, no network access other than responding to HTTP requests via Tir and the only persistence will be via a static, authenticated connection to the data store. And wrap all of the processes with ulimit to prevent them from getting out of hand. Here's what I've found so far on Lua sandboxing: http://lua-users.org/wiki/SandBoxes ✈ Matt
Sandboxing Lua is relatively easy to do. First step is to run intrusted code in a separate environment in a separate process with ulimit or similar watching memory and CPU usage. Then you need to limit your environment, removing everything dangerous and everything that can be used to break out. In this case we have a nice interop system already in zmq. I'll try to give you more specific info later as this is typed on my phone and I'm getting annoyed at it already :) On Mar 25, 2011 7:17 AM, "Matt Towers" <matt@ziplinegames.com> wrote: > On Mar 24, 2011, at 11:31 , Loic d'Anterroches wrote: > >> Anyway, interesting problem and I am very interested with what you come >> up with. > > > > Right now I'm thinking that I'll severely sandbox the handler's Lua environment. No filesystem access, no network access other than responding to HTTP requests via Tir and the only persistence will be via a static, authenticated connection to the data store. And wrap all of the processes with ulimit to prevent them from getting out of hand. > > Here's what I've found so far on Lua sandboxing: http://lua-users.org/wiki/SandBoxes > > ✈ Matt
One method that seems to be problematic is loadstring(). According to the
wiki page on sand-boxing below, loadstring will return a function in the
global environment? In my case, I intend to allow code to be loaded from
a mongodb gridfs, which comes back as a string, necessitating the use of
loadstring.
loadstring -- UNSAFE. See load. Even this:
local oldloadstring = loadstring
local function safeloadstring(s, chunkname)
local f, message = oldloadstring(s, chunkname)
if not f then
return f, message
end
setfenv(f, getfenv(2))
return f
end
isn't safe. For example, pcall(safeloadstring, some_script) will load
some_script in global environment. --SergeyRozhenko
✈ Matt
On Mar 24, 2011, at 15:34 , joshua simmons wrote:
> Sandboxing Lua is relatively easy to do. First step is to run intrusted
code in a separate environment in a separate process with ulimit or
similar watching memory and CPU usage. Then you need to limit your
environment, removing everything dangerous and everything that can be used
to break out. In this case we have a nice interop system already in zmq.
>
> I'll try to give you more specific info later as this is typed on my
phone and I'm getting annoyed at it already :)
>
> On Mar 25, 2011 7:17 AM, "Matt Towers" <matt@ziplinegames.com> wrote:
> > On Mar 24, 2011, at 11:31 , Loic d'Anterroches wrote:
> >
> >> Anyway, interesting problem and I am very interested with what you come
> >> up with.
> >
> >
> >
> > Right now I'm thinking that I'll severely sandbox the handler's Lua
environment. No filesystem access, no network access other than responding
to HTTP requests via Tir and the only persistence will be via a static,
authenticated connection to the data store. And wrap all of the processes
with ulimit to prevent them from getting out of hand.
> >
> > Here's what I've found so far on Lua sandboxing:
http://lua-users.org/wiki/SandBoxes
> >
> > ✈ Matt
Loadstring is fine, you just don't want it available in your sandbox. These things are not unsafe to use in sandboxing, they're unsafe to be accessible from within the sandbox, since you can break out with them. On Fri, Mar 25, 2011 at 9:42 AM, Matt Towers <matt@ziplinegames.com> wrote: > One method that seems to be problematic is loadstring(). According to the > wiki page on sand-boxing below, loadstring will return a function in the > global environment? In my case, I intend to allow code to be loaded from a > mongodb gridfs, which comes back as a string, necessitating the use of > loadstring. > > > - loadstring -- UNSAFE. See load. Even this: > > local oldloadstring = loadstringlocal function safeloadstring(s, chunkname) > local f, message = oldloadstring(s, chunkname) > if not f then > return f, message > end > setfenv(f, getfenv(2)) > return fend > > isn't safe. For example, pcall(safeloadstring, some_script) will load > some_script in global environment. --SergeyRozhenko<http://lua-users.org/wiki/SergeyRozhenko> > > > ✈ Matt > > > > On Mar 24, 2011, at 15:34 , joshua simmons wrote: > > Sandboxing Lua is relatively easy to do. First step is to run intrusted > code in a separate environment in a separate process with ulimit or similar > watching memory and CPU usage. Then you need to limit your environment, > removing everything dangerous and everything that can be used to break out. > In this case we have a nice interop system already in zmq. > > I'll try to give you more specific info later as this is typed on my phone > and I'm getting annoyed at it already :) > On Mar 25, 2011 7:17 AM, "Matt Towers" <matt@ziplinegames.com> wrote: > > On Mar 24, 2011, at 11:31 , Loic d'Anterroches wrote: > > > >> Anyway, interesting problem and I am very interested with what you come > >> up with. > > > > > > > > Right now I'm thinking that I'll severely sandbox the handler's Lua > environment. No filesystem access, no network access other than responding > to HTTP requests via Tir and the only persistence will be via a static, > authenticated connection to the data store. And wrap all of the processes > with ulimit to prevent them from getting out of hand. > > > > Here's what I've found so far on Lua sandboxing: > http://lua-users.org/wiki/SandBoxes > > > > ✈ Matt > > >
If I can't use loadstring within the sandbox, is there some way similar to the "safeloadstring" below that i could provide within the sandbox? ✈ Matt On Mar 24, 2011, at 15:51 , joshua simmons wrote: > Loadstring is fine, you just don't want it available in your sandbox. > > These things are not unsafe to use in sandboxing, they're unsafe to be accessible from within the sandbox, since you can break out with them. > > On Fri, Mar 25, 2011 at 9:42 AM, Matt Towers <matt@ziplinegames.com> wrote: > One method that seems to be problematic is loadstring(). According to the wiki page on sand-boxing below, loadstring will return a function in the global environment? In my case, I intend to allow code to be loaded from a mongodb gridfs, which comes back as a string, necessitating the use of loadstring. > > loadstring -- UNSAFE. See load. Even this: > local oldloadstring = loadstring > local function safeloadstring(s, chunkname) > local f, message = oldloadstring(s, chunkname) > if not f then > return f, message > end > setfenv(f, getfenv(2)) > return f > end > isn't safe. For example, pcall(safeloadstring, some_script) will load some_script in global environment. --SergeyRozhenko > > > ✈ Matt > > > > On Mar 24, 2011, at 15:34 , joshua simmons wrote: > >> Sandboxing Lua is relatively easy to do. First step is to run intrusted code in a separate environment in a separate process with ulimit or similar watching memory and CPU usage. Then you need to limit your environment, removing everything dangerous and everything that can be used to break out. In this case we have a nice interop system already in zmq. >> >> I'll try to give you more specific info later as this is typed on my phone and I'm getting annoyed at it already :) >> >> On Mar 25, 2011 7:17 AM, "Matt Towers" <matt@ziplinegames.com> wrote: >> > On Mar 24, 2011, at 11:31 , Loic d'Anterroches wrote: >> > >> >> Anyway, interesting problem and I am very interested with what you come >> >> up with. >> > >> > >> > >> > Right now I'm thinking that I'll severely sandbox the handler's Lua environment. No filesystem access, no network access other than responding to HTTP requests via Tir and the only persistence will be via a static, authenticated connection to the data store. And wrap all of the processes with ulimit to prevent them from getting out of hand. >> > >> > Here's what I've found so far on Lua sandboxing: http://lua-users.org/wiki/SandBoxes >> > >> > ✈ Matt > >
On Thu, Mar 24, 2011 at 11:31 AM, Loic d'Anterroches <loic@ceondo.com> wrote: > Anyway, interesting problem and I am very interested with what you come > up with. I've also thought about how to make M2 work in a shared hosting environment. I think the solution that involves the least pain would be to have M2 exposed to the users' processes via unix sockets, either by speaking to M2 directly, or by using/writing a zmq hub that proxies between TCP to M2 and unix sockets to the users. This would keep user communication on the filesystem and bypass the troubles that would come with allowing them to speak to M2 over the network. -Isaac