[PHP-DEV] [Discussion] Sandbox API

Sand Box: A first class API that allows unit testing of code with mocks
and stubs of other classes or functions, without the need to modify the
class under test.

This is an initial idea of how a Sand Box API could work:

$oSandbox = new SPLSandBox();

$oSandbox->MockFunction('\mocks\fopen','\fopen');
$oSandbox->MockFunction('\mocks\fread','\fread');
$oSandbox->MockFunction('\mocks\fwrite','\fwrite');
$oSandbox->MockFunction('\mocks\fclose','\fclose');

$oFileManager = $oSandbox->GetInstance('MyFileManager');

$oFileManager->WriteFile('/path/to/file.txt');

Let's break down what's happening:

The call to `new SPLSandBox();` creates a sandbox instance.

When we then call `$oSandbox->MockFunction()`, we specify the name of a
function that's generally available to the PHP application, and also
specify the name of that function as it should appear to code running
inside the sandbox.

In this case, we have told the sandbox that any call to \fopen, \fread,
\fwrite, or \fclose by sandboxed code should be remapped to
\mocks\fopen, \mocks\fread, \mocks\fwrite, and \mocks\fclose.

The code in the sandbox is completely unaware of this and remains
unchanged.

The line:

$oFileManager = $oSandbox->GetInstance('MyFileManager');

Creates a **sandboxed** instance of the MyFileManager class.

$oFileManager is exactly the same as it normally is, but any calls it
makes to a function will use the Mock function instead, if defined by
`MockFunction()`, or will simply fail if no mock has been loaded into
the sandbox.

We then tell the sandboxed instance of the MyFileManager class to write
out a file.

The `MyFileManager::WriteFile()` method calls to `\fopen`, `\fwrite`,
and `\fclose`.

But, unknown to MyFileManager, the sandbox has remapped those calls to
the mock functions we specified instead.

If you know how a chroot jail works on Linux, this will seem very
familiar. Pass in all the dependencies the class under test needs,
remapped to stubs, then get a sandboxed instance of the class to test
with.

Sandbox: Security

A SandBox has two use cases:

1. Unit Testing of code with mocks or stubs, and also, allowing testing
with different environments.

2. The secure running of 3rd party code inside a 1st party application.

For the second use case, I will use a fictional blogging software
called "Hot Blog" as the example.

Hot Blog is a very popular Open Source blogging platform. Hot Blog
allows third party developers to write plugins.

While many Hot Blog plugin developers have the best of intentions, some
of them are novice coders that make security mistakes.

So let's talk about how Hot Blog could benefit from using the new
SandBox API.

By default, a SandBox instance is a blank slate. There's nothing inside
of it, unless the SandBox is told to have something in it.

That means that sandboxed code that tries to read $_SESSION will find
an empty array. Same with $_SERVER, $_POST, and $_GET.

That's by default. This allows the code that controls the sandbox to
create custom access to application level resources.

Let's say that Hot Blog wants plugin developers to be able to access
certain $_POST variables, but only *after* Hot Blog has checked the
strings for multi-byte attack vulnerabilities.

To do this, Hot Blog creates a class called PluginAPI with a
GetCleanPost method. This lets sandboxed plugins get $_POST data
without being able to bypass Hot Blog's mandatory security check.
(Remember, $_POST is empty inside the sandbox).

The code looks like this:

$oSandbox = new SPLSandBox();
$oSandbox->MockClass('\HotBlog\PluginAPI','\HotBlog\PluginAPI');
$oUserPlugin = $oSandbox->GetInstance('BobsMagicPlugin');
$oUserPlugin->Run();

Because "Bob" has written his plugin as a Hot Blog plugin and knows
that Hot Blog's rules require him to use
\HotBlog\PluginAPI::GetCleanPost() to access a $_POST variable, he
calls that instead of using $_POST.

Now, Hot Blog can impose mandatory security checks on incoming data
making their application more secure.

Next, let's talk about includes. By default, if sandboxed code tries to
include or require a file, a SandBoxAccessViolation is thrown.

Letting sandboxed code include whatever it wants defeats the point of a
sandbox, at least for security use cases.

Of course, includes are useful, and plugins may need them. But the
outer application should be able to control that access.

Enter SPLSandBox::RegisterIncludeHandler().

RegisterIncludeHandler accepts a callable.

The callable's signature is:

(int $IncludeType, string $FileName, string $FilePath)

Where:

$IncludeType is:
0 - require
1 - require_once
2 - include
3 - include_once

$FileName is the file without the path, and $FilePath is the path with
trailing `/`.

If the sandbox should allow includes, the sandbox should have an
Include Handler registered.

The SandBox API calls the include handler, if defined, when sandboxed
code tries to include or require files.

Let's setup a function so our plugin authors can include files, but
only from their own plugin directory:

// Sandbox setup for includes:

$oSandbox = new SPLSandBox();
$oSandbox->RegisterIncludeHandler('HotBlogInclude');

$oUserPlugin = $oSandbox->GetInstance('BobsMagicPlugin');
$oUserPlugin->Run();

// Include Handler:

function HotBlogInclude($Type, $FileName, $FilePath){
  
   if(file_exists($PluginDirectoy.$FileName)){
      $oSandbox->Include($PluginDirectoy.$FileName);
      return 0;
   }
   return 1; // error!
}

In the above example, $FilePath contained the path that Bob requested
with his include statement. But we ignored it! Bob is only allowed to
include from his plugin's own directory, so we see if the file is in
$PluginDirectoy instead.

If the file is in Bob's directory, we include it *into* the sandbox
with SPLSandBox::Include(), making it available to Bob's code, but
keeping the main application code clean of any registrations the
include may cause.

** Back to Unit Testing **

For the Unit Testing use case, however, certain code under test may
normally read from $_GET, and that shouldn't change under test.

In this next example, we are running a unit test on a FrontController
class, and we want to see if it works with many different URL
structures.

Normally, the web server will map example.com/a/b/c to $_GET vars, so
the FrontController class expects something like:

$_GET = [
   'a' => 'Forum',
   'b' => 'Post'
   'c' => '123'
];

Let's make sure our FrontController is doing everything right with a
battery of tests:

$aControllerTests = [
   ['Forum','Post','123'],
   ['Blog','Post','123'],
   ['Article','acb'],
   ['Cart','Product','723'],
   ['Cart','Category','Jeans']
];

$aTestResults = ;
foreach($aControllerTests as $TestID => $GetVars){

   $oSandbox = new SPLSandBox();
   $oSandbox->MockGlobal('$_GET',$GetVars);
   $oController = $oSandbox->GetInstance('FrontController');
   $aTestResults[$TestID] = $oController->Init();
   $oSandbox->Destroy();
}

SPLSandBox::MockGlobal() lets us set global variables (including super
globals) inside the sandbox.

Now, $aTestResults contains the results of each test, run with separate
$_GET parameters.

With this structure, you could get **every** valid URL from a database
and run a unit test with custom $_GET params on your FrontController to
make sure everything works.

On Tue, Aug 6, 2024, at 10:41, Nick Lockheart wrote:

Sandbox: Security

A SandBox has two use cases:

  1. Unit Testing of code with mocks or stubs, and also, allowing testing

with different environments.

  1. The secure running of 3rd party code inside a 1st party application.

For the second use case, I will use a fictional blogging software

called “Hot Blog” as the example.

Hot Blog is a very popular Open Source blogging platform. Hot Blog

allows third party developers to write plugins.

While many Hot Blog plugin developers have the best of intentions, some

of them are novice coders that make security mistakes.

So let’s talk about how Hot Blog could benefit from using the new

SandBox API.

By default, a SandBox instance is a blank slate. There’s nothing inside

of it, unless the SandBox is told to have something in it.

That means that sandboxed code that tries to read $_SESSION will find

an empty array. Same with $_SERVER, $_POST, and $_GET.

That’s by default. This allows the code that controls the sandbox to

create custom access to application level resources.

Let’s say that Hot Blog wants plugin developers to be able to access

certain $_POST variables, but only after Hot Blog has checked the

strings for multi-byte attack vulnerabilities.

To do this, Hot Blog creates a class called PluginAPI with a

GetCleanPost method. This lets sandboxed plugins get $_POST data

without being able to bypass Hot Blog’s mandatory security check.

(Remember, $_POST is empty inside the sandbox).

The code looks like this:

$oSandbox = new SPLSandBox();

$oSandbox->MockClass(‘\HotBlog\PluginAPI’,‘\HotBlog\PluginAPI’);

$oUserPlugin = $oSandbox->GetInstance(‘BobsMagicPlugin’);

$oUserPlugin->Run();

Because “Bob” has written his plugin as a Hot Blog plugin and knows

that Hot Blog’s rules require him to use

\HotBlog\PluginAPI::GetCleanPost() to access a $_POST variable, he

calls that instead of using $_POST.

Now, Hot Blog can impose mandatory security checks on incoming data

making their application more secure.

Next, let’s talk about includes. By default, if sandboxed code tries to

include or require a file, a SandBoxAccessViolation is thrown.

Letting sandboxed code include whatever it wants defeats the point of a

sandbox, at least for security use cases.

Of course, includes are useful, and plugins may need them. But the

outer application should be able to control that access.

Enter SPLSandBox::RegisterIncludeHandler().

RegisterIncludeHandler accepts a callable.

The callable’s signature is:

(int $IncludeType, string $FileName, string $FilePath)

Where:

$IncludeType is:

0 - require

1 - require_once

2 - include

3 - include_once

$FileName is the file without the path, and $FilePath is the path with

trailing /.

If the sandbox should allow includes, the sandbox should have an

Include Handler registered.

The SandBox API calls the include handler, if defined, when sandboxed

code tries to include or require files.

Let’s setup a function so our plugin authors can include files, but

only from their own plugin directory:

// Sandbox setup for includes:

$oSandbox = new SPLSandBox();

$oSandbox->RegisterIncludeHandler(‘HotBlogInclude’);

$oUserPlugin = $oSandbox->GetInstance(‘BobsMagicPlugin’);

$oUserPlugin->Run();

// Include Handler:

function HotBlogInclude($Type, $FileName, $FilePath){

if(file_exists($PluginDirectoy.$FileName)){

$oSandbox->Include($PluginDirectoy.$FileName);

return 0;

}

return 1; // error!

}

In the above example, $FilePath contained the path that Bob requested

with his include statement. But we ignored it! Bob is only allowed to

include from his plugin’s own directory, so we see if the file is in

$PluginDirectoy instead.

If the file is in Bob’s directory, we include it into the sandbox

with SPLSandBox::Include(), making it available to Bob’s code, but

keeping the main application code clean of any registrations the

include may cause.

** Back to Unit Testing **

For the Unit Testing use case, however, certain code under test may

normally read from $_GET, and that shouldn’t change under test.

In this next example, we are running a unit test on a FrontController

class, and we want to see if it works with many different URL

structures.

Normally, the web server will map example.com/a/b/c to $_GET vars, so

the FrontController class expects something like:

$_GET = [

‘a’ => ‘Forum’,

‘b’ => ‘Post’

‘c’ => ‘123’

];

Let’s make sure our FrontController is doing everything right with a

battery of tests:

$aControllerTests = [

[‘Forum’,‘Post’,‘123’],

[‘Blog’,‘Post’,‘123’],

[‘Article’,‘acb’],

[‘Cart’,‘Product’,‘723’],

[‘Cart’,‘Category’,‘Jeans’]

];

$aTestResults = ;

foreach($aControllerTests as $TestID => $GetVars){

$oSandbox = new SPLSandBox();

$oSandbox->MockGlobal(‘$_GET’,$GetVars);

$oController = $oSandbox->GetInstance(‘FrontController’);

$aTestResults[$TestID] = $oController->Init();

$oSandbox->Destroy();

}

SPLSandBox::MockGlobal() lets us set global variables (including super

globals) inside the sandbox.

Now, $aTestResults contains the results of each test, run with separate

$_GET parameters.

With this structure, you could get every valid URL from a database

and run a unit test with custom $_GET params on your FrontController to

make sure everything works.

Hey Nick,

This looks quite valuable, and I assume auto loading would work just like normal? Register an autoloader that will eventually require the file and call this function?

It would be nice to provide a simplified api as well, maybe “CopyCurrentEnvironment()” or something? In most cases, it is easier/faster to find things to remove vs. adding everything on every plugin/request every time.

In saying that, it would be great if there was an api for “sharing” a base-sandbox pool via shm (or similar to a pool) so that the base vm doesn’t need to be recreated potentially hundreds of times per request.

— Rob

On Aug 6, 2024, at 2:09 AM, Nick Lockheart <lists@ageofdream.com> wrote:

Sand Box: A first class API that allows unit testing of code with mocks
and stubs of other classes or functions, without the need to modify the
class under test.

This is an initial idea of how a Sand Box API could work:

$oSandbox = new SPLSandBox();

$oSandbox->MockFunction('\mocks\fopen','\fopen');
$oSandbox->MockFunction('\mocks\fread','\fread');
$oSandbox->MockFunction('\mocks\fwrite','\fwrite');
$oSandbox->MockFunction('\mocks\fclose','\fclose');

$oFileManager = $oSandbox->GetInstance('MyFileManager');

$oFileManager->WriteFile('/path/to/file.txt');

On the surface, this sounds like a good idea.

This is already possible to do in userland, with a few edge-cases. The edge-cases are:

1. Anything that uses `self::class` or similar will break if it expects `MyFileManager` to be returned instead of something like `MyFileManager_4k2x8j`.

2. Anything that expects a static variable in `MyFileManager` to have a pre-existing value set but an earlier use of `MyFileManager` will not work as expected. However, expecting that would be against testing best practices so I don't see this as a real concern given your use-case.

OTOH, a userland implementation will also not be very performant when compared to a potential PHP core implementation, making it less than ideal for testing.

However, doing a userland implementation would be a good proof-of-concept, allow others to try it, allow others to contribute to the exact syntax and semantics, and finally a userland implementation could reveal any potential hidden issues in the design before moving on to a proper implementation in C for PHP core.

-Mike

P.S. If you are unfamiliar with how to implement in userland you can use the same techniques I used in my proof-of-concept for Userland Packages: GitHub - mikeschinkel/userland-packages: Userland Packages — Single directory "Packages" in PHP with File-Only and Package-Only scope AND a Proof-of-Concept for a potential PHP RFC. If that link is not enough and you instead want to ask specific questions about how to implement in PHP, feel free to contact me off-list.

This looks quite valuable, and I assume auto loading would work just
like normal? Register an autoloader that will eventually require the
file and call this function?

It would be nice to provide a simplified api as well, maybe
“CopyCurrentEnvironment()” or something? In most cases, it is
easier/faster to find things to remove vs. adding everything on every
plugin/request every time.

In saying that, it would be great if there was an api for “sharing” a
base-sandbox pool via shm (or similar to a pool) so that the base vm
doesn’t need to be recreated potentially hundreds of times per
request.

I didn't want to be too overwhelming on the first post, but since it
seems the feedback is positive, here's a more complete list of what I
think should be included:

// Passthroughs:

// Make all user and built-in global functions
// available inside the sandbox:
SPLSandBox::PassGlobalFunctions();

// Make all built-in (but not user) functions
// available inside the sandbox:
SPLSandBox::PassBuiltInFunctions();

// Make all built-in (but not user) functions
// available inside the sandbox, EXCEPT blacklisted functions:
SPLSandBox::PassBuiltInFunctionsExcept(['eval','exit']);

(assuming exit becomes a function).

// Allow only specific functions to be called (whitelist method):
$aWhiteList = ['array_key_exists','in_array'];
SPLSandBox::PassFunctions($aWhiteList);

// Allow specific classes to be used by sandbox code:
$aClassList = ['\MyAPP\PluginAPI'];
SPLSandBox::PassClasses($aClassList);

// Allow specific constants to be seen by sandbox code:
SPLSandBox::PassConstants(['\DB_USERNAME','\DB_PASSWORD']);

// Language Construct Callbacks:

The callbacks allow the outer code to control and monitor certain
language features of the sandboxed code during execution.

// Called when the sandbox code tries to include or require something:
SPLSandBox::RegisterIncludeHandler();

// Includes a file into the sandbox:
SPLSandBox::Include('path/to/file.php');

// Your sandbox autoloader logic could be incorporated here:
SPLSandBox::RegisterAutoLoadHandler();

// But, for unit testing with mocks and stubs,
// it might be better to use:
SPLSandBox::RegisterNewHandler();

The NewHandler callback is called every time sandboxed code tries to
instantiate an object with `new`.

// Example: Override what `new` returns to code running in the sandbox:
function MyNewHandler(string $ClassName, array $aConstructorArgs){

   if($ClassName === '\DateTime'){
      return new FakeDate();
   }
   return new $ClassName($aConstructorArgs);
}

// Every time a sandboxed class calls a method, call this first:
SPLSandBox::RegisterMethodCallHandler();

Useful for unit testing to monitor if the tested class is calling the
methods it should be calling. Ignores visibility rules.

Could also allow for infinite recursion detection from the outside.

// The companion for static method calls, gets called
// every time a method is called on a class statically:
SPLSandBox::RegisterStaticMethodCallHandler();

// Each time a sandboxed loop iterates, call this first:
// Allows the outer code to put limit breaks on the sandboxed code.
SPLSandBox::RegisterLoopHandler();

The callback takes the type of loop, and the variables that make up the
loop ($i for for(), $Key => $value for foreach(), etc)

// If the sandboxed code calls echo, print, or
// causes any output to occur (ie outside of <?php tags:
SPLSandBox::RegisterEchoHandler();

Could be used to make sure templates behave as desired, but perhaps
even more useful, it lets you *fail* unit tests if any output occurs
from a test that shouldn't produce output.

ie. Catch echo statements used in testing, or whitespace after a
closing ?> tag.

// If the sandbox code tries to use `exit` or `die`,
// call this function instead:
SPLSandBox::RegisterExitHandler();

You'll probably want to destroy the sandbox from the outside (see
below), rather than letting sandboxed code halt the test framework or
main application.

// If sandboxed code throws, it should *not*
// be a throw in the outer application space.
// Every exception throw triggers this callback,
// even if there is a catch block:
SPLSandBox::RegisterExceptionHandler();

// When a catch block runs, invoke this callback first:
SPLSandBox::RegisterCatchHandler();

Allows unit tests to make sure that exceptions are handled correctly.

// For non-Exceptions (warning, notice, deprecated, fatal,
// and yes, maybe even parse because we're in a sandbox):
SPLSandBox::RegisterErrorHandler();

// Inside your Handlers, you may want to know the file and line
// that triggered the callback.
SPLSandBox::GetCurrentLine();
SPLSandBox::GetFileName();
SPLSandBox::GetClassName();
SPLSandBox::GetFunctionName();

For example, if you want to catch any echo left behind from debugging,
you might also want to find the line and file where the statement is
located to remove it. The above methods would be usable inside any of
the callbacks.

// Inside your Handlers, abort execution:
SPLSandBox::Stop();

// Resource Limits, hopefully self explanatory:
SPLSandBox::SetMemoryLimit();
SPLSandBox::SetExecutionTimeLimit();

// Mocks, Stubs:

// Put a function from the outer application
// into the sandbox as the specified name:
SPLSandBox::MockFunction('\mocks\fopen','\fopen');

// Put a class from the outer application
// into the sandbox as the specified name:
SPLSandBox::MockClass('\Mocks\FakeTime','\DateTime');

// Set global variables (and super globals) inside the sandbox:
SPLSandBox::MockGlobal('$_GET',$aGetVars);

// Set global constants inside the sandbox:
SPLSandBox::MockConstant('MY_CONSTANT',$Value);

// Invocation

// You can run procedural code to setup your test environment.
// Runs the array of code lines in the sandbox context:
$aProceduralCode = [
   "$a = 1;",
   "$b = 2;",
   "$c = DoSomething($a, $b);"
];

SPLSandBox::Procedure($aProceduralCode);

// You can get a pointer to an object instantiated in the sandbox:
$oClass = SPLSandBox::GetInstance('ClassName');

// And use it like you normally would:
$oClass->DoSomething();

This class runs entirely in the sandbox.

// You cleanup the resource with:
SPLSandBox::Destroy();

Hey Nick,

Looking forward to the RFC!

On Tue, Aug 6, 2024, at 19:28, Nick Lockheart wrote:

This looks quite valuable, and I assume auto loading would work just

like normal? Register an autoloader that will eventually require the

file and call this function?

It would be nice to provide a simplified api as well, maybe

“CopyCurrentEnvironment()” or something? In most cases, it is

easier/faster to find things to remove vs. adding everything on every

plugin/request every time.

In saying that, it would be great if there was an api for “sharing” a

base-sandbox pool via shm (or similar to a pool) so that the base vm

doesn’t need to be recreated potentially hundreds of times per

request.

I didn’t want to be too overwhelming on the first post, but since it

seems the feedback is positive, here’s a more complete list of what I

think should be included:

// Passthroughs:

// Make all user and built-in global functions

// available inside the sandbox:

SPLSandBox::PassGlobalFunctions();

Bike shed: maybe have PassGlobals() instead? Or rather, PassNamespace and have \ be a valid namespace.

// Make all built-in (but not user) functions

// available inside the sandbox:

SPLSandBox::PassBuiltInFunctions();

// Make all built-in (but not user) functions

// available inside the sandbox, EXCEPT blacklisted functions:

SPLSandBox::PassBuiltInFunctionsExcept([‘eval’,‘exit’]);

(assuming exit becomes a function).

I feel like exit (function or not) should just return from the sandbox and shouldn’t be disable-able. For example, a plugin might detect a valid etag and set headers to 302 and exit.

// Allow only specific functions to be called (whitelist method):

$aWhiteList = [‘array_key_exists’,‘in_array’];

SPLSandBox::PassFunctions($aWhiteList);

Might be a good idea to combine these two? Allow passing a whitelist AND a blacklist? Are these supposed to be static or on an instance of a sandbox?

// Allow specific classes to be used by sandbox code:

$aClassList = [‘\MyAPP\PluginAPI’];

SPLSandBox::PassClasses($aClassList);

// Allow specific constants to be seen by sandbox code:

SPLSandBox::PassConstants([‘\DB_USERNAME’,‘\DB_PASSWORD’]);

// Language Construct Callbacks:

The callbacks allow the outer code to control and monitor certain

language features of the sandboxed code during execution.

// Called when the sandbox code tries to include or require something:

SPLSandBox::RegisterIncludeHandler();

Does including a file from outside the sandbox (next call) call this handler?

// Includes a file into the sandbox:

SPLSandBox::Include(‘path/to/file.php’);

// Your sandbox autoloader logic could be incorporated here:

SPLSandBox::RegisterAutoLoadHandler();

// But, for unit testing with mocks and stubs,

// it might be better to use:

SPLSandBox::RegisterNewHandler();

The NewHandler callback is called every time sandboxed code tries to

instantiate an object with new.

Why not use the current autoload logic?

// Example: Override what new returns to code running in the sandbox:

function MyNewHandler(string $ClassName, array $aConstructorArgs){

if($ClassName === ‘\DateTime’){

return new FakeDate();

}

return new $ClassName($aConstructorArgs);

}

// Every time a sandboxed class calls a method, call this first:

SPLSandBox::RegisterMethodCallHandler();

Useful for unit testing to monitor if the tested class is calling the

methods it should be calling. Ignores visibility rules.

Could also allow for infinite recursion detection from the outside.

I think this is handled automatically now.

// The companion for static method calls, gets called

// every time a method is called on a class statically:

SPLSandBox::RegisterStaticMethodCallHandler();

// Each time a sandboxed loop iterates, call this first:

// Allows the outer code to put limit breaks on the sandboxed code.

SPLSandBox::RegisterLoopHandler();

The callback takes the type of loop, and the variables that make up the

loop ($i for for(), $Key => $value for foreach(), etc)

// If the sandboxed code calls echo, print, or

// causes any output to occur (ie outside of <?php tags:

SPLSandBox::RegisterEchoHandler();

Could be used to make sure templates behave as desired, but perhaps

even more useful, it lets you fail unit tests if any output occurs

from a test that shouldn’t produce output.

ie. Catch echo statements used in testing, or whitespace after a

closing ?> tag.

// If the sandbox code tries to use exit or die,

// call this function instead:

SPLSandBox::RegisterExitHandler();

You’ll probably want to destroy the sandbox from the outside (see

below), rather than letting sandboxed code halt the test framework or

main application.

// If sandboxed code throws, it should not

// be a throw in the outer application space.

// Every exception throw triggers this callback,

// even if there is a catch block:

SPLSandBox::RegisterExceptionHandler();

// When a catch block runs, invoke this callback first:

SPLSandBox::RegisterCatchHandler();

Allows unit tests to make sure that exceptions are handled correctly.

// For non-Exceptions (warning, notice, deprecated, fatal,

// and yes, maybe even parse because we’re in a sandbox):

SPLSandBox::RegisterErrorHandler();

// Inside your Handlers, you may want to know the file and line

// that triggered the callback.

SPLSandBox::GetCurrentLine();

SPLSandBox::GetFileName();

SPLSandBox::GetClassName();

SPLSandBox::GetFunctionName();

For example, if you want to catch any echo left behind from debugging,

you might also want to find the line and file where the statement is

located to remove it. The above methods would be usable inside any of

the callbacks.

// Inside your Handlers, abort execution:

SPLSandBox::Stop();

// Resource Limits, hopefully self explanatory:

SPLSandBox::SetMemoryLimit();

SPLSandBox::SetExecutionTimeLimit();

// Mocks, Stubs:

// Put a function from the outer application

// into the sandbox as the specified name:

SPLSandBox::MockFunction(‘\mocks\fopen’,‘\fopen’);

// Put a class from the outer application

// into the sandbox as the specified name:

SPLSandBox::MockClass(‘\Mocks\FakeTime’,‘\DateTime’);

// Set global variables (and super globals) inside the sandbox:

SPLSandBox::MockGlobal(‘$_GET’,$aGetVars);

// Set global constants inside the sandbox:

SPLSandBox::MockConstant(‘MY_CONSTANT’,$Value);

// Invocation

// You can run procedural code to setup your test environment.

// Runs the array of code lines in the sandbox context:

$aProceduralCode = [

“$a = 1;”,

“$b = 2;”,

“$c = DoSomething($a, $b);”

];

SPLSandBox::Procedure($aProceduralCode);

// You can get a pointer to an object instantiated in the sandbox:

$oClass = SPLSandBox::GetInstance(‘ClassName’);

// And use it like you normally would:

$oClass->DoSomething();

This class runs entirely in the sandbox.

// You cleanup the resource with:

SPLSandBox::Destroy();

Looks pretty exciting and useful! Since you’re imagining it being a part of SPL, why not implement this in its own extension? It looks like the pecl process is pretty convoluted to get an extension listed there, but many popular extensions live outside of pecl too.

— Rob

On 06/08/2024 10:41, Nick Lockheart wrote:

Sandbox: Security

A SandBox has two use cases:

1. Unit Testing of code with mocks or stubs, and also, allowing testing
with different environments.

2. The secure running of 3rd party code inside a 1st party application.

The use-case of securely running 3rd party code inside your application is impossible at this moment, and will still be impossible after a sandbox API is introduced.
The reason is that the PHP interpreter as it is today is not memory safe. It is relatively easy to cause memory corruption by only using PHP code by abusing things like custom error handlers set from userland. This in turn can be used to gain arbitrary read/write primitives which has been shown to circumvent disable_functions & open_basedir, and some PoCs can even run arbitrary commands. It would be doable to extend these tricks to circumvent a sandboxing API.
As such, a sandboxing API for securely executing 3rd party code is only possible after the interpreter has become memory safe.
Although some work has been done in PHP 8.3 to plug many of these memory safety bugs in the VM, much more work remains and would likely require complicated changes.
So therefore I propose to only focus on the mocking functionality of your proposal for now, until the time comes that the interpreter is memory safe.
I would therefore also not call it "sandbox".

Introducing a sandbox API for security also opens up a can of worms for the security policy.
Right now we are assuming an attacker model of a remote attacker, and that the code running on your server is trusted.
But that would change when an official sandbox API is introduced.

Kind regards
Niels

On Tue, Aug 6, 2024, at 20:59, Niels Dossche wrote:

On 06/08/2024 10:41, Nick Lockheart wrote:

Sandbox: Security

A SandBox has two use cases:

  1. Unit Testing of code with mocks or stubs, and also, allowing testing

with different environments.

  1. The secure running of 3rd party code inside a 1st party application.

The use-case of securely running 3rd party code inside your application is impossible at this moment, and will still be impossible after a sandbox API is introduced.

The reason is that the PHP interpreter as it is today is not memory safe. It is relatively easy to cause memory corruption by only using PHP code by abusing things like custom error handlers set from userland. This in turn can be used to gain arbitrary read/write primitives which has been shown to circumvent disable_functions & open_basedir, and some PoCs can even run arbitrary commands. It would be doable to extend these tricks to circumvent a sandboxing API.

As such, a sandboxing API for securely executing 3rd party code is only possible after the interpreter has become memory safe.

Although some work has been done in PHP 8.3 to plug many of these memory safety bugs in the VM, much more work remains and would likely require complicated changes.

So therefore I propose to only focus on the mocking functionality of your proposal for now, until the time comes that the interpreter is memory safe.

I would therefore also not call it “sandbox”.

Introducing a sandbox API for security also opens up a can of worms for the security policy.

Right now we are assuming an attacker model of a remote attacker, and that the code running on your server is trusted.

But that would change when an official sandbox API is introduced.

Kind regards

Niels

Hey Niels,

I find this assertion kind of scary from a shared hosting perspective or even from a 3v4l kind of perspective. How do these services protect themselves if php is inherently insecure?

— Rob

On 06/08/2024 21:05, Rob Landers wrote:

Hey Niels,

I find this assertion kind of scary from a shared hosting perspective or even from a 3v4l kind of perspective. How do these services protect themselves if php is inherently insecure?

— Rob

Hi Rob

I'm not a sysadmin guy or anything like that, so I don't know how shared hosting stacks looks like in practice.
But containers, chroot jails, seccomp-bpf, ... can offer protection. And you should be doing those things anyway (as a matter of defense-in-depth) if you're offering servers running untrusted code.

Kind regards
Niels

On Tue, 2024-08-06 at 20:51 +0200, Rob Landers wrote:

Hey Nick,

Looking forward to the RFC!

On Tue, Aug 6, 2024, at 19:28, Nick Lockheart wrote:
> >
> > This looks quite valuable, and I assume auto loading would work
> > just
> > like normal? Register an autoloader that will eventually require
> > the
> > file and call this function?
> >
> > It would be nice to provide a simplified api as well, maybe
> > “CopyCurrentEnvironment()” or something? In most cases, it is
> > easier/faster to find things to remove vs. adding everything on
> > every
> > plugin/request every time.
> >
> > In saying that, it would be great if there was an api for
> > “sharing” a
> > base-sandbox pool via shm (or similar to a pool) so that the base
> > vm
> > doesn’t need to be recreated potentially hundreds of times per
> > request.
> >
>
> I didn't want to be too overwhelming on the first post, but since
> it
> seems the feedback is positive, here's a more complete list of what
> I
> think should be included:
>
>
> // Passthroughs:
>
> // Make all user and built-in global functions
> // available inside the sandbox:
> SPLSandBox::PassGlobalFunctions();

Bike shed: maybe have PassGlobals() instead? Or rather, PassNamespace
and have \ be a valid namespace.

That's not a bad suggestion, but I was concerned that there isn't a way
to grab everything in a namespace. I suppose if the class is already
registered then a "pass namespace" would work for everything registered
at the time of the call, but wouldn't work with global auto load after
the call.

>
> // Make all built-in (but not user) functions
> // available inside the sandbox:
> SPLSandBox::PassBuiltInFunctions();
>
>
> // Make all built-in (but not user) functions
> // available inside the sandbox, EXCEPT blacklisted functions:
> SPLSandBox::PassBuiltInFunctionsExcept(['eval','exit']);
>
> (assuming exit becomes a function).

I feel like exit (function or not) should just return from the
sandbox and shouldn’t be disable-able. For example, a plugin might
detect a valid etag and set headers to 302 and exit.

>
>
> // Allow only specific functions to be called (whitelist method):
> $aWhiteList = ['array_key_exists','in_array'];
> SPLSandBox::PassFunctions($aWhiteList);

Might be a good idea to combine these two? Allow passing a whitelist
AND a blacklist? Are these supposed to be static or on an instance of
a sandbox?

I'm using static notation, but the intent is that they are methods
called on an instance.

Re: combining, it's more of a style preference. I like naming things:

PassBuiltInFunctionsExcept()

for example, because you can read the code without comments and
understand what it is doing, without documentation.

Any time you can name things so the code reads like a sentence is a win
in my book.

>
> // Allow specific classes to be used by sandbox code:
> $aClassList = ['\MyAPP\PluginAPI'];
> SPLSandBox::PassClasses($aClassList);
>
>
> // Allow specific constants to be seen by sandbox code:
> SPLSandBox::PassConstants(['\DB_USERNAME','\DB_PASSWORD']);
>
>
>
> // Language Construct Callbacks:
>
> The callbacks allow the outer code to control and monitor certain
> language features of the sandboxed code during execution.
>
> // Called when the sandbox code tries to include or require
> something:
> SPLSandBox::RegisterIncludeHandler();

Does including a file from outside the sandbox (next call) call this
handler?

No, the callback monitors what the sandbox code is trying to do and can
allow, deny, or modify it.

>
> // Includes a file into the sandbox:
> SPLSandBox::Include('path/to/file.php');
>
> // Your sandbox autoloader logic could be incorporated here:
> SPLSandBox::RegisterAutoLoadHandler();
>
>
> // But, for unit testing with mocks and stubs,
> // it might be better to use:
> SPLSandBox::RegisterNewHandler();
>
> The NewHandler callback is called every time sandboxed code tries
> to
> instantiate an object with `new`.

Why not use the current autoload logic?

For unit testing purposes, you can intercept calls to `new` from the
code under test and return a mock object.

For example, if your class assembles SQL queries and sends them to a
PDO object, you can give the SQL class a dummy PDO to test that it's
building queries correctly, without needing a database available in
your build environment while the test is being run.

This tool is optional, like most of the callbacks, so you just
implement the ones you need. If you don't define a NewHandler, then the
sandbox code runs the normal `new` behavior, with or without a sandbox
auto loader.

> // Example: Override what `new` returns to code running in the
> sandbox:
> function MyNewHandler(string $ClassName, array $aConstructorArgs){
>
> if($ClassName === '\DateTime'){
> return new FakeDate();
> }
> return new $ClassName($aConstructorArgs);
> }
>
>
> // Every time a sandboxed class calls a method, call this first:
> SPLSandBox::RegisterMethodCallHandler();
>
> Useful for unit testing to monitor if the tested class is calling
> the
> methods it should be calling. Ignores visibility rules.
>
> Could also allow for infinite recursion detection from the outside.

I think this is handled automatically now.

>
>
> // The companion for static method calls, gets called
> // every time a method is called on a class statically:
> SPLSandBox::RegisterStaticMethodCallHandler();
>
>
>
> // Each time a sandboxed loop iterates, call this first:
> // Allows the outer code to put limit breaks on the sandboxed code.
> SPLSandBox::RegisterLoopHandler();
>
> The callback takes the type of loop, and the variables that make up
> the
> loop ($i for for(), $Key => $value for foreach(), etc)
>
>
> // If the sandboxed code calls echo, print, or
> // causes any output to occur (ie outside of <?php tags:
> SPLSandBox::RegisterEchoHandler();
>
> Could be used to make sure templates behave as desired, but perhaps
> even more useful, it lets you *fail* unit tests if any output
> occurs
> from a test that shouldn't produce output.
>
> ie. Catch echo statements used in testing, or whitespace after a
> closing ?> tag.
>
>
> // If the sandbox code tries to use `exit` or `die`,
> // call this function instead:
> SPLSandBox::RegisterExitHandler();
>
> You'll probably want to destroy the sandbox from the outside (see
> below), rather than letting sandboxed code halt the test framework
> or
> main application.
>
>
> // If sandboxed code throws, it should *not*
> // be a throw in the outer application space.
> // Every exception throw triggers this callback,
> // even if there is a catch block:
> SPLSandBox::RegisterExceptionHandler();
>
> // When a catch block runs, invoke this callback first:
> SPLSandBox::RegisterCatchHandler();
>
> Allows unit tests to make sure that exceptions are handled
> correctly.
>
>
> // For non-Exceptions (warning, notice, deprecated, fatal,
> // and yes, maybe even parse because we're in a sandbox):
> SPLSandBox::RegisterErrorHandler();
>
>
>
> // Inside your Handlers, you may want to know the file and line
> // that triggered the callback.
> SPLSandBox::GetCurrentLine();
> SPLSandBox::GetFileName();
> SPLSandBox::GetClassName();
> SPLSandBox::GetFunctionName();
>
> For example, if you want to catch any echo left behind from
> debugging,
> you might also want to find the line and file where the statement
> is
> located to remove it. The above methods would be usable inside any
> of
> the callbacks.
>
>
> // Inside your Handlers, abort execution:
> SPLSandBox::Stop();

>
>
> // Resource Limits, hopefully self explanatory:
> SPLSandBox::SetMemoryLimit();
> SPLSandBox::SetExecutionTimeLimit();
>
>
> // Mocks, Stubs:
>
> // Put a function from the outer application
> // into the sandbox as the specified name:
> SPLSandBox::MockFunction('\mocks\fopen','\fopen');
>
>
> // Put a class from the outer application
> // into the sandbox as the specified name:
> SPLSandBox::MockClass('\Mocks\FakeTime','\DateTime');
>
>
> // Set global variables (and super globals) inside the sandbox:
> SPLSandBox::MockGlobal('$_GET',$aGetVars);
>
>
> // Set global constants inside the sandbox:
> SPLSandBox::MockConstant('MY_CONSTANT',$Value);
>
>
>
> // Invocation
>
> // You can run procedural code to setup your test environment.
> // Runs the array of code lines in the sandbox context:
> $aProceduralCode = [
> "$a = 1;",
> "$b = 2;",
> "$c = DoSomething($a, $b);"
> ];
>
> SPLSandBox::Procedure($aProceduralCode);
>
>
> // You can get a pointer to an object instantiated in the sandbox:
> $oClass = SPLSandBox::GetInstance('ClassName');
>
> // And use it like you normally would:
> $oClass->DoSomething();
>
> This class runs entirely in the sandbox.
>
>
> // You cleanup the resource with:
> SPLSandBox::Destroy();
>

Looks pretty exciting and useful! Since you’re imagining it being a
part of SPL, why not implement this in its own extension? It looks
like the pecl process is pretty convoluted to get an extension listed
there, but many popular extensions live outside of pecl too.

— Rob

I think this would need to be a team effort.

Some of this requires doing stuff that I'm not sure how to do, for
example, I don't know how to make a user land callback function get
invoked when `new` is called, that injects a new thing back to the
calling code.

It seems like this would need to be part of core.

On the other hand, this could be the defining feature of whatever PHP
version it was included with.

A security framework for third party extensions and a first class unit
testing framework, would benefit almost everyone using PHP.

On Wed, Aug 7, 2024, 2:11 AM Rob Landers rob@bottled.codes wrote:

I find this assertion kind of scary from a shared hosting perspective or even from a 3v4l kind of perspective. How do these services protect themselves if php is inherently insecure?

php is not inherently insecure. not even remotely and quite the opposite.

Shared hosting is.

This is issue is not specific to php, almost all languages out there will have the same memory (or else) challenges.

Crypto apis or similar features requiring high level of safety for the data use various technics to mitigate it (zeroing after use, decrypt mem on demand only etc).

A bit off topic but with the solutions out there for vps, etc, shared hosting for anything requiring data safety should be avoided like the pest.

About this feature, it looks, at a first glance, like an advanced complex version of safe mode/open base dir, with the additional features. I never ever had to mock core functions for testing, I would think about a design issue if it is needed.

But I may be wrong, that would not be a first :slight_smile:

cheers,
Pierre

On 06.08.2024 at 20:59, Niels Dossche wrote:

On 06/08/2024 10:41, Nick Lockheart wrote:

Sandbox: Security

A SandBox has two use cases:

1. Unit Testing of code with mocks or stubs, and also, allowing testing
with different environments.

2. The secure running of 3rd party code inside a 1st party application.

The use-case of securely running 3rd party code inside your application is impossible at this moment, and will still be impossible after a sandbox API is introduced.
The reason is that the PHP interpreter as it is today is not memory safe. It is relatively easy to cause memory corruption by only using PHP code by abusing things like custom error handlers set from userland. This in turn can be used to gain arbitrary read/write primitives which has been shown to circumvent disable_functions & open_basedir, and some PoCs can even run arbitrary commands. It would be doable to extend these tricks to circumvent a sandboxing API.
As such, a sandboxing API for securely executing 3rd party code is only possible after the interpreter has become memory safe.
Although some work has been done in PHP 8.3 to plug many of these memory safety bugs in the VM, much more work remains and would likely require complicated changes.
So therefore I propose to only focus on the mocking functionality of your proposal for now, until the time comes that the interpreter is memory safe.
I would therefore also not call it "sandbox".

I concur. The old <PECL :: Package :: runkit; did provide a
"sandbox" feature, but that had not been ported to
<PECL :: Package :: runkit7, possibly for exactly these reasons.

Introducing a sandbox API for security also opens up a can of worms for the security policy.
Right now we are assuming an attacker model of a remote attacker, and that the code running on your server is trusted.
But that would change when an official sandbox API is introduced.

Christoph

>
> Introducing a sandbox API for security also opens up a can of worms
> for the security policy. Right now we are assuming an attacker
> model of a remote attacker, and that the code running on your
> server is trusted. But that would change when an official sandbox
> API is introduced.
>
> Kind regards
> Niels
>
Hey Niels,

I find this assertion kind of scary from a shared hosting perspective
or even from a 3v4l kind of perspective. How do these services
protect themselves if php is inherently insecure?

— Rob

So I was thinking about this a bit more and I thought, what if instead
of adding a sandbox as a feature of PHP, what if PHP *was* the sandbox.

So consider this:

What if the PHP engine added a C API that lets C/C++ programs not only
spin up and run PHP, but those C/C++ programs could also control and
monitor the execution of the PHP environment from the outside.

That would essentially make every instance of PHP a sandbox.

But now, we would be able to control, monitor, and override certain
behavior from the outside while script execution runs.

This gives us a foundation to do two different things.

** Thing 1: A PHP extension for the PHP C API **

PHP's C API could then be controlled by a PHP script by using a PHP
extension that uses the PHP API. This would allow PHP scripts to spin
up and control instances of PHP, running in their own execution
context.

This meets the use case of Unit Testing scripts and secure execution of
third party plugins by PHP applications.

** Thing 2: C/C++ programs could use PHP as a library **

With a comprehensive C API to PHP, we could build C/C++ applications
that can use PHP scripts as a plugin.

Let's consider a use case for a social media site or ecommerce platform
that is high traffic and written in PHP.

What if we moved the front controller logic to a C application that was
built as an Nginx module?

Nginx Modules are statically linked, so now our front controller,
written in C, would be native inside Nginx, usable in location blocks.

Thanks to the C API for PHP, our front-controller-as-Nginx-plugin can
directly invoke parts of the PHP application as needed, and inject
resources directly into the PHP environment.

This isn't the same as a Fast CGI pass. This moves the routing, session
setup, and other redundant code into a native C application that's
actually part of the whole app.

So you could route requests, setup sessions, and serve cached pages as
fast as Nginx can process a request (about 1ms).

When the C land front controller needs to invoke PHP, it invokes PHP
through PHP's C API that lets the front controller pass data directly
into the PHP instance.

For example, the C front controller application can register a database
class inside the PHP environment that's already primed with an open
database connection (saves 20ms).

Because the running PHP application is actually talking back and forth
with the C land front controller, the C land front controller can
remember things (including session details) for faster responses.

This hybrid approach gives you the best of both worlds: Very fast
response times for most requests because the front controller and
sessions and other resources are handled in native C land, inside the
web server, while also giving you the flexibility, ease of use, and
rapid prototyping of PHP.

On Wed, Aug 7, 2024, 7:13 PM Nick Lockheart <lists@ageofdream.com> wrote:

So I was thinking about this a bit more and I thought, what if instead
of adding a sandbox as a feature of PHP, what if PHP was the sandbox.

So consider this:

What if the PHP engine added a C API that lets C/C++ programs not only
spin up and run PHP, but those C/C++ programs could also control and
monitor the execution of the PHP environment from the outside.

Something similar is done in things like frankenphp (go/caddi/own sapi) or nativephp (desktop app, afair rust/tauri).

Not the same goal, same starting point.

But I would stay away to replace, or improve, OS security with my own things. Totally possible but it is the kind of worms can I don’t look forward to open :slight_smile:

On Aug 6, 2024, at 3:09 AM, Nick Lockheart <lists@ageofdream.com> wrote:

Sand Box: A first class API that allows unit testing of code with mocks
and stubs of other classes or functions, without the need to modify the
class under test.

This honestly feels like it's going to be a repeat of safe_mode.

What might be better is exposing OS things like Capsicum or unveil when
possible. Capabilities are probably the right way to deal with this kind
of thing, after all.