[PHP-DEV] Study on unsafe extract() usage

Hey everyone,

as part of our security research, we studied the attack surface of the extract() function when being called with user-controlled input. Calling extract($_GET) allows an attacker to overwrite all variables in the current scope, which could lead to vulnerabilities. We analyzed the use of extract() in 28325 open source projects on GitHub looking for cases where extract() is called with user-controlled input. extract() was called 4923 times and 146 of those calls involved user-input. The majority of findings were vulnerable due to being able to overwrite superglobals.

We would like to share the findings and start a discussion on if and how the risks of unsafe extract() usage could be reduced without breaking backwards compatibility. The vulnerable extract() calls could be exploited primarily by overwriting superglobals who's state is expected to be immutable by the user like $_SESSION or $_SERVER or empty like $_POST on a GET request. Similar to how changing $GLOBALS through extract() is already forbidden, we propose to block overwriting all superglobals through extract(). This would prevent most of the vulnerabilities found in the dataset and we cannot think of a valid use case for allowing this behavior. Here is where we rely on your feedback and experience.

The semantics of every EXTR_ flag need to be documented, considered, and understood properly as invalid usage could still leave the application vulnerable. Calling extract() in the global scope is particularly risky and the warning in the documentation could be expanded to describe the risks in more depth. Using EXTR_IF_EXISTS in the global scope for example would still allow overwriting superglobals instead of the documented intended use case of defining a list of valid variables to extract beforehand.

Since empty prefix strings are allowed, even EXTR_PREFIX_ALL would allow to overwrite superglobals since SESSION would become <empty prefix>_SESSION -> _SESSION. We propose to disallow the use of empty prefix strings in extract() to prevent this. No such code pattern was found in the dataset, but we wanted to mention it since an empty prefix string looks like a bug when using the EXTR_PREFIX_* flags.

You can find more details in our paper [0], especially the case studies of vulnerable usage scenarios and possible exploits in chapter 5. We propose further possible mitigations in section 6.4 that wouldn't fit in this email. If the proposed mitigations are not feasible, we hope the raw study results could be of use for further discussion on their own.

We are looking forward to your feedback and any discussion on this topic. The proposed changes aim to improve the security of PHP applications in the hopes to avoid breaking existing code.

Kind regards
Jannik Hartung

Institute for Application Security (IAS)
Technische Universität Braunschweig

[0] https://www.ias.tu-bs.de/publications/extract-woot-2025.pdf

This would prevent most of the

vulnerabilities found in the dataset and we cannot think of a valid use

case for allowing this behavior

But not all. This function is dangerous on its own. Attack vectors are 99% through superglobals but could come from other sources too. However, this function is handy when used correctly.

I agree with everything stated in the email. Using extract() on superglobals is definitely an incorrect usage and should be forbidden. If it’s something that we can do then we should do it as soon as possible even if it means breaking some poorly written code.

Empty prefix should be a bug and as such I recommend adding an error for this in PHP 8.5 without deprecation or RFC.

One more thing that would improve security is to change the default flag to EXTR_SKIP. It would be a major BC though so we could probably only do it in PHP 9.

Hey Kamil,

···

On 23.7.2025 21:01:14, Kamil Tekiela wrote:

This would prevent most of the

vulnerabilities found in the dataset and we cannot think of a valid use

case for allowing this behavior

But not all. This function is dangerous on its own. Attack vectors are 99% through superglobals but could come from other sources too. However, this function is handy when used correctly.

I agree with everything stated in the email. Using extract() on superglobals is definitely an incorrect usage and should be forbidden. If it’s something that we can do then we should do it as soon as possible even if it means breaking some poorly written code.

Empty prefix should be a bug and as such I recommend adding an error for this in PHP 8.5 without deprecation or RFC.

One more thing that would improve security is to change the default flag to EXTR_SKIP. It would be a major BC though so we could probably only do it in PHP 9.

I can agree on making EXTR_SKIP the default.

I do not agree that empty prefix is a bug though. That’s a common operation for templating for example.

Regarding extract on superglobals, I have some legacy code from PHP 3 times, which still lives on with the following lines of code:

// Emulate register_globals on
extract($_POST, EXTR_SKIP);
extract($_GET, EXTR_SKIP);

Yeah, not the most awesome. But it works. Obviously, you can work around that with foreach ($_GET as $key => $val) $$key = $val;. So, I’m not strictly against warning here, but it doesn’t necessarily mean the code is incorrect. Just not necessarily to todays standards.

Bob