With many complex functions, the most natural way to write them is with a recursive implementation. However, a recursive implementation can be slow in Python because recursive calls can be expensive. Is there a better way? Yes! We can remove the recursion and manage an explicit stack ourselves.
In this blog post, we will explain how the recursion to explicit stack optimization works, share an example of such an optimization merged into Pydantic, and discuss how we can automatically discover such optimizations for a codebase using Codeflash.

Being a dynamic language, in Python a function call can be expensive. Taking a bird’s eye view, this is because the following steps occur while calling a function:
Similarly while returning from a function, the following steps are involved:
An effective way of speeding up such functions is to remove these recursive function calls and manage the stack ourselves explicitly. Let's see how this optimization works by looking at some Pydantic code, exploring how we can find such optimizations on our codebases.
Pydantic is one of the most popular and important libraries in Python when it comes to defining data classes that self-validate. Since regular objects in Python don’t check for the validity of data types and values, Pydantic’s validating data classes have become an important component of building reliable Python software.
Internally, at a very high level, Pydantic does 2 things. The first step is to parse the given Pydantic dataclass definitions and convert them to a schema that can be validated later on. This code is written in Python. The second step involves using the schema definitions and validating the new Pydantic objects in runtime against that schema. This happens in Rust.
This was a recursive code in Pydantic schema code which collected the string associated $ref key in a deeply nested object. This is a very recursive code.
We used Codeflash to optimize this function. Codeflash works by using LLMs to generate possible optimizations for the code and then testing them for correctness and speedup. Here is the Pull Request that Codeflash created.
This is the optimized code that Codeflash found:
The main optimization is to introduce a stack variable and replace all recursive calls by adding the function argument to the stack. Moreover, for lists, instead of iterating through the list and calling a function recursively like this:
The optimization directly extends the stack like this:
This approach prevents several recursive calls in one statement!
Compared with a benchmark that Codeflash generated, this new stack-based approach was 34% faster than the recursive approach! This optimization was merged into Pydantic mainline code, speeding up projects wherever Pydantic is used!
You can re-create the same results by running codeflash —-file pydantic/json_schema.py —- function _get_all_json_refs on a local fork of Pydantic codebase before the optimization was merged in. If you are feeling adventurous, you can also optimize the whole of Pydantic’s codebase by running codeflash --all. Feel free to share the results with me or open a PR on the Pydantic project itself with the results you find!
Converting every recursive function to an explicit-stack approach may or may not result in performance optimization. Explicit stack may be a good approach for optimization if the function:
If the above conditions are not met then such an optimization may not be a good idea. I’d suggest you try this category of optimization for yourself to see if it makes your code faster. Of course, an easier approach would be to just let Codeflash do all the hard work to figure out the best way to optimize your code, including doing the kinds of optimizations discussed in this post.
We saw how this type of optimization significantly sped up a real Pydantic function. I encourage you to see if any such optimizations exist in your Python codebase by scanning it with Codeflash. Codeflash can also find all the other possible optimization types for your codebase. Getting started is super quick by running pip install codeflash on your project’s environment and then running codeflash init . More instructions to get started are here in our documentation.
Let’s make the world’s code faster. Happy optimizing 🚀
Join our newsletter and stay updated with the latest in performance optimization automation.


Join our newsletter and stay updated with fresh insights and exclusive content.
