pyinstrument - Man Page
pyinstrument 4.6.2
Pyinstrument
Pyinstrument is a Python profiler. A profiler is a tool to help you optimize your code - make it faster. To get the biggest speed increase you should focus on the slowest part of your program. Pyinstrument helps you find it!
☕️ Not sure where to start? Check out this video tutorial from calmcode.io!
User Guide
Installation
pip install pyinstrument
Pyinstrument supports Python 3.7+.
Profile a Python script
Call Pyinstrument directly from the command line. Instead of writing python script.py, type pyinstrument script.py. Your script will run as normal, and at the end (or when you press ^C), Pyinstrument will output a colored summary showing where most of the time was spent.
Here are the options you can use:
Usage: pyinstrument [options] scriptfile [arg] ... Options: --version show program's version number and exit -h, --help show this help message and exit --load-prev=ID instead of running a script, load a previous report -m MODULE_NAME run library module as a script, like 'python -m module' --from-path (POSIX only) instead of the working directory, look for scriptfile in the PATH environment variable -o OUTFILE, --outfile=OUTFILE save to <outfile> -r RENDERER, --renderer=RENDERER how the report should be rendered. One of: 'text', 'html', 'json', 'speedscope', or python import path to a renderer class -t, --timeline render as a timeline - preserve ordering and don't condense repeated calls --hide=EXPR glob-style pattern matching the file paths whose frames to hide. Defaults to '*/lib/*'. --hide-regex=REGEX regex matching the file paths whose frames to hide. Useful if --hide doesn't give enough control. --show=EXPR glob-style pattern matching the file paths whose frames to show, regardless of --hide or --hide-regex. For example, use --show '*/<library>/*' to show frames within a library that would otherwise be hidden. --show-regex=REGEX regex matching the file paths whose frames to always show. Useful if --show doesn't give enough control. --show-all show everything --unicode (text renderer only) force unicode text output --no-unicode (text renderer only) force ascii text output --color (text renderer only) force ansi color text output --no-color (text renderer only) force no color text output
Protip: -r html will give you a interactive profile report as HTML - you can really explore this way!
Profile a Python CLI command
For profiling an installed Python script via the "console_script" entry point, call Pyinstrument directly from the command line with the --from-path flag. Instead of writing cli-script, type pyinstrument --from-path cli-script. Your script will run as normal, and at the end (or when you press ^C), Pyinstrument will output a colored summary showing where most of the time was spent.
Profile a specific chunk of code
Pyinstrument also has a Python API. Just surround your code with Pyinstrument, like this:
from pyinstrument import Profiler profiler = Profiler() profiler.start() # code you want to profile profiler.stop() profiler.print()
If you get "No samples were recorded." because your code executed in under 1ms, hooray! If you still want to instrument the code, set an interval value smaller than the default 0.001 (1 millisecond) like this:
profiler = Profiler(interval=0.0001) ...
Experiment with the interval value to see different depths, but keep in mind that smaller intervals could affect the performance overhead of profiling.
Protip: To explore the profile in a web browser, use profiler.open_in_browser(). To save this HTML for later, use profiler.output_html().
Profile code in Jupyter/IPython
Via IPython magics, you can profile a line or a cell in IPython or Jupyter.
Example:
%load_ext pyinstrument
%%pyinstrument import time def a(): b() c() def b(): d() def c(): d() def d(): e() def e(): time.sleep(1) a()
To customize options, see %%pyinstrument??.
Profile a web request in Django
To profile Django web requests, add pyinstrument.middleware.ProfilerMiddleware to MIDDLEWARE in your settings.py.
Once installed, add ?profile to the end of a request URL to activate the profiler. Your request will run as normal, but instead of getting the response, you'll get pyinstrument's analysis of the request in a web page.
If you're writing an API, it's not easy to change the URL when you want to profile something. In this case, add PYINSTRUMENT_PROFILE_DIR = 'profiles' to your settings.py. Pyinstrument will profile every request and save the HTML output to the folder profiles in your working directory.
If you want to show the profiling page depending on the request you can define PYINSTRUMENT_SHOW_CALLBACK as dotted path to a function used for determining whether the page should show or not. You can provide your own function callback(request) which returns True or False in your settings.py.
def custom_show_pyinstrument(request): return request.user.is_superuser PYINSTRUMENT_SHOW_CALLBACK = "%s.custom_show_pyinstrument" % __name__
You can configure the profile output type using setting's variable PYINSTRUMENT_PROFILE_DIR_RENDERER. Default value is pyinstrument.renderers.HTMLRenderer. The supported renderers are pyinstrument.renderers.JSONRenderer, pyinstrument.renderers.HTMLRenderer, pyinstrument.renderers.SpeedscopeRenderer.
Profile a web request in Flask
A simple setup to profile a Flask application is the following:
from flask import Flask, g, make_response, request app = Flask(__name__) @app.before_request def before_request(): if "profile" in request.args: g.profiler = Profiler() g.profiler.start() @app.after_request def after_request(response): if not hasattr(g, "profiler"): return response g.profiler.stop() output_html = g.profiler.output_html() return make_response(output_html)
This will check for the ?profile query param on each request and if found, it starts profiling. After each request where the profiler was running it creates the html output and returns that instead of the actual response.
Profile a web request in FastAPI
To profile call stacks in FastAPI, you can write a middleware extension for pyinstrument.
Create an async function and decorate with app.middleware('http') where app is the name of your FastAPI application instance.
Make sure you configure a setting to only make this available when required.
from pyinstrument import Profiler PROFILING = True # Set this from a settings model if PROFILING: @app.middleware("http") async def profile_request(request: Request, call_next): profiling = request.query_params.get("profile", False) if profiling: profiler = Profiler(interval=settings.profiling_interval, async_mode="enabled") profiler.start() await call_next(request) profiler.stop() return HTMLResponse(profiler.output_html()) else: return await call_next(request)
To invoke, make any request to your application with the GET parameter profile=1 and it will print the HTML result from pyinstrument.
Profile a web request in Falcon
For profile call stacks in Falcon, you can write a middleware extension using pyinstrument.
Create a middleware class and start the profiler at process_request and stop it at process_response. The middleware can be added to the app.
Make sure you configure a setting to only make this available when required.
from pyinstrument import Profiler import falcon class ProfilerMiddleware: def __init__(self, interval=0.01): self.profiler = Profiler(interval=interval) def process_request(self, req, resp): self.profiler.start() def process_response(self, req, resp, resource, req_succeeded): self.profiler.stop() self.profiler.open_in_browser() PROFILING = True # Set this from a settings model app = falcon.App() if PROFILING: app.add_middleware(ProfilerMiddleware())
To invoke, make any request to your application and it launch a new window printing the HTML result from pyinstrument.
Profile Pytest tests
Pyinstrument can be invoked via the command-line to run pytest, giving you a consolidated report for the test suite.
pyinstrument -m pytest [pytest-args...]
Or, to instrument specific tests, create and auto-use fixture in conftest.py in your test folder:
from pathlib import Path import pytest from pyinstrument import Profiler TESTS_ROOT = Path.cwd() @pytest.fixture(autouse=True) def auto_profile(request): PROFILE_ROOT = (TESTS_ROOT / ".profiles") # Turn profiling on profiler = Profiler() profiler.start() yield # Run test profiler.stop() PROFILE_ROOT.mkdir(exist_ok=True) results_file = PROFILE_ROOT / f"{request.node.name}.html" profiler.write_html(results_file)
This will generate a HTML file for each test node in your test suite inside the .profiles directory.
Profile something else?
I'd love to have more ways to profile using Pyinstrument - e.g. other web frameworks. PRs are encouraged!
How It Works
Pyinstrument interrupts the program every 1ms[1] and records the entire stack at that point. It does this using a C extension and PyEval_SetProfile, but only taking readings every 1ms. Check out this blog post for more info.
You might be surprised at how few samples make up a report, but don't worry, it won't decrease accuracy. The default interval of 1ms is a lower bound for recording a stackframe, but if there is a long time spent in a single function call, it will be recorded at the end of that call. So effectively those samples were 'bunched up' and recorded at the end.
Statistical profiling (not tracing)
Pyinstrument is a statistical profiler - it doesn't track every function call that your program makes. Instead, it's recording the call stack every 1ms.
That gives some advantages over other profilers. Firstly, statistical profilers are much lower-overhead than tracing profilers.
Django template render × 4000 | Overhead | |
Base | ████████████████ 0.33s | |
pyinstrument | ████████████████████ 0.43s | 30% |
cProfile | █████████████████████████████ 0.61s | 84% |
profile | ██████████████████████████████████...██ 6.79s | 2057% |
But low overhead is also important because it can distort the results. When using a tracing profiler, code that makes a lot of Python function calls invokes the profiler a lot, making it slower. This distorts the results, and might lead you to optimise the wrong part of your program!
Full-stack recording
The standard Python profilers profile and cProfile show you a big list of functions, ordered by the time spent in each function. This is great, but it can be difficult to interpret why those functions are getting called. It's more helpful to know why those functions are called, and which parts of user code were involved.
For example, let's say I want to figure out why a web request in Django is slow. If I use cProfile, I might get this:
151940 function calls (147672 primitive calls) in 1.696 seconds Ordered by: cumulative time ncalls tottime percall cumtime percall filename:lineno(function) 1 0.000 0.000 1.696 1.696 profile:0(<code object <module> at 0x1053d6a30, file "./manage.py", line 2>) 1 0.001 0.001 1.693 1.693 manage.py:2(<module>) 1 0.000 0.000 1.586 1.586 __init__.py:394(execute_from_command_line) 1 0.000 0.000 1.586 1.586 __init__.py:350(execute) 1 0.000 0.000 1.142 1.142 __init__.py:254(fetch_command) 43 0.013 0.000 1.124 0.026 __init__.py:1(<module>) 388 0.008 0.000 1.062 0.003 re.py:226(_compile) 158 0.005 0.000 1.048 0.007 sre_compile.py:496(compile) 1 0.001 0.001 1.042 1.042 __init__.py:78(get_commands) 153 0.001 0.000 1.036 0.007 re.py:188(compile) 106/102 0.001 0.000 1.030 0.010 __init__.py:52(__getattr__) 1 0.000 0.000 1.029 1.029 __init__.py:31(_setup) 1 0.000 0.000 1.021 1.021 __init__.py:57(_configure_logging) 2 0.002 0.001 1.011 0.505 log.py:1(<module>)
It's often hard to understand how your own code relates to these traces.
Pyinstrument records the entire stack, so tracking expensive calls is much easier. It also hides library frames by default, letting you focus on your app/module is affecting performance.
_ ._ __/__ _ _ _ _ _/_ Recorded: 14:53:35 Samples: 131 /_//_/// /_\ / //_// / //_'/ // Duration: 3.131 CPU time: 0.195 / _/ v3.0.0b3 Program: examples/django_example/manage.py runserver --nothreading --noreload 3.131 <module> manage.py:2 └─ 3.118 execute_from_command_line django/core/management/__init__.py:378 [473 frames hidden] django, socketserver, selectors, wsgi... 2.836 select selectors.py:365 0.126 _get_response django/core/handlers/base.py:96 └─ 0.126 hello_world django_example/views.py:4
'Wall-clock' time (not CPU time)
Pyinstrument records duration using 'wall-clock' time. When you're writing a program that downloads data, reads files, and talks to databases, all that time is included in the tracked time by pyinstrument.
That's really important when debugging performance problems, since Python is often used as a 'glue' language between other services. The problem might not be in your program, but you should still be able to find why it's slow.
Async profiling
pyinstrument can profile async programs that use async and await. This async support works by tracking the 'context' of execution, as provided by the built-in contextvars module.
When you start a Profiler with the async_mode enabled or strict (not disabled), that Profiler is attached to the current async context.
When profiling, pyinstrument keeps an eye on the context. When execution exits the context, it captures the await stack that caused the context to exit. Any time spent outside the context is attributed to the that halted execution of the await.
Async contexts are inherited, so tasks started when a profiler is active are also profiled.
[image: Async context inheritance] [image]
pyinstrument supports async mode with Asyncio and Trio, other async/await frameworks should work as long as they use contextvars.
Greenlet doesn't use async and await, and alters the Python stack during execution, so is not fully supported. However, because greenlet also supports contextvars, we can limit profiling to one green thread, using strict mode. In strict mode, whenever your green thread is halted the time will be tracked in an <out-of-context> frame. Alternatively, if you want to see what's happening when your green thread is halted, you can use async_mode='disabled' - just be aware that readouts might be misleading if multiple tasks are running concurrently.
----
- [1]
Or, your configured interval.
API Reference
Command line interface
pyinstrument works just like python, on the command line, so you can call your scripts like pyinstrument script.py or pyinstrument -m my_module.
When your script ends, or when you kill it with ctrl-c, pyinstrument will print a profile report to the console.
- System Message: ERROR/6 (/builddir/build/BUILD/pyinstrument-4.6.2-build/pyinstrument-4.6.2/docs/reference.md:, line 12)
Command ['pyinstrument', '--help'] failed: [Errno 2] No such file or directory: 'pyinstrument'
Python API
The Python API is also available, for calling pyinstrument directly from Python and writing integrations with with other tools.
The Profiler object
- class pyinstrument.Profiler(interval=0.001, async_mode='enabled')
The profiler - this is the main way to use pyinstrument.
Note the profiling will not start until start() is called.
- Parameters
- interval (float) -- See interval.
- async_mode (AsyncMode) -- See async_mode.
- property interval: float
The minimum time, in seconds, between each stack sample. This translates into the resolution of the sampling.
- property async_mode: str
Configures how this Profiler tracks time in a program that uses async/await.
- enabled
When this profiler sees an await, time is logged in the function that awaited, rather than observing other coroutines or the event loop.
- disabled
This profiler doesn't attempt to track await. In a program that uses async/await, this will interleave other coroutines and event loop machinery in the profile. Use this option if async support is causing issues in your use case, or if you want to run multiple profilers at once.
- strict
Instructs the profiler to only profile the current async context. Frames that are observed in an other context are ignored, tracked instead as <out-of-context>.
- property last_session: Session | None
The previous session recorded by the Profiler.
- start(caller_frame=None)
Instructs the profiler to start - to begin observing the program's execution and recording frames.
The normal way to invoke start() is with a new instance, but you can restart a Profiler that was previously running, too. The sessions are combined.
- Parameters
caller_frame (FrameType | None) --
Set this to override the default behaviour of treating the caller of start() as the 'start_call_stack' - the instigator of the profile. Most renderers will trim the 'root' from the call stack up to this frame, to present a simpler output.
You might want to set this to inspect.currentframe().f_back if you are writing a library that wraps pyinstrument.
- stop()
Stops the profiler observing, and sets last_session to the captured session.
- Returns
The captured session.
- Return type
Session
- property is_running
Returns True if this profiler is running - i.e. observing the program execution.
- reset()
Resets the Profiler, clearing the last_session.
- __enter__()
Context manager support.
Profilers can be used in with blocks! See this example:
with Profiler() as p: # your code here... do_some_work() # profiling has ended. let's print the output. p.print()
- print(file=sys.stdout, *, unicode=None, color=None, show_all=False, timeline=False)
Print the captured profile to the console.
- Parameters
- file (IO[str]) -- the IO stream to write to. Could be a file descriptor or sys.stdout, sys.stderr. Defaults to sys.stdout.
- unicode (bool | None) -- Override unicode support detection.
- color (bool | None) -- Override ANSI color support detection.
- show_all (bool) -- Sets the show_all parameter on the renderer.
- timeline (bool) -- Sets the timeline parameter on the renderer.
- output_text(unicode=False, color=False, show_all=False, timeline=False)
Return the profile output as text, as rendered by ConsoleRenderer
- output_html(timeline=False, show_all=False)
Return the profile output as HTML, as rendered by HTMLRenderer
- write_html(path, timeline=False, show_all=False)
Writes the profile output as HTML to a file, as rendered by HTMLRenderer
- open_in_browser(timeline=False)
Opens the last profile session in your web browser.
- output(renderer)
Returns the last profile session, as rendered by renderer.
- Parameters
renderer (Renderer) -- The renderer to use.
Sessions
- class pyinstrument.session.Session
Represents a profile session, contains the data collected during a profile session.
- static load(filename)
Load a previously saved session from disk.
- Parameters
filename (PathOrStr) -- The path to load from.
- Return type
Session
- save(filename)
Saves a Session object to disk, in a JSON format.
- Parameters
filename (PathOrStr) -- The path to save to. Using the .pyisession extension is recommended.
- static combine(session1, session2)
Combines two Session objects.
Sessions that are joined in this way probably shouldn't be interpreted as timelines, because the samples are simply concatenated. But aggregate views (the default) of this data will work.
- Return type
Session
- root_frame(trim_stem=True)
Parses the internal frame records and returns a tree of Frame objects. This object can be rendered using a Renderer object.
- Return type
A Frame object, or None if the session is empty.
Renderers
Renderers transform a tree of Frame objects into some form of output.
Rendering has two steps:
- First, the renderer will 'preprocess' the Frame tree, applying each processor in the processor property, in turn.
- The resulting tree is rendered into the desired format.
Therefore, rendering can be customised by changing the processors property. For example, you can disable time-aggregation (making the profile into a timeline) by removing aggregate_repeated_calls().
- class pyinstrument.renderers.FrameRenderer(show_all=False, timeline=False, processor_options=None)
An abstract base class for renderers that process Frame objects using processor functions. Provides a common interface to manipulate the processors before rendering.
- Parameters
- show_all (bool) -- Don't hide library frames - show everything that pyinstrument captures.
- timeline (bool) -- Instead of aggregating time, leave the samples in chronological order.
- processor_options (dict[str, Any]) -- A dictionary of processor options.
- processors: List[Callable[[...], Frame | None]]
Processors installed on this renderer. This property is defined on the base class to provide a common way for users to add and manipulate them before calling render().
- processor_options: dict[str, Any]
Dictionary containing processor options, passed to each processor.
- default_processors()
Return a list of processors that this renderer uses by default.
- render(session)
Return a string that contains the rendered form of frame.
- class pyinstrument.renderers.ConsoleRenderer(unicode=False, color=False, flat=False, time='seconds', **kwargs)
Produces text-based output, suitable for text files or ANSI-compatible consoles.
- Parameters
- unicode (bool) -- Use unicode, like box-drawing characters in the output.
- color (bool) -- Enable color support, using ANSI color sequences.
- flat (bool) -- Display a flat profile instead of a call graph.
- time (LiteralStr['seconds', 'percent_of_total']) -- How to display the duration of each frame - 'seconds' or 'percent_of_total'
- class pyinstrument.renderers.HTMLRenderer(**kwargs)
Renders a rich, interactive web page, as a string of HTML.
- Parameters
- show_all -- Don't hide library frames - show everything that pyinstrument captures.
- timeline -- Instead of aggregating time, leave the samples in chronological order.
- processor_options -- A dictionary of processor options.
- class pyinstrument.renderers.JSONRenderer(**kwargs)
Outputs a tree of JSON, containing processed frames.
- Parameters
- show_all -- Don't hide library frames - show everything that pyinstrument captures.
- timeline -- Instead of aggregating time, leave the samples in chronological order.
- processor_options -- A dictionary of processor options.
- class pyinstrument.renderers.SpeedscopeRenderer(**kwargs)
Outputs a tree of JSON conforming to the speedscope schema documented at
wiki: https://github.com/jlfwong/speedscope/wiki/Importing-from-custom-sources schema: https://www.speedscope.app/file-format-schema.json spec: https://github.com/jlfwong/speedscope/blob/main/src/lib/file-format-spec.ts example: https://github.com/jlfwong/speedscope/blob/main/sample/profiles/speedscope/0.0.1/simple.speedscope.json
- Parameters
- show_all -- Don't hide library frames - show everything that pyinstrument captures.
- timeline -- Instead of aggregating time, leave the samples in chronological order.
- processor_options -- A dictionary of processor options.
Processors
Processors are functions that take a Frame object, and mutate the tree to perform some task.
They can mutate the tree in-place, but also can change the root frame, they should always be called like:
frame = processor(frame, options=...)
- pyinstrument.processors.remove_importlib(frame, options)
Removes <frozen importlib._bootstrap frames that clutter the output.
- pyinstrument.processors.remove_tracebackhide(frame, options)
Removes frames that have set a local __tracebackhide__ (e.g. __tracebackhide__ = True), to hide them from the output.
- pyinstrument.processors.aggregate_repeated_calls(frame, options)
Converts a timeline into a time-aggregate summary.
Adds together calls along the same call stack, so that repeated calls appear as the same frame. Removes time-linearity - frames are sorted according to total time spent.
Useful for outputs that display a summary of execution (e.g. text and html outputs)
- pyinstrument.processors.group_library_frames_processor(frame, options)
Groups frames that should be hidden into FrameGroup objects, according to hide_regex and show_regex in the options dict, as applied to the file path of the source code of the frame. If both match, 'show' has precedence. Options:
- hide_regex
regular expression, which if matches the file path, hides the frame in a frame group.
- show_regex
regular expression, which if matches the file path, ensures the frame is not hidden
Single frames are not grouped, there must be at least two frames in a group.
- pyinstrument.processors.merge_consecutive_self_time(frame, options, recursive=True)
Combines consecutive 'self time' frames.
- pyinstrument.processors.remove_unnecessary_self_time_nodes(frame, options)
When a frame has only one child, and that is a self-time frame, remove that node and move the time to parent, since it's unnecessary - it clutters the output and offers no additional information.
- pyinstrument.processors.remove_irrelevant_nodes(frame, options, total_time=None)
Remove nodes that represent less than e.g. 1% of the output. Options:
- filter_threshold
sets the minimum duration of a frame to be included in the output. Default: 0.01.
- pyinstrument.processors.remove_first_pyinstrument_frames_processor(frame, options)
The first few frames when using the command line are the __main__ of pyinstrument, the eval, and the 'runpy' module. I want to remove that from the output.
Internals notes
Frames are recorded by the Profiler in a time-linear fashion. While profiling, the profiler builds a list of frame stacks, with the frames having in format:
function_name <null> filename <null> function_line_number
When profiling is complete, this list is turned into a tree structure of Frame objects. This tree contains all the information as gathered by the profiler, suitable for a flame render.
Frame objects, the call tree, and processors
The frames are assembled to a call tree by the profiler session. The time-linearity is retained at this stage.
Before rendering, the call tree is then fed through a sequence of 'processors' to transform the tree for output.
The most interesting is aggregate_repeated_calls, which combines different instances of function calls into the same frame. This is intuitive as a summary of where time was spent during execution.
The rest of the processors focus on removing or hiding irrelevant Frames from the output.
Self time frames vs. frame.self_time
Self time nodes exist to record time spent in a node, but not in its children. But normal frame objects can have self_time too. Why? frame.self_time is used to store the self_time of any nodes that were removed during processing.
Indices and Tables
- Index
- Search Page
Author
Joe Rickerby
Copyright
2024, Joe Rickerby