Donald Knuth once wrote these unfortunate words:
We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil.
Profiling is a technique for measuring execution times and numbers of invocations of procedures. Simply profiling is a way to identify time issues in applications.
Why you need a profiler? It’s all about improving the performance of your code. Its hard to find out where complex program spending time by just looking at code. It helps to find where hot code is and what can you do to speed it up. Optimizing your code without measuring where it’s taking time is useless exercise.
Profiling help us answer questions such as:
- Where application using their time?
- how much memory is the program using.
- how much time is spent on functions?
- how much time is spent on statements?
- Which part of code need to optimize?
Python has couple of profiler built in standard library, like profile
and cProfile
cProfile is a handy tool and recommend for most users. We will focus on cProfile
here.
Profiling code with cProfile
cProfile is very handy tool for profiling python code. We just need to import cProfile and used it. Here we will go through a very simple example.
import cProfile
cProfile.run('1+1')
First import cProfile and profile 1+1 by calling cProfile.run(). It give us:
3 function calls in 0.000 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 0.000 0.000 <string>:1(<module>)
1 0.000 0.000 0.000 0.000 {built-in method builtins.exec}
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
Here we some two numbers. The first line shows it call 3 functions and take 0 seconds. Next line ordered by standard name, indicates that the text string in the far right column was used to sort the output - according to docs Heading columns are, * ncall - the number of call made * tottime - total time spent in the given function * percall - quotient of tottime by ncall * cumtime - cumulative time spent in this and all subfunctions * percall - quotient of cumtime by primitive calls. A primitive call is one that was not induced via recursion. * filename:lineno(function) - respective data of each function
This example calls in 0 sec so there is no measurement. Lets see another example,
!/usr/bin/python3
import time
def main():
sum = 0
for i in range(5):
sum += sleep_func(i)
return sum
def sleep_func(sleep_time):
time.sleep(sleep_time)
return sleep_time
if __name__ == '__main__':
main()
When the program execute it will run main function. Inside that there is a loop that will call sleep_func
function 5 times and finally return the summation. In sleep_func it call time.sleep
that will pause program. Lets profile this and see what results come,
python3 -m cProfile profiling.py
results:
15 function calls in 10.007 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
5 0.000 0.000 10.007 2.001 cprofiling.py:12(sleep_func)
1 0.000 0.000 10.007 10.007 cprofiling.py:3(<module>)
1 0.000 0.000 10.007 10.007 cprofiling.py:5(main)
1 0.000 0.000 10.007 10.007 {built-in method builtins.exec}
1 0.000 0.000 0.000 0.000 {built-in method builtins.print}
5 10.007 2.001 10.007 2.001 {built-in method time.sleep}
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
Now we can see that our program takes 10.007 seconds and call 15 functions. And all details are here. In details we can see sleep_func
is most calls function that calls time.sleep
and take enough time to run. So we identify that sleep_func
is the culprit that taking most of time in our program. we know that the best way to speed up the function is to remove the time.sleep
call.
In our results its hard to find out which function calls most or taking most total time. If the result is sorted by ncalls or tottime then we can find it easily. Luckily there -s
argument exists. just run this,
python3 -m cProfile -s calls profiling.py
15 function calls in 10.011 seconds
Ordered by: call count
ncalls tottime percall cumtime percall filename:lineno(function)
5 0.000 0.000 10.010 2.002 cprofiling.py:12(sleep_func)
5 10.010 2.002 10.010 2.002 {built-in method time.sleep}
1 0.000 0.000 10.011 10.011 {built-in method builtins.exec}
1 0.000 0.000 0.000 0.000 {built-in method builtins.print}
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
1 0.000 0.000 10.011 10.011 cprofiling.py:5(main)
1 0.000 0.000 10.011 10.011 cprofiling.py:3(<module>)
Now we get a sorted result that sorted by calls
. If you want to sort by tottime
just replace calls
with tottime
Now if we want save the results in a file pass -o
arguments and output file name. Like,
python3 -m cProfile -s tottime -o output.txt profiling.py
Unfortunately you can’t read the data. To read the data we need pstats
,
import pstats
p = pstats.Stats('output.txt')
p.strip_dirs().sort_stats(-1).print_stats()
This will give a result that we seen previously. To know more about pstats
Now cProfile module help us to identify where the hot code is and take steps to speed it up.
There are also several other 3rd party modules that are good for profiling such as the line_profiler , memory_profiler , Interactive continuous profiler .