Donald Knuth once wrote these unfortunate words:

We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil.

Profiling is a technique for measuring execution times and numbers of invocations of procedures. Simply profiling is a way to identify time issues in applications.

Why you need a profiler? It’s all about improving the performance of your code. Its hard to find out where complex program spending time by just looking at code. It helps to find where hot code is and what can you do to speed it up. Optimizing your code without measuring where it’s taking time is useless exercise.

Profiling help us answer questions such as:

  • Where application using their time?
  • how much memory is the program using.
  • how much time is spent on functions?
  • how much time is spent on statements?
  • Which part of code need to optimize?

Python has couple of profiler built in standard library, like profile and cProfile cProfile is a handy tool and recommend for most users. We will focus on cProfile here.

Profiling code with cProfile

cProfile is very handy tool for profiling python code. We just need to import cProfile and used it. Here we will go through a very simple example.

import cProfile
cProfile.run('1+1')

First import cProfile and profile 1+1 by calling cProfile.run(). It give us:

3 function calls in 0.000 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.000    0.000 <string>:1(<module>)
        1    0.000    0.000    0.000    0.000 {built-in method builtins.exec}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

Here we some two numbers. The first line shows it call 3 functions and take 0 seconds. Next line ordered by standard name, indicates that the text string in the far right column was used to sort the output - according to docs Heading columns are, * ncall - the number of call made * tottime - total time spent in the given function * percall - quotient of tottime by ncall * cumtime - cumulative time spent in this and all subfunctions * percall - quotient of cumtime by primitive calls. A primitive call is one that was not induced via recursion. * filename:lineno(function) - respective data of each function

This example calls in 0 sec so there is no measurement. Lets see another example,

!/usr/bin/python3

import time

def main():
	sum = 0
	for i in range(5):
		sum += sleep_func(i)
	return sum

def sleep_func(sleep_time):
	time.sleep(sleep_time)
	return sleep_time

if __name__ == '__main__':
	main()

When the program execute it will run main function. Inside that there is a loop that will call sleep_func function 5 times and finally return the summation. In sleep_func it call time.sleep that will pause program. Lets profile this and see what results come,

python3 -m cProfile profiling.py

results:

	 15 function calls in 10.007 seconds

Ordered by: standard name

ncalls  tottime  percall  cumtime  percall filename:lineno(function)
	5    0.000    0.000   10.007    2.001 cprofiling.py:12(sleep_func)
	1    0.000    0.000   10.007   10.007 cprofiling.py:3(<module>)
	1    0.000    0.000   10.007   10.007 cprofiling.py:5(main)
	1    0.000    0.000   10.007   10.007 {built-in method builtins.exec}
	1    0.000    0.000    0.000    0.000 {built-in method builtins.print}
	5   10.007    2.001   10.007    2.001 {built-in method time.sleep}
	1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

Now we can see that our program takes 10.007 seconds and call 15 functions. And all details are here. In details we can see sleep_func is most calls function that calls time.sleep and take enough time to run. So we identify that sleep_func is the culprit that taking most of time in our program. we know that the best way to speed up the function is to remove the time.sleep call.

In our results its hard to find out which function calls most or taking most total time. If the result is sorted by ncalls or tottime then we can find it easily. Luckily there -s argument exists. just run this,

python3 -m cProfile -s calls profiling.py

   15 function calls in 10.011 seconds

Ordered by: call count

ncalls  tottime  percall  cumtime  percall filename:lineno(function)
     5    0.000    0.000   10.010    2.002 cprofiling.py:12(sleep_func)
     5   10.010    2.002   10.010    2.002 {built-in method time.sleep}
     1    0.000    0.000   10.011   10.011 {built-in method builtins.exec}
     1    0.000    0.000    0.000    0.000 {built-in method builtins.print}
     1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
     1    0.000    0.000   10.011   10.011 cprofiling.py:5(main)
     1    0.000    0.000   10.011   10.011 cprofiling.py:3(<module>)

Now we get a sorted result that sorted by calls. If you want to sort by tottime just replace calls with tottime

Now if we want save the results in a file pass -o arguments and output file name. Like,

python3 -m cProfile -s tottime -o output.txt profiling.py

Unfortunately you can’t read the data. To read the data we need pstats,

import pstats
p = pstats.Stats('output.txt')
p.strip_dirs().sort_stats(-1).print_stats()

This will give a result that we seen previously. To know more about pstats

Now cProfile module help us to identify where the hot code is and take steps to speed it up.

There are also several other 3rd party modules that are good for profiling such as the line_profiler , memory_profiler , Interactive continuous profiler .

PROFILING RESOURCES