Updated May 11, 2020: Adjust language and structure and make sure the code is compatible with CPython 3.8.2.
Recently, I came across the question, which method to sort a list is more efficient: Using Python’s built-in
sorted() function or relying on the
To answer this question I started a little investigation described in this article.
You can find the repository I’m referring to on GitHub.
The starting point is a Python list containing 1.000.000 random numbers (integers) built using the
import random arr = [random.randint(0, 50) for r in range(1_000_000)]
The generated numbers are in the range from 0 (inclusive) to 50 (inclusive).
Let’s have a look at the memory consumption of both functions first.
Therefore, we are using the built-in resource module  to track the maximum memory usage.
As the resource module enables us to track the memory usage of a single thread, we are running the sorting of our list in a separate thread.
You can use the
FunctionSniffingClass included in the repository to do so.
Let’s have a closer look at our Python script:
# memory_measurement/main.py import random import resource import sys import time from sniffing import FunctionSniffingClass def list_sort(arr): return arr.sort() def sorted_builtin(arr): return sorted(arr) if __name__ == "__main__": if len(sys.argv) != 2: sys.exit("Please run: python (sort|sorted)") elif sys.argv == "sorted": func = sorted_builtin elif sys.argv == "sort": func = list_sort else: sys.exit("Please run: python (sort|sorted)") # Lib Testing Code arr = [random.randint(0, 50) for r in range(1_000_000)] mythread = FunctionSniffingClass(func, arr) mythread.start() used_mem = 0 max_memory = 0 memory_usage_refresh = .005 # Seconds while(1): time.sleep(memory_usage_refresh) used_mem = (resource.getrusage(resource.RUSAGE_SELF).ru_maxrss) if used_mem > max_memory: max_memory = used_mem # Check to see if the function call is complete if mythread.isShutdown(): # Uncomment if yu want to see the results # print(mythread.results) break; print("\nMAX Memory Usage:", round(max_memory / (2 ** 20), 3), "MB")
We create two wrapper functions for the built-in ones to be able to pass them as arguments to the
We could pass the built-in sorted function directly to the
FunctionSniffingClass, but we want the same chances for both, so we create a wrapper function for it, too.
Furthermore, some simple command-line argument parsing is implemented to be able to use it as simple as possible from the command-line.
Curious how both built-ins perform? Let’s see!
$ python memory_measurement/main.py sort Calling the Target Function... Function Call Complete MAX Memory Usage: 23.371 MB $ python memory_measurement/main.py sorted Calling the Target Function... Function Call Complete MAX Memory Usage: 30.879 MB
As you can see, the
sorted() function consumed around 32% more memory than the
This was predictable as the latter on modifies the list in-place, whereas the first ones is always creating a separate list.
To be able to measure the execution time of both wrapper functions, we make use of the third-party boxx module .
The following snippet shows you how we can make use of its
timeit() function to measure the execution time of both functions.
# speed/main.py import random from boxx import timeit def list_sort(arr): return arr.sort() def sorted_builtin(arr): return sorted(arr) def main(): arr = [random.randint(0, 50) for r in range(1_000_000)] with timeit(name="sorted(list)"): sorted_builtin(arr) with timeit(name="list.sort()"): list_sort(arr) if __name__ == "__main__": main()
Note: Be sure to run the
sorted_builtin()function first as the
list.sort()method sorts the list just in-place, so the
sorted()function would not have to sort anything!
Running the above snippet prints the following output:
$ python main.py "sorted(list)" spend time: 0.1104379 "list.sort()" spend time: 0.0956471
As you can see, the
list.sort() method is slightly faster than the
Why is this the case?
Let’s disassemble both functions and see, whether we can conclude the answer based on the bytecode:
>>> import dis >>> dis.dis(list_sort) 12 0 LOAD_FAST 0 (arr) 2 LOAD_METHOD 0 (sort) 4 CALL_METHOD 0 6 RETURN_VALUE >>> dis.dis(sorted_builtin) 16 0 LOAD_GLOBAL 0 (sorted) 2 LOAD_FAST 0 (arr) 4 CALL_FUNCTION 1 6 RETURN_VALUE
Both functions bytecode is pretty much the same.
The only difference is that the
list_sort() function first loads the list and the method (sort) followed by calling the method on the list without any arguments.
In contrast to that, the
sorted_builtin() function first loads the built-in
sorted() function, followed by loading the list and calling the loaded function with the list as argument.
Furthermore, both use the same sorting algorithm: Timsort . So if both are using the same sorting algorithm and the bytecode of both is pretty much the same. Why are the timing results different?
My guess is that as
list.sort() can work with a known size and swap elements within that size,
sorted() has to work with an unknown size.
Therefore, the new list created by
sorted() needs to be resized if not enough memory is left when appending a new element.
And this takes time!
Having a look at the CPython source code, we find the following comment about resizing list objects:
The growth pattern is: 0, 4, 8, 16, 25, 35, 46, 58, 72, 88, …
- CPython: Objects/listobject.c
If we bring back to mind that we are dealing with a list of size 1.000.000, we can see: That is a lot of resizing!
Unfortunately, this is the best answer we get, when asking why
list.sort() is 13% faster than
However, my guess is wrong. As Nick Coghlan, one of the CPython core developers, stated on Twitter, the size of the resulting list is known. Basically, the following is happening:
new_array = arr.copy() arr.sort()
This theory is not correct - sorted knows how big the input is in this case, so it can preallocate the output. What it can't avoid is the extra data copying required to make a whole new list - if you measure "arr2 = arr.copy(); arr2.sort()" it should be comparable to sorted(arr).— Nick Coghlan (@ncoghlan_dev) 10. April 2019
He also states, that it is not really obvious if you do not know that it is there and look explicitly for it in the implementation.
The resizing idea was a decent guess though - even when you go read the source code, the preallocation trick is buried way down in the list.extend implementation, and hence is easy to miss if you don't already know it is there :)— Nick Coghlan (@ncoghlan_dev) 10. April 2019
The implementation results in the execution time difference as creating a copy of the list takes some time.
Before wrapping up this article, let’s have a look at what the official Python documentation says about this topic.
You can also use the list.sort() method. It modifies the list in-place (and returns None to avoid confusion). Usually it’s less convenient than sorted() - but if you don’t need the original list, it’s slightly more efficient.
- Sorting HOW TO
As you can see, the official documentation states what we have already proven:
list.sort() is slightly more efficient.
Furthermore, it tells us, that
sorted() is usually more convenient.
Another question that may arise is, whether both sorting techniques are stable. Fortunately, the docs have an answer to that:
Sorts are guaranteed to be stable. That means that when multiple records have the same key, their original order is preserved.
- Sorting HOW TO
This is also true, when using the reverse parameter or applying the reversed function twice.
The previous investigations showed us that
list.sort() is slightly faster than
sorted() and consumes around 24% less memory.
However, keep in mind that
list.sort() is only implemented for lists, whereas
sorted() accepts any iterable.
Furthermore, if you use
list.sort(), you will lose your original list.
I hope you enjoyed reading the article. Make sure to share it with your friends and colleagues. If you have not already, consider following me on Twitter, where I am @DahlitzF or subscribing to my newsletter so you won’t miss a future article. Stay curious and keep coding!