What I Learned at Work this Week: Python pow()

Mike Diaz
6 min readJan 10, 2021

--

Credit Skitterphoto on Pexels

I’m almost at the end of Pierian Data’s Udemy course on Python (which I would recommend to any beginner looking to get into the language). In one of the final modules, the course goes over a series of useful, though less frequently used methods for numbers, strings, sets, lists, and dictionaries. They’re all so easy to use and intuitive that it’s barely worth writing a blog post over, but I did find one interesting tidbit: the course mentioned that the pow() method offers a speed advantage when used with a third argument.

pow()

Pow is short for “power” or “power of” — it can be used to calculate the result of exponent operations. Here’s a simple example:

pow(2,4)
# 16

Python, like many other programming languages, can use the ** operator to calculate exponents. So there’s nothing that would stop us from running this instead to get the same result:

2**4
# 16

But as I mentioned before, pow() has a third, optional argument. This argument is a modulus — it will return the remainder of the exponent we just calculated:

pow(2,4,5)
# 1
2**4 % 5
# 1

2 to the power of 4 is 16, as we previously established, and so modulus 5 equates to 1, because 5 goes evenly into 16 3 times and leaves a remainder of 1. This is still a pretty simple calculation, so you might question why a special method needs to exist to run it. We find the answer once we look into the efficiency of the two techniques.

timeit

One of Python’s many user-friendly advantages is its variety of conveniently-importable libraries. The timeit library lets us time count exactly how long it takes for our logic to run, which is especially useful in testing efficiency. All we have to do is import and invoke by passing a function name or calculation (as a string) into the timeit function! Here’s an example from the Python docs:

from timeit import timeittimeit('"-".join(str(n) for n in range(100))', number=10000)
# 0.18679064999923867

timeit takes 1–5 arguments, the first of which is a statement. The statement can either be a function or a string which describes the logic we are looking to time. In this case, we can see that the example is joining a dash character (-) 100 times (for n in range(100) meaning each number from 0 to 99).

The example also provides a number argument, which determines how many times the logic is run in our test. The function returns the fastest result, so a larger sample size is ideal. The default is 1,000,000, but the example only uses 10,000 (which is probably plenty). We can see that this statement took just under 0.19 seconds at best.

We can try this with a function instead of a statement in string form. Let’s take a look at the relative speed pow() vs typing out ** and %:

from timeit import timeitbase = 2
exponent = 4
mod = 5
def testA():
pow(base, exponent, mod)
def testB():
base**exponent % mod
print('testA: ', timeit(testA))
print('testB: ', timeit(testB))
# testA: 0.669285089999903
# testB: 0.3033970689994021

We set our three values to variables so that it’s easy to reference them in our functions and potentially change them if we’d like to. We then define two methods, one which uses pow() and the other that does the manual calculation. We print the results of timeit for both by simply passing in the function call as timeit’s lone argument. And we see that pow is…much slower?

But I thought that the point of this whole thing is that pow is more efficient than writing out the plain formula, so what gives? The key here is that pow() gains an advantage once the numbers start to get larger:

from timeit import timeitbase = 200
exponent = 400
mod = 5
def testA():
pow(base, exponent, mod)
def testB():
base**exponent % mod
print('testA: ', timeit(testA))
print('testB: ', timeit(testB))
# testA: 0.6807437240004219
# testB: 3.78076794899971

This time, pow() took about 0.01 seconds longer that it previously did, but testB took more than 3 extra seconds. Why is this?

Under the hood

When we run 2 ** 4, we’ve got a pretty good idea of Python’s doing in the background. It probably looks something like this:

2 * 2 => 4
4 * 2 => 8
8 * 2 => 16

And so if we change that to 200 * 400…

200 * 200 => 40000
40000 * 200 => 8000000
8000000 * 200 => 1600000000

It stands to reason that if our exponent is larger, we’re going to run more calculations, which will take more time. It’s also important to remember that larger numbers also require more memory/time to handle, so increasing our base also adds stress to our processor. To illustrate:

from timeit import timeitbase = 2000
exponent = 400
mod = 5
def testA():
pow(base, exponent, mod)
def testB():
base**exponent % mod
print('testA: ', timeit(testA))
print('testB: ', timeit(testB))
# testA: 0.6852057929991133
# testB: 6.890102186000149

One thing is clear: pow() is doing something very unique under the hood. It’s either running fewer calculations, finding a way to work with smaller numbers, or both.

Modular exponentiation

Wikipedia provides a helpful explanation of different, more time/space efficient means of finding the remainder of an exponent expression (though what’s really helpful is having a math degree). Of the methods described, I best understood the first one, which conserves space based on avoiding calculations that produce increasingly large numbers. It starts with this “identity”, a mathematical term meaning an equation that is true no matter what values are chosen:

(a ⋅ b) mod m = [(a mod m) ⋅ (b mod m)] mod m

We can think of a and b as the two values from each step of an exponent calculation. So if we’re doing 2**4 % 5:

# a is 2 and b is 2
2 * 2 => 4
# now a is 4 but b remains 2
4 * 2 => 8
# a is 8…

To take advantage of this identity, we break our exponent calculation into individual steps and run a mod m for each of them. BUT in order to keep the values small, we set the value of a to the value of ((b * c) mod m) mod m each time we do so. That means that, as we move up the exponent ladder, our a value won’t continue to grow exponentially, but will rather loop back to a smaller number once it’s larger than m. Let’s look at Wikipedia’s example, which calculates 4¹³ % 497 (e’ more or less represents a base or a counter):

We have our new a, which we multiply by b, the base and run mod again. I’ll admit that I still don’t fully understand why this works, but fortunately Wikipedia also provides pseudocode that I was able to translate into Python. Here’s where I ended up:

def testC():
c = 1
e_prime = 0
while e_prime < exponent:
e_prime += 1
c = (base * c) % mod
return c

And when I ran the logic, I got the same answer as pow() and as my standard equation. I do have to dispute one part of the Wikipedia article, though:

Keeping the numbers smaller requires additional modular reduction operations, but the reduced size makes each operation faster, saving time (as well as memory) overall.

When I ran timeit on my three operations:

testA: 0.6071091160047217
testB: 2.043397512003139
testC: 30.34461369500059

So if you were thinking “all those extra operations look costly,” you were right! I guess I wasn’t able to crack exactly what’s going on under the hood of pow().

The Magical World of Math

If you’re great at math, I highly recommend checking out that Wikipedia article and seeing if you can understand how the pow function might be providing such an efficient modular exponentiation calculation. If you’re more like me, it’s easy to get discouraged, especially if you work alongside people with degrees in Computer Science or Mathematics. So I implore you, dear reader, not to question your ability or potential if certain parts of programming are challenging or seem impossible. I’m grateful that this field has evolved in a way that is more accessible to people from different backgrounds and I also count it as a win that we learned a new function (pow) and experimented with a new library (timeit). I set a goal to write about how this function is so efficient and I didn’t quite get there. But I’m not a failure, because I learned something about Python this weekend — and that’s success.

Sources

--

--

Mike Diaz
Mike Diaz

No responses yet