Understanding UnboundLocalError in Python

If you're closely following thePython tag on StackOverflow, you'll notice that the same question comes up at least once a week. The question goes on like this:

x = 
10
def
foo
():
    x += 
1
print
 x
foo()

Why, when run, this results in the following error:

Traceback (most recent call last):
  File "unboundlocalerror.py", line 8, in 
<
module
>

    foo()
  File "unboundlocalerror.py", line 4, in foo
    x += 1
UnboundLocalError: local variable 'x' referenced before assignment

There are a few variations on this question, with the same core hiding underneath. Here's one:

lst = [
1
, 
2
, 
3
]


def
foo
():
    lst.append(
5
)   
# OK
#lst += [5]     # ERROR here


foo()

print
 lst

Running thelst.append(5)statement successfully appends 5 to the list. However, substitute it forlst+=[5], and it raisesUnboundLocalError, although at first sight it should accomplish the same.

Although this exact question is answered in Python's official FAQ (right here), I decided to write this article with the intent of giving a deeper explanation. It will start with a basic FAQ-level answer, which should satisfy one only wanting to know how to "solve the damn problem and move on". Then, I will dive deeper, looking at the formal definition of Python to understand what's going on. Finally, I'll take a look what happens behind the scenes in the implementation of CPython to cause this behavior.

The simple answer

As mentioned above, this problem is covered in the Python FAQ. For completeness, I want to explain it here as well, quoting the FAQ when necessary.

Let's take the first code snippet again:

x = 
10
def
foo
():
    x += 
1
print
 x
foo()

So where does the exception come from? Quoting the FAQ:

This is because when you make an assignment to a variable in a scope, that variable becomes local to that scope and shadows any similarly named variable in the outer scope.

Butx+=1is similar tox=x+1, so it should first readx, perform the addition and then assign back tox. As mentioned in the quote above, Python considersxa variable local tofoo, so we have a problem - a variable is read (referenced) before it's been assigned. Python raises theUnboundLocalErrorexception in this case[1].

So what do we do about this? The solution is very simple - Python has theglobal statementjust for this purpose:

x = 
10
def
foo
():

global
 x
    x += 
1
print
 x
foo()

This prints11, without any errors. Theglobalstatement tells Python that insidefoo,xrefers to the global variablex, even if it's assigned infoo.

Actually, there is another variation on the question, for which the answer is a bit different. Consider this code:

def
external
():
    x = 
10
def
internal
():
        x += 
1
print
(x)
    internal()

external()

This kind of code may come up if you're into closures and other techniques that use Python's lexical scoping rules. The error this generates is the familiarUnboundLocalError. However, applying the "global fix":

def
external
():
    x = 
10
def
internal
():

global
 x
        x += 
1
print
(x)
    internal()

external()

Doesn't help - another error is generated:NameError:globalname'x'isnotdefined. Python is right here - after all, there's no_global_variable namedx, there's only anxinexternal. It may be not local tointernal, but it's not global. So what can you do in this situation? If you're using Python 3, you have thenonlocalkeyword. Replacingglobalbynonlocalin the last snippet makes everything work as expected.nonlocalis a new statement in Python 3, and there is no equivalent in Python 2[2].

The formal answer

Assignments in Python are used to bind names to values and to modify attributes or items of mutable objects. I could find two places in the Python (2.x) documentation where it's defined how an assignment to a local variable works.

One is section 6.2 "Assignment statements" in theSimple Statementschapter of the language reference:

Assignment of an object to a single target is recursively defined as follows. If the target is an identifier (name):

If the name does not occur in a global statement in the current code block: the name is bound to the object in the current local namespace.

Otherwise: the name is bound to the object in the current global namespace.

Another is section 4.1 "Naming and binding" of theExecution modelchapter:

If a name is bound in a block, it is a local variable of that block.

[...]

When a name is used in a code block, it is resolved using the nearest enclosing scope. [...] If the name refers to a local variable that has not been bound, a UnboundLocalError exception is raised.

This is all clear, but still, another small doubt remains. All these rules apply to assignments of the formvar=valuewhich clearly bindvartovalue. But the code snippets we're having a problem with here have the+=assignment. Shouldn't that just modify the bound value, without re-binding it?

Well, no.+=and its cousins (-=,*=, etc.) are what Python calls "augmented assignment statements" [emphasis mine]:

An augmented assignment evaluates the target (which, unlike normal assignment statements, cannot be an unpacking) and the expression list, performs the binary operation specific to the type of assignment on the two operands,and assigns the result to the original target. The target is only evaluated once.

An augmented assignment expression likex+=1can be rewritten asx=x+1to achieve a similar, but not exactly equal effect. In the augmented version,xis only evaluated once. Also, when possible, the actual operation is performed in-place, meaning that rather than creating a new object and assigning that to the target, the old object is modified instead.

With the exception of assigning to tuples and multiple targets in a single statement,the assignment done by augmented assignment statements is handled the same way as normal assignments. Similarly, with the exception of the possible in-place behavior, the binary operation performed by augmented assignment is the same as the normal binary operations.

So when earlier I said thatx+=1is_similar to_x=x+1, I wasn't telling all the truth, but it was accurate with respect to binding. Apart for possible optimization,+=counts exactly as=when binding is considered. If you think carefully about it, it's unavoidable, because some types Python works with are immutable. Consider strings, for example:

x = 
"abc"

x += 
"def"

The first line bindsxto the value "abc". The second line doesn't modify the value "abc" to be "abcdef".Strings are immutable in Python. Rather, it creates the new value "abcdef" somewhere in memory, and re-bindsxto it. This can be seen clearly when examining the object ID forxbefore and after the+=:

>
>
>
 x = 
"abc"
>
>
>
id
(x)

11173824
>
>
>
 x += 
"def"
>
>
>
id
(x)

32831648
>
>
>
 x

'abcdef'

Note that some types in Python_are_mutable. For example, lists can actually be modified in-place:

>
>
>
 y = [
1
, 
2
]

>
>
>
id
(y)

32413376
>
>
>
 y += [
2
, 
3
]

>
>
>
id
(y)

32413376
>
>
>
 y
[
1
, 
2
, 
2
, 
3
]

id(y)didn't change after+=, because the objectyreferenced was just modified. Still, Python re-boundyto the same object[3].

The "too much information" answer

This section is of interest only to those curious about the implementation internals of Python itself.

One of the stages in the compilation of Python into bytecode is building the symbol table[4]. An important goal of building the symbol table is for Python to be able to mark the scope of variables it encounters - which variables are local to functions, which are global, which are free (lexically bound) and so on.

When the symbol table code sees a variable is assigned in a function, it marks it as local. Note that it doesn't matter if the assignment was done before usage, after usage, or maybe not actually executed due to a condition in code like this:

x = 
10
def
foo
():

if
 something_false_at_runtime:
        x = 
20
print
(x)

We can use thesymtablemodule to examine the symbol table information gathered on some Python code during compilation:

import
symtable


code = 
'''
x = 10
def foo():
    x += 1
    print(x)
'''


table = symtable.symtable(code, 
'
<
string
>
'
, 
'exec'
)

foo_namespace = table.lookup(
'foo'
).get_namespace()
sym_x = foo_namespace.lookup(
'x'
)


print
(sym_x.get_name())

print
(sym_x.is_local())

This prints:

x
True

So we see thatxwas marked as local infoo. Marking variables as local turns out to be important for optimization in the bytecode, since the compiler can generate a special instruction for it that's very fast to execute. There's an excellentarticle hereexplaining this topic in depth; I'll just focus on the outcome.

Thecompiler_nameopfunction inPython/compile.chandles variable name references. To generate the correct opcode, it queries the symbol table functionPyST_GetScope. For ourx, this returns a bitfield withLOCALin it. Having seenLOCAL,compiler_nameopgenerates aLOAD_FAST. We can see this in the disassembly offoo:

35           0 LOAD_FAST                0 (x)
             3 LOAD_CONST               1 (1)
             6 INPLACE_ADD
             7 STORE_FAST               0 (x)

36          10 LOAD_GLOBAL              0 (print)
            13 LOAD_FAST                0 (x)
            16 CALL_FUNCTION            1
            19 POP_TOP
            20 LOAD_CONST               0 (None)
            23 RETURN_VALUE

The first block of instructions shows whatx+=1was compiled to. You will note that already here (before it's actually assigned),LOAD_FASTis used to retrieve the value ofx.

ThisLOAD_FASTis the instruction that will cause theUnboundLocalErrorexception to be raised at runtime, because it is actually executed before anySTORE_FASTis done forx. The gory details are in the bytecode interpreter code inPython/ceval.c:

TARGET(LOAD_FAST)
    x = GETLOCAL(oparg);

if
 (x != 
NULL
) {
        Py_INCREF(x);
        PUSH(x);
        FAST_DISPATCH();
    }
    format_exc_check_arg(PyExc_UnboundLocalError,
        UNBOUNDLOCAL_ERROR_MSG,
        PyTuple_GetItem(co-
>
co_varnames, oparg));

break
;

Ignoring the macro-fu for the moment, what this basically says is that onceLOAD_FASTis seen, the value ofxis obtained from an indexed array of objects[5]. If noSTORE_FASTwas done before, this value is stillNULL, theifbranch is not taken[6]and the exception is raised.

You may wonder why Python waits until runtime to raise this exception, instead of detecting it in the compiler. The reason is this code:

x = 
10
def
foo
():

if
 something_true():
        x = 
1

    x += 
1
print
(x)

Supposesomething_trueis a function that returnsTrue, possibly due to some user input. In this case,x=1bindsxlocally, so the reference to it inx+=1is no longer unbound. This code will then run without exceptions. Of course ifsomething_trueactually turns out to returnFalse, the exception will be raised. Python has no way to resolve this at compile time, so the error detection is postponed to runtime.

[1]	This is quite useful, if you think about it. In C & C++ you can use the value of an un-initialized variable, which is almost always a bug. Some compilers (with some settings) warn you about this, but in Python it's just a plain error.

[2]	If you're using Python 2 and still need such code to work, the common workaround is the following: if you have data inexternalwhich you want to modify ininternal, store it inside adictinstead of a stand-alone variable.

[3]	Could this be spared? Due to the dynamic nature of Python, that would be hard to do. At compilation time, when Python is compiled to bytecode, there's no way to know what the real type of the objects is.yin the example above could be some user-defined type with an overloaded+=operator which returns a new object, so Python compiler has to create generic code that re-binds the variable.

[4]	I've written comprehensively on the internals of symbol table construction in Python's compiler (part 1andpart 2).

[5]	GETLOCAL(i)is a macro for(fastlocals[i]).

[6]	Had theifbeen entered, the exception raising code would not have been reached, sinceFAST_DISPATCHexpands to agotothat takes control elsewhere.

Python

Understanding UnboundLocalError in Python

The simple answer

The formal answer

The "too much information" answer

results matching ""

No results matching ""