Accessing Iterator In 'for In' Loop
Solution 1:
It is possible to do what you want to do, as long as you're willing to rely on multiple undocumented internals of your Python interpreter (in my case, CPython 3.7)—but it isn't going to do you any good.
The iterator is not exposed to locals
, or anywhere else (not even to a debugger). But as pointed out by Patrick Haugh, you can get at it indirectly, via get_referrers
. For example:
for ref in gc.get_referrers(seq):
if isinstance(ref, collections.abc.Iterator):
break
else:
raise RuntimeError('Oops')
Of course if you have two different iterators to the same list, I don't know if there's any way you can decide between them, but let's ignore that problem.
Now, what do you do with this? You've got an iterator over seq
, and… now what? You can't replace it with something useful, like an itertools.chain(seq, [1, 2, 3])
. There's no public API for mutating list, set, etc. iterators, much less arbitrary iterators.
if you happen to know it's a list iterator… well, the CPython 3.x listiterator
does happen to be mutable. The way they're pickled is by creating an empty iterator and calling __setstate__
with a reference to a list and an index:
>>> print(ref.__reduce__())
(<function iter>, ([0, 1, 2, 3, 4, 5, 6, 7, 8, 9],), 7)
>>> ref.__setstate__(3) # resets the iterator to index 3 instead of 7
>>> ref.__reduce__()[1][0].append(10) # adds another value
But this is all kind of silly, because you could get the same effect by just mutating the original list. In fact:
>>> ref.__reduce__()[1][0] is seq
True
So:
lst = list(range(10))
for elem in lst:
print(elem, end=' ')
if elem % 2:
lst.append(elem * 2)
print()
… will print out:
0 1 2 3 4 5 6 7 8 9 2 6 10 14 18
… without having to monkey with the iterator at all.
You can't do the same thing with a set.
Mutating a set while you're in the middle of iterating it will affect the iterator, just as mutating a list will—but what it does is indeterminate. After all, sets have arbitrary order, which is only guaranteed to be consistent as long as you don't add or delete. What happens if you add or delete in the middle? You may get a whole different order, meaning you may end up repeating elements you already iterated, and missing ones you never saw. Python implies that this should be illegal in any implementation, and CPython does actually check it:
s = set(range(10))
for elem in s:
print(elem, end=' ')
if elem % 2:
s.add(elem * 2)
print()
This will just immediately raise:
RuntimeError: Set changed size during iteration
So, what happens if we use the same trick to go behind Python's back, find the set_iterator
, and try to change it?
s = {1, 2, 3}
for elem in s:
print(elem)
for ref in gc.get_referrers(seq):
if isinstance(ref, collections.abc.Iterator):
break
else:
raise RuntimeError('Oops')
print(ref.__reduce__)
What you'll see in this case will be something like:
2
(<function iter>, ([1, 3],))
1
(<function iter>, ([3],))
3
(<function iter>, ([],))
In other words, when you pickle a set_iterator
, it creates a list of the remaining elements, and gives you back instructions to build a new listiterator out of that list. Mutating that temporary list obviously has no useful effect.
What about a tuple? Obviously you can't just mutate the tuple itself, because tuples are immutable. But what about the iterator?
Under the covers, in CPython, tuple_iterator
shares the same structure and code as listiterator
(as does the iterator
type that you get from calling iter
on an "old-style sequence" type that defines __len__
and __getitem__
but not __iter__). So, you can do the exact same trick to get at the iterator, and to
reduce` it.
But once you do, ref.__reduce__()[1][0] is seq
is going to be true again—in other words, it's a tuple, the same tuple you already had, and still immutable.
Solution 2:
No, it is not possible to access this iterator (unless maybe with the Python C API, but that is just a guess). If you need it, assign it to a variable before the loop.
it = iter(MyObject)
for i in it:
print(i)
# do something with it
Keep in mind that manually advancing the iterator can raise a StopIteration
exception.
for i in it:
if check_skip_next_element(i):
try: next(it)
except StopIteration: break
The use of break
is discussable. In this case it has the same semantics as continue
but you may just use pass
if you want to keep going until the end of the for-block.
Solution 3:
If you want to insert an additional object into a loop mid-iteration in a debugger, you don't need to do it by modifying the iterator. Instead, after the end of the loop, jump to the first line of the loop body, then set the loop variable to the object you want. Here's a PDB example. With the following file:
import pdb
def f():
pdb.set_trace()
for i in range(5):
print(i)
f()
I've recorded a debugging session that inserts a 15
into the loop:
> /tmp/asdf.py(5)f()
-> for i in range(5):
(Pdb) n
> /tmp/asdf.py(6)f()
-> print(i)
(Pdb) n
0
> /tmp/asdf.py(5)f()
-> for i in range(5):
(Pdb) j 6
> /tmp/asdf.py(6)f()
-> print(i)
(Pdb) i = 15
(Pdb) n
15
> /tmp/asdf.py(5)f()
-> for i in range(5):
(Pdb) n
> /tmp/asdf.py(6)f()
-> print(i)
(Pdb) n
1
> /tmp/asdf.py(5)f()
-> for i in range(5):
(Pdb) c
2
3
4
(Due to a PDB bug, you have to jump, then set the loop variable. PDB will lose the change to the loop variable if you jump immediately after setting it.)
Solution 4:
If you are not aware of the pdb
debugger in python, please give it a try. It's a very interactive debugger I have ever come across.
I am sure we can control the loop iterations manually with pdb. But altering list mid way, not sure. Give it a try.
Solution 5:
To access the iterator of a given object, you can use the iter() built-in function.
>>> it = iter(MyObject)
>>> it.next()
Post a Comment for "Accessing Iterator In 'for In' Loop"