Preallocating Ndarrays
Solution 1:
This isn't just a quesiton of preallocating an array, such as np.empty((100,), dtype=int)
. It's as much a question about how to collect a large number of lists into one structure, whether it be a list or numpy array. The comparison with MATLAB cells is enough, in my opinion, to warrant further discussion.
I think you should be using Python lists. They can contain lists or other objects (incuding arrays) of varying sizes. You can easily append more items (or use extend to add multiple objects). Python has had them for ever; MATLAB added cells to approximate that flexibility.
np.arrays
with dtype=object
are similar - arrays of pointers to objects such as lists. For the most part they are just lists with an array wrapper. You can initial an array to some large size, and insert/set items.
A = np.empty((10,),dtype=object)
produces an array with 10 elements, each None
.
A[0] = [1,2,3]
A[1] = [2,3]
...
You can also concatenate elements to an existing array, but the result is new one. There is a np.append
function, but it is just a cover for concatenate
; it should not be confused with the list append
.
If it must be an array, you can easily construct it from the list at the end. That's what your np.array([[1,2],[1],[2,3,4]])
does.
How to add to numpy array entries of different size in a for loop (similar to Matlab's cell arrays)?
On the issue of speed, let's try simple time tests
defwitharray(n):
result=np.empty((n,),dtype=object)
for i inrange(n):
result[i]=list(range(i))
return result
defwithlist(n):
result=[]
for i inrange(n):
result.append(list(range(i)))
return result
which produce
In [111]: withlist(4)
Out[111]: [[], [0], [0, 1], [0, 1, 2]]
In [112]: witharray(4)
Out[112]: array([[], [0], [0, 1], [0, 1, 2]], dtype=object)
In [113]: np.array(withlist(4))
Out[113]: array([[], [0], [0, 1], [0, 1, 2]], dtype=object)
timetests
In [108]: timeit withlist(400)
1000 loops, best of 3: 1.87 ms per loop
In [109]: timeit witharray(400)
100 loops, best of 3: 2.13 ms per loop
In [110]: timeit np.array(withlist(400))
100 loops, best of 3: 8.95 ms per loop
Simply constructing a list of lists is fastest. But if the result must be an object type array, then assigning values to an empty array is faster.
Post a Comment for "Preallocating Ndarrays"