Python: How to efficiently implement large symmetric array with many empty entries -
i using large 2-dimensional square array entries lists of integers. not written in stone, it's w w array, w=15000, , each entry list of integers, random size 0 400. has 2 characteristics:
- it symmetric
- many of entries empty lists, including ones on diagonal
right now, taking advantage of first part , implement pseudocode:
f=[ [ [] in range(w)] j in range(w) ] # initiate blank w x w array in range(w): j in range(i): if (condition): f[i][j] = [a list of integers]
then in end, assign rest of values follows:
for in range(w): j in range(w): f[i][j]=f[i][j] # reference, size of f not increase
i not experienced programmer, , i'm feeling approach not efficient. in particular, feel not taking advantage of fact many entries empty.
can have more efficient list f such doesn't occupy space entries end being empty?
mind still important have "coordinates" x, y assigned correctly.
there variety of ways store sparse matrices, , wikipedia place start. without knowing you're planning these, nobody can tell 1 use.
for example, dictionary of keys design simple build in python:
d = collections.defaultdict(list) in range(w): j in range(w): if condition: d[i, j] = [a list of integers]
that's it. now, retrieve or change value later:
value = d[i, j] d[i, j].append(3)
but if ever want to, say, iterate of non-empty rows in order, of non-empty columns of each row in order, going terribly inefficient. naive solution loop on every , j , d[i, j]
, that's going allocate new empty list fill in every single position. instead sort keys, can slow well. keep around coordinate lists sorted in both orders, alongside dictionary—but @ point, dictionary adding overhead on sticking reference value in each coordinate list.
so, there other alternatives better different uses.
if you're willing use numpy/scipy arrays, scipy.sparse
handles 2d sparse arrays. of course they're normally arrays of numbers, can use dtype=object
of types (although mean "empty" slots have none
instead of []
, , there odd limitations on works in cases). (there's documentation somewhere (sorry, no link…) explains how sparse matrices implemented , how equivalent in pure python/numpy, use create wxwx400 sparse array of ints.)
Comments
Post a Comment