faqts : Computers : Programming : Languages : Python : Snippets : Lists

+ Search
Add Entry AlertManage Folder Edit Entry Add page to http://del.icio.us/
Did You Find This Entry Useful?

4 of 6 people (67%) answered Yes
Recently 3 of 5 people (60%) answered Yes

Entry

List size hints

Jul 5th, 2000 09:59
Nathan Wallace, unknown unknown, Hans Nowak, Snippet 34, Tim Peters


"""
Packages: maths.sieves;basic_datatypes.lists
"""

"""
> Is there any way to give Python RTE hints as the expected size of
> an array or list, or to get the reallocation to occur in larger chunks?

The only "hint" you can give is to allocate the size you want explicitly,
then keep track of the current index yourself; e.g.,
"""

mylistsize = 5
mylist = [None] * mylistsize
i = 0

"""
then manipulate i instead of appending.  Rarely helpful, though.

> C:\TEMP>c:\"program files"\python\python.exe
> Python 1.5.1 (#0, Apr 13 1998, 20:22:04) [MSC 32 bit (Intel)] on win32
> Copyright 1991-1995 Stichting Mathematisch Centrum, Amsterdam
> >>> import array
> >>> a = array.array('h')
> >>> a.append(2)
> >>> a.buffer_info()
> (7936480, 1)
> >>> a.append(3)
> >>> a.buffer_info()
> (7937504, 2)
> >>>

Actually, that says less than you may think <wink>:
"""

#Python 1.5.2b1 (#0, Dec 10 1998, 11:29:56) [MSC 32 bit (Intel)] on win32
import array
a = array.array('h')
adr = {}
for i in range(100000):
   a.append(12)
   adr[a.buffer_info()[0]] = 1
print len(adr)
# 134, sez Tim
# (16, or our system -- PSST)

"""
That is, while appending 100,000 elements one at a time, there were only 134
distinct start-of-buffer addresses.  In general, the bigger an array (or
list) gets, the less frequently reallocations occur.  Can hurt to grow more
than one list at a time frequently, though.

> This problem occurred to me while trying to speed up a generating
> routine that finds primes < 20k.  Almost all of the run time was
> the append().  I found a 3X speed increase by writing to a _file_
> instead of a list/array.  I found that unsettling. ;-)

So do I <frown>.  You should post the code -- something's fishy.  Also
bothered by your use of "list/array" -- they're not the same thing, and
arrays are generally slower than lists (in return for which they consume
less storage).  Finally, if it's taking more than a couple seconds to find
all the primes under 20K, what you really need is a better algorithm <0.9
wink>.  I.e., that it's slow enough that you're bothering to time it at all
is fishy too.

I'll attach a simple sieve with a timer; takes about 1/3 second to generate
the primes < 20,000 on a P5-166 under Win95 and Python 1.5.2b1.  If I
replace the list append with a file write, it's slower -- which is what I
expected.

could-be-you're-just-haunted<wink>-ly y'rs  - tim
"""

def sieve(n):
    if n < 2:
        return []
    candidate = range(n+1)
    primes = [2]
    p = 1
    while 1:
        p = p + 2
        while p <= n and candidate[p] == 0:
            p = p + 2
        if p > n:
            break
        primes.append(p)
        for pmult in xrange(p*p, n+1, 2*p):
            candidate[pmult] = 0
    return primes

def timeit(f):
    from time import clock
    for i in range(3):
        start = clock()
        x = f(20000)
        finish = clock()
        print len(x), finish - start

timeit(sieve)