
+ Search 

Dec 14th, 2001 13:00
Wolfgang Lipp, Michael Chermside,
""" Here is a script that provides a simple class, cycStr, that acts just like a string, except that when you iterate over it, you cycle, possibly several times. The loop for c in cs: print c, will give the following results for these instances: # five elements: cs = cycStr( 'ab', 5 ) # > a b  a b # five full cycles: cs = cycStr( 'ab', 5 ) # > a b  a b  a b  a b  a b  # one full cycle: cs = cycStr( 'ab' ) # > a b  A special case is the infinitely long cyclic string, which results from passing None to the constructor: cs = cycStr( 'ab', None ) # > a b  a b  a b  a b  a ... Therefore, the second argument passed to the constructor, the 'cyclic length', controls how long the sequence will look like when being iterated over. The definition of cycStr is very short, the main trick is actually put into the module function cycle() (for which see below): class cycStr( str ): def __new__( cls, data, cyclen = NA ): #(1) return str.__new__( cls, data ) def __init__( self, data, cyclen = NA ): #(2) if cyclen is NA: cyclen = len( data ) self.cyclen = cyclen def __iter__( self ): #(3) return cycle( str( self ), self.cyclen ) The following points may be noted: (1) Method __new__() has a signature like __init__(), it returns the result of calling __new__() of str and is called *before* __init__(); its first argument is not the instance (which doesn't exist at this point), but the class cycStr. (2) NA in __init__() is an instance of an empty class and only serves as a magic value to distinguish a missing second argument from an explicit None.  cyclen, cyclic length, is detailed below. (3) __iter__() is called whenever an iterator is wanted from the instance. Be cs be an instance of cycStr, then __iter__() will be called either implicitly (in a forxincs like situation) or explicitly (when iter(cs) is called). The __iter__() function is then responsible for returning an iterator; since a generator is a kind of an iterator and cycle() is a generator, cycle() is a valid result here. Now, the interesting part is really the generator function. We want a generator to iterate over v elements of sequence s  if v is greater than the length of s, we want to start over from the first element in s whenever we pass the last one.  We start out like this: def cycle1( s, v ): count = 0 it = iter( s ) while v is None or count < v: count += 1 yield it.next() s = 'abcdef' v = 10 for e in cycle1( s, v ): print e, This function only gives us maximally as many elements as there actually are in s. What we can do, then, is to catch the StopIteration (which is generated when count exceeds len(s) and went unnoticed, since the forinloop silently stops when encountering the exception) and 'rewind' the iterator, like this: def cycle2( s, v ): count = 0 it = iter( s ) while v is None or count < v: try: count += 1 yield it.next() except StopIteration: it.rewind() This solution, however, is not possible, since iterators do generally not have a method rewind(). It would be nice if we could retrieve the original sequence from the iterator and build a new iterator from the sequence, but I don't see this is a possibility. In this concrete situation, since we know s anyway, its possible to get away with this: def cycle3( s, v ): count = 0 it = iter( s ) while v is None or count < v: try: count += 1 yield it.next() except StopIteration: it = iter( s ) This gives us 'a b c d e f a b c'  almost perfect, except for the one missing element. The solution is either to count only successful calls to method next(), or to subtract one from count in case of an exception; I think the first solution is better: def cycle4( s, v ): count = 0 it = iter( s ) while v is None or count < v: try: R = it.next() count += 1 yield R except StopIteration: it = iter( s ) This, in fact, gives 'a b c d e f a b c d', ten elements,when called with for e in cycle4( s, 10 ) The actual code used here is but a small optimization: adding UnboundLocalError to the exceptions caught means we can have a single line that defines our local variable it, and the call to cyclen2virlen() means we can pass the more general concept of a cyclic length instead of a virtual length to the generator. The transition from cyclic length cl of a sequence s to its virtual length vl and the relationship to real length rl (the result of calling len(sequence)) of a cyclic sequence is defined as follows:  If a sequence's cl is negative, then its vl equals minus cl times the sequence's rl, resulting in so and so many cycles over the *entire* sequence.  If a sequence's cl is positive or zero, then its vl equals its cl, resulting in a cycle over so and so many *elements* (starting over from first element when passing past the last (real) element).  If a sequence's cl is None, then its vl is interpreted as being indefinitely large, yielding the same result as a  practically impossible  infinitely large positive or negative cl. This is the conversion done by cyclen2virlen(): cyclen2virlen('abc',5) > 15 # 5 full cycles over 5 * 3 == 15 elements cyclen2virlen('abc') > 3 # real length of one cycle of 'abc') cyclen2virlen('abc',5) > 5 # 5 elements cyclen2virlen('abc',None) > None # symbol for infinetely many cycles of 'abc') """ # from __future__ import nested_scopes from __future__ import generators # from __future__ import division def cyclen2virlen( seq, cyclen = 1 ): """Compute 'virtual length' (length of a sequence in terms of its elements) from 'cyclic length', and return it.""" if cyclen is not None and cyclen < 0: return cyclen * len( seq ) return cyclen def cycle( seq, cyclen = 1 ): """Generator that, given a sequence and cyclic length, returns the next element from seq, starting over from the sequence start when having passed beyond its end, until sequence is exhausted (if ever).""" # Start count with zero elements; # convert cyclic length to real length: count = 0 virlen = cyclen2virlen( seq, cyclen ) while virlen is None or count < virlen: try: # Try to fetch next element from sequence; if successful, # count that element, and yield it: R = it.next() count += 1 yield R except ( UnboundLocalError, StopIteration ): # Create a fresh iterator whenever iterator is nonexistant # or we have run off the end of sequence: it = iter( seq ) # Helper class for default arguments: class NA: pass NA = NA() class cycStr( str ): """Cyclic string class that has +cycle()+ as its iterator. Creating an instance and omitting explicit cyclic length +cyclen+ is equal to creating it with the real length of the sequence as second argument.""" def __init__( self, data, cyclen = NA ): if cyclen is NA: cyclen = len( data ) self.cyclen = cyclen # # Next line necessary??? # str.__init__( self, data ) def __new__( cls, data, cyclen = NA ): return str.__new__( cls, data ) def __iter__( self ): return cycle( str( self ), self.cyclen ) if __name__ == '__main__': def show( instance ): """This would be a simple forcininstance loop most of the time, but we want to catch infinitely long cyclic things here.""" print '' * 25 print '~.cyclen == %s' % instance.cyclen it = iter( instance ) maxcount = len( instance ) * 500 count = 0 while 1: try: assert count < maxcount print it.next(), count += 1 except StopIteration: break except AssertionError: print '... and so on and on...' break print show( cycStr( 'ab', 5 ) ) show( cycStr( 'ab', 5 ) ) show( cycStr( 'ab' ) ) show( cycStr( 'ab', None ) ) s = cycStr( 'foo', None )