pureSalsa20.pySalsa20 in pure Python 2.5
version p3.2
|
Daniel J. Bernstein's Salsa20, is a provisionally well respected stream cipher. Larry Bugbee's pySalsa20 C extension wrapper has been around for a while, but no pure Python version, despite occasional forlorn calls on the net.
I wrote pureSalsa20 for a program that doesn't need raw speed, but which I want to distribute in such a way that users don't have to compile C code or install a binary library.
Raw speed pureSalsa20 has not got. On my old G3 Mac, it can encrypt around 24KB/sec., and this is pureSalsa20's niche: when you need portability but not speed or real security. More about speed below. Meanwhile,
from pureSalsa20 import Salsa20at the top of your program, and either have pureSalsa20.py in the current directory when you run it, or pre-install pureSalsa20 in your Python's standard location with the (Linux or Mac OS) command
sudo python setup.py installpureSalsa20 follows the API of Bugbee's pySalsa20, and I included Bugbee's use notes wholesale in the comments section of pureSalsa20.py. There are also comments there about my implementation and about testSalsa20.py .
If you're a Python geek, you might like to try your hand at the problem by writing three functions--
The crazy (but of course perfectly sane) thing is the sign bit. Python has no unsigned ints. The top bit of a 32-bit int is not 2**31 or 1<<31, but -(1<<31), or -0x80000000. Here's trunc32() from my code:
def trunc32( w ): """ Return the bottom 32 bits of w as a Python int. This may create a long temporarily, but returns an int. """ w = int( ( w & 0x7fffFFFF ) | -( w & 0x80000000 ) ) assert type(w) == int return wtrunc32() creates a couple longs temporarily, but I only use it in non- timing-critical testing, so I didn't avoid "longing". But I did avoid it in these two:
def add32( a, b ): """ Add two 32-bit words discarding carry above 32nd bit, and without creating a Python long. Timing shouldn't vary. """ lo = ( a & 0xFFFF ) + ( b & 0xFFFF ) hi = ( a >> 16 ) + ( b >> 16 ) + ( lo >> 16 ) return ( -(hi & 0x8000) | ( hi & 0x7FFF ) ) << 16 | ( lo & 0xFFFF ) def rot32( w, nLeft ): """ Rotate 32-bit word left by nLeft or right by -nLeft without creating a Python long. Timing depends on nLeft but not on w. """ nLeft &= 31 # which makes nLeft >= 0 if nLeft == 0: return w # Note: now 1 <= nLeft <= 31. # RRRsLLLLLL There are nLeft RRR's, (31-nLeft) LLLLLL's, # => sLLLLLLRRR and one s which becomes the sign bit. RRR = ( ( ( w >> 1 ) & 0x7fffFFFF ) >> ( 31 - nLeft ) ) sLLLLLL = -( (1<<(31-nLeft)) & w ) | (0x7fffFFFF>>nLeft) & w return RRR | ( sLLLLLL << nLeft )(This rot32() deals with nLeft < 0 or nLeft > 31 by taking nLeft mod 32, which is more than my spec above, but it converts the problem to that problem.)
All in all, it's like Homer Simpson juggling nuclear waste in a glove box.... in slow motion.
p3.2 Fixed bug that initialized the output buffer with plaintext! Saner ramping of nreps in speed test. Minor changes and print statements. p3.1 Took timing variability out of add32() and rot32(). Made the internals more like pySalsa20/libsalsa . Put the semicolons back in the main loop! In encryptBytes(), modify a byte array instead of appending. Used subclasses instead of patches in testSalsa20.py . Fixed speed calculation bug. Added 64k-byte messages to speed test to be fair to pySalsa20. p3 First version, intended to parallel pySalsa20 version 3.
--Steve Witham
Up to my home page.