Friday, February 25, 2011

A scientific Python starter

As requested by some, here is a list of important websites, doc-sites and modules for a successful start in Python.

1. Getting Python
If there's one problem with Python, it is how some of the available modules depend on other modules to be the right version. Therefore I recommend wholeheartedly to install big packages that include all the modules you need as one big chunk.
I personally made good experiences with http://www.enthought.com., they provide academic licenses for free and also offer 64-bit version free on personal email request (they did for me at least).
There are other packages like PythonXY, but I think Enthought is the only one, that creates a package for Mac, Windows AND Linux.

2. First steps
The tutorial on http://docs.python.org is really excellent! When I worked through it years ago, I continued the next day with Python GUI programming tutorials and was coding my own graphical user interfaces with Python within 1 day! The clarity of Python enables this, I believe. Haven't learned a language before that is that easy to learn (and I used BASIC, C, C++, FORTRAN77, Java and IDL).

Read the tutorial until inclusive section 5 (Data Structures, up to here is a MUST!) and then you can go on for now to the scientific tutorials, but if you feel puzzled sometime later you really should continue this tutorial at least until section 9 inclusive to see what classes are all about and, important, how to read files in section 7).

3. A warning for IDL switchers:
Before we go on to Sciency stuff in Python, a warning:

If you do this in IDL:
a = [1,2,3]
b = a

then you have a new array b, copied from a, and you can do what you want with it without influence on a.
This has advantages for ease of use, but makes your IDL very fast very slow, because you carry your data around multiple times after a while.
As Python is a real programming language, it tries to be memory efficient and it does so, by avoiding copies if not explicitly asked for.
So in Python when you do:

a = [1,2,3]
b = a

then you don't get a copy but a link to the a-array. So if you do b[0]=4, 'a' has changed as well! (Try it out!)
So what do you do, if you really want a copy without changing the original?
Well, you ask for a copy:
b = a.copy()

4. Numpy and Scipy
Now on to the most import Python modules for scientists called 'numpy' and 'scipy'.
Scipy uses Numpy so they are closely linked.

The documentation for both can be found here:
http://docs.scipy.org/doc/ or just start some browsing around at http://www.scipy.org it's very interesting.
I recommend to start with the last linked document, the Scipy Reference guide on the docs page.
Why? Because it has a tutorial for Numpy, Scipy and also introduces you to the plotting module 'matplotlib' at the same time! So really worth reading.

Before I leave you alone in your python adventures (i put all links together again at the end), one more import comment for beginners confusion:
An import difference between numpy arrays (which look a lot like lists) and the original Python lists.
Python lists can take anything, so a list like this is possible:

myList = ['aString', 3.1415, (atuple, atuple)]

but the Python compiler needs some time to be able to deal with all this different things, so if one wants efficient arrays that only deal with the same type of elements at a time, then one needs numpy arrays.
So one important difference is the type of elements (many in Python lists, only one for numpy arrays), the other is how they react on mathematical operations.

For Python lists it can be quite handy, that it is possible to do:
[3,4]*3
to get
[3,4,3,4,3,4]

For writing text files in certain formats this is quite useful.
But for scientific calculations that doesn't make any sense of course, that's why numpy arrays do exactly what you expect for this:

import numpy as np
a = np.array([3,4])
print a*3

and you get
array([9, 12])

What you saw here as well, that the np.array function can transform normal Python lists to numpy arrays without problems (that also works the other way around in case you need it).

Ok, have fun in Python and don't shy to ask questions, the Python community is very helpful, I think because everybody is so happy about it. ;)

Here are the promised links of all important Python websites for scientists:

http://www.enthought.com
(to get a full scientific Python environment, that runs exactly the same on Win, Mac and Linux)
http://docs.python.org
(Overview of only the core Python stuff itself, home of the Tutorial !)
http://www.scipy.org
(the home of Scientific Python, very nice to browse through...)
http://docs.scipy.org/doc/
(the docs page of scipy)
http://matplotlib.sourceforge.net/
(The home of the matplotlib Python module, a very powerful plotting library. I always look at the gallery to find what I need and one get's the example code for each graph! Very helpful)

Maybe one last tip: The enthought environment installs an Examples folder as well, so you get a lot of example code installed on your computer, for the times when you are not online!

Now, enjoy!
I was getting a lot of MDS related errors in my system logfile (visible via the Console.app in Mac).
Like these:

Feb 25 12:42:00 paradigm /System/Library/Frameworks/SecurityFoundation.framework/Versions/A/dotmacfx.app/Contents/MacOS/dotmacfx[89682]: MDS Error: unable to create user DBs in /var/folders/PO/PO-GUydDF4mpVHTQxsRatE+++TI/-Caches-//mds
Feb 25 12:42:02 paradigm /System/Library/Frameworks/SecurityFoundation.framework/Versions/A/dotmacfx.app/Contents/MacOS/dotmacfx[89699]: MDS Error: unable to create user DBs in /var/folders/PO/PO-GUydDF4mpVHTQxsRatE+++TI/-Caches-//mds
Feb 25 12:42:34 paradigm /System/Library/PrivateFrameworks/GoogleContactSync.framework/Versions/A/Resources/gconsync[89761]: MDS Error: unable to create user DBs in /var/folders/PO/PO-GUydDF4mpVHTQxsRatE+++TI/-Caches-//mds
Feb 25 12:55:34 paradigm /System/Library/Frameworks/PubSub.framework/Versions/A/Resources/PubSubAgent.app/Contents/MacOS/PubSubAgent[91445]: MDS Error: unable to create user DBs in /var/folders/PO/PO-GUydDF4mpVHTQxsRatE+++TI/-Caches-//mds
Feb 25 12:59:30 paradigm /Applications/Safari.app/Contents/Safari Webpage Preview Fetcher[91954]: MDS Error: unable to create user DBs in /var/folders/PO/PO-GUydDF4mpVHTQxsRatE+++TI/-Caches-//mds

and many more.
I thought that can't be good for performance so I digged around a bit and found that the caches of the kernel modules can become a bit problematic after security updates from Apple.
But there is a very simple remedy: Boot in safe mode, because then the caches are automatically deleted and renewed at the next boot.
To do that you keep pressing the SHIFT key while rebooting. Once you are fully booted into safe mode, just restart again and you will see (if you are as sensitive as me to the snappiness of a GUI) that things move a tad snappier now, like it was when the Mac was still fresh. ;)

Good luck!

Find out on which computer you are 'python-ing'

Often, data is stored at different locations, depending on which computer the code is running.
To find out one can use the sys module like this:


import sys
if sys.platform == 'darwin':
    base = '/Users/aye/Data/hirise/'
else:
    base = '/processed_data/'

Thursday, February 24, 2011

Great Python resource!

This site diggs through many useful stuff of the Python standard library.

http://www.doughellmann.com/PyMOTW/contents.html

Python lists as function parameters

I often stumble about something like this:
A function needs 2 parameters like
myfunction(a,b)
and i have them, but inside a list 'myList'=[a,b].
Sure I could write
myfunction(myList[0],myList[1]),
but where's the famous Python beauty in this, right?
Until I realised that this is what the *args thingie is for that I never used before.
Damn, all those typing hours lost when I typed the explicit unpacking of lists!!!
Now I can just do
myfunction(*myList)
and all is beautiful again! ;)
Thank you, Python

Tuesday, February 15, 2011

Deutsche Service Wueste, jetzt auch in der Schweiz!

Schade, dass die deutsche Service-Wueste schon in die Schweiz migriert ist:
Kreditkarte bei der PostFinance am Schalter bestellt.
Karte kommt, Name falsch, wir zurueck zum Schalter um es zu aendern:
"Nein, das muessen sie jetzt mit der Kreditkartenfirma ausmachen, da haben wir ueberhaupt keinen Einfluss darauf."
Warum bezahl ich dann fuer eine PostFinance Kreditkarte?
Darauf wurde beim Vertragsabschluss natuerlich nicht hingewiesen, das man eventuelle Probleme dann woanders loesen muss. Echt traurig sowas...