Thursday, November 14, 2013

Fernando Perez: An ambitious experiment in Data Science takes off:...

Fernando Perez: An ambitious experiment in Data Science takes off:...: Today, during a White House OSTP event combining government, academia and industry, the Gordon and Betty Moore Foundation and the Alfred P...

Wednesday, September 4, 2013

Those funny offset numbers in matplotlib

If you are a heavy matplotlib user, you are bound to have seen the funny offset numbers in the top left of the plot window:

They are obviously there to help the viewer focus on the level where the numbers are really changing, removing the area where there's no change happening.

But I am claiming that due to pattern recognition, there are quite a few cases where this confuses more than it helps. In this example I (and the people in my team) are used to see 5-digit numbers and it takes quite some time to figure out here, that these are indeed 5-digit numbers.

Therefore I researched how to switch this behavior off.


First, one imports the ScalarFormatter class from the matplotlib.ticker module:

from matplotlib.ticker import ScalarFormatter
Then, one creates a formatter object with the use of offset numbers switched off:

y_formatter = ScalarFormatter(useOffset=False)

Finally, you apply it to an axis object that you either receive via the fig.subplot() command, via plt.gca() (acronym for Get Current Axis) or you catch it when it is being returned after a plot command:

ax.yaxis.set_major_formatter(y_formatter)
There you go, hope this helps someone.

Here is the stackoverflow issue that helped me to find the solution.

Update (2013-10-20) :

An easier way is to catch the axis object from the plot command and apply the following command:
ax.ticklabel_format(useOffset=False)
I initiated a github issue to have this included in matplotlib, which has been responded already with a solution, so this will be configurable in the future, yay!

Update 2, same day:
Weird, I thought I had the above shortcut working at some time, now it doesn't. If anyone knows the circumstance under this can work and can not, please comment.

Wednesday, June 5, 2013

polyfit

A follow-up to the previous post.

Polynomial fitting is also very easy with the numpy packages polyfit and poly1d.


In [196]: x = range(100)
In [197]: y = randn(100)
In [198]: plot(x,y)
Out[198]: []
Here I am asking polyfit to fit me a 2nd degree polynomial.
In [199]: polyfit(x,y,2)
Out[199]: array([-0.00018313,  0.01669275, -0.09621319])
The polyfit function returns the polynomial coefficients in a list.
If I want to use them directly as a fit function, just embed them in a new polynomial object:
In [200]: fitfunc = poly1d(polyfit(x,y,2))
In [201]: plot(x,fitfunc(x))
Out[201]: []
Saving the plot like this
In [202]: savefig('/Users/maye/Desktop/blog_polyfit.png')
and looks like this:


Friday, May 31, 2013

Polynomials with Python

Seriously, can it be any easier? ;)

If you are not in a pylab session, import the module like this:
In [148]: from numpy import poly1d
 Otherwise, just "import poly1d" should work.
Now let's get a polynomial for the coefficients of [3,2,1] (always in decreasing order!):
In [149]: p = poly1d([3,2,1])
Printing it provides a semi-analytical printout:
In [150]: print p
   2
3 x + 2 x + 1
Applying new x values to it is easy, because the poly1d object is a function:
In [152]: newx = linspace(0,10,10)
In [153]: p(newx)
Out[153]:
array([   1.        ,    6.92592593,   20.25925926,   41.        ,
         69.14814815,  104.7037037 ,  147.66666667,  198.03703704,
        255.81481481,  321.        ])
Lots of other things are possible with this object. IPython's object inspection makes it easy to discover them:
In [154]: p.
p.coeffs    p.deriv     p.integ     p.order     p.variable

In [155]: p.deriv()
Out[155]: poly1d([6, 2])
In [156]: pderiv = p.deriv()
In [157]: print pderiv
6 x + 2
Roots for this polynomial can be either determined by the roots function that is imported in a pylab session (or importable like from numpy import roots)
In [158]: roots(p)
Out[158]: array([-0.33333333+0.47140452j, -0.33333333-0.47140452j])
In [159]: p.r
Out[159]: array([-0.33333333+0.47140452j, -0.33333333-0.47140452j])




PS: One of these days I really have to find out how to do code high-lighting in Blogger, or, preferably, go all the way and do IPython notebook posts.