Okay, here is a book I've been trying to read for almost two months already and barely get to the half of it. Python for Data Analysis has left me with mixed feelings. Of all the O'Reilly books I've got until now, this is probably the one in worst shape.
It seems that the author (Wes McKinney, the author of the Pandas library -- a great guy, no doubt, and that obviously possesses an extensive technical knowledge) did not spend much time trying to get the book more digestible. The Pandas library is great, kudos to him! The book however seems to be a pretty half-assed work.
I was unsure if my difficulty reading it was mostly my own failure. But then I talked to some friends who also read some of it, and they agreed that the book is a very dry read.
Maybe I am trying to read the book in a suboptimal way (from cover to cover), but the effort needed to grasp each little subject is making me cringe.
I buy books about technologies because I want a better way to learn them than having to go through all the documentation. However, reading this book feels much harder than following through the online documentation of the tools it documents.
Now, the online docs for Pandas have syntax highlight, hyperlinking and even a 10 minute tutorial. Maybe it's more merit for the online docs than demerit for the book, I'm not sure.
It seems that the book content was written as a bunch of IPython notebooks, and then everything was later concatenated into a big document in an order that sort of made sense (and losing the good looks). Its examples are a bunch of throwaway code, and are not presented in a way that the reader can care about -- and believe me, I've been trying.
I guess when you are an expert in a subject, it can be hard to remember how it is to be a beginner again. However, that's precisely the exercise that would render a great book. I urge all tech authors to read and apply the principles explained in the Crash Course in Learning Theory. This would surely render better books for all of us.
The book is not all bad. I've certainly learned something from it. The effort to read all of it is just not worth it, though. For anyone who wants to learn more about NumPy, Scipy and Pandas, my recommendation is to use the online documentation. Thank you Zé Ricardo for revising this text.