2011-01-30

Book: Programming Collective Intelligence by Toby Segaran

Book: Programming Collective Intelligence by Toby Segaran (2007, 368 pages)
First I should said that I have always been interested in Machine Learning. I believe that a lot of information can emerge from this technology. So I am always eager to know about Statistic, Open Data and Machine Learning. I would like to have more time to dedicate to this field.
So about this book: Well, first, the example are in Python. Obviously, I am more of a "Java Guy" if that makes any sense. But I am glad that the subject forced me into reading Python code.  Now I understand better why Python is used in domains like Biology, Genetic and data manipulation. Python is really not only about indentation! It is great at manipulating data structure: multi-dimensional arrays, maps ... Short and powerful.  But even if Python is great, I felt that there could have been more schema and pictures, just to relax a bit from certain code intensive sections, specially when dealing with text parsing and word counting.
Last word about the code: the focus is not on optimizing code. But there are advices and considerations on which algorithm suits specific use case.  Still, I would be eager to read another volume on the subject. Especially about concurrency, Scala and GridGain. And even more algorithm ! There are a lot left to cover, specially time series, stream ...
The use case are well chosen, interesting and allow to introduce each algorithm and its limitation as he moves to another use case which require another algorithm. The algorithm are "classical", but he cover a wide range, from the Bayesian filter to SVM and even genetic programming. He avoids also the "Recipe collection", he outlines the constant principals about optimisation for example.
The title "Building Smart Web 2.0 App" is very limiting. But maybe having "web 2.0" in the title is required to sell a decent amount of books. The range of application and domain covered is way larger. Incidentally, the author work in a Biology company!
It is rare when I read a book, feels like it covers a lot, but still wants to know even more! Obviously he makes the subject interesting. There are a lot of data available that only wait to be minded. I followed the recent "Strata" thread from O'Reilly with great expectation for "Data Journalism".

No comments:

2023 summary

  Life is bigger than what you can imagine.  Still using Roam  http://www.roamresearch.com/  to take notes Still using Mastodon mainly, but ...