Data science for data-driven startups

Tag

data mining

About evaluation

When deploying a model, one very important thing is to monitor the results. Does it work like you’ve expected? I’m not talking about pre production […]

Read more...

Machine learning vs simulation

Lately, I was thinking on the difference between machine learning and simulation (for prediction).  Machine learning use historical inputs and outputs to find subsequent outputs. […]

Read more...

INFORMS Data Mining Contest Part 1

A new data mining contest is available here.  The functional domain is medical, more precisely there is two tasks. First, we need to prediction if […]

Read more...

Book review : Programming Collective Intelligence

Programming Collective Intelligence is a great book. It covers most of the existing data mining algorithms and presents many applications for them.  It covers clustering […]

Read more...

How to : What to do when your model fails?

Sometimes (well most of the time) using your favorite data mining methods and the more obvious attributes are not good enough. What to do then? […]

Read more...

Data mining tools

When it comes to data mining the tool you use is very important. It seems that peoples use many software (see How many software packages […]

Read more...

A Twitter users segmentation

Now it’s time to create some clusters from our twitter data. In this post, we focus only on biographical tags and we use the old […]

Read more...

When is a token a tag?

After some first statistics about the twitter dataset, I try to get further. In this post, i’ve discussed how to extract token from Twitter, more […]

Read more...

Book review : Competing on analytics

Competing on Analytics : A new Science of Winning is an interesting book even if it doesn’t explain anything about how to do real analytics. […]

Read more...

Book review : Collective Intelligence in Action

Last book I read was Collective Intelligence in Action from Satman Alag (ed. Manning). It covers data mining from a web 2.0 related view.  Data […]

Read more...