Data science for data-driven startups

Supercharged Excel for startup analytics with PowerBI

You think Excel is not suitable for analytics? Let me convince you that Excel can be your best tool with PowerBI.

Read more...

Apache Spark on Windows without winutils.exe

In order to use Spark on windows you need to install winutils.exe and change some environment variables. Here is a nice fix.

Read more...

PostgreSQL for data science : pro and cons

Is PostgreSQL a good companion for a data scientist at a startup? At which maturity stage should it be used? Let’s find out!

Read more...

Hadoop landscape review 2013

I’ve spent some time lately to dig into the Hadoop ecosystem both from a product survey and some hands on. Here is some remarks about […]

Read more...

Data Manipulation Part 2 : ETL

My last post discuss about SQL queries. Nevertheless, sometimes data came from differents databases. In such cases, it is no longer possible to use SQL. […]

Read more...

Data Manipulation Part 1 : SQL

Data manipulation is a big part of a data mining process. Some authors claims it could take 80% of a data mining project. I could […]

Read more...

Data mining tools

When it comes to data mining the tool you use is very important. It seems that peoples use many software (see How many software packages […]

Read more...

Using MySQL as a Data Warehouse

PS : This post is quite old now and isn’t relevant anymore. MySQL 5.6 introduced hash join which basically makes it more suitable to a data […]

Read more...