Data science for data-driven startups

Page 3 of 5

Hadoop is dead thanks to EMC, long live to Hadoop

Today EMC announced the launch of Pivotal HD, their new version of Greenplum HD. Most of the underlying detail are not know and neither the […]

Read more...

Price deal analysis

Price deal is a common method to boost sales. Steam push it to an art. There is always a lot of stuff to buy at […]

Read more...

Data Brewery - Manage your data warehouse

Open Source ETL tool for data warehouse : Agile and SQL-based

Find more...

The call for a Modular Data Warehouse

In those day of huge focus on the Big Data mouvement, it seems that nobody needs a data warehouse anymore but a huge cloud of […]

Read more...

Fast creation of surrogate keys in Greenplum

Usually we use sequence to generate unique identifier for surrogate keys.  A sequence is simply a database object that return a number every time you […]

Read more...

Big data and mobile BI : New hype but same old issue

For the end of 2011, many around the blogosphere are forecasting what will be on hype next year. I often read that big data and […]

Read more...

Book review : Marketing calculator

Measuring and managing return on marketing investment, that’s the promise of the book from Guy R. Powell. A famous quote in marketing is : Half […]

Read more...

Data Manipulation Part 2 : ETL

My last post discuss about SQL queries. Nevertheless, sometimes data came from differents databases. In such cases, it is no longer possible to use SQL. […]

Read more...

Data Manipulation Part 1 : SQL

Data manipulation is a big part of a data mining process. Some authors claims it could take 80% of a data mining project. I could […]

Read more...

About evaluation

When deploying a model, one very important thing is to monitor the results. Does it work like you’ve expected? I’m not talking about pre production […]

Read more...

Machine learning vs simulation

Lately, I was thinking on the difference between machine learning and simulation (for prediction).  Machine learning use historical inputs and outputs to find subsequent outputs. […]

Read more...

« Older posts Newer posts »