
9 Data Mining Challenges from Data Scientists Like You

1. Poor quality data

• Dirty data• Missing values• Inadequate data size• Poor representation in data sampling

2. Lack of understanding

Lack of understanding/lack of diffusion of data mining techniques in academic arenas

3. Lack of good literature

Lack of good literature on important data mining topics and techniques

4. (Academic) access to commercial-grade software.

(Academic institutions) have trouble accessing commercial-grade software at reasonable costs.

5. Data variety

Data variety - trying to accommodate data that comes from different sources and in a variety of different forms (images, geo data, text, social, numeric, etc.).

6. Data velocity

Data velocity - online machine learning requires models to be constantly updated with new, incoming data.

7. Dealing with huge datasets

Dealing with huge datasets, or 'Big Data,' that require distributed approaches.

8. Coming up with the right question

Coming up with the right question or problem.

"More data beats the better algorithm, but smarter questions beat more data,“- Gregory Piatetsky,

9. Remaining objective and allowing the data to lead you, not the opposite.

Remaining objective and allowing the data to lead you, not the opposite. Preconceived notions can be dangerous, but luckily it is in our power to resist them...

Interested in other articles on data mining topics and techniques?

Check out the Salford Systems blog:

Top Related