Machine Learning with Signal Processing Techniques

Geplaatst 47 reactiesGeplaatst in Classification, Machine Learning, scikit-learn, Stochastic signal analysis

Introduction Stochastic Signal Analysis is a field of science concerned with the processing, modification and analysis of (stochastic) signals. Anyone with a background in Physics or Engineering knows to some degree about signal analysis techniques, what these technique are and how they can be used to analyze, model and classify signals. Data Scientists coming from […]

Share This:

The Perceptron

Geplaatst 10 reactiesGeplaatst in Classification, Machine Learning

1. Introduction Most tasks in Machine Learning can be reduced to classification tasks. For example, we have a medical dataset and we want to classify who has diabetes (positive class) and who doesn’t (negative class). We have a dataset from the financial world and want to know which customers will default on their credit (positive […]

Share This:

Regression, Logistic Regression and Maximum Entropy

Geplaatst 4 reactiesGeplaatst in Classification, Machine Learning, Sentiment Analytics

update: The Python code for Logistic Regression can be forked/cloned from my Git repository. It is also available on PyPi. The relevant information in the blog-posts about Linear and Logistic Regression are also available as a Jupyter Notebook on my Git repository. 1. Introduction One of the most important tasks in Machine Learning are the Classification tasks […]

Share This:

Sentiment Analysis with the Naive Bayes Classifier

Geplaatst 13 reactiesGeplaatst in Machine Learning, Sentiment Analytics

From the introductionary blog we know that the Naive Bayes Classifier is based on the bag-of-words model. With the bag-of-words model we check which word of the text-document appears in a positive-words-list or a negative-words-list. If the word appears in a positive-words-list the total score of the text is updated with +1 and vice versa. […]

Share This:

Sentiment Analysis with bag-of-words

Geplaatst 11 reactiesGeplaatst in Machine Learning, Sentiment Analytics

update: the dataset containing the book-reviews of Amazon.com has been added to the UCI Machine Learning repository. Introduction: In my previous post I have explained the Theory behind three of the most popular Text Classification methods (Naive Bayes, Maximum Entropy and Support Vector Machines) and told you that I will use these Classifiers for the automatic […]

Share This:

Visualizing Data

Geplaatst Een reactie plaatsenGeplaatst in Visualizations

We all know that visualizing data is an important part of Data Science. If it is done wrong, it can be boring not grabbing the attention of the readers, or even worse; convey the wrong message. If it done correctly, it can intrigue even the most indifferent reader (some people can even turn Data Visualizations into […]

Share This:

Text Classification and Sentiment Analysis

Geplaatst 11 reactiesGeplaatst in Machine Learning, Sentiment Analytics

Introduction: Natural Language Processing (NLP) is a vast area of Computer Science that is concerned with the interaction between Computers and Human Language[1]. Within NLP many tasks are – or can be reformulated as – classification tasks. In classification tasks we are trying to produce a classification function which can give the correlation between a […]

Share This:

Collecting Data from Twitter

Geplaatst 36 reactiesGeplaatst in Data Mining

update: The Python code for this TwitterScraper can be forked/cloned from my Git repository. ———– For most people, the most interesting part of the previous post, will be the final results. But for the ones who would like to try something similar or the ones who are also curious about the technical part, I will explain the […]

Share This: