Post-Election Analysis: Twitter vs traditional polls

As promised, here is the post-election analysis.
Although my predicted voting percentage for AKP was much closer to the actual result compared to most of the traditional polls, it is also true that my predicted value for MHP is far off, making the overall prediction error bigger than most conventional polls (see table below).

Election results	AKP	CHP	MHP	HDP	Others	prediction error
	49.4	25.4	11.9	10.7	2.5
My prediction	47.3	22.4	18.8	11.68	0	3.1
Traditional polls:
Andy-Ar	43.7	27.1	14.0	13.0	2.2	2.42
Konda	41.7	27.9	14.2	13.8	2.3	3.16
A&G	47.2	25.3	13.5	12.2	1.8	1.22
Gezici	43	26.1	14.9	12.2	3.8	2.58
Metropoll	43.3	25.9	14.8	13.4	2.6	2.46
ORC	43.3	27.4	14	12.2	3.1	2.46

So to be honest, I have to conclude that the results of this research do not point towards a clear victory for Twitter Data Analytics. Although it is not a clear victory, it is also not a clear loss.
On the bright side, this research was done with a few Amazone EC2 instances with a total cost of about three dollars, while the cost of traditional polls was in the range of a few million (put mildly). For the ones who are interested, this is an interesting article about the current state of the polling industry.

I still believe that the content of Twitter can be representative of an electorate and political sentiment can be modeled from Twitter messages effectively. However, it is clear that further research is needed and challenges lie ahead.

At the moment I can not give a clear answer to the question why there is such a large discrepancy between the predicted and actual result. I hope to provide you with a better explanation later on, but for now I can already tell you that this discrepancy is partly caused by ‘Ahmet Kaya’.

There were two politicians of the MHP party, named ‘Ahmet Kaya’ who were also participating in the elections (one for the province of Diyarbakir and one for the province of Erzincan). Now, the problem with these two politicians is that Ahmet Kaya was also the name of a very famous Turkish singer (who happened to be born on 28 October).
Ofcourse the Twitter Data Collector is not smart enough to distinguish between Ahmet Kaya the politician and Ahmet Kaya the singer and since I did not check the content of the Tweets or go through the dictionary containing the names of the ~550 politicians in great detail, MHP got thousands of Tweets more than it should have…

In later posts I will go into the more technical part about how to collect data from Twitter, for the ones interested in doing Twitter Data Analytics.

4 gedachten over “Post-Election Analysis: Twitter vs traditional polls”

Luis Fernando MARQUES ROSA schreef:

november 4, 2015 om 9:05 am

Such a small prediction error for 3 dollars… really not bad.
Congratulations, Ahmet. Very nice job.

Beantwoorden
Bert schreef:

november 4, 2015 om 10:42 am

and another point on the positive side; these predictions were made without actively interviewing people in Turkey. These kind of predictions can be made from anywhere at anytime.

Beantwoorden
aucan schreef:

november 7, 2015 om 9:05 am

tebrikler hocam gayet başarılı tahmin.

Beantwoorden
1. ataspinar schreef:
  
  november 7, 2015 om 1:39 pm
  
  Tesekkurler hocam
  
  Beantwoorden

ML Fundamentals

Post-Election Analysis: Twitter vs traditional polls

4 gedachten over “Post-Election Analysis: Twitter vs traditional polls”

Geef een antwoord Reactie annuleren

Delen:

4 gedachten over “Post-Election Analysis: Twitter vs traditional polls”

Geef een antwoord Reactie annuleren