Introduction
For the past year, Trump’s tweeting habits have been of great interest to people around the world. People in all fields and industries look to his tweets to understand the machinations of President Trumps and the innner workings of the current administration. For this project, we were interested in exploring the relationship between Trump’s tweets and the stock market. While there have been numerous news outlets that makes this identification, we wanted to quantify and further delve into this relationship. Additionally, we were hoping to gain further insight into Trump’s tweeting patterns to possibly predict future trends that could be applied to the stock market as well as broader contexts.
Hypothesis
Our three primary hypotheses is as follows:
1. Trump’s tweeting patterns will confirm popular media depictions including his criticism of the Democratic party and his conservative-leaning perspectives.
2. The sentiment of Trump’s tweet’s will be relatively erratic, reflective of the variable nature of rhetoric and ideas disseminated by the Trump administration.
S&P500 Analysis
spxcomp <- read.csv("/Users/kimberlyyan/Documents/BROWN/spxcomp.csv")
spxcomp_frame <- as.data.frame(spxcomp)
state_count <- count(spxcomp_frame, vars = state)
state_count <- as.data.frame(state_count)
colnames(state_count) <- c("region","value")
state_count$region <- as.character(state_count$region)
Data Visualization 5: S&P 500 Company Headquarter Locations
## Warning: package 'acs' was built under R version 3.4.2
state_count$region <- tolower(state_count$region)
state_count$region = gsub("\\s\\s+", "", state_count$region)
state_count$region <- gsub("^\\s", "", state_count$region)
state_choropleth(state_count, num_colors = 1, title="Headquarter Concentration of S&P 500 Companies", legend="Headquarters")
This visualization serves to help us better understand the composition of companies that makeup the S&P 500. From this chloropleth, we can see that the highest concentration of S&P 500 companies are headquartered in the Northeast and in California. This is consistent with the direction of the stock market and types of companies that are thriving in today’s economy such as technology companies and financial institutions. These are also areas with larger concentrations of metropolitan populations. The black states indicate that there are zero companies from the S&P 500 that are headquartered in those states.
Data Visualization 6: Graphing S&P500 Close Price
spx <- read.csv("/Users/kimberlyyan/Documents/BROWN/spx.csv")
spx <- spx[-c(63:68),]
spx$Adj.Close <- 0
for ( i in 2:nrow(spx)) {
spx$Adj.Close[i] <- ((spx$Close[i]-spx$Close[i-1])/spx$Close[i-1])*100
}
spx_close <- spx$Close
spx_date <- spx$Date
spx_close <- cbind((1:62), spx_close, spx$Adj.Close)
spx_close <- as.data.frame(spx_close)
ggplot(data = spx_close) + geom_line(aes(x=V1, y=spx_close)) + ggtitle("S&P 500 Close Price in days after Jan 1, 2017") + labs(y="Close Price ($)", x="Days after Jan 1, 2017")
datem <- match(as.character(spx_date), as.character(date_v_score$date), nomatch = 0 )
trump_score <- spx$Close
for ( i in 1:length(datem)) {
if ( datem[i] != 0 )
trump_score[i] <- date_v_score$avg_score[datem[i]]
}
Conclusion and Next Steps
After thoroughly exploring Trump tweet sentiment data, we believe that the visualizations are consistent with and validate our three hypotheses stated above:
1. The content of Trump’s tweets are consistent with media depictions of him, and the content of his tweets is relatively monotonous, focusing on a few ideas revolving around the same topics of “fake news” and “Russia”, and in many cases, self praise.
2. Trump’s tweeting patterns are extremely erratic, mirroring his variable moods and unpredictable shifts from praising to criticism.
3. The stock market is responsive to Trump’s administration. Though the correlation between the sentiment of the content of his tweets and the stock market is not extremely strong, the large change in stock price around the time of his inaugaration is indicative of the fact that Wall Street does indeed pay attention to Trump’s tweets or at the very least, the happenings of his administration.
These results are extremley relevant and paint a cohesive picture while showing a clear application of sentiment analysis. Moving forward, it would be enlightening to lengthen the time frame of data that is utilized to capture any longer term trends. Even more, the erratic nature of Trump’s sentiments could be compared against the sentiment of tweets from the President Obama’s Twitter account when he was president to have a ‘control’. This would serve to either furhter prove or disprove the variability of Trumps tweeting. Furthermore, it would be an intersting experiment to utilize median daily sentiment or modal daily sentiment and compare it to the average daily sentiment and see if the resulting histograms would yield different results.
This was a small application of sentient analysis, but it shows the potential value of evaluating the sentiment of bodies of text. Text sentiment analysis is a crucial step in eventually being able to predict human sentiment in written and spoke language and is definitely a worthy field of exploration that will certainly help us gain a better understanding of human thought processes and sociological interactions.