Stock Trading with Microblog Sentiments

Stock Trading with Microblog Sentiments

Stock Trading with Microblog Sentiments Authors: Joseph Watts, Nick Anderson, Joseph Mehr, Connor Asbill Client: Saurabh Chakravarty Professor: Edward Fox Virginia Tech - Blacksburg, VA 24061 CS 4624: Multimedia, Hypertext, and Information Access, Spring 2017 May 2, 2017 Project Overview / Goal - Implement multiple trading strategies - Maximize profit over the course of one year - Data sources: Stock Twits (Tweets) Provided Yahoo/Google Finance (Daily stock price data) Found Wharton Research Center Data Services (Intraday stock price data)

Found Trading Simulation Software Returns positive or negative sentiment value Computes buy/sell based on sentiment/strategy Sentiment Analysis Trading Strategy Stock Twits Stock Prices Uses Hadoop/HBase Written in Scala Virtual Portfolio From Yahoo/Google Finance

Plan We chose 11 stocks to watch: AAPL, FB, GILD, KNDI, MNKD, NQ, PLUG, QQQ, SPY, TSLA, VRNG Set up the following strategies: Baseline S&P 500 (buy and hold the S&P 500 index) Moving Average Moving Average with Sentiment

Selection by Sentiment (One Stock): n = 1 Selection by Sentiment: n = 3 Selection by Sentiment (All Stocks): n = 11 Trading Strategies Strategy Stocks/Portfolio Decision-Making CrowdIQ Strategy 1 portfolio for each 11 stocks Based on bullish/bearish

sentiment Moving Average 1 portfolio for each 11 stocks Based on 5 and 10 day price trends. Moving Average with Sentiment 1 portfolio for each 11 stocks Based on 5 and 10 day price trends and sentiment Selection by Sentiment 1 portfolio shared by 11 stocks Based on bullish/bearish sentiment

Buy and Hold Only uses 1 Stock - S&P 500 Control: buys once at start then hold for entire year. $7 million $6 million $5 million SelectionBySentiment(AllStocks) SelectionBySentiment $4 million $3 million SelectionBySentiment(OneStock) $2 million Baseline IndexFund

MovingAverageWithSentiment MovingAverage $1 million Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec $1.4 million $1.2 million IndexFund $1 million SelectionBySentiment(AllStocks) MovingAverage MovingAverageWithSentiment Baseline SelectionBySentiment $800,000 SelectionBySentiment(OneStock) $600,000 $400,000

$200,000 Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec

Issues Unreliable data caused our day trading strategies to perform unreasonably well, so those results have been omitted. 2014-01-06T09:42:00Z VRNG 3.15 2014-01-06T09:43:00Z VRNG 3.12 2014-01-06T09:44:00Z VRNG 3.11 2014-01-06T09:45:00Z VRNG 3.11 2014-01-06T09:46:00Z VRNG 3.12 2014-01-06T09:47:00Z VRNG 1.09 2014-01-06T09:48:00Z VRNG 3.11 2014-01-06T09:49:00Z VRNG 3.13 2014-01-06T09:50:00Z VRNG 3.13 Future work - Use accurate source of high-resolution bid/ask quotes for day trading - Obtain data for 2013 and 2016, testing on each - Will help to explain the difference between our 2014 and 2015 results

- Test with live data (and integration with real trading platforms) - Implement slippage models in simulation software which factor in trading volume - More robust sentiment analysis with advanced text normalization techniques Acknowledgements - Saurabh Chakravarty - Client - [email protected] - Eric Williamson - Created sentiment analysis - [email protected] - Dr. Weiguo Fan - StockTwits Data - [email protected]

Recently Viewed Presentations

  • The Enlisted Force

    The Enlisted Force

    THE ENLISTED FORCE Overview Enlisted Force Foundation US Air Force Enlisted Force Evolution World War II The Career Force CMSAF and SEAs The Enlisted Force Structure General Responsibilities Enlisted Education Enlisted Force Foundation Organization Used Many Ranks From 1775 Adopted...
  • 1.01 - UCF Department of EECS

    1.01 - UCF Department of EECS

    is the additional time waiting for the disk to rotate the desired sector to the disk head. Minimize seek time. Seek time seek distance. Disk bandwidth is the total number of bytes transferred, divided by the total time between the...
  • Alcohol Safety - Reslife.Net

    Alcohol Safety - Reslife.Net

    Types of Drinking Drinking in moderation is defined as having no more than 1 drink per day for women and no more than 2 drinks per day for men. Binge drinking is defined as a pattern of alcohol consumption that...
  • Fundamentals of Operations Management - Stanford University

    Fundamentals of Operations Management - Stanford University

    Quality and Operations Management ... some but not all of the eight dimensions Allows 'voice of the customer' to be heard Quality dimensions need to be specified in terms of product characteristics Product characteristics need to be translated into engineering...
  • Campaign Definition: a series of operations to accomplish

    Campaign Definition: a series of operations to accomplish

    Synonym: detailed. Antonym: Simple. Parts of Speech: Timothy tried to elaborate. on his description of the suspect so that the police could draw an accurate sketch of him. Adjective. Elaborate. TELL . ME . MORE. Author: Nancy Bennett Created Date:...
  • Systems Basics: Roots of the Systems Movement

    Systems Basics: Roots of the Systems Movement

    Second Order Cybernetics Von Foerster, Cybernetics of Cybernetics, 1974 Maturana and Varela, Autopoiesis and Cognition, 1980 Bateson, Steps to an Ecology of Mind, 1972 Mind and Nature, 1979 "Cybernetics is the study of form and pattern" Observer Integral Part of...
  • AVAMS Stage 4 Overview - E-LIS

    AVAMS Stage 4 Overview - E-LIS

    Its archive includes a large number of student productions from what are now significant Australian directors such as Philip Noyce, Gillian Armstrong and Jane Campion. It includes the audio-visual records of significant politicians, for example former Prime Ministers Harold Holt,...
  • Loop-Level Parallelism - Computer Science and Engineering

    Loop-Level Parallelism - Computer Science and Engineering

    Loop-Level Parallelism Analysis at the source level Dependencies across iterations Loop-Carried Dependences Compiler support for ILP Dependence analysis Finding dependences is important for: Good scheduling of code Determining loop-level parallelism Eliminating name dependencies Complexity ...