Interesting Twitter Data

binarypoet · December 17, 2017, 6:04pm

It is probably fairly well known, but new to me, so sharing. Going through my twitter settings, I stumbled across the option to pull your personal tweet archive. They put together an archive with a nice HTML interface to browse them. More interesting is that each day’s worth of tweets is an very parseable JS file (2008_05.js) with entries like this:

{
  "source": "<a href=\"http://twitter.com\" rel=\"nofollow\">Twitter Web Client</a>",
  "entities": {
    "user_mentions": [],
    "media": [],
    "hashtags": [],
    "urls": []
  },
  "geo": {},
  "id_str": "820985823",
  "text": "break from ontology class...nothing new so far :(",
  "id": 820985823,
  "created_at": "2008-05-27 00:00:00 +0000",
  "user": {
    "name": "James Birchfield",
    "screen_name": "birchsport",
    "protected": false,
    "id_str": "14919086",
    "profile_image_url_https": "https://pbs.twimg.com/profile_images/942449706436911105/5-DiYQxG_normal.jpg",
    "id": 14919086,
    "verified": false
  }
}

jeremy · December 17, 2017, 6:32pm

Here are the steps, for those interested: https://help.twitter.com/en/managing-your-account/how-to-download-your-twitter-archive

binarypoet · December 17, 2017, 8:31pm

TY @jeremy for sharing the link…that would have been useful for me to do in the first place.

groverpr · December 18, 2017, 2:51am

Another guide to extract tweets and do sentimental analysis on those. We can use tweepy wrapper around twitter API.

github.com

parrt/msan692/blob/master/hw/sentiment.md

# Twitter sentiment analysis

The goal of this project is to learn how to pull twitter data, using the [tweepy](http://www.tweepy.org/) wrapper around the twitter API, and how to perform simple sentiment analysis using the [vaderSentiment](https://github.com/cjhutto/vaderSentiment) library.  The tweepy library hides all of the complexity necessary to handshake with Twitter's server for a secure connection.

As you did in the recommendation engine project, you will also produce a web server running at AWS to display the most recent 100 tweets from a given user and the list of users followed by a given user. For example, in response to URL `/the_antlr_guy` (`http://localhost/the_antlr_guy` when tested on your laptop), your web server should respond with a tweet list color-coded by sentiment, using a red to green gradient:

<img src=figures/parrt-tweets.png width=800>

As another example URL `/realdonaldtrump` yields:

<img src=figures/trump-tweets.png width=750>

Next you will create a page responding to URLs, such as `/following/the_antlr_guy`, that displays the list of users followed by a given user:

<img src=figures/parrt-follows.png width=320>

Or:

<img src=figures/trump-follows.png width=350>

This file has been truncated. show original