Analytical Tools

This suite of tools can be used after scraping data from Reddit. Both of these tools analyze the frequencies of words found in submission titles and bodies, or comments within JSON scrape data.

There are a few ways you can quickly get the correct filepath to the scrape file:

  • Drag and drop the file into the terminal.
  • Partially type the path and rely on tab completion support to finish the full path for you.

Running either tool will create the analytics directory within the date directory. This directory is located in the same directory in which the scrape data resides. For example, if you run the frequencies generator on February 16th for scrape data that was captured on February 14th, analytics will be created in the February 14th directory. Command history will still be written in the February 16th urs.log.

The sub-directories frequencies or wordclouds are created in analytics depending on which tool is run. These directories mirror the directories in which the original scrape files reside. For example, if you run the frequencies generator on a Subreddit scrape, the directory structure will look like this:

analytics/
└── frequencies
    └── subreddits
        └── SUBREDDIT_SCRAPE.json

A shortened export path is displayed once URS has completed exporting the data, informing you where the file is saved within the scrapes directory. You can open urs.log to view the full path.

Target Fields

The data varies depending on the scraper, so these tools target different fields for each type of scrape data:

Scrape DataTargets
Subredditselftext, title
Redditorselftext, title, body
Submission Commentsbody
Livestreamselftext and title, or body

For Subreddit scrapes, data is pulled from the selftext and title fields for each submission (submission title and body).

For Redditor scrapes, data is pulled from all three fields because both submission and comment data is returned. The title and body fields are targeted for submissions, and the selftext field is targeted for comments.

For submission comments scrapes, data is only pulled from the body field of each comment.

For livestream scrapes, comments or submissions may be included depending on user settings. The selftext and title fields are targeted for submissions, and the body field is targeted for comments.

File Names

File names are identical to the original scrape data so that it is easier to distinguish which analytical file corresponds to which scrape.