CLICK TO JUMP
THESE MUST BE INSTALLED FOR YOUR PRACTICAL
| Name | Download link | Notes |
|---|---|---|
| Gephi | https://gephi.org/users/download/ | You must install this for your practical |
| Openrefine | http://openrefine.org/download.html | You should already have this installed |
| Sublime Text | https://www.sublimetext.com/ | You should already have this installed |
| CSV Editor | https://github.com/ScriptSmith/csveditor/ | Instructions for use are on that page. If on a mac you need to right-click the program to open it the first time |
| Reaper >= v0.1.7 | https://github.com/ScriptSmith/reaper/releases | Must be a version greater than or equal to v0.1.7 |
These are the only functions you should attempt to graph in Gephi
Click the blue to jump to the section on the page
-
- Post's comments
- Page's posts' comments
- Group's posts' comments
-
- Search's tweets
- Hashtag's tweets
-
- Thread's comments
- Search's threads' comments
- Subreddit's threads' comments
-
YouTube (Not currently available)
- Video's comments
- Search's videos' comments
- Channel's videos' comments
The following are instructions for scraping from a source in Reaper, editing the files it extracts and viewing them in Gephi
Tick the box to include information from the original post
In the original post's fields, make sure From is ticked
In the comment's fields, make sure Parent is ticked
Save it as a CSV
Open it in OpenRefine
Rename from.name to Source
Add a column named Target based on original_post.from.name
Expression: if(isNonBlank(rows.cells["parent.from.name"]), rows.cells["parent.from.name"].value, value)
Export from OpenRefine
Tick the box to include information from the original post
In the original post's fields, make sure From is ticked
In the comment's fields, make sure Parent is ticked
Select posts as the post type if you want posts from the page, select feed from the post type if you want posts from others as well
Save it as a CSV
Open it in OpenRefine
Rename from.name to Source
Add a column named Target based on original_post.from.name
Expression: if(isNonBlank(row.cells["parent.from.name"]), row.cells["parent.from.name"].value, value)
Export from OpenRefine
Tick the box to include information from the original post
In the original post's fields, make sure From is ticked
In the comment's fields, make sure Parent is ticked
Save it as a CSV
Open it in OpenRefine
Rename from.name to Source
Add a column named Target based on original_post.from.name
Expression: if(isNonBlank(row.cells["parent.from.name"]), row.cells["parent.from.name"].value, value)
Export from OpenRefine
Make sure that your search topic is recent and trending. You may want to select recent as your Result type
Save it as a CSV
Open it in CSV Editor
Select the following columns
Save it from CSV Editor as a new file
wait until New Row Count is the same as Old Row Count to confirm it is finished saving
Open the new file in Sublime text
Rename user.screen_name to Source
Rename retweeted_status.user.screen_name to Target
Save the file
Make sure that your hashtag is recent and trending. You may want to select recent as your Result type
Save it as a CSV
Open it in CSV Editor
Select the following columns
Save it from CSV Editor as a new file
wait until New Row Count is the same as Old Row Count to confirm it is finished saving
Open the new file in Sublime text
Rename user.screen_name to Source
Rename retweeted_status.user.screen_name to Target
Save the file
Check the box that says Include parent
Note that when it is checked, Reaper can only download a maximum of 500 comments / thread
Save it as a CSV
Open it in Sublime Text
Rename data.author to Source
Rename parent.data.author to Target
Save the file
Check the box that says Include parent
Note that when it is checked, Reaper can only download a maximum of 500 comments / thread
Save it as a CSV
Open it in Sublime Text
Rename data.author to Source
Rename parent.data.author to Target
Save the file
Check the box that says Include parent
Note that when it is checked, Reaper can only download a maximum of 500 comments / thread
Save it as a CSV
Open it in Sublime Text
Rename data.author to Source
Rename parent.data.author to Target
Save the file
Not currently available
When first setting up Gephi, make sure your plugins are up-to-date
Tools -> Plugins
Click the Check for Updates button and follow the process to install the updates for your plugins
Import data by going to File -> Import spreadsheet
Select the CSV file you want to import and import it as an Edges Table
If the warning Found row(s) with empty Source and/or Target columns appears and it won't let you click Next>, make sure you've updated your plugins
Click Next> and then Finish
There are 3 viewing modes, Overview, Data Laboratory and Preview
In Data Laboratory select Copy data to another column and select Id
Then select Label
Now in the Overview, when you click the button to add labels (the black T at the bottom), you can see the node's name
Choose the Force Atlas 2 Layout
Press the Run button to run the Layout
Press the Stop button to stop it when the graph stops moving significantly
Use the Expansion and Contraction layouts to expand and contract nodes. Make the scale factor > 1 to expand, > 0 and < 1 to contract
Click the spyglass on the left toolbar to see the entire graph
Gephi doesn't allow for parallel edges, so it merges those edges into a single edge.
If you want to visualize the frequency (how often nodes are connecting) of edges between nodes, you need to include a weight
-
In Openrefine / CSV Editor, remove all the columns other that the
SourceandTargetWe do this because we need to remove the unique identifiers for particular edges, which prevents merging edges
-
Add a new column based on the
Sourcecolumn -
Call it
Weight, set the value of the expression to just be1 -
Export the CSV
Now when you view the network, parallel edges will be merged so that their weight is increased according to the number of parallel edges
See the quick-start guide to see what analsis you can do in Gephi





























