A brief tutorial on using Gephi in regards to basic analysis of various texts.

Gephi is a free and open-source platform for data visualization and exploration. It allows for interactions and analysis of graph data, and includes the enabling of various tools such as shaping, colouring and manipulation of structures so that one can easier see the patterns within. According to the developers, the platform can be used for: exploratory data analysis, link analysis, social network analysis, biological network analysis, and poster creation.

The purpose of this tutorial is to teach you to be able to utilize Gephi to visualize various aspects of texts. This tutorial will be teaching you first to create a dataset, second to import it into Gephi, and finally, how to do some basic manipulations. The text used in the samples for the creation and importing of the dataset will be an ancient comedy by Terence called The Brothers, the translation used will be the one in Terence: The Comedies translated by Peter Brown.


Nodes: Vertices of a graph, these represent specific objects. Examples: events or characters.

Edges: Links in between nodes of a graph. Examples: character interactions, events, or themes.

NOTE: There are several sample datasets on the Gephi wiki here.

Step One: Detailing the Data


Above is an example of a Nodes sheet. The only required column is the Id column, as it is used for the Edges sheet, but it is highly recommended if you are going to be creating texts about it to add more context through the Label column.


This is an example of the character interactions sheet before it is turned into the Edges sheet below.




As we are only doing base character interactions, you can simply fill out Type with ‘undirected’ and Weight with ‘1’. However, if you wanted to go more in depth with whether or not a character is speaking to, or being spoken at, you can change around the Type to ‘directed’. Or if you wanted to, for example, include how many lines are spoken in the interaction, you can change the Weight (this will enable you to view how many lines are said in between two characters). When imported into Gephi, this sheet will be used to create connections between the Ids for the Nodes created via the Nodes .csv file.

Step Two: Importing the Data into Gephi



Ascertain that the drop down tabs in your screen match the one above (especially the ‘Import as:’ one), and then click ‘Next’. In the next page, you may specify which columns you would like to import; the other setting ‘Time Representation’ will not be used in this tutorial. Click ‘Finish’ and then when prompted in the new window, have the Nodes imported to the existing workspace if your current workspace is empty (or a new workspace if your current workspace is not empty). This window will also tell you your number of Nodes/Edges, and you can specify whether or not you would like the graph type to be directed, undirected or mixed (a combination of directed and undirected) via a dropdown tab. Click ‘Ok’ once you have read over and made certain that the information is accurate and you will see a table of your Nodes.


Click next, and select which columns you wish to import on the next page. The ‘Source’ and ‘Target’ columns are required, but aside from that, you can choose with other columns you wish to see. Click ‘Finish’ and you should be prompted to decide whether or not you want the Edges to be put in a new workspace, or the one that you just created for your Nodes. Select the ‘Append to existing workspace’ option, which will add the Edges sheet to the Nodes sheet.

Step Three: Using Gephi



NOTE: You can click and drag Nodes to various places on your graph.


The appearance of both nodes and edges can be changed with this tab. We’re going to be making the colour of the nodes change based on the degree of the nodes (or how many nodes one is connected to). Click the ‘Ranking’ tab, which will bring up a dropdown tab that will allow you to choose what you want to change the colour based on (you can also change the size of the nodes, colour of the text and size of the text by choosing between the images in the same row as the ‘Nodes’ and ‘Edges’ tabs). Choose the ‘Degree’ option from the ‘—Choose an attribute’ tab. This should change the ‘Ranking’ tab to show a colour slider. The slider goes from left to right, least to greatest amount. You may choose preset colour palettes from the little icon beside the slider, or you can add colours (click on an empty space on the slider) and alter colours (double click on the little arrows on the slider) yourself. Once you have decided on your colour palette, click ‘Apply’ to see the colours change to reflect their degrees.

NOTE: If you wish to see the degrees in the Data Table, go to the Statistics window (it should be on the right side of the screen) and Run the ‘Average Degree’ program. This should give you a window that will provide you with details regarding the Degree Distribution and place the ‘Degree’ column on your Data Table, where you may view the statistics at your leisure.


Select ‘Degree’ from the dropdown menu, decide the minimum and maximum sizes of the nodes, and then press ‘Apply’ for your Nodes to be resized.

NOTE: If there are overlapping nodes, you may go back to the Layout tab and select ‘Adjust by Sizes’ to remove the overlapping.


From there, you can use the slider at the bottom of the panel to adjust the amount of nodes viewed based on their edges.


To view the graph’s most recent manipulations, select the ‘Refresh’ button down at the bottom.


‘Preview ratio’ allows you to view a partial graph.

NOTE: Curved edges do not show arrows to display direction of graph. To turn these off, go down to the ‘Edges’ selection under Settings, and turn off the ‘Curved’ option.

NOTE: SVG files can be edited further in programs such as Inkscape and Adobe Illustrator.

Make sure to save your project, and thank you for reading!

This tutorial is brought to you by the Brock University Digital Scholarship Lab. For more information on the DSL check out our website at or you can e-mail us at

