Introduction to Text Analysis with Voyant Tools

Tool Logo

Introduction to Voyant

Voyant Tools is an open-source text analysis tool available online. It was designed with the digital humanities in mind, with the goal of making reading and interpretation easier for its users. With plain text, HTML, XML, PDF, RTF, and MS Word files, Voyant can generate word clouds, display word frequency or collocation, and perform other text mining functions. Voyant Tools have been used by researchers to analyze texts in a variety of contexts, including literature, language teaching, healthcare, and system architecture.


Single Text Analysis

I am currently looking for a job and would like to create a resume and cover letter in such a way that it get shortlisted for the further interview process. Outstanding cover letters and resumes can turn a middle-of-the-road candidate into a top contender. It is vital to create a resume and cover letter in a manner that it is selected by recruiters. Recruiter uses sourcing tools to shortlist resumes and cover letters for the interview process. Sourcing tools aim to shortlist candidates through smarter job matching. The goal is to create a pool of quality candidates who have skill sets that are as close as possible to the job description criteria.

I will use Voyant to analyze job posting text to draw upon the important words/text and utilize those to craft my resume and cover letter. Please download the job posting description for this module by clicking HERE


Getting Started

Please watch the video below for quick review on getting started with Voyant Tool


Understanding Voyant Dashboard

Please click on the pink info icons on the image below to learn more about the different tools on Voyant Dashboard.


Learning Different Tools

Voyant offers a variety of additional Tools such as corpus tools, document tools, visualization tools, grid tools, and other tools. To change which tools you are using, click the “Windows” button located in the top bar of a tool. This will reveal a drop-down menu, where you can select an alternate tool to switch out with that current tool.

How to open differnt tools

There are certain tools which are repetitive, for the purpose of this module, we have excluded the repetitive tools. In order to learn more about various tools, please refer below:-

Corpus Tools

Document Tools

Visualization Tools


Multiple Text Analysis

When you read many novels or poems, how do you analyze them? What kind of information do you look for, and how do you decide what it means? Voyant can be also utilized to perform multiple text analyses. Multiple text analysis generally involves detecting patterns, such as identifying word frequency or associative links between words and comparing two or more texts. The most frequent usage of multiple text analysis is for distant reading. Distant reading refers to the use of computational methods to analyze literary texts without actually reading the whole document. We will perform the multiple text analysis on two books that are taken from Project Gutenberg. Project Gutenberg is a library of over 60,000 free eBooks.

Part One:

Please open your Voyant Tool. Once it is opened, please copy, and paste the below-mentioned links for the books- Alice Adventures in Wonderland and Pride and Prejudice under the ADD TEXT section.

https://raw.githubusercontent.com/DagaGargi/Introduction-to-Text-Analysis-with-Voyant-Tools/master/Alice%E2%80%99s%20Adventures%20in%20Wonderland%2C%20by%20Lewis%20Carroll.txt

https://raw.githubusercontent.com/DagaGargi/Introduction-to-Text-Analysis-with-Voyant-Tools/master/Pride%20and%20Prejudice%2C%20by%20Jane%20Austen.txt

Please make sure to put both the links under separate lines

Picture displaying how to add links on voyant

Part Two:

You will see a Voyant dashboard that gives the analysis for both texts.

The summary tab will you provide a comparative analysis of the novels on the basis of document length, vocabulary density, average words per sentence, readability index, most frequent words in the corpus, and distinctive words.

Picture displaying the summary tab

When you will click on the document tab, it will reveal the words, types, ratios, and words/sentences for both novels.

Picture displaying the document tab Part Three:

Under the cirrus, you will see the most frequent words used in the corpus. The Cirrus combines both the books and displays the word cloud that visualizes the most frequently used words of the documents. Hovering over a word will display the number of times that word appears on the document. Changing the bottom-left slider labeled (“Terms”) adjusts the number of shown words.

Picture displaying the Cirrus

In the summary tab, you will see a section “Most frequent words in the corpus” which has the same words as in cirrus.

Further Learning

To further analyze your books, please try different variety of tools that are offered by Voyant. You will see different tools, that depict different findings which can be utilized to analyze the text as per our knowledge.


This workshop is brought to you by the Brock University Digital Scholarship Lab. For a listing of our upcoming workshops go to Experience BU if you are a Brock affiliate or Eventbrite page for external attendees. For additional inquiries, contact DSL@Brocku.ca