R Practice: Building Interdisciplinary Skillsets to Understand Environmental Attitudes (Part I: Word Clouds)
- Page ID
- 101339
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)
( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\id}{\mathrm{id}}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\kernel}{\mathrm{null}\,}\)
\( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\)
\( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\)
\( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)
\( \newcommand{\vectorA}[1]{\vec{#1}} % arrow\)
\( \newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow\)
\( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vectorC}[1]{\textbf{#1}} \)
\( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)
\( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)
\( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)
Technical Learning Objective: In this module, students will learn how to prepare a .txt file for the purpose of text analysis in R and will learn how to make a word cloud with the resulting textual data.
Word Cloud Analysis of Silent Spring
Note: This is an Module is focused on the preparation of a .txt file for purposes of text analysis. Please refer to the accompanying module 'Sentiment Analysis' to see how the product of this is used.
When we think of coding, we often associate it heavily with STEM. Coding is used in a wide range of fields, however, including in the humanities. Often, we need to think critically about how to analyze large amounts of text without having to perform a close reading on them. Several methods in R allow us summarize these large amounts of textual data into meaningful patterns.
A corpus is the term used for a set of text files of interest in an analysis. For this particular module, we will be focus on the famous book 'Silent Spring' by Rachel Carson. This book sparked a massive environmental movement - including the creation of the Environmental Protection Agency in 1970 and the banning of DDT (an insecticide that wrecked havoc on natural environments, particularly on birds). The impact of this one book is a testament to the power of individuals in conservation. Here, we will conduct a "sentiment analysis" of Silent Spring. Sentiment analysis is used to determine the "tone" of a corpus, whether it is positive, negative, or neutral.
Rachel Carson by U.S. Fish and Wildlife Service is licensed under CC-BY.
This module will focus first on the often overlooked, but critical, stage of any analysis - data cleaning and preparation. We are trying to identify sentiment as much as possible in our corpus, so we have to take into account factors that influence the sentiment of text. While you continue with this module, please take into account the comments and how each line of code in our text preparation aids in our quest of complete objectivity and preciseness in the preparation of our corpus analysis.
The analysis below is based on guidance from an article written by Mhatre 2020.
Loading Packages and Data
Before we get to work, we have to download the appropriate packages for this type of analysis. Because these are specialized packages, we do need to install most of them before loading them. This stage might take a minute! The more common packages, which are already built in to Libretexts, have been commented out of the install.packages commands with a #. When you run R on your computer, you only need to install packages once (but you do need to load them each time).
Now that our packages are ready, let's load our data.
Cleaning and Preparing Our Data
After we load out .txt file we have to make sure that we eliminate anything that is not purely a text file. We will go line by line and denote what needs to be removed in order to create a proper corpus.
Note: You will get a warning message, but that is because the corpus is being altered. If you get these messages you are going in the right direction.
All if these preparations can be used for many other purposes, for example, we can generate a word cloud or get statistical word association values.
Now that we have understood how the preparation of a text corpus works, analyze the corpus for after the publishing of Silent Spring. We challenge you to come up with 1-3 custom stopwords and rerun your analysis to see whether it changes anything. This specific work of pre-preocessing texts can be used for the creation of different word analysis and visualizations. For example, the most popular visualization is the word cloud. If we think about the preprocessing, without the elimination of common English stopwords, we would have words like 'the', 'and', and 'or' as the most frequent in the word cloud.
Thanks to the preprocessing we have the following:
Above, we see the most common words, now we will visualize these in a word cloud.
While you may be wondering what this has to do explicitly with ecology, we have to think about the interdisciplinary nature of Environmental Studies. Humanities intertwine with the ecological perspectives of people which would normally be analyzed through close reading and mass analysis of individual texts. With coding we can still achieve those results and make the analysis of various data forms widely available and accessible. Think about the possibilities of doing this for other disciplines and data forms.
Food for Thought
What themes and ideas can you conclude from the word cloud? How can this be applied to broader studies on ecology?
References:
Carson, R. (1962). Silent Spring. Crest Book.
Mhatre, S., Sampaio, J., Torres, D., & Abhishek, K. (2021, September 15). Text mining and sentiment analysis: Analysis with R. Simple Talk. Retrieved November 18, 2022, from https://www.red-gate.com/simple-talk...alysis-with-r/