- webMAUS: an automatic speech aligner with a wide range of capabilities (e.g. API and many languages)
- TreeForm: a Java program with great options for drawing syntactic trees
- RStudio: a popular IDE for R
- An R Markdown template for producing pleasant HTML output
- Praat: the primary tool for analyzing sound recordings for linguistic purposes
- FFmpeg: the finest media conversion tool no money can buy
- The British National Corpus (BNC) [1994 version] [2014 version]
- A large (1994: ~88M written + ~10.5M spoken; 2014: ~11.5M spoken) balanced corpus of British English from respectively the 1990s and 2010s
- The Corpus of Contemporary American English (COCA)
- A large (~560M) balanced corpus of American Enlish from between 1990 and 2017
- The Parse and Query (PaQu) web service
- A syntactically annotated parser for Dutch
- The Google Ngram viewer
- The raw data is available here. Note that you have to preprocess a lot in order to use this data, because it is quite messy.
- An Introduction to Bootstrap Methods with Applications to R by Michael Chernick and Robert LaBudde
- How to do Linguistics with R by Natalia Levshina
- Learning Python by Mark Lutz
- Trask’s Historical Linguistics by Robert Millar
- Sapiens by Yuval Harari
- The Silk Roads by Peter Frankopan
- The Lord of the Rings by J. R. R. Tolkien
- Planet of Adventure by Jack Vance
- The Name of the Wind & The Wise Man’s Fear by Patrick Rothfuss