Online Corpus
| PolyU Language bank | Over 36 mil words of multilingual, multi-genre corpora | free |
| RCPCE Profession-specific Corpora | A large collection of texts used in different professions in Hong Kong | free |
| A Query to Internet Corpora (Leeds U) | Updated general-purpose online corpora with different languages | |
| British National Corpus (1980-1993) | A standard English corpus often used as a reference corpus. | |
| British Academic Written corpus (BAWE) | A 6- mil- word collection of student essays in different disciplines | |
| Business Letter Corpus | A corpus with different English letters | |
| BYU Corpora | A collection of mega-corpora, including such as BNC and NOW (New words from 2010 to yesterday) | |
| The Corpus of Contemporary American English (COCA ,1990-present) |
Representative of modern American English | |
| Time Magazine (1923-2006) | A corpus for diachronic language study | free |
| GloWbE (Global Web-Based English) | 1.9 billion words of English used in 20 countries | free |
| MICASE | Transcripts of a wide range of spoken academic texts from Michigan University. | free |
| The Oxford Text Archive | The Archive develops, collects, catalogues and preserves a variety of electronic literary and linguistic resources | free |
| WebCorp | Allows corpus-type searches of documents in English on the Internet. | free |
| CQP Web for Language Corpora | A collection of corpora created by the Language and Mutilmodal Analysis Lab(LAMAL), Department of English, The Hong Kong Polytechnic University | free |
| Fashion Communication Corpus (FCC) | A 1 million-word texts obtained from fashion magazines, literature, journals, websites etc. | free |
| Enron email corpus | Enron email data sets compiled at UK Berkeley | free |
| Corpora maintained by Geoffrey Sampson | A collection of different texts |
Parallel Corpus
| Bilingual Parallel Corpora of Chinese Classics | Parallel texts of Chinese classic novels and government documents | |
| English-Chinese parallel concordancer | A collection of novels, fables and essays | free |
Text Archive
| The Gutenberg Project | The pioneering project designed to make non-copyright text available electronically | free |
| Internet Archive | The Internet Archive Text Archive contains a wide range of fiction, popular books, children’s books, historical texts and academic books. | free |
| Internet Archive: Wayback Machine | The Wayback Machine is a digital archive of the World Wide Web and other information on the Internet. You can check the Wayback Machine for archives of a website. | free |
Word Cloud
| Voyant Tools | To create word cloud based on frequency | free |
| Wordle | Wordle is a tool for generating “word clouds” from text that you provide. | free |
Corpus Tools
| AntConc | A freeware concordance program for Windows. Please visit Laurence Anthony’s Website for the complete list of software. | free |
| AntCorGen | A freeware discipline-specific corpus creation tool. | free |
| ConcGram 1.0 | ConcGram 1.0 is a corpus linguistics software package which is specifically designed to find all the co-occurrences of words in a text or corpus irrespective of variation. | |
| ConcGramCore | ConcGramCore is an open source corpus linguistics software package for corpus linguists to find all the co-occurrences of words in a text or corpus irrespective of variation. The software is in continous development. | free |
| ParaConc | A bilingual or multilingual concordancer that can be used in contrastive analyses and translation studies | free trial |
| WordSmith Tools | Concordancing, word lists, key words | |
| Leximancer | Lexical analysis | free trial |
| WMatrix | In addition to frequency lists and concordances, WMatrix extends the keywords method to key grammatical categories and key semantic domains. | free trial |
| Sketch Engine | Sketch Engine can provide a one-page summary of the word’s grammatical and collocational behavior, showing the word’s collocates categorised by grammatical relations. | |
| ATLAS.ti (7) | For qualitative data analysis and discourse analysis | free trial |
| NVivo (10) | For qualitative data analysis and discourse analysis | |
| kfNgram | kfNgram makes n-gram indices of any text(s) you give it, similar to WordSmithTools’ Cluster function. | free |
| The IMS Open Corpus Workbench | free |
Lexical Analysers
| The Ultimate Research Assistant | Lexical semantic thematic analysis of web documents | free |
Taggers
| CLAWS | Word class (part-of-speech) tagger | free |
| Stanford Log-linear Part Of Speech tagger | Different software for POS tagging | free |
| Stanford CoreNLP online engine | Online interface of the Stanford CoreNLP software. Click here for more information of the package. |
free |
| GUM | The Georgetown University Multilayer Corpus | free |
Phonetic Analysis
| Praat | Praat (the Dutch word for “talk”) is a free scientific software program for the analysis of speech in phonetics. | free |
| EMU (The Emu Speech Database System) | EMU is a collection of software tools for the creation, manipulation and analysis of speech databases. | free |
| WaveSurfer | WaveSurfer is an Open Source tool for sound visualization and manipulation. | free |
| SpeechAnalyzer | Speech Analyzer is a computer program for acoustic analysis of speech sounds. | free |
Development Workbench
| KPML | Workbench for developing grammatical descriptions and defining computational grammars | free |
| TermBase | Database for developing and storing terminologies | free |
Descriptive Resources
| WordNet | A lexical database organizing nouns, verbs, adjectives and adverbs into synonym sets, each representing one underlying lexical concept. | free |
| FrameNet | A lexical database containing around 1,200 semanticframes, 13,000lexical units and over 190,000 example sentences. | free |
Statistical Tools
| SPSS | A famous advanced statistical and analytic tools. | |
| R Project | A free package for statistical computing and graphics | free |
| GNU PSPP | A free program for statisical analysis. It is a free as in freedom replacement for the proprietary program SPSS, and appears very similar to it with a few exceptions. | free |
| Sample Size Calculator | An online calculator to find out the sample size based on the set confidence level and confidence interval. Useful for quantitative research sampling. | free |