UsefulLinks
Computer Science
Artificial Intelligence
Natural Language Processing (NLP)
Computational Linguistics
1. Introduction to Computational Linguistics
2. Mathematical and Computational Foundations
3. Foundational Concepts in Linguistics
4. Computational Phonology and Morphology
5. Syntax and Parsing
6. Computational Semantics
7. Pragmatics and Discourse
8. Corpus Linguistics and Data
9. Statistical and Machine Learning Methods
10. Advanced Topics and Applications
11. Evaluation Methodologies
12. Current Challenges and Future Directions
8.
Corpus Linguistics and Data
8.1.
Corpus Design and Construction
8.1.1.
Corpus Types
8.1.1.1.
Reference Corpora
8.1.1.2.
Monitor Corpora
8.1.1.3.
Specialized Corpora
8.1.1.4.
Parallel Corpora
8.1.2.
Sampling and Representativeness
8.1.2.1.
Population Definition
8.1.2.2.
Sampling Strategies
8.1.2.3.
Size Considerations
8.1.3.
Web as Corpus
8.1.3.1.
Web Crawling Techniques
8.1.3.2.
Data Quality Issues
8.1.3.3.
Ethical Considerations
8.1.4.
Multilingual Corpora
8.1.4.1.
Parallel Text Alignment
8.1.4.2.
Comparable Corpora
8.1.4.3.
Cross-Lingual Resources
8.2.
Corpus Annotation
8.2.1.
Annotation Frameworks
8.2.1.1.
Annotation Guidelines
8.2.1.2.
Quality Control
8.2.1.3.
Interoperability Standards
8.2.2.
Morphosyntactic Annotation
8.2.2.1.
Part-of-Speech Tagging
8.2.2.2.
Morphological Analysis
8.2.2.3.
Lemmatization
8.2.3.
Syntactic Annotation
8.2.3.1.
Phrase Structure Treebanks
8.2.3.2.
Dependency Treebanks
8.2.3.3.
Universal Dependencies
8.2.4.
Semantic Annotation
8.2.4.1.
Word Sense Annotation
8.2.4.2.
Semantic Role Labeling
8.2.4.3.
Named Entity Annotation
8.2.5.
Discourse Annotation
8.2.5.1.
Coreference Annotation
8.2.5.2.
Discourse Relations
8.2.5.3.
Dialogue Act Annotation
8.3.
Annotation Quality and Agreement
8.3.1.
Inter-Annotator Agreement
8.3.1.1.
Percent Agreement
8.3.1.2.
Cohen's Kappa
8.3.1.3.
Krippendorff's Alpha
8.3.2.
Annotation Consistency
8.3.2.1.
Guidelines Refinement
8.3.2.2.
Annotator Training
8.3.2.3.
Adjudication Processes
8.3.3.
Crowdsourcing Annotation
8.3.3.1.
Quality Control Mechanisms
8.3.3.2.
Aggregation Methods
8.3.3.3.
Cost-Benefit Analysis
8.4.
Corpus Analysis Methods
8.4.1.
Frequency Analysis
8.4.1.1.
Word Frequency Distributions
8.4.1.2.
Zipf's Law
8.4.1.3.
Frequency Lists and Rankings
8.4.2.
Collocation Analysis
8.4.2.1.
Statistical Measures
8.4.2.2.
Mutual Information
8.4.2.3.
Log-Likelihood Ratio
8.4.3.
Concordance Analysis
8.4.3.1.
KWIC Concordances
8.4.3.2.
Pattern Identification
8.4.3.3.
Linguistic Variation
8.4.4.
Comparative Analysis
8.4.4.1.
Corpus Comparison
8.4.4.2.
Diachronic Analysis
8.4.4.3.
Cross-Linguistic Studies
Previous
7. Pragmatics and Discourse
Go to top
Next
9. Statistical and Machine Learning Methods