Useful Links
Computer Science
Artificial Intelligence
Machine Learning
Machine Learning with Scikit-Learn
1. Introduction to Scikit-Learn
2. Core Scikit-Learn Concepts and API
3. Machine Learning Fundamentals
4. Data Preprocessing and Feature Engineering
5. Supervised Learning: Regression
6. Supervised Learning: Classification
7. Model Evaluation and Metrics
8. Improving Model Performance
9. Unsupervised Learning
10. Building Machine Learning Pipelines
11. Working with Text Data
12. Advanced Topics
13. Model Persistence and Deployment
14. Performance Optimization
15. Best Practices and Common Pitfalls
Working with Text Data
Text Preprocessing
Text Cleaning
Removing HTML Tags
Handling Special Characters
Unicode Normalization
Tokenization
Word Tokenization
Sentence Tokenization
Text Normalization
Lowercasing
Stemming
Lemmatization
Stop Word Removal
Standard Stop Words
Custom Stop Words
Feature Extraction from Text
Bag of Words Model
CountVectorizer
Vocabulary Building
Token Patterns
N-grams
Binary Occurrence
TF-IDF Vectorization
Term Frequency
Inverse Document Frequency
TfidfVectorizer
TfidfTransformer
Hashing Vectorizer
Memory Efficiency
Hash Collisions
Character-level Features
Character N-grams
Text Classification
Document Classification Pipeline
Feature Selection for Text
Handling Large Vocabularies
Sparse Matrix Operations
Text Clustering
Document Clustering
Topic Modeling Preparation
Advanced Text Processing
Handling Multiple Languages
Custom Tokenizers
Feature Hashing
Previous
10. Building Machine Learning Pipelines
Go to top
Next
12. Advanced Topics