Useful Links
Computer Science
Software Engineering
Web Scraping
1. Fundamentals of Web Scraping
2. Core Web Technologies for Scraping
3. The Web Scraping Process
4. Essential Tools and Libraries
5. Data Extraction Techniques
6. Handling Common Scraping Challenges
7. Advanced Scraping Techniques
8. Data Storage and Post-Processing
9. Project Management and Best Practices
Core Web Technologies for Scraping
HyperText Transfer Protocol
The Request-Response Model
Client-Server Architecture
Lifecycle of a Request
Connection Management
HTTP Methods
GET Requests
Retrieving Resources
Query Parameters
Caching Considerations
POST Requests
Submitting Data
Form Data Encoding
Request Body Formats
Other HTTP Methods
PUT Method
DELETE Method
PATCH Method
HTTP Headers
User-Agent Header
Customizing User-Agent Strings
Implications for Scraping
Browser Fingerprinting
Referer Header
Tracking Navigation Paths
Security Implications
Cookie Management
Session Management
Persistent vs. Session Cookies
Cookie Security Attributes
Accept and Content-Type Headers
Content Negotiation
MIME Types
HTTP Status Codes
Success Codes
200 OK
201 Created
204 No Content
Redirection Codes
301 Moved Permanently
302 Found
304 Not Modified
Client Error Codes
400 Bad Request
401 Unauthorized
403 Forbidden
404 Not Found
429 Too Many Requests
Server Error Codes
500 Internal Server Error
502 Bad Gateway
503 Service Unavailable
HyperText Markup Language
Document Object Model
Tree Structure
Nodes and Elements
DOM Manipulation
HTML Tags and Attributes
Common Structural Tags
Content Tags
Form Elements
Attribute Usage
Document Structure
HTML Document Declaration
Head Section Elements
Body Section Organization
Semantic HTML
Semantic Elements
Accessibility Considerations
HTML Parsing Challenges
Malformed HTML
Browser Compatibility
Cascading Style Sheets
CSS Selectors
Basic Selectors
Attribute Selectors
Pseudo-classes
Combinator Selectors
CSS Impact on Scraping
Hidden Elements
Dynamic Classes and IDs
CSS-Generated Content
Responsive Design Considerations
Media Queries
Mobile vs. Desktop Layouts
Previous
1. Fundamentals of Web Scraping
Go to top
Next
3. The Web Scraping Process