MediaCrawler automates Chinese social media scraping
MediaCrawler is an open-source Python framework that uses Playwright-based browser automation to scrape content and comments from major Chinese social media platforms. It simulates authentic user interactions to bypass complex security and platform signing mechanisms natively.
While traditional scraping projects quickly break due to platform security updates and signature changes, MediaCrawler's browser-automation first design makes it exceptionally resilient for scraping heavily protected platforms.
* Playwright integration bypasses complex JS signature and anti-crawling defenses by running logic directly in a real browser context.
* Support for a wide range of popular Chinese social platforms makes it a valuable tool for localized market research, sentiment analysis, and academic study.
* The repository's rapid rise to over 50,000 stars underscores the massive demand for reliable data access to notoriously locked-down Chinese platforms.
* The non-commercial licensing model limits its use in enterprise environments, though the creator offers a Pro version with features like multi-account/proxy handling and API access.
DISCOVERED
1h ago
2026-06-27
PUBLISHED
1h ago
2026-06-27
RELEVANCE