AIBit-Discover Open Source Projects AIBit-Discover Open Source Projects
Open Source ProjectsWeb Scraping & DataAI Agents & AutomationAI Tools & Resources
More
Learning & TutorialsAI Research & BenchmarksDevelopment & SecurityWeb & InfrastructureMedia & Content CreationHardware & Edge AIStartup Resources
AIBit-Discover Open Source Projects › Web Scraping & Data› Scraping Libraries

March 1, 2026

Scrapling: Ultimate Python Web Scraping Framework

Discover Scrapling, the adaptive web scraping framework that handles everything from single requests to full-scale crawls. Bypass Cloudflare Turnstile, use smart element tracking that survives website changes, and scale with concurrent spiders featuring pause/resume. With stealth modes, proxy rotation, AI integration via MCP server, and blazing-fast performance outperforming Scrapy/Parsel, it's built for serious web scrapers. Install with pip and start scraping in minutes!

  • Jul 9, 2025

    Crawlee: Powering Reliable Web Scraping with Node.js

    Discover Crawlee, the powerful Node.js library for web scraping and browser automation. Learn how this open-source tool helps developers build robust and reliable crawlers with features like proxy rotation, bot protection evasion, and support for Puppeteer and Playwright. Whether you're extracting data for AI, LLMs, or general data collection, Crawlee streamlines the process. Explore its capabilities and find out how to get started with installation and basic usage. Ideal for JavaScript and TypeScript developers looking to enhance their data extraction workflows and ensure their crawlers operate efficiently and undetected.

  • Jun 29, 2025

    Crawlee-Python: The Ultimate Web Scraping Library

    Discover Crawlee-Python, a robust and reliable web scraping and browser automation library. Ideal for data extraction for AI, LLMs, RAG, and GPTs, Crawlee handles everything from downloading various file types to working with BeautifulSoup, Playwright, and raw HTTP. It supports both headful and headless modes, offering proxy rotation and advanced features for building resilient crawlers. This library simplifies complex scraping tasks, ensuring your projects are efficient and effective. Learn how Crawlee revolutionizes web data collection and automation for developers.

  • Jun 29, 2025

    Crawl4AI: The Open-Source LLM-Friendly Web Crawler

    Discover Crawl4AI, the trending open-source web crawler engineered for Large Language Models (LLMs) and AI agents. This powerful tool offers lightning-fast, AI-ready data extraction, enabling developers to build robust RAG applications and data pipelines. Learn about its key features, including intelligent Markdown generation, structured data extraction, flexible browser control, and easy Docker deployment. Ideal for anyone looking to democratize data access and empower AI models with high-quality, real-time web content.

  • Jun 22, 2025

    WaterCrawl: Transform Web Content into LLM-Ready Data

    Discover WaterCrawl, a powerful open-source web application designed to crawl web pages and extract relevant data, making it ready for integration with Large Language Models (LLMs). Built with Python, Django, Scrapy, and Celery, WaterCrawl offers advanced web crawling, multi-language support, and asynchronous processing. It provides comprehensive API access, client SDKs (Python, Node.js, Go, PHP), and integrations with platforms like Dify and N8N. Whether you're a developer looking to build data pipelines for AI or an organization needing robust web scraping tools, WaterCrawl offers a self-hosted, customizable solution. Learn how to quick start with Docker or contribute to its ongoing development.

Curated AI tools, open source projects, tutorials, and resources for developers building with artificial intelligence.

Terms of Service Privacy Policy © 2026 AIBit-Discover Open Source Projects