Wiseflow is an agile information mining tool that extracts concise messages from various sources such as websites, WeChat official accounts, social platforms, etc. It automatically categorizes and …

Python 3,842 616 Updated Sep 4, 2024

opendatalab / MinerU

A one-stop, open-source, high-quality data extraction tool, supports PDF/webpage/e-book extraction.一站式开源高质量数据提取工具，支持PDF/网页/多格式电子书提取。

Python 11,513 864 Updated Sep 23, 2024

Menghuan1918 / pdfdeal

A python wrapper for the Doc2X API and comes with native PDF processing (to improve PDF recall in RAG). | Doc2X API的python封装，同时附带本地的PDF处理(提升PDF在RAG中的召回率)。

Python 177 10 Updated Sep 12, 2024

microsoft / playwright

Playwright is a framework for Web Testing and Automation. It allows testing Chromium, Firefox and WebKit with a single API.

TypeScript 65,837 3,583 Updated Sep 23, 2024

scholarly-python-package / scholarly

Retrieve author and publication information from Google Scholar in a friendly, Pythonic way without having to worry about CAPTCHAs!

Python 1,357 298 Updated Jul 3, 2024

adbar / trafilatura

Python & Command-line tool to gather text and metadata on the Web: Crawling, scraping, extraction, output as CSV, JSON, HTML, MD, TXT, XML

Python 3,478 253 Updated Sep 10, 2024

stardust95 / NewsFeeds

Newsfeeds website using nodejs as server and mongo as storage backends, including a simple recommendation system. 基于Node.js的新闻聚合网站, 支持基于用户行为推荐新闻.

HTML 24 11 Updated Sep 7, 2020

stanford-oval / storm

An LLM-powered knowledge curation system that researches a topic and generates a full-length report with citations.

Python 10,788 1,000 Updated Sep 23, 2024

automagica / automagica

AI-powered Smart Robotic Process Automation 🤖

Python 3,000 476 Updated Oct 13, 2020

aisingapore / TagUI

Free RPA tool by AI Singapore

JavaScript 5,585 580 Updated Sep 6, 2024

tebelorg / RPA-Python

Python package for doing RPA

Python 4,856 663 Updated Sep 5, 2024

robotframework / robotframework

Generic automation framework for acceptance testing and RPA

Python 9,682 2,323 Updated Sep 23, 2024

Skyvern-AI / skyvern

Automate browser-based workflows with LLMs and Computer Vision

Python 5,823 425 Updated Sep 23, 2024

LLM-Red-Team / deepseek-free-api

🚀 DeepSeek-V2大模型逆向API白嫖测试【特长：GPT4平替】，支持高速流式输出、多轮对话，零配置部署，多路token支持。

TypeScript 309 101 Updated Jun 20, 2024

dylanhogg / llmgraph

Create knowledge graphs with LLMs

Jupyter Notebook 287 20 Updated Aug 31, 2024

RVC-Boss / GPT-SoVITS

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

Python 32,938 3,788 Updated Sep 17, 2024

Alir3z4 / html2text

Convert HTML to Markdown-formatted text.

Python 1,802 273 Updated Jul 25, 2024

py-bin / dianping_textmining

大众点评评论文本挖掘，包括点评数据爬取、数据清洗入库、数据分析、评论情感分析等的完整挖掘项目

Jupyter Notebook 711 166 Updated Oct 9, 2018

Sniper970119 / dianping_spider

大众点评爬虫（全站可爬，解决动态字体加密，非OCR）。持续更新

Python 895 157 Updated Aug 29, 2024

olonok69 / LLM_Notebooks

Notebooks and Code about Generative Ai, LLMs, MLOPS, NLP , CV and Graph databases

Jupyter Notebook 41 8 Updated Sep 20, 2024

frankkramer-lab / miseval

a metric library for Medical Image Segmentation EVALuation

Jupyter Notebook 107 22 Updated Aug 12, 2022

6drf21e / ChatTTS_Speaker

ChatTTS 2000条音色稳定性打分🥇+区分男女年龄👧+在线试听🔈 ChatTTS 2K Speaker Stability Score & Categorized by Gender and Age & Audio Preview

Python 463 24 Updated Jul 2, 2024

6drf21e / ChatTTS_colab

🚀 一键部署（含离线整合包）！基于 ChatTTS ，支持流式输出、音色抽卡、长音频生成和分角色朗读。简单易用，无需复杂安装。

Python 1,920 240 Updated Jul 2, 2024

Jingnan-Jia / segmentation_metrics

A package to compute medical segmentation metrics.

Python 115 12 Updated Jul 16, 2024

markmap / markmap

Build mindmaps with plain text

TypeScript 8,406 592 Updated Sep 23, 2024

modelscope / KAN-TTS

KAN-TTS is a speech-synthesis training framework, please try the demos we have posted at https://modelscope.cn/models?page=1&tasks=text-to-speech

Python 483 78 Updated Dec 28, 2023

PeterH0323 / Streamer-Sales

Streamer-Sales 销冠 —— 卖货主播 LLM 大模型🛒🎁，一个能够根据给定的商品特点从激发用户购买意愿角度出发进行商品解说的卖货主播大模型。🚀⭐内含详细的数据生成流程❗ 📦另外还集成了 LMDeploy 加速推理🚀、RAG检索增强生成 📚、TTS文字转语音🔊、数字人生成 🦸、 Agent 使用网络查询实时信息🌐、ASR 语音转文字🎙️、Vue 生态搭建前端🍍、FastAPI 搭…

Python 2,358 341 Updated Sep 14, 2024

Raspberry Pi

Natural language processing

a6225301

Highlights

Starred repositories

Raspberry Pi

Natural language processing

Machine learning

Awesome Lists

Deep learning