强制停止跟踪 all_i_need 目录
This commit is contained in:
116
readme.md
116
readme.md
@@ -1,116 +0,0 @@
|
||||
# B站关注清理工具 - Scripts 版
|
||||
|
||||
> 一键命令运行全流程:`python source/scripts/run_pipeline.py`
|
||||
|
||||
python source/scripts/run_pipeline.py --input-json source/resources/export_uids_test5.json
|
||||
|
||||
本工具包含7个步骤的完整流水线:
|
||||
|
||||
1. 抓取视频标题
|
||||
2. 分批AI分析
|
||||
3. 生成保留关注报告
|
||||
4. 生成取关UID列表
|
||||
5. 按首字母排序
|
||||
6. 提取分组信息
|
||||
7. 删除最近10条标题
|
||||
|
||||
## 快速开始
|
||||
|
||||
```powershell
|
||||
# 完整流程(推荐)
|
||||
python source/scripts/run_pipeline.py
|
||||
|
||||
# 速度优先
|
||||
python source/scripts/run_pipeline.py --workers 8 --batch-size 30 --sleep-seconds 0
|
||||
|
||||
# 试跑30个UP
|
||||
python source/scripts/run_pipeline.py --max-ups 30
|
||||
|
||||
# 跳过抓取,使用已有标题报告
|
||||
python source/scripts/run_pipeline.py --skip-fetch
|
||||
|
||||
# 跳过分析,仅生成产物
|
||||
python source/scripts/run_pipeline.py --skip-analyze
|
||||
|
||||
# 跳过排序/分组/删除
|
||||
python source/scripts/run_pipeline.py --skip-sort --skip-group --skip-remove
|
||||
```
|
||||
|
||||
## 输出文件
|
||||
|
||||
| 文件 | 说明 |
|
||||
|------|------|
|
||||
| `source/output/reports/1_up_titles_report.md` | 标题抓取报告 |
|
||||
| `source/output/reports/2_up_analysis_full_auto.md` | AI分析报告(完整) |
|
||||
| `source/output/reports/3_up_keep_follow_only.md` | 保留关注报告 |
|
||||
| `source/output/uids/4_unfollow_mids_list.txt` | 取关UID列表 |
|
||||
| `source/output/reports/5_sorted_up_analysis.md` | 按首字母排序报告 |
|
||||
| `source/output/reports/6_group_info.md` | 提取分组信息报告 |
|
||||
| `source/output/reports/7_no_titles.md` | 最终报告(删除最近10条) |
|
||||
|
||||
## 常用参数
|
||||
|
||||
| 参数 | 默认值 | 说明 |
|
||||
|------|--------|------|
|
||||
| `--workers` | 6 | 并发请求数 |
|
||||
| `--batch-size` | 20 | 每批分析条数 |
|
||||
| `--max-ups` | 0(全部) | 限制处理UP数量 |
|
||||
| `--split-size` | 100 | UID拆分大小 |
|
||||
| `--sleep-seconds` | 0 | 任务间隔秒数 |
|
||||
|
||||
### 跳过参数
|
||||
|
||||
| 参数 | 说明 |
|
||||
|------|------|
|
||||
| `--skip-fetch` | 跳过抓取阶段 |
|
||||
| `--skip-analyze` | 跳过分析阶段 |
|
||||
| `--skip-sort` | 跳过排序阶段 |
|
||||
| `--skip-group` | 跳过提取分组阶段 |
|
||||
| `--skip-remove` | 跳过删除最近10条阶段 |
|
||||
|
||||
## 分步执行
|
||||
|
||||
### 步骤1:抓取标题
|
||||
```powershell
|
||||
python source/scripts/analyze_up_content.py --skip-ai
|
||||
```
|
||||
|
||||
### 步骤2:分批AI分析
|
||||
```powershell
|
||||
python source/scripts/batch_ai_summary_from_report.py --run-all-batches
|
||||
```
|
||||
|
||||
### 步骤3:生成保留关注报告
|
||||
```powershell
|
||||
python source/scripts/extract_keep_follow_doc.py
|
||||
```
|
||||
|
||||
### 步骤4:生成取关UID
|
||||
```powershell
|
||||
python source/scripts/extract_unfollow_list.py --format mid-only --split-size 100
|
||||
```
|
||||
|
||||
### 步骤5:按首字母排序
|
||||
```powershell
|
||||
python source/scripts/sort_up_main.py
|
||||
```
|
||||
|
||||
### 步骤6:提取分组信息
|
||||
```powershell
|
||||
python source/scripts/extract_group_info.py
|
||||
```
|
||||
|
||||
### 步骤7:删除最近10条标题
|
||||
```powershell
|
||||
python source/scripts/remove_10content.py
|
||||
```
|
||||
|
||||
## 先配置API
|
||||
|
||||
编辑 [source/scripts/analyze_up_content.py](source/scripts/analyze_up_content.py) 顶部配置:
|
||||
|
||||
```python
|
||||
VOLCENGINE_API_KEY = "你的火山引擎API Key"
|
||||
VOLCENGINE_MODEL = "deepseek-v3-1-terminus"
|
||||
VOLCENGINE_BASE_URL = "https://ark.cn-beijing.volces.com/api/v3"
|
||||
```
|
||||
Reference in New Issue
Block a user