Grounded-SAM 结合 Grounding DINO 和 SAM 实现文本引导的自动分割,配合 RAM/Tag2Text 可全自动生成标签。
环境配置
1
2
3
| export AM_I_DOCKER=False
export BUILD_WITH_CUDA=True
export CUDA_HOME=/path/to/cuda-11.3/
|
1
2
3
4
5
6
7
8
9
10
11
12
| # 安装核心组件
python -m pip install -e segment_anything
pip install --no-build-isolation -e GroundingDINO
pip install --upgrade diffusers[torch]
# 安装 RAM & Tag2Text
git clone https://github.com/xinyu1205/recognize-anything.git
pip install -r ./recognize-anything/requirements.txt
pip install -e ./recognize-anything/
# 可选依赖
pip install opencv-python pycocotools matplotlib onnxruntime onnx ipykernel
|
自动标注流程
下载预训练权重:
1
2
3
| wget https://dl.fbaipublicfiles.com/segment_anything/sam_vit_h_4b8939.pth
wget https://github.com/IDEA-Research/GroundingDINO/releases/download/v0.1.0-alpha/groundingdino_swint_ogc.pth
wget https://huggingface.co/spaces/xinyu1205/Tag2Text/resolve/main/ram_swin_large_14m.pth
|
运行 RAM 自动标注:
1
2
3
4
5
6
7
8
9
10
11
| python automatic_label_ram_demo.py \
--config GroundingDINO/groundingdino/config/GroundingDINO_SwinT_OGC.py \
--ram_checkpoint ram_swin_large_14m.pth \
--grounded_checkpoint groundingdino_swint_ogc.pth \
--sam_checkpoint sam_vit_h_4b8939.pth \
--input_image assets/demo9.jpg \
--output_dir "outputs" \
--box_threshold 0.25 \
--text_threshold 0.2 \
--iou_threshold 0.5 \
--device "cuda"
|
工作流程
- RAM/Tag2Text 生成图像标签
- Grounding DINO 根据标签检测边界框
- SAM 根据边界框生成分割掩码
Comments