张芷铭的个人博客

📅 2026-02-26

#ai #deep-learning #machine-learning

Grounded-SAM 结合 Grounding DINO 和 SAM，通过 RAM/Tag2Text 实现自动图像标注。

环境配置

1
2
3
export AM_I_DOCKER=False
export BUILD_WITH_CUDA=True
export CUDA_HOME=/path/to/cuda-11.3/

安装依赖

1
2
3
4
5
6
7
8
9
# Segment Anything
python -m pip install -e segment_anything

# Grounding DINO
pip install --no-build-isolation -e GroundingDINO

# RAM & Tag2Text
git clone https://github.com/xinyu1205/recognize-anything.git
pip install -e ./recognize-anything/

下载权重

1
2
3
wget https://dl.fbaipublicfiles.com/segment_anything/sam_vit_h_4b8939.pth
wget https://github.com/IDEA-Research/GroundingDINO/releases/download/v0.1.0-alpha/groundingdino_swint_ogc.pth
wget https://huggingface.co/spaces/xinyu1205/Tag2Text/resolve/main/ram_swin_large_14m.pth

自动标注流程

RAM 方式：

1
2
3
4
5
6
7
8
python automatic_label_ram_demo.py \
  --config GroundingDINO/groundingdino/config/GroundingDINO_SwinT_OGC.py \
  --ram_checkpoint ram_swin_large_14m.pth \
  --grounded_checkpoint groundingdino_swint_ogc.pth \
  --sam_checkpoint sam_vit_h_4b8939.pth \
  --input_image assets/demo9.jpg \
  --output_dir "outputs" \
  --box_threshold 0.25

Tag2Text 方式：

1
2
3
python automatic_label_tag2text_demo.py \
  --tag2text_checkpoint tag2text_swin_14m.pth \
  # 其他参数同上

流程说明

RAM/Tag2Text 生成图像标签
Grounding DINO 定位目标框
SAM 生成分割掩码

Comments