Skip to main content Link Menu Expand (external link) Copy Copied

花朵分类:从 PaddleLabel 到 PaddleClas

PaddleLabel 标注数据+PaddleClas 训练预测=快速完成一次花朵分类的任务


1. 数据准备

  • 首先使用PaddleLabel对自制的花朵数据集进行标注,其次使用Split Dataset功能分割数据集,最后导出数据集
  • PaddleLabel导出后的内容全部放到自己的建立的文件夹下,例如flower_clas_dataset,其目录结构如下:
├── flower_clas_dataset
│   ├── image
│   │   ├── flower1.jpg
│   │   ├── flower2.jpg
│   │   ├── ...
│   ├── labels.txt
│   ├── test_list.txt
│   ├── train_list.txt
│   ├── val_list.txt

2. 训练

2.1 安装必备的库

2.1.1 安装 paddlepaddle

# 您的机器安装的是 CUDA9 或 CUDA10,请运行以下命令安装
# pip install paddlepaddle-gpu -i https://mirror.baidu.com/pypi/simple
# 您的机器是CPU,请运行以下命令安装
pip install paddlepaddle

2.1.2 安装 paddleclas 以及依赖项

git clone https://gitee.com/paddlepaddle/PaddleClas.git -b release/2.2
cd PaddleClas
pip install -r requirements.txt
python setup.py install

2.2 准备自制的花朵分类数据集

cd ./PaddleClas/dataset/
mkdir flower_clas_dataset
cd ../../
cp -r ./flower_clas_dataset/* ./PaddleClas/dataset/flower_clas_dataset

2.3 修改配置文件

PaddleClas/ppcls/configs/quick_start/new_user/ShuffleNetV2_x0_25.yaml

# global configs
Global:
  checkpoints: null
  pretrained_model: null
  output_dir: ./output/
  device: cpu
  save_interval: 20
  eval_during_train: True
  eval_interval: 10
  epochs: 100
  print_batch_step: 10
  use_visualdl: True
  # used for static mode and model export
  image_shape: [3, 224, 224]
  save_inference_dir: ./inference

# model architecture
Arch:
  name: ShuffleNetV2_x0_25
  class_num: 3

# loss function config for traing/eval process
Loss:
  Train:
    - CELoss:
        weight: 1.0
  Eval:
    - CELoss:
        weight: 1.0


Optimizer:
  name: Momentum
  momentum: 0.9
  lr:
    name: Cosine
    learning_rate: 0.0125
    warmup_epoch: 5
  regularizer:
    name: 'L2'
    coeff: 0.00001


# data loader for train and eval
DataLoader:
  Train:
    dataset:
      name: ImageNetDataset
      image_root: ./dataset/
      cls_label_path: ./dataset/train_list.txt
      transform_ops:
        - DecodeImage:
            to_rgb: True
            channel_first: False
        - RandCropImage:
            size: 224
        - RandFlipImage:
            flip_code: 1
        - NormalizeImage:
            scale: 1.0/255.0
            mean: [0.485, 0.456, 0.406]
            std: [0.229, 0.224, 0.225]
            order: ''

    sampler:
      name: DistributedBatchSampler
      batch_size: 16
      drop_last: False
      shuffle: True
    loader:
      num_workers: 0
      use_shared_memory: True

  Eval:
    dataset:
      name: ImageNetDataset
      image_root: ./dataset/
      cls_label_path: ./dataset/val_list.txt
      transform_ops:
        - DecodeImage:
            to_rgb: True
            channel_first: False
        - ResizeImage:
            resize_short: 256
        - CropImage:
            size: 224
        - NormalizeImage:
            scale: 1.0/255.0
            mean: [0.485, 0.456, 0.406]
            std: [0.229, 0.224, 0.225]
            order: ''
    sampler:
      name: DistributedBatchSampler
      batch_size: 32
      drop_last: False
      shuffle: False
    loader:
      num_workers: 0
      use_shared_memory: True

Infer:
  infer_imgs: dataset/predict_demo.jpg
  batch_size: 10
  transforms:
    - DecodeImage:
        to_rgb: True
        channel_first: False
    - ResizeImage:
        resize_short: 256
    - CropImage:
        size: 224
    - NormalizeImage:
        scale: 1.0/255.0
        mean: [0.485, 0.456, 0.406]
        std: [0.229, 0.224, 0.225]
        order: ''
    - ToCHWImage:
  PostProcess:
    name: Topk
    topk: 3

Metric:
  Train:
    - TopkAcc:
        topk: [1, 3]
  Eval:
    - TopkAcc:
        topk: [1, 3]

2.4 添加类别映射文件

PaddleClas/ppcls/configs/quick_start/new_user/label.txt

sunflower
rose
dandelion

2.5 开始训练

export CUDA_VISIBLE_DEVICES=0
# 开始训练
python PaddleClas/tools/train.py -c ./PaddleClas/ppcls/configs/quick_start/new_user/ShuffleNetV2_x0_25.yaml

3. 模型评估

3.1 评估

python PaddleClas/tools/eval.py -c ./PaddleClas/ppcls/configs/quick_start/new_user/ShuffleNetV2_x0_25.yaml

3.2 预测

python3 PaddleClas/tools/infer.py \
    -c ./PaddleClas/ppcls/configs/quick_start/new_user/ShuffleNetV2_x0_25.yaml \
    -o Infer.infer_imgs=dataset/predict_demo.jpg \
    -o Global.pretrained_model=output/ShuffleNetV2_x0_25/latest

预测的样例图片是:

预测的结果是:

{‘class_ids’: [0, 1, 2], ‘scores’: [0.89812, 0.09476, 0.00712], ‘file_name’: ‘dataset/predict_demo.jpg’, ‘label_names’: []} 也就是说 0 的概率最大,为 0.89812,0 对应的结果是向日葵,也就是说结果是向日葵,预测无误。

AI Studio 第三方教程推荐

快速体验演示案例