【Image Classification】——Come here, do this bowl of EfficientNet actual combat (Pytorch)

table of Contents

Summary

New Project

Import the required libraries

Set global parameters

Image preprocessing

Read data

Set up the model

Set up training and verification

test

Complete code:


Summary

EfficientNet is a classification model proposed by Google in 2019. Since this model was proposed, he can often be seen on major competition platforms, and it has become an artifact of the top ranking. The figure below is the network structure of the EfficientNet-B0 model.

It can be seen from the network that the author built MBConv, the structure is as follows:

The size of the convolution kernel corresponding to k is convolution of 1×1, then the channel is enlarged by 4 times, and then it is convolution of depthwise conv3×3, and then after the SE module, it is convolution of 1×1, and the channel Restore to the input size, and finally merge with the upper layer input.

This article briefly introduces the network structure of EfficientNet, which is mainly based on actual combat. Next, I will talk about how to use EfficientNet to implement cat and dog classification. Since the Loss function used in this article is CrossEntropyLoss, multi-classification can be achieved by changing the number of categories.

New Project

Create a new image classification project, put the data set in the data, and customize the data reading method in the dataset folder. This time I will not use the default reading method, which is too simple and meaningless. Then create new train.py and test.py

Create train.py in the root directory of the project, and then write the training code in it.

Import the required libraries

First check whether the EfficientNet library is installed. If it is not installed, execute pip install efficientnet_pytorch to install the EfficientNet library, and then import it after installation.

import torch.optim as optimimport torchimport torch.nn as nnimport torch.nn.parallelimport torch.optimimport torch.utils.dataimport torch.utils.data.distributedimport torchvision.transforms as transformsfrom dataset.dataset import DogCatfrom torch.autograd import Variablefrom efficientnet_pytorch import EfficientNet#pip install efficientnet_pytorch


Set global parameters

Set BatchSize, learning rate and epochs to determine whether there is a cuda environment, if not set to cpu.

# Set global parameters
modellr = 1e-4
BATCH_SIZE = 64
EPOCHS = 20
DEVICE = torch.device('cuda' if torch.cuda.is_available() else'cpu')


Image preprocessing


     When doing image and processing, the transform of the train data set and the transform of the validation set are done separately. In addition to the resize and normalization of the image processing of train, you can also set image enhancements, such as rotation, random erasure, etc. For the operation of, the verification set does not need to be image enhancement, in addition, do not blindly do enhancement, unreasonable enhancement methods are likely to bring negative effects, and even Loss does not converge.

# 数据预处理
 
transform = transforms.Compose([
    transforms.Resize((224, 224)),
    transforms.ToTensor(),
    transforms.Normalize([0.5, 0.5, 0.5], [0.5, 0.5, 0.5])
 
])
transform_test = transforms.Compose([
    transforms.Resize((224, 224)),
    transforms.ToTensor(),
    transforms.Normalize([0.5, 0.5, 0.5], [0.5, 0.5, 0.5])
])


Read data

Data set address: Link: https://pan.baidu.com/s/1ZM8vDWEzgscJMnBrZfvQGw Extraction code: 48c3  
Download it and unzip it and place it in the data folder. The data directory is as follows:

Then we create a new __init__.py and dataset.py under the dataset folder, and write the following code in the dataset.py folder:

# coding:utf8import osfrom PIL import Imagefrom torch.utils import datafrom torchvision import transforms as Tfrom sklearn.model_selection import train_test_split class DogCat(data.Dataset):     def __init__(self, root, transforms=None, train=True, test=False):        """        主要目标: 获取所有图片的地址,并根据训练,验证,测试划分数据        """        self.test = test        self.transforms = transforms        imgs = [os.path.join(root, img) for img in os.listdir(root)]        if self.test:            imgs = sorted(imgs, key=lambda x: int(x.split('.')[-2].split('/')[-1]))        else:            imgs = sorted(imgs, key=lambda x: int(x.split('.')[-2]))         if self.test:            self.imgs = imgs        else:            trainval_files, val_files = train_test_split(imgs, test_size=0.3, random_state=42)            if train:                self.imgs = trainval_files            else:                self.imgs = val_files     def __getitem__(self, index):        """        一次返回一张图片的数据        """        img_path = self.imgs[index]        if self.test:            label =-1        else:            label = 1 if 'dog' in img_path.split('/')[-1] else 0        data = Image.open(img_path)        data = self.transforms(data)        return data, label     def __len__(self):        return len(self.imgs)

Then we call DogCat in train.py to read the data

dataset_train = DogCat('data/train', transforms=transform, train=True)dataset_test = DogCat("data/train", transforms=transform_test, train=False)# 读取数据print(dataset_train.imgs) # 导入数据train_loader = torch.utils.data.DataLoader(dataset_train, batch_size=BATCH_SIZE, shuffle=True)test_loader = torch.utils.data.DataLoader(dataset_test, batch_size=BATCH_SIZE, shuffle=False)


Set up the model

Use CrossEntropyLoss as the loss, and the model uses efficientnet-B3. Change the full connection of the last layer, set the category to 2, and then put the model in the DEVICE. The optimizer uses Adam.

# 实例化模型并且移动到GPUcriterion = nn.CrossEntropyLoss()model_ft = EfficientNet.from_pretrained('efficientnet-b3')num_ftrs = model_ft._fc.in_featuresmodel_ft._fc = nn.Linear(num_ftrs, 2)model_ft.to(DEVICE)# 选择简单暴力的Adam优化器,学习率调低optimizer = optim.Adam(model_ft.parameters(), lr=modellr)  def adjust_learning_rate(optimizer, epoch):    """Sets the learning rate to the initial LR decayed by 10 every 30 epochs"""    modellrnew = modellr * (0.1 ** (epoch // 50))    print("lr:", modellrnew)    for param_group in optimizer.param_groups:        param_group['lr'] = modellrnew 

Set up training and verification

# 定义训练过程 def train(model, device, train_loader, optimizer, epoch):    model.train()    sum_loss = 0    total_num = len(train_loader.dataset)    print(total_num, len(train_loader))    for batch_idx, (data, target) in enumerate(train_loader):        data, target = Variable(data).to(device), Variable(target).to(device)        output = model(data)        loss = criterion(output, target)        optimizer.zero_grad()        loss.backward()        optimizer.step()        print_loss = loss.data.item()        sum_loss += print_loss        if (batch_idx + 1) % 50 == 0:            print('Train Epoch: {} [{}/{} ({:.0f}%)]\tLoss: {:.6f}'.format(                epoch, (batch_idx + 1) * len(data), len(train_loader.dataset),                       100. * (batch_idx + 1) / len(train_loader), loss.item()))    ave_loss = sum_loss / len(train_loader)    print('epoch:{},loss:{}'.format(epoch, ave_loss))  # 验证过程def val(model, device, test_loader):    model.eval()    test_loss = 0    correct = 0    total_num = len(test_loader.dataset)    print(total_num, len(test_loader))    with torch.no_grad():        for data, target in test_loader:            data, target = Variable(data).to(device), Variable(target).to(device)            output = model(data)            loss = criterion(output, target)            _, pred = torch.max(output.data, 1)            correct += torch.sum(pred == target)            print_loss = loss.data.item()            test_loss += print_loss        correct = correct.data.item()        acc = correct / total_num        avgloss = test_loss / len(test_loader)        print('\nVal set: Average loss: {:.4f}, Accuracy: {}/{} ({:.0f}%)\n'.format(            avgloss, correct, len(test_loader.dataset), 100 * acc))  # 训练 for epoch in range(1, EPOCHS + 1):    adjust_learning_rate(optimizer, epoch)    train(model_ft, DEVICE, train_loader, optimizer, epoch)    val(model_ft, DEVICE, test_loader)torch.save(model_ft, 'model.pth')


​After completing the above code, you can start training, click run to start training, as shown below:

Since we use a pre-trained model, the convergence speed is very fast.

test

I introduce two commonly used testing methods. The first one is general. You can manually load the data set and make predictions by yourself. The specific operations are as follows:

The directory where the test set is stored is as follows:

The first step is to define the categories. The order of this category corresponds to the order of the categories during training. Do not change the order! ! ! ! When we are training, the cat category is 0 and the dog category is 1, so I define classes as (cat,dog).

The second step is to define transforms. Transforms are the same as those of the validation set. Don't do data enhancement.

The third step is to load the model and put the model in the DEVICE,

The fourth step is to read the picture and predict the category of the picture. Note here that the Image of the PIL library is used to read the picture. Do not use cv2, transforms does not support it.

import torch.utils.data.distributedimport torchvision.transforms as transformsimport torchvision.datasets as datasetsfrom PIL import Imagefrom torch.autograd import Variableimport osclasses = ('cat', 'dog')transform_test = transforms.Compose([         transforms.Resize((224, 224)),        transforms.ToTensor(),        transforms.Normalize([0.5, 0.5, 0.5], [0.5, 0.5, 0.5])]) DEVICE = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")model = torch.load("model.pth")model.eval()model.to(DEVICE) path='data/test/'testList=os.listdir(path)for file in testList:        img=Image.open(path+file)        img=transform_test(img)        img.unsqueeze_(0)        img = Variable(img).to(DEVICE)        out=model(img)        # Predict        _, pred = torch.max(out.data, 1)        print('Image Name:{},predict:{}'.format(file,classes[pred.data.item()]))

operation result:

The second one uses the dataset.py we just defined to load the test set. code show as below:

import torch.utils.data.distributedimport torchvision.transforms as transformsfrom dataset.dataset import DogCatfrom torch.autograd import Variable classes = ('cat', 'dog')transform_test = transforms.Compose([    transforms.Resize((224, 224)),    transforms.ToTensor(),    transforms.Normalize([0.5, 0.5, 0.5], [0.5, 0.5, 0.5])]) DEVICE = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")model = torch.load("model.pth")model.eval()model.to(DEVICE) dataset_test =DogCat('data/test/', transform_test,test=True)print(len(dataset_test))# 对应文件夹的label for index in range(len(dataset_test)):    item = dataset_test[index]    img, label = item    img.unsqueeze_(0)    data = Variable(img).to(DEVICE)    output = model(data)    _, pred = torch.max(output.data, 1)    print('Image Name:{},predict:{}'.format(dataset_test.imgs[index], classes[pred.data.item()]))    index += 1 

Complete code:

train.py

import torch.optim as optimimport torchimport torch.nn as nnimport torch.nn.parallelimport torch.optimimport torch.utils.dataimport torch.utils.data.distributedimport torchvision.transforms as transformsfrom dataset.dataset import DogCatfrom torch.autograd import Variablefrom efficientnet_pytorch import EfficientNet#pip install efficientnet_pytorch # 设置全局参数modellr = 1e-4BATCH_SIZE = 32EPOCHS = 10DEVICE = torch.device('cuda' if torch.cuda.is_available() else 'cpu') # 数据预处理 transform = transforms.Compose([    transforms.Resize((224, 224)),    transforms.ToTensor(),    transforms.Normalize([0.5, 0.5, 0.5], [0.5, 0.5, 0.5]) ])transform_test = transforms.Compose([    transforms.Resize((224, 224)),    transforms.ToTensor(),    transforms.Normalize([0.5, 0.5, 0.5], [0.5, 0.5, 0.5])])dataset_train = DogCat('data/train', transforms=transform, train=True)dataset_test = DogCat("data/train", transforms=transform_test, train=False)# 读取数据print(dataset_train.imgs) # 导入数据train_loader = torch.utils.data.DataLoader(dataset_train, batch_size=BATCH_SIZE, shuffle=True)test_loader = torch.utils.data.DataLoader(dataset_test, batch_size=BATCH_SIZE, shuffle=False) # 实例化模型并且移动到GPUcriterion = nn.CrossEntropyLoss()model_ft = EfficientNet.from_pretrained('efficientnet-b3')num_ftrs = model_ft._fc.in_featuresmodel_ft._fc = nn.Linear(num_ftrs, 2)model_ft.to(DEVICE)# 选择简单暴力的Adam优化器,学习率调低optimizer = optim.Adam(model_ft.parameters(), lr=modellr)  def adjust_learning_rate(optimizer, epoch):    """Sets the learning rate to the initial LR decayed by 10 every 30 epochs"""    modellrnew = modellr * (0.1 ** (epoch // 50))    print("lr:", modellrnew)    for param_group in optimizer.param_groups:        param_group['lr'] = modellrnew  # 定义训练过程 def train(model, device, train_loader, optimizer, epoch):    model.train()    sum_loss = 0    total_num = len(train_loader.dataset)    print(total_num, len(train_loader))    for batch_idx, (data, target) in enumerate(train_loader):        data, target = Variable(data).to(device), Variable(target).to(device)        output = model(data)        loss = criterion(output, target)        optimizer.zero_grad()        loss.backward()        optimizer.step()        print_loss = loss.data.item()        sum_loss += print_loss        if (batch_idx + 1) % 50 == 0:            print('Train Epoch: {} [{}/{} ({:.0f}%)]\tLoss: {:.6f}'.format(                epoch, (batch_idx + 1) * len(data), len(train_loader.dataset),                       100. * (batch_idx + 1) / len(train_loader), loss.item()))    ave_loss = sum_loss / len(train_loader)    print('epoch:{},loss:{}'.format(epoch, ave_loss))  # 验证过程def val(model, device, test_loader):    model.eval()    test_loss = 0    correct = 0    total_num = len(test_loader.dataset)    print(total_num, len(test_loader))    with torch.no_grad():        for data, target in test_loader:            data, target = Variable(data).to(device), Variable(target).to(device)            output = model(data)            loss = criterion(output, target)            _, pred = torch.max(output.data, 1)            correct += torch.sum(pred == target)            print_loss = loss.data.item()            test_loss += print_loss        correct = correct.data.item()        acc = correct / total_num        avgloss = test_loss / len(test_loader)        print('\nVal set: Average loss: {:.4f}, Accuracy: {}/{} ({:.0f}%)\n'.format(            avgloss, correct, len(test_loader.dataset), 100 * acc))  # 训练 for epoch in range(1, EPOCHS + 1):    adjust_learning_rate(optimizer, epoch)    train(model_ft, DEVICE, train_loader, optimizer, epoch)    val(model_ft, DEVICE, test_loader)torch.save(model_ft, 'model.pth')

test1.py

import torch.utils.data.distributedimport torchvision.transforms as transformsimport torchvision.datasets as datasetsfrom PIL import Imagefrom torch.autograd import Variableimport osclasses = ('cat', 'dog')transform_test = transforms.Compose([         transforms.Resize((224, 224)),        transforms.ToTensor(),        transforms.Normalize([0.5, 0.5, 0.5], [0.5, 0.5, 0.5])]) DEVICE = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")model = torch.load("model.pth")model.eval()model.to(DEVICE) path='data/test/'testList=os.listdir(path)for file in testList:        img=Image.open(path+file)        img=transform_test(img)        img.unsqueeze_(0)        img = Variable(img).to(DEVICE)        out=model(img)        # Predict        _, pred = torch.max(out.data, 1)        print('Image Name:{},predict:{}'.format(file,classes[pred.data.item()]))

test2.py

import torch.utils.data.distributedimport torchvision.transforms as transformsfrom dataset.dataset import DogCatfrom torch.autograd import Variable classes = ('cat', 'dog')transform_test = transforms.Compose([    transforms.Resize((224, 224)),    transforms.ToTensor(),    transforms.Normalize([0.5, 0.5, 0.5], [0.5, 0.5, 0.5])]) DEVICE = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")model = torch.load("model.pth")model.eval()model.to(DEVICE) dataset_test =DogCat('data/test/', transform_test,test=True)print(len(dataset_test))# 对应文件夹的label for index in range(len(dataset_test)):    item = dataset_test[index]    img, label = item    img.unsqueeze_(0)    data = Variable(img).to(DEVICE)    output = model(data)    _, pred = torch.max(output.data, 1)    print('Image Name:{},predict:{}'.format(dataset_test.imgs[index], classes[pred.data.item()]))    index += 1 

Image classification EfficientNet actual combat.zip-deep learning document resources -CSDN download