图像分类任务：识别蜜蜂种类

问题：机器能否识别蜜蜂是蜜蜂还是大黄蜂？

这些蜜蜂有不同的行为和外观，但考虑到背景、位置和图像分辨率的多样性，机器区分它们可能是一个挑战。

能够从图像中识别蜜蜂物种是一项最终将使研究人员能够更快、更有效地收集现场数据的任务。授粉蜜蜂在生态和农业中都发挥着至关重要的作用，而蜂群崩溃失调等疾病威胁着这些物种。识别野外不同种类的蜜蜂意味着我们可以更好地了解这些重要昆虫的流行和生长。

本文记录了加载和处理图像，然后构建自动检测蜜蜂和大黄蜂的模型。

# Used to change filepaths

from pathlib import Path

# We set up matplotlib, pandas, and the display function

%matplotlib inline

import matplotlib.pyplot as plt

from IPython.display import display

import pandas as pd

# import numpy to use in this cell

import numpy as np

# import Image from PIL so we can use it later

from PIL import Image

# generate test_data

a=1

b=1

test_data =np.random.beta(a, b, size = (100,100,3))

# display the test_data

plt.imshow(test_data)

plt.show()

现在我们要加载图像，将其显示在笔记本中，并打印出图像的尺寸。

# open the image

img = Image.open('datasets/bee_1.jpg')

# Get the image size

img_size = img.size

print("The image size is: {}".format(img_size))

img

The image size is: (100, 100)

Pillow 库中内置了许多常见的图像处理任务。Pillow提供的操作包括：

调整大小

裁剪

旋转

翻转

转换为灰度（或其他颜色模式）

通常，这些类型的操作是将少量图像转换为更多图像以创建机器学习算法训练数据的流程的一部分。这种技术称为数据增强，是图像分类的常用技术。

我们将尝试其中一些操作并查看结果。

# Crop the image to 25, 25, 75, 75

img_cropped = img.crop((25,25,75,75))

display(img_cropped)

# rotate the image by 45 degrees

img_rotated = img.rotate(45)

display(img_rotated)

# flip the image left to right

img_flipped = img.transpose(Image.FLIP_LEFT_RIGHT)

display(img_flipped)

大多数图像格式具有三种颜色“通道”：红色、绿色和蓝色（某些图像还具有称为“alpha”的第四个通道，用于控制透明度）。对于图像中的每个像素，每个通道都有一个值。

其表示为数据的方式是三维矩阵。矩阵的宽度是图像的宽度，矩阵的高度是图像的高度，矩阵的深度是通道数。因此，正如我们所见，图像的高度和宽度均为 100 像素。这意味着基础数据是一个维度为100x100x3的矩阵。

# Turn our image object into a NumPy array

img_data = np.array(img)

# get the shape of the resulting array

img_data_shape = img_data.shape

print("Our NumPy array has the shape: {}".format(img_data_shape))

# plot the data with `imshow`

plt.imshow(img_data)

plt.show()

# plot the red channel

plt.imshow(img_data[:, :, 0], cmap=plt.cm.Reds_r)

plt.show()

# plot the green channel

plt.imshow(img_data[:, :, 1], cmap=plt.cm.Greens_r)

plt.show()

# plot the blue channel

plt.imshow(img_data[:, :, 2], cmap=plt.cm.Blues_r)

plt.show()

颜色通道可以帮助提供有关图像的更多信息。海洋的图片会更蓝，而田野的图片会更绿。此类信息在构建模型或检查图像之间的差异时非常有用。

我们将查看同一图上每个颜色通道的核密度估计，以便我们可以了解它们有何不同。

当我们绘制这个图时，我们会看到，形状越靠右意味着该颜色越多，而越靠左则意味着该颜色越少。

def plot_kde(channel, color):

""" Plots a kernel density estimate for the given data.

`channel` must be a 2d array

`color` must be a color string, e.g. 'r', 'g', or 'b'

"""

data = channel.flatten()

return pd.Series(data).plot.density(c=color)

# create the list of channels

r = img_data[:, :, 0]

g = img_data[:, :, 1]

b = img_data[:, :, 2]

channels = ['r', 'g', 'b']

def plot_rgb(image_data):

# use enumerate to loop over colors and indexes

for i, color in enumerate(channels):

plot_kde(img_data[:, :, i], color)

plt.show()

plot_rgb(img_data)

现在我们将看看两个不同的图像以及它们之间的一些差异。第一张图像是蜜蜂，第二张图像是大黄蜂。

首先，我们来看看蜜蜂。

# load bee_12.jpg as honey

honey = Image.open('datasets/bee_12.jpg')

# display the honey bee image

display(honey)

# NumPy array of the honey bee image data

honey_data = np.array(honey)

# plot the rgb densities for the honey bee image

plot_rgb(honey_data)

现在让我们看看大黄蜂。

当人们比较这些图像时，可以清楚地看到颜色有多么不同。上面的蜜蜂图像有一朵蓝色的花，在蓝色通道的右侧有一个强烈的峰值。

大黄蜂图像的蜜蜂和背景有很多黄色，红色和绿色通道（它们一起形成黄色）之间几乎完美重叠。

# load bee_3.jpg as bumble

bumble = Image.open('datasets/bee_3.jpg')

# display the bumble bee image

display(bumble)

# NumPy array of the bumble bee image data

bumble_data = np.array(bumble)

# plot the rgb densities for the bumble bee image

plot_rgb(bumble_data)

虽然有时颜色信息很有用，但有时它可能会分散注意力。在这个例子中，我们观察的是蜜蜂，蜜蜂本身的颜色非常相似。另一方面，蜜蜂经常在不同颜色的花朵上。我们知道花朵的颜色可能会分散蜜蜂和熊蜂的注意力，所以让我们将这些图像转换为灰度图像。

因为我们改变了颜色“通道”的数量，所以数组的形状也会随着这种变化而改变。看看灰度版本的 KDE 与上面的 RGB 版本相比如何也很有趣。

现在，我们将对 Pillow 中的Image对象进行一些更改并保存。我们将从左到右翻转图像，就像我们对彩色版本所做的那样。

然后，我们将通过裁剪来更改数据的 NumPy 版本。使用np.maximum函数，我们可以取数组中小于100的任何数字，并将其替换为100。因为这缩小了值的范围，所以会增加图像的对比度。然后我们将其转换回 Image并保存结果。

# flip the image left-right with transpose

honey_bw_flip = honey_bw.transpose(Image.FLIP_LEFT_RIGHT)

# show the flipped image

display(honey_bw_flip)

# save the flipped image

honey_bw_flip.save('saved_images/bw_flipped.jpg')

# create higher contrast by reducing range

honey_hc_arr = np.maximum(honey_bw_arr, 100)

# show the higher contrast version

plt.imshow(honey_hc_arr, cmap=plt.cm.gray)

# convert the NumPy array of high contrast to an Image

honey_bw_hc = Image.fromarray(honey_hc_arr)

# save the high contrast version

honey_bw_hc.save("saved_images/bw_hc.jpg")

现在是时候创建图像处理管道了。我们的工具箱中拥有用于加载图像、转换图像和保存结果的所有工具。

在此管道中，我们将执行以下操作：

用Image.open加载图像并创建路径来保存图像

将图像转换为灰度

保存灰度图像

旋转、裁剪和放大图像并保存新图像

image_paths = ['datasets/bee_1.jpg', 'datasets/bee_12.jpg', 'datasets/bee_2.jpg', 'datasets/bee_3.jpg']

def process_image(path):

img = Image.open(path)

# create paths to save files to

bw_path = "saved_images/bw_{}.jpg".format(path.stem)

rcz_path = "saved_images/rcz_{}.jpg".format(path.stem)

print("Creating grayscale version of {} and saving to {}.".format(path, bw_path))

bw = img.convert("L")

bw.save(bw_path)

print("Creating rotated, cropped, and zoomed version of {} and saving to {}.".format(path, rcz_path))

rcz = img.rotate(45).crop((25,25,75,75)).resize((100,100))

rcz.save(rcz_path)

# for loop over image paths

for img_path in image_paths:

process_image(Path(img_path))

Creating grayscale version of datasets/bee_1.jpg and saving to saved_images/bw_bee_1.jpg.

Creating rotated, cropped, and zoomed version of datasets/bee_1.jpg and saving to saved_images/rcz_bee_1.jpg.

Creating grayscale version of datasets/bee_12.jpg and saving to saved_images/bw_bee_12.jpg.

Creating rotated, cropped, and zoomed version of datasets/bee_12.jpg and saving to saved_images/rcz_bee_12.jpg.

Creating grayscale version of datasets/bee_2.jpg and saving to saved_images/bw_bee_2.jpg.

Creating rotated, cropped, and zoomed version of datasets/bee_2.jpg and saving to saved_images/rcz_bee_2.jpg.

Creating grayscale version of datasets/bee_3.jpg and saving to saved_images/bw_bee_3.jpg.

Creating rotated, cropped, and zoomed version of datasets/bee_3.jpg and saving to saved_images/rcz_bee_3.jpg.

现在是时候构建我们的模型了。让我们导入其他库。

from skimage.feature import hog

from skimage.color import rgb2gray

from sklearn.preprocessing import StandardScaler

from sklearn.decomposition import PCA

# import train_test_split from sklearn's model selection module

from sklearn.model_selection import train_test_split

# import SVC from sklearn's svm module

from sklearn.svm import SVC

# import accuracy_score from sklearn's metrics module

from sklearn.metrics import roc_curve, auc, accuracy_score

我们将把 labels.csv 文件加载到名为 labels 的数据帧中，其中索引是图像名称（例如，1036的索引指的是名为1036.jpg的图像），genus 列告诉我们蜜蜂的类型。genus 的值为0.0（Apis或蜜蜂）或1.0（Bombus或大黄蜂）。

函数 get_image 将数据帧中的索引值转换为图像所在的文件路径，使用 Pillow 中的 image 对象打开图像，然后将图像作为 numpy 数组返回。

我们将使用此函数加载数据框中的第 6 个 Apis 图像，然后加载第 6 个 Bombus 图像。

# load the labels using pandas

labels = pd.read_csv("datasets/labels.csv", index_col=0)

# show the first five rows of the dataframe using head

display(labels.head(5))

def get_image(row_id, root="datasets/"):

"""

Converts an image number into the file path where the image is located,

opens the image, and returns the image as a numpy array.

"""

filename = "{}.jpg".format(row_id)

file_path = os.path.join(root, filename)

img = Image.open(file_path)

return np.array(img)

# subset the dataframe to just Apis (genus is 0.0) get the value of the sixth item in the index

apis_row = labels[labels.genus == 0.0].index[5]

# show the corresponding image of an Apis

plt.imshow(get_image(apis_row))

plt.show()

# subset the dataframe to just Bombus (genus is 1.0) get the value of the sixth item in the index

bombus_row = labels[labels.genus == 1.0].index[5]

# show the corresponding image of a Bombus

plt.imshow(get_image(bombus_row))

plt.show()

scikit-image 库中内置了许多图像处理函数，例如将图像转换为灰度。rgb2gray函数使用以下公式计算 RGB 图像的亮度：

Y = 0.2125 R + 0.7154 G + 0.0721 B。

图像数据表示为矩阵，其中深度是通道数。RGB 图像具有三个通道（红色、绿色和蓝色），而返回的灰度图像只有一个通道。

因此，原始彩色图像的尺寸为100x100x3，但调用rgb2gray后，生成的灰度图像只有一个通道，尺寸为100x100x1。

# load a bombus image using our get_image function and bombus_row from the previous cell

bombus = get_image(bombus_row)

# print the shape of the bombus image

print('Color bombus image has shape: ', bombus.shape)

# convert the bombus image to grayscale

gray_bombus = rgb2gray(bombus)

# show the grayscale image

plt.imshow(gray_bombus, cmap=mpl.cm.gray)

# grayscale bombus image only has one channel

print('Grayscale bombus image has shape: ', gray_bombus.shape)

现在我们需要将这些图像转换成机器学习算法可以理解的东西。传统的计算机视觉技术依赖数学变换将图像转化为有用的特征。例如，你可能想要检测图像中对象的边缘、增加对比度或过滤掉特定颜色。

我们有一个像素值矩阵，但对于大多数算法来说，它们本身并没有包含足够的有趣信息。我们需要使用定向梯度直方图描述符为算法挑选一些显着特征，从而帮助算法前进。HOG背后的想法是，图像中物体的形状可以通过其边缘来推断，而识别边缘的一种方法是查看强度梯度的方向（即发光的变化）。

图像以网格方式划分为单元，并且对于每个单元内的像素，编译梯度方向的直方图。为了提高图像中高光和阴影的不变性，对单元进行块归一化，这意味着为图像的较大区域（称为块）计算强度值，并用于对比归一化每个块内的所有单元级直方图。图像的 HOG 特征向量是这些单元级直方图的串联。

# run HOG using our grayscale bombus image

hog_features, hog_image = hog(gray_bombus,

visualize=True,

block_norm='L2-Hys',

pixels_per_cell=(16, 16))

# show our hog_image with a gray colormap

plt.imshow(hog_image, cmap=mpl.cm.gray)

我们希望为我们的模型提供图像中的原始像素值以及我们刚刚计算的 HOG 特征。为此，我们将编写一个名为create_features的函数，通过将三维数组展平为一维（平面）数组来组合这两组特征。

def create_features(img):

# flatten three channel color image

color_features = img.flatten()

# convert image to grayscale

gray_image = rgb2gray(img)

# get HOG features from grayscale image

hog_features = hog(gray_image, block_norm='L2-Hys', pixels_per_cell=(16, 16))

# combine color and hog features into a single array

flat_features = np.hstack([color_features, hog_features])

return flat_features

bombus_features = create_features(bombus)

# print shape of bombus_features

print('Bombus image has shape: ', bombus.shape)

Bombus image has shape: (100, 100, 3)

上面我们为 bombus 图像生成了一个扁平化的特征数组。现在是时候循环我们所有的图像了。我们将为每个图像创建特征，然后将展平的特征数组堆叠成一个大矩阵，我们可以将其传递到模型中。

在该create_feature_matrix函数中，我们将执行以下操作：

加载图像

create_features使用上面的函数生成一行特征

将行堆叠成特征矩阵

在生成的特征矩阵中，行对应于图像，列对应于特征。

def create_feature_matrix(label_dataframe):

features_list = []

for img_id in label_dataframe.index:

# load image

img = get_image(img_id)

# get features for image

image_features = create_features(img)

features_list.append(image_features)

# convert list of arrays into a matrix

feature_matrix = np.array(features_list)

return feature_matrix

# run create_feature_matrix on our dataframe of images

feature_matrix = create_feature_matrix(labels)

现在我们需要将数据转换为训练集和测试集。我们将使用 70% 的图像作为训练数据，并在剩余的 30% 上测试我们的模型。Scikit-learn 的train_test_split函数使这一切变得简单。

# split the data into training and test sets

X_train, X_test, y_train, y_test = train_test_split(feature_matrix,

labels.genus.values,

test_size=.3,

random_state=1234123)

# look at the distribution of labels in the train set

pd.Series(y_train).value_counts()

0.0 175

1.0 175

dtype: int64

我们需要在扩展之前分割数据以避免数据泄漏，我们的模型会在其中获取有关测试集的信息。

现在数据已经被分割，我们可以拟合StandardScaler到我们的训练特征，并使用这种拟合来转换两组数据。

# get shape of our training features

print('Training features matrix shape is: ', X_train.shape)

# define standard scaler

ss = StandardScaler()

# fit the scaler and transform the training features

train_stand = ss.fit_transform(X_train)

# transform the test features

test_stand = ss.transform(X_test)

# look at the new shape of the standardized feature matrices

print('Standardized training features matrix shape is: ', train_stand.shape)

print('Standardized test features matrix shape is: ', test_stand.shape)

Training features matrix shape is: (350, 31296)

Standardized training features matrix shape is: (350, 31296)

Standardized test features matrix shape is: (150, 31296)

我们为每张图像提供了超过 31,000 个特征，但总共只有 500 张图像。为了使用我们选择的模型 SVM，我们还需要使用主成分分析来减少特征数量。我们将保留 350 个主成分。这意味着我们的特征矩阵train_stand将test_stand只有 350 列，而不是原始的 31,296 列。

# Instantiate a PCA object with 350 components

pca = PCA(n_components=350)

# use fit_transform on our standardized training features

X_train = pca.fit_transform(train_stand)

# use transform on our standardized test features

X_test = pca.transform(test_stand)

# look at new shape

print('Training features matrix is: ', X_train.shape)

print('Test features matrix is: ', X_test.shape)

Training features matrix is: (350, 350)

Test features matrix is: (150, 350)

终于到了构建我们的模型的时候了！我们将使用支持向量机，这是一种用于回归、分类和异常值检测的监督机器学习模型。

由于我们有一个分类任务——蜂蜜或大黄蜂——我们将使用支持向量分类器 (SVC)，这是一种 SVM。我们将使用准确性指标来评估性能。

# define support vector classifier

svm = SVC(kernel='linear', probability=True, random_state=42)

# fit model

svm.fit(X_train, y_train)

# generate predictions

y_pred = svm.predict(X_test)

# calculate accuracy

accuracy = accuracy_score(y_test, y_pred)

print('Model accuracy is: ', accuracy)

Model accuracy is: 0.68

现在，我们将使用svm.predict_proba来获得每个类都是真实标签的概率。

# predict probabilities for X_test using predict_proba

probabilities = svm.predict_proba(X_test)

# select the probabilities for label 1.0

y_proba = probabilities[:, 1]

# calculate false positive rate and true positive rate at different thresholds

false_positive_rate, true_positive_rate, thresholds = roc_curve(y_test, y_proba, pos_label=1)

# calculate AUC

roc_auc = auc(false_positive_rate, true_positive_rate)

plt.title('Receiver Operating Characteristic')

# plot the false positive rate on the x axis and the true positive rate on the y axis

roc_plot = plt.plot(false_positive_rate,

true_positive_rate,

label='AUC = {:0.2f}'.format(roc_auc))

plt.legend(loc=0)

plt.plot([0,1], [0,1], ls='--')

plt.ylabel('True Positive Rate')

plt.xlabel('False Positive Rate');

我们现在拥有经过充分训练的计算机视觉模型，可用于识别图像中的蜜蜂和大黄蜂。数据科学过程的下一步是探索通过使用更多数据、添加新功能并尝试不同方法来改进模型！

图像分类任务：识别蜜蜂种类

相关阅读

磐创AI

磐创AI

举报文章问题

举报评论问题

用户登录×