TensorFlow手寫數字辨識_CNN

以 MLP 方式建立的模型，正確率約為 96%，要再進一步提升正確率，就要使用 Yann Lecun 提出的 CNN Convolutional Neural Network。

CNN 簡介

卷積運算就是將一個影像，經過卷積運算後，產生多個影像，分為兩個部分

卷積與縮減取樣，提取影像的特徵

經過第一次卷積、第一次縮減取樣、第二次卷積、第二次縮減取樣，提取影像的特徵
完全連結神經網路

提取影像特徵後，reshape 為1維的向量，送進平坦層、隱藏層、輸出層組成的累身經網路進行處理

池化層用來 downsampling，優點：

減少所需處理的資料點：減少後續運算所需時間
讓影像位置差異變小：手寫數字的位置不同，會影響辨識結果，減少影像大小可讓位置差異變小
參數的數量與計算量下降：控制 overfitting 的問題

tensorflow CNN

import tensorflow as tf
import numpy as np

# STEP 1 讀取資料
mnist = tf.keras.datasets.mnist
# Tuple of Numpy arrays: (x_train, y_train), (x_test, y_test)

(x_train, y_train), (x_test, y_test) = mnist.load_data()

# 將 training 的 input 資料 28*28 的 2維陣列 轉為 1維陣列，再轉成 float32
# 每一個圖片，都變成 784 個 float 的 array
# training 與 testing 資料數量分別是 60000 與 10000 筆
# X_train_2D 是 [60000, 28*28] 的 2維陣列
x_train_2D = x_train.reshape(60000, 28*28).astype('float32')
x_test_2D = x_test.reshape(10000, 28*28).astype('float32')
print('x_train_2D.shape=', x_train_2D.shape)
# x_train_2D.shape=(60000, 784)

# 將圖片的數字 (0~255) 標準化，最簡單的方法就是直接除以 255
# x_train_norm 是標準化後的結果，每一個數字介於 0~1 之間
x_train_norm = x_train_2D/255
x_test_norm = x_test_2D/255

# 將 training 的 label 進行 one-hot encoding，例如數字 7 經過 One-hot encoding 轉換後是 array([0., 0., 0., 0., 0., 0., 0., 1., 0., 0.], dtype=float32)，即第7個值為 1

y_train_one_hot_tf=tf.one_hot(y_train,10)
y_test_one_hot_tf=tf.one_hot(y_test,10)

y_train_one_hot = None
y_test_one_hot = None
with tf.compat.v1.Session() as sess:
    init = tf.compat.v1.global_variables_initializer()
    sess.run(init)
    y_train_one_hot = sess.run(y_train_one_hot_tf)
    y_test_one_hot = sess.run(y_test_one_hot_tf)

# 將 x_train, y_train 分成 train 與 validation 兩個部分
x_train_norm_data = x_train_norm[0:50000]
x_train_norm_validation = x_train_norm[50000:60000]

y_train_one_hot_data = y_train_one_hot[0:50000]
y_train_one_hot_validation = y_train_one_hot[50000:60000]


### 建立模型

# 先建立一些共用的函數
def weight(shape):
    return tf.Variable(tf.random.truncated_normal(shape, stddev=0.1),
                       name ='W')
# bias 張量，先以 constant 建立常數，然後用 Variable 建立張量變數
def bias(shape):
    return tf.Variable(tf.constant(0.1, shape=shape)
                       , name = 'b')
# 卷積運算 功能相當於濾鏡
#  x 是輸入的影像，必須是 4 維的張量
#  W 是 filter weight 濾鏡的權重，後續以隨機方式產生 filter weight
#  strides 是 濾鏡的跨步 step，設定為 [1,1,1,1]，格式是 [1, stride, stride, 1]，濾鏡每次移動時，從左到右，上到下，各移動 1 步
#  padding 是 'SAME'，此模式會在邊界以外 補0 再做運算，讓輸入與輸出影像為相同大小
def conv2d(x, W):
    return tf.nn.conv2d(x, W, strides=[1,1,1,1],
                        padding='SAME')

# 建立池化層，進行影像的縮減取樣
#  x 是輸入的影像，必須是 4 維的張量
#  ksize 是縮減取樣窗口的大小，設定為 [1,2,2,1]，格式為 [1, height, width, 1]，也就是高度 2 寬度 2 的窗口
#  stides 是縮減取樣窗口的跨步 step，設定為 [1,2,2,1]，格式為 [1, stride, stride, 1]，也就是縮減取樣窗口，由左到右，由上到下，各2步
#  原本 28x28 的影像，經過 max-pool 後，會縮小為 14x14
def max_pool_2x2(x):
    return tf.nn.max_pool2d(x, ksize=[1,2,2,1],
                          strides=[1,2,2,1],
                          padding='SAME')


# 輸入層
with tf.name_scope('Input_Layer'):
    # placeholder 會傳入影像
    x = tf.compat.v1.placeholder("float",shape=[None, 784],name="x")
    # x 原本為 1 維張量，要 reshape 為 4 維張量
    # 第 1 維 -1，因為後續訓練要透過 placeholder 輸入的資料筆數不固定
    # 第 2, 3 維，是 28, 28，因為影像為 28x28
    # 第 4 維是 1，因為是單色的影像，就設定為 1，如果是彩色，要設定為 3 (RGB)
    x_image = tf.reshape(x, [-1, 28, 28, 1])

# CNN Layer 1
# 用來提取特徵，卷積運算後，會產生 16 個影像，大小仍為 28x28
with tf.name_scope('C1_Conv'):
    # filter weight 大小為 5x5
    # 因為是單色，第 3 維設定為 1
    # 要產生 16 個影像，所以第 4 維設定為 16
    W1 = weight([5,5,1,16])

    # 因為產生 16 個影像，所以輸入餐次 shape = 16
    b1 = bias([16])

    # 卷積運算
    Conv1=conv2d(x_image, W1)+ b1
    # ReLU 激活函數
    C1_Conv = tf.nn.relu(Conv1 )

# 池化層用來 downsampling，將影像由 28x28 縮小為 14x14，影像數量仍為 16
with tf.name_scope('C1_Pool'):
    C1_Pool = max_pool_2x2(C1_Conv)

# CNN Layer 2
# 第二次卷積運算，將 16 個影像轉換為 36 個影像，卷積運算不改變影像大小，仍為 14x14
with tf.name_scope('C2_Conv'):
    # filter weight 大小為 5x5
    # 第 3 維是 16，因為卷積層1 的影像數量為 16
    # 第 4 維設定為 36，因為將 16 個影像轉換為 36個
    W2 = weight([5,5,16,36])
    # 因為產生 36 個影像，所以輸入餐次 shape = 36
    b2 = bias([36])
    Conv2=conv2d(C1_Pool, W2)+ b2
    # relu 會將負數的點轉換為 0
    C2_Conv = tf.nn.relu(Conv2)

# 池化層2用來 downsampling，將影像由 14x14 縮小為 7x7，影像數量仍為 36
with tf.name_scope('C2_Pool'):
    C2_Pool = max_pool_2x2(C2_Conv)

# Fully Connected Layer
# 平坦層，將 36個 7x7 影像，轉換為 1 維向量，長度為 36x7x7= 1764，也就是 1764 個 float，作為輸入資料
with tf.name_scope('D_Flat'):
    D_Flat = tf.reshape(C2_Pool, [-1, 1764])

with tf.name_scope('D_Hidden_Layer'):
    W3= weight([1764, 128])
    b3= bias([128])
    D_Hidden = tf.nn.relu(
                  tf.matmul(D_Flat, W3)+b3)

    ## Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.
    # D_Hidden_Dropout= tf.nn.dropout(D_Hidden, keep_prob=0.8)
    D_Hidden_Dropout= tf.nn.dropout(D_Hidden, rate = 0.2)

# 輸出層, 10 個神經元
#  y_predict = softmax(D_Hidden_Dropout * W4 + b4)
with tf.name_scope('Output_Layer'):
    # 因為上一層 D_Hidden 是 128 個神經元，所以第1維是 128
    W4 = weight([128,10])
    b4 = bias([10])
    y_predict= tf.nn.softmax(
                 tf.matmul(D_Hidden_Dropout, W4)+b4)


### 設定訓練模型最佳化步驟
# 使用反向傳播演算法，訓練多層感知模型
with tf.name_scope("optimizer"):

    y_label = tf.compat.v1.placeholder("float", shape=[None, 10],
                              name="y_label")

    loss_function = tf.reduce_mean(
                      tf.nn.softmax_cross_entropy_with_logits
                         (logits=y_predict ,
                          labels=y_label))

    optimizer = tf.compat.v1.train.AdamOptimizer(learning_rate=0.0001) \
                    .minimize(loss_function)


### 設定評估模型
with tf.name_scope("evaluate_model"):
    correct_prediction = tf.equal(tf.argmax(y_predict, 1),
                                  tf.argmax(y_label, 1))
    accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))


### 訓練模型

trainEpochs = 30
batchSize = 100
totalBatchs = int(len(x_train_norm_data)/batchSize)
epoch_list=[];accuracy_list=[];loss_list=[];
from time import time

with tf.compat.v1.Session() as sess:
    startTime=time()

    sess.run(tf.compat.v1.global_variables_initializer())

    for epoch in range(trainEpochs):
        for i in range(totalBatchs):
            # batch_x, batch_y = mnist.train.next_batch(batchSize)
            batch_x = x_train_norm_data[i*batchSize:(i+1)*batchSize]
            batch_y = y_train_one_hot_data[i*batchSize:(i+1)*batchSize]

            sess.run(optimizer,feed_dict={x: batch_x,
                                          y_label: batch_y})

        loss,acc = sess.run([loss_function,accuracy],
                            feed_dict={x: x_train_norm_validation,
                                       y_label: y_train_one_hot_validation})

        epoch_list.append(epoch)
        loss_list.append(loss)
        accuracy_list.append(acc)

        print("Train Epoch:", '%02d' % (epoch+1), "Loss=","{:.9f}".format(loss)," Accuracy=",acc)

    duration =time()-startTime
    print("Train Finished takes:",duration)

    ## 評估模型準確率
    print("Accuracy:",
      sess.run(accuracy,feed_dict={x: x_test_norm,
                                   y_label:y_test_one_hot}))
    # 前 5000 筆
    print("Accuracy:",
      sess.run(accuracy,feed_dict={x: x_test_norm[:5000],
                                   y_label: y_test_one_hot[:5000]}))
    # 後 5000 筆
    print("Accuracy:",
      sess.run(accuracy,feed_dict={x: x_test_norm[5000:],
                                   y_label: y_test_one_hot[5000:]}))

    ## 預測機率
    y_predict=sess.run(y_predict,
                   feed_dict={x: x_test_norm[:5000]})

    ## 預測結果
    prediction_result=sess.run(tf.argmax(y_predict,1),
                           feed_dict={x: x_test_norm ,
                                      y_label: y_test_one_hot})

    ## 儲存模型
    saver = tf.train.Saver()
    save_path = saver.save(sess, "saveModel/CNN_model1")
    print("Model saved in file: %s" % save_path)
    merged = tf.summary.merge_all()
    # 可將 計算圖，透過 TensorBoard 視覺化
    train_writer = tf.summary.FileWriter('log/CNN',sess.graph)


# matplotlib 列印 loss, accuracy 折線圖
import matplotlib.pyplot as plt

fig = plt.gcf()
# fig.set_size_inches(4,2)
plt.plot(epoch_list, loss_list, label = 'loss')
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend(['loss'], loc='upper left')
plt.savefig('loss.png')


fig = plt.gcf()
# fig.set_size_inches(4,2)
plt.plot(epoch_list, accuracy_list,label="accuracy" )

plt.ylim(0.8,1)
plt.ylabel('accuracy')
plt.xlabel('epoch')
plt.legend(['accuracy'], loc='upper right')
plt.savefig('accuracy.png')

############
# 查看多筆資料，以及 label
import matplotlib.pyplot as plt
def plot_images_labels_prediction(images,labels,prediction,idx,filename, num=10):
    fig = plt.gcf()
    fig.set_size_inches(12, 14)
    if num>25: num=25
    for i in range(0, num):
        ax=plt.subplot(5,5, 1+i)

        # 將 images 的 784 個數字轉換為 28x28
        ax.imshow(np.reshape(images[idx],(28, 28)), cmap='binary')

        # 轉換 one_hot label 為數字
        title= "label=" +str(np.argmax(labels[idx]))
        if len(prediction)>0:
            title+=",predict="+str(prediction[idx])

        ax.set_title(title,fontsize=10)
        ax.set_xticks([]);ax.set_yticks([])
        idx+=1
    plt.savefig(filename)


plot_images_labels_prediction(x_test_norm,
                              y_test_one_hot,
                              prediction_result,0, "result.png", num=10)

# 找出預測錯誤
for i in range(400):
    if prediction_result[i]!=np.argmax(y_test_one_hot[i]):
        print("i="+str(i)+
              "   label=",np.argmax(y_test_one_hot[i]),
              "predict=",prediction_result[i])

Train Epoch: 01 Loss= 1.604377151  Accuracy= 0.8872
Train Epoch: 02 Loss= 1.547111511  Accuracy= 0.9281
Train Epoch: 03 Loss= 1.525221825  Accuracy= 0.9447
Train Epoch: 04 Loss= 1.516423583  Accuracy= 0.9511
Train Epoch: 05 Loss= 1.507740974  Accuracy= 0.9584
Train Epoch: 06 Loss= 1.503444791  Accuracy= 0.9636
Train Epoch: 07 Loss= 1.496760130  Accuracy= 0.9683
Train Epoch: 08 Loss= 1.494633555  Accuracy= 0.9712
Train Epoch: 09 Loss= 1.492025375  Accuracy= 0.9724
Train Epoch: 10 Loss= 1.491448402  Accuracy= 0.9735
Train Epoch: 11 Loss= 1.488568783  Accuracy= 0.9751
Train Epoch: 12 Loss= 1.488826513  Accuracy= 0.9745
Train Epoch: 13 Loss= 1.485750437  Accuracy= 0.9778
Train Epoch: 14 Loss= 1.484605789  Accuracy= 0.9798
Train Epoch: 15 Loss= 1.483879209  Accuracy= 0.9788
Train Epoch: 16 Loss= 1.482506037  Accuracy= 0.9808
Train Epoch: 17 Loss= 1.482969046  Accuracy= 0.9796
Train Epoch: 18 Loss= 1.481315017  Accuracy= 0.9811
Train Epoch: 19 Loss= 1.480247617  Accuracy= 0.983
Train Epoch: 20 Loss= 1.480669379  Accuracy= 0.9817
Train Epoch: 21 Loss= 1.480412602  Accuracy= 0.9824
Train Epoch: 22 Loss= 1.479805708  Accuracy= 0.983
Train Epoch: 23 Loss= 1.479858279  Accuracy= 0.9827
Train Epoch: 24 Loss= 1.479218960  Accuracy= 0.9834
Train Epoch: 25 Loss= 1.479144573  Accuracy= 0.9829
Train Epoch: 26 Loss= 1.478820801  Accuracy= 0.9838
Train Epoch: 27 Loss= 1.477338433  Accuracy= 0.9857
Train Epoch: 28 Loss= 1.478171706  Accuracy= 0.9847
Train Epoch: 29 Loss= 1.477008104  Accuracy= 0.9856
Train Epoch: 30 Loss= 1.477438688  Accuracy= 0.9845
Train Finished takes: 1763.7836382389069
Accuracy: 0.988
Accuracy: 0.9814
Accuracy: 0.9928

i=18   label= 3 predict= 5
i=290   label= 8 predict= 4
i=321   label= 2 predict= 7
i=359   label= 9 predict= 8

cctg

2020/10/26

TensorFlow手寫數字辨識_CNN

TensorFlow手寫數字辨識_CNN

CNN 簡介

tensorflow CNN

沒有留言:

張貼留言

analytics

Creative Commons License