Mất hiệu lực và độ chính xác không đổi

Tôi đang cố gắng thực hiện bài báo này trên một bộ ảnh y tế. Tôi đang làm điều đó ở Keras. Mạng về cơ bản bao gồm 4 lớp đối lưu và nhóm tối đa theo sau là một lớp được kết nối đầy đủ và phân loại tối đa mềm.

Theo tôi biết, tôi đã theo kiến trúc được đề cập trong bài báo. Tuy nhiên, mất xác nhận và độ chính xác vẫn không thay đổi trong suốt. Độ chính xác dường như được cố định ở mức ~ 57,5%.

Bất kỳ trợ giúp về nơi tôi có thể đi sai sẽ được đánh giá rất cao.

Mã của tôi:

from keras.models import Sequential
from keras.layers import Activation, Dropout, Dense, Flatten  
from keras.layers import Convolution2D, MaxPooling2D
from keras.optimizers import SGD
from keras.utils import np_utils
from PIL import Image
import numpy as np
from sklearn.utils import shuffle
from sklearn.cross_validation import train_test_split
import theano
import os
import glob as glob
import cv2
from matplotlib import pyplot as plt

nb_classes = 2
img_rows, img_cols = 100,100
img_channels = 3


#################### DATA DIRECTORY SETTING######################

data = '/home/raghuram/Desktop/data'
os.chdir(data)
file_list = os.listdir(data)
##################################################################

## Test lines
#I = cv2.imread(file_list[1000])
#print np.shape(I)
####
non_responder_file_list = glob.glob('0_*FLAIR_*.png')
responder_file_list = glob.glob('1_*FLAIR_*.png')
print len(non_responder_file_list),len(responder_file_list)

labels = np.ones((len(file_list)),dtype = int)
labels[0:len(non_responder_file_list)] = 0
immatrix = np.array([np.array(cv2.imread(data+'/'+image)).flatten() for image in file_list])
#img = immatrix[1000].reshape(100,100,3)
#plt.imshow(img,cmap = 'gray')


data,Label = shuffle(immatrix,labels, random_state=2)
train_data = [data,Label]
X,y = (train_data[0],train_data[1])
# Also need to look at how to preserve spatial extent in the conv network
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=4)
X_train = X_train.reshape(X_train.shape[0], 3, img_rows, img_cols)
X_test = X_test.reshape(X_test.shape[0], 3, img_rows, img_cols)
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')

X_train /= 255
X_test /= 255

Y_train = np_utils.to_categorical(y_train, nb_classes)
Y_test = np_utils.to_categorical(y_test, nb_classes)

model = Sequential()

## First conv layer and its activation followed by the max-pool layer#
model.add(Convolution2D(16,5,5, border_mode = 'valid', subsample = (1,1), init = 'glorot_normal',input_shape = (3,100,100))) # Glorot normal is similar to Xavier initialization
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size = (2,2),strides = None))
# Output is 48x48

print 'First layer setup'
###########################Second conv layer#################################
model.add(Convolution2D(32,3,3,border_mode = 'same', subsample = (1,1),init = 'glorot_normal'))
model.add(Activation('relu'))
model.add(Dropout(0.6))
model.add(MaxPooling2D(pool_size = (2,2),strides = None))
#############################################################################

print ' Second layer setup'
# Output is 2x24

##########################Third conv layer###################################
model.add(Convolution2D(64,3,3, border_mode = 'same', subsample = (1,1), init = 'glorot_normal'))
model.add(Activation('relu'))
model.add(Dropout(0.6))
model.add(MaxPooling2D(pool_size = (2,2),strides = None))
#############################################################################
# Output is 12x12

print ' Third layer setup'
###############################Fourth conv layer#############################
model.add(Convolution2D(128,3,3, border_mode = 'same', subsample = (1,1), init = 'glorot_normal'))
model.add(Activation('relu'))
model.add(Dropout(0.6))
model.add(MaxPooling2D(pool_size = (2,2),strides = None))
############################################################################# 

print 'Fourth layer setup'

# Output is 6x6x128
# Create the FC layer of size 128x6x6#
model.add(Flatten()) 
model.add(Dense(2,init = 'glorot_normal',input_dim = 128*6*6))
model.add(Dropout(0.6))
model.add(Activation('softmax'))

print 'Setting up fully connected layer'
print 'Now compiling the network'
sgd = SGD(lr=0.01, decay=1e-4, momentum=0.6, nesterov=True)
model.compile(loss = 'mse',optimizer = 'sgd', metrics=['accuracy'])

# Fit the network to the data#
print 'Network setup successfully. Now fitting the network to the data'
model. fit(X_train,Y_train,batch_size = 100, nb_epoch = 20, validation_split = None,verbose = 1)
print 'Testing'
loss,accuracy = model.evaluate(X_test,Y_test,batch_size = 32,verbose = 1)
print "Test fraction correct (Accuracy) = {:.2f}".format(accuracy)

— Raghuram
nguồn

Là mất đào tạo đi xuống?

— Jan van der Vegt

Không, mất mát đào tạo cũng không đổi trong suốt.

— Raghuram

Bạn chưa đặt bất kỳ dữ liệu xác thực hoặc xác thực_split trong cuộc gọi phù hợp của mình, nó sẽ xác nhận cái gì? Hay bạn có nghĩa là thử nghiệm?

— Jan van der Vegt

Đó là sau khi thử nghiệm xung quanh. Tôi đặt xác thực_split = 0,2 trước khi đặt thành Không và cũng đã thử nghiệm điều đó.

— Raghuram

Bạn có thể phù hợp với một đợt trong nhiều lần để xem liệu bạn có thể giảm tổn thất đào tạo không?

— Jan van der Vegt

Có vẻ như bạn sử dụng MSE làm chức năng mất, từ cái nhìn thoáng qua trên giấy có vẻ như họ sử dụng NLL (entropy chéo), MSE được coi là dễ bị mất cân bằng dữ liệu giữa các vấn đề khác và nó có thể là nguyên nhân của vấn đề bạn Kinh nghiệm, tôi sẽ thử đào tạo bằng cách sử dụng mất phân loại trong trường hợp của bạn, hơn nữa tỷ lệ học tập 0,01 dường như quá lớn Tôi sẽ thử chơi với nó và thử 0,001 hoặc thậm chí 0,0001

— koltun
nguồn

Mặc dù tôi đến hơi muộn ở đây, tôi muốn đặt hai xu của mình vì nó giúp tôi giải quyết một vấn đề tương tự gần đây. Điều đã đến với sự giải cứu của tôi là nhân rộng các tính năng thành phạm vi (0,1) bên cạnh tổn thất entropy chéo phân loại. Tuy nhiên, điều đáng nói là việc chia tỷ lệ tính năng chỉ giúp nếu các tính năng thuộc các số liệu khác nhau và sở hữu nhiều biến thể hơn (theo thứ tự cường độ) so với nhau, như trong trường hợp của tôi. Ngoài ra, tỷ lệ có thể thực sự hữu ích nếu người ta sử dụng hingemất, vì các phân loại lề tối đa thường nhạy cảm với khoảng cách giữa các giá trị tính năng. Hy vọng điều này sẽ giúp một số du khách trong tương lai!

— Saurav--
nguồn