Logarifmik sanoq tizimini (LNS) chuqur o‘rganishda qo‘llash g‘oyasini birinchi marta uchratganimda, men juda qiziqdim, lekin shu bilan birga shubha bilan qaradim. Ko'pchiligimiz singari, men ham doim Floating-Point (FP) arifmetikasi bilan ishlaganman - chuqur o'rganishda raqamli hisoblash uchun standart. FP aniqlik va diapazon o'rtasida yaxshi muvozanatni ta'minlaydi, ammo u o'zaro kelishuvlar bilan birga keladi: yuqori xotiradan foydalanish, hisoblash murakkabligi va ko'proq quvvat sarfi. Shunday qilib, men tajriba o'tkazishga va o'zim ko'rishga qaror qildim - MNIST-da oddiy to'liq bog'langan ko'p qatlamli perseptronni (MLP) o'rgatishda LNS FP bilan qanday taqqoslanadi?
LNS logarifmik shkalada raqamlarni ifodalaydi, ko'paytirishni qo'shimchalarga aylantiradi, bu esa ma'lum apparat arxitekturalarida hisoblash jihatidan arzonroq bo'lishi mumkin. Bu samaradorlik, ayniqsa, LNS-da murakkabroq bo'lgan qo'shish va ayirish operatsiyalari bilan bog'liq aniqlik narxiga to'g'ri keladi. Biroq, potentsial imtiyozlar - qisqargan xotira maydoni, tezroq hisob-kitoblar va kam quvvat iste'moli - meni sinab ko'rishga qiziqtirdi.
Floating-Point arifmetikasi PyTorch va TensorFlow kabi chuqur o'rganish tizimidagi standart raqamli ko'rinishdir. FP raqamlari quyidagilarga ega:
FP32 (bitta aniqlik) odatda chuqur o'rganishda qo'llaniladi, bu raqamli aniqlik va hisoblash samaradorligi o'rtasidagi muvozanatni ta'minlaydi. FP16 va BF16 kabi samaraliroq formatlar mashg'ulotlarni tezlashtirish uchun mashhurlik kasb etmoqda.
LNS - bu raqamlar logarifm sifatida saqlanadigan muqobil raqamli ko'rinishdir: [ x = \log_b (y) ] bu erda ( b ) - logarifm asosi. LNS bir qator afzalliklarga ega:
Biroq, LNS-da qo'shish va ayirish uchun taxminiy ma'lumotlar talab qilinadi, bu esa aniqlikning pasayishiga olib keladi.
LNSni qo'shimcha o'rganish uchun men LNS ichki ko'rinishlaridan foydalangan holda qo'shish, ayirish, ko'paytirish va bo'lish kabi asosiy arifmetik amallarni amalga oshirdim.
import torch import numpy as np import xlns as xl # Assuming xlns module is installed and provides xlnsnp # Function to convert floating-point numbers to xlns internal representation def float_to_internal(arr): xlns_data = xl.xlnsnp(arr) return xlns_data.nd # Function to convert xlns internal representation back to floating-point numbers def internal_to_float(internal_data): original_numbers = [] for value in internal_data: x = value // 2 s = value % 2 # Use x and s to create xlns object xlns_value = xl.xlns(0) xlns_value.x = x xlns_value.s = s original_numbers.append(float(xlns_value)) return original_numbers # Function to perform LNS addition using internal representation def lns_add_internal(x, y): max_part = torch.maximum(x, y) diff = torch.abs(x - y) adjust_term = torch.log1p(torch.exp(-diff)) return max_part + adjust_term # Function to perform LNS subtraction using internal representation def lns_sub_internal(x, y): return lns_add_internal(x, -y) # Function to perform LNS multiplication using internal representation def lns_mul_internal(x, y): return x + y # Function to perform LNS division using internal representation def lns_div_internal(x, y): return x - y # Input floating-point arrays x_float = [2.0, 3.0] y_float = [-1.0, 0.0] # Convert floating-point arrays to xlns internal representation x_internal = float_to_internal(x_float) y_internal = float_to_internal(y_float) # Create tensors from the internal representation tensor_x_nd = torch.tensor(x_internal, dtype=torch.int64) tensor_y_nd = torch.tensor(y_internal, dtype=torch.int64) # Perform the toy LNS addition on the internal representation result_add_internal = lns_add_internal(tensor_x_nd, tensor_y_nd) # Perform the toy LNS subtraction on the internal representation result_sub_internal = lns_sub_internal(tensor_x_nd, tensor_y_nd) # Perform the toy LNS multiplication on the internal representation result_mul_internal = lns_mul_internal(tensor_x_nd, tensor_y_nd) # Perform the toy LNS division on the internal representation result_div_internal = lns_div_internal(tensor_x_nd, tensor_y_nd) # Convert the internal results back to original floating-point values result_add_float = internal_to_float(result_add_internal.numpy()) result_sub_float = internal_to_float(result_sub_internal.numpy()) result_mul_float = internal_to_float(result_mul_internal.numpy()) result_div_float = internal_to_float(result_div_internal.numpy()) # Convert the results back to PyTorch tensors result_add_tensor = torch.tensor(result_add_float, dtype=torch.float32) result_sub_tensor = torch.tensor(result_sub_float, dtype=torch.float32) result_mul_tensor = torch.tensor(result_mul_float, dtype=torch.float32) result_div_tensor = torch.tensor(result_div_float, dtype=torch.float32) # Print results print("Input x:", x_float) print("Input y:", y_float) print("Addition Result:", result_add_float) print("Addition Result Tensor:", result_add_tensor) print("Subtraction Result:", result_sub_float) print("Subtraction Result Tensor:", result_sub_tensor) print("Multiplication Result:", result_mul_float) print("Multiplication Result Tensor:", result_mul_tensor) print("Division Result:", result_div_float) print("Division Result Tensor:", result_div_tensor)
Logarifmik sanoq tizimini (LNS) eksperimental amalga oshirishimning qisqacha mazmuni.
LNS-da raqamlar ko'paytirish va bo'linishni qo'shish va ayirishga aylantiradigan logarifmlar sifatida ifodalanadi. Biroq, buni PyTorch bilan amalga oshirish muayyan qiyinchiliklarni keltirib chiqaradi, chunki PyTorch tensorlari suzuvchi nuqtali tasvirlarni ichkarida ishlatadi. Bu bir nechta talablarni keltirib chiqaradi:
Birinchi qadam suzuvchi nuqtali raqamlarni LNS ichki ko'rinishiga aylantirishdir.
import torch import numpy as np import xl # Hypothetical external LNS library def float_to_internal(arr): xlns_data = xl.xlnsnp(arr) return xlns_data.nd # Convert floating-point arrays to xlns internal representation x_float = np.array([2.0, 3.0]) y_float = np.array([-1.0, 0.0]) x_internal = float_to_internal(x_float) y_internal = float_to_internal(y_float) # Create tensors from the internal representation tensor_x_nd = torch.tensor(x_internal, dtype=torch.int64) tensor_y_nd = torch.tensor(y_internal, dtype=torch.int64)
dtype=torch.int64
dan foydalanish juda muhim, chunki:
def lns_mul_internal(x, y): return x + y
LNSda ko'paytirish qo'shimchaga aylanadi:
a = log(x)
va b = log(y)
boʻlsa, log(x×y) = log(x) + log(y)
. def lns_div_internal(x, y): return x - y
Bo'linish ayirishga aylanadi:
log(x/y) = log(x) - log(y)
. def lns_add_internal(x, y): max_part = torch.maximum(x, y) diff = torch.abs(x - y) adjust_term = torch.log1p(torch.exp(-diff)) return max_part + adjust_term
Qo'shish murakkabroq va son jihatdan sezgir, chunki:
log(x + y) = log(max(x,y)) + log(1 + exp(log(min(x,y)) - log(max(x,y))))
.log(1 + x)
o'rniga log1p
foydalanadi. def internal_to_float(internal_data): for value in internal_data: x = value // 2 # Integer division s = value % 2 # Integer modulo
Konversiya quvur liniyasi aniq ajratishni saqlaydi:
# Convert results back to float and tensor result_add_float = internal_to_float(result_add_internal.numpy()) result_add_tensor = torch.tensor(result_add_float, dtype=torch.float32)
Ishlashni yaxshilash uchun quyidagilar mumkin:
Joriy tatbiq amaliy o'zgarishlarni amalga oshiradi:
Quyidagi qadamlar [2.0, 3.0]
va [-1.0, 0.0]
misol qiymatlari yordamida toʻliq quvur liniyasini koʻrsatadi:
Ushbu dastur PyTorch-ning suzuvchi nuqtali tenzor tizimi va LNS arifmetikasi o'rtasidagi bo'shliqni muvaffaqiyatli ko'paytiradi va raqamli barqarorlik va aniqlikni saqlaydi.
Men MNIST ma'lumotlar to'plamida FP va LNS taqdimotlaridan foydalangan holda to'liq bog'langan MLPni o'rgatganman. Model arxitekturasi oddiy edi:
LNS-ni amalga oshirish uchun men odatdagi PyTorch ish oqimidan tashqariga chiqishim kerak edi. PyTorch tomonidan qo'llab-quvvatlanadigan FPdan farqli o'laroq, PyTorch o'rnatilgan LNS operatsiyalarini ta'minlamaydi. Men xlns
nomli GitHub loyihasini topdim, u logarifmik raqamlar va arifmetikani amalga oshiradi, bu esa neyron tarmoqlarda LNS dan foydalanish imkonini beradi.
Biz PyTorch-dan foydalanib, standart FP-ga asoslangan to'liq ulangan MLP-ni amalga oshirishdan boshlaymiz:
import torch import torch.nn as nn import torch.optim as optim import torchvision import torchvision.transforms as transforms import matplotlib.pyplot as plt import numpy as np import time # For calculating elapsed time # Define the multi-layer perceptron (MLP) model with one hidden layer class MLP(nn.Module): def __init__(self): super(MLP, self).__init__() # Input: 28*28 features; Hidden layer: 100 neurons; Output layer: 10 neurons self.fc1 = nn.Linear(28 * 28, 100) self.relu = nn.ReLU() self.fc2 = nn.Linear(100, 10) self.logsoftmax = nn.LogSoftmax(dim=1) # For stable outputs with NLLLoss def forward(self, x): # Flatten image: (batch_size, 1, 28, 28) -> (batch_size, 784) x = x.view(x.size(0), -1) x = self.fc1(x) x = self.relu(x) x = self.fc2(x) return self.logsoftmax(x) def train_and_validate(num_epochs=5, batch_size=64, learning_rate=0.01, split_ratio=0.8): # Set the device to GPU if available device = torch.device("cuda" if torch.cuda.is_available() else "cpu") print(f"Training on device: {device}") # Transformation for MNIST: convert to tensor and normalize transform = transforms.Compose([ transforms.ToTensor(), transforms.Normalize((0.5,), (0.5,)) ]) # Load the MNIST training dataset train_dataset_full = torchvision.datasets.MNIST( root='./data', train=True, transform=transform, download=True ) # Split the dataset into training and validation sets n_total = len(train_dataset_full) n_train = int(split_ratio * n_total) n_val = n_total - n_train train_dataset, val_dataset = torch.utils.data.random_split(train_dataset_full, [n_train, n_val]) # Create DataLoaders for training and validation datasets train_loader = torch.utils.data.DataLoader(train_dataset, batch_size=batch_size, shuffle=True) val_loader = torch.utils.data.DataLoader(val_dataset, batch_size=batch_size, shuffle=False) # Initialize the model, loss function, and optimizer; move model to device model = MLP().to(device) criterion = nn.NLLLoss() optimizer = optim.SGD(model.parameters(), lr=learning_rate) # Lists to store training and validation accuracies for each epoch train_accuracies = [] val_accuracies = [] # Record the start time for measuring elapsed time start_time = time.time() # Training loop for epoch in range(num_epochs): model.train() running_loss = 0.0 correct_train = 0 total_train = 0 for inputs, labels in train_loader: # Move inputs and labels to device inputs, labels = inputs.to(device), labels.to(device) optimizer.zero_grad() outputs = model(inputs) loss = criterion(outputs, labels) loss.backward() optimizer.step() # Compute running loss and training accuracy running_loss += loss.item() * inputs.size(0) _, predicted = torch.max(outputs.data, 1) total_train += labels.size(0) correct_train += (predicted == labels).sum().item() train_accuracy = 100.0 * correct_train / total_train train_accuracies.append(train_accuracy) # Evaluate on validation set model.eval() correct_val = 0 total_val = 0 with torch.no_grad(): for inputs, labels in val_loader: inputs, labels = inputs.to(device), labels.to(device) outputs = model(inputs) _, predicted = torch.max(outputs.data, 1) total_val += labels.size(0) correct_val += (predicted == labels).sum().item() val_accuracy = 100.0 * correct_val / total_val val_accuracies.append(val_accuracy) print(f"Epoch [{epoch+1}/{num_epochs}], Loss: {running_loss/total_train:.4f}, " f"Train Acc: {train_accuracy:.2f}%, Val Acc: {val_accuracy:.2f}%") # Calculate elapsed time elapsed_time = time.time() - start_time print(f"Training completed in {elapsed_time:.2f} seconds.") # Show sample predictions from the validation set show_predictions(model, val_loader, device) # Optional: plot training and validation accuracies epochs_arr = np.arange(1, num_epochs + 1) plt.figure(figsize=(10, 6)) plt.plot(epochs_arr, train_accuracies, label='Training Accuracy', marker='o') plt.plot(epochs_arr, val_accuracies, label='Validation Accuracy', marker='x') plt.xlabel('Epoch') plt.ylabel('Accuracy (%)') plt.title('Training and Validation Accuracies') plt.legend() plt.grid(True) plt.savefig('pytorch_accuracy.png') plt.show() def show_predictions(model, data_loader, device, num_images=6): """ Displays a few sample images from the data_loader along with the model's predictions. """ model.eval() images_shown = 0 plt.figure(figsize=(12, 8)) # Get one batch of images from the validation dataset for inputs, labels in data_loader: inputs, labels = inputs.to(device), labels.to(device) with torch.no_grad(): outputs = model(inputs) _, predicted = torch.max(outputs, 1) # Loop through the batch and plot images for i in range(inputs.size(0)): if images_shown >= num_images: break # Move the image to cpu and convert to numpy for plotting img = inputs[i].cpu().squeeze() plt.subplot(2, num_images // 2, images_shown + 1) plt.imshow(img, cmap='gray') plt.title(f"Pred: {predicted[i].item()}") plt.axis('off') images_shown += 1 if images_shown >= num_images: break plt.suptitle("Sample Predictions from the Validation Set") plt.tight_layout() plt.show() if __name__ == '__main__': train_and_validate(num_epochs=5, batch_size=64, learning_rate=0.01, split_ratio=0.8)
Ushbu dastur ko'paytirish va qo'shishlar FP arifmetikasi bilan amalga oshiriladigan an'anaviy chuqur o'rganish liniyasidan keyin amalga oshiriladi.
MNIST ma'lumotlar to'plami uchun ko'p qatlamli perceptron (MLP) ning PyTorch dasturining batafsil tavsifi.
class MLP(nn.Module): def __init__(self): super(MLP, self).__init__() self.fc1 = nn.Linear(28 * 28, 100) # First fully connected layer self.relu = nn.ReLU() # Activation function self.fc2 = nn.Linear(100, 10) # Output layer self.logsoftmax = nn.LogSoftmax(dim=1)
def forward(self, x): x = x.view(x.size(0), -1) # Flatten: (batch_size, 1, 28, 28) -> (batch_size, 784) x = self.fc1(x) # First layer x = self.relu(x) # Activation x = self.fc2(x) # Output layer return self.logsoftmax(x) # Final activation
def train_and_validate(num_epochs=5, batch_size=64, learning_rate=0.01, split_ratio=0.8): device = torch.device("cuda" if torch.cuda.is_available() else "cpu") # Data preprocessing transform = transforms.Compose([ transforms.ToTensor(), transforms.Normalize((0.5,), (0.5,)) # Normalize to [-1, 1] ])
Asosiy komponentlar:
Qurilma tanlash orqali GPU-ni qo'llab-quvvatlash
Yaxshiroq o'qitish uchun ma'lumotlarni normallashtirish
Konfiguratsiya qilinadigan giperparametrlar
train_dataset_full = torchvision.datasets.MNIST( root='./data', train=True, transform=transform, download=True ) # Split into train/validation n_train = int(split_ratio * n_total) train_dataset, val_dataset = torch.utils.data.random_split(train_dataset_full, [n_train, n_val])
Agar mavjud boʻlmasa, MNIST maʼlumotlar toʻplamini yuklab oladi
Ma'lumotlarni trening (80%) va tekshirish (20%) to'plamlariga ajratadi
for epoch in range(num_epochs): model.train() for inputs, labels in train_loader: inputs, labels = inputs.to(device), labels.to(device) optimizer.zero_grad() # Clear gradients outputs = model(inputs) # Forward pass loss = criterion(outputs, labels)# Calculate loss loss.backward() # Backward pass optimizer.step() # Update weights
Klassik mashg'ulot jarayoni:
Nol gradient
Oldinga o'tish
Yo'qotishlarni hisoblash
Orqaga o'tish
Og'irlik yangilanishlari
model.eval() with torch.no_grad(): for inputs, labels in val_loader: outputs = model(inputs) _, predicted = torch.max(outputs.data, 1) total_val += labels.size(0) correct_val += (predicted == labels).sum().item()
Asosiy xususiyatlar:
Model baholash rejimiga o'rnatildi
Gradient hisoblash shart emas
Aniqlikni hisoblash
def show_predictions(model, data_loader, device, num_images=6): model.eval() plt.figure(figsize=(12, 8)) # Display predictions vs actual labels
Tekshirish to'plamidan namunali bashoratlarni ko'rsatadi
Sifatli baholash uchun foydali
# Training metrics train_accuracies.append(train_accuracy) val_accuracies.append(val_accuracy) # Plot learning curves plt.plot(epochs_arr, train_accuracies, label='Training Accuracy') plt.plot(epochs_arr, val_accuracies, label='Validation Accuracy')
Trening va tasdiqlash aniqligini kuzatib boradi
O'rganish egri chiziqlari
Trening vaqtini o'lchaydi
Bu LNS-ga asoslangan ilovalar bilan taqqoslash uchun mustahkam asos yaratadi, chunki u an'anaviy suzuvchi nuqta arifmetikasidan foydalangan holda chuqur o'rganish quvurining barcha standart komponentlarini amalga oshiradi.
LNS uchun biz xlns
kutubxonasidan foydalanishimiz kerak. FP dan farqli o'laroq, LNS ko'paytirish og'ir operatsiyalarni logarifmik sohada qo'shish bilan almashtiradi. Biroq, PyTorch buni qo'llab-quvvatlamaydi, shuning uchun kerak bo'lganda LNS operatsiyalarini qo'lda qo'llashimiz kerak.
import numpy as np import matplotlib.pyplot as plt import os import time import argparse import xlns as xl from tensorflow.keras.datasets import mnist # Use Keras's MNIST loader # If you are using fractional normalized LNS, make sure the following are uncommented import xlnsconf.xlnsudFracnorm xlnsconf.xlnsudFracnorm.ilog2 = xlnsconf.xlnsudFracnorm.ipallog2 xlnsconf.xlnsudFracnorm.ipow2 = xlnsconf.xlnsudFracnorm.ipalpow2 # Set global parameter in xlns xl.xlnssetF(10) def softmax(inp): max_vals = inp.max(axis=1) max_vals = xl.reshape(max_vals, (xl.size(max_vals), 1)) u = xl.exp(inp - max_vals) v = u.sum(axis=1) v = v.reshape((xl.size(v), 1)) u = u / v return u def main(main_params): print("arbitrary base np LNS. Also xl.hstack, xl routines in softmax") print("testing new softmax and * instead of @ for delta") print("works with type " + main_params['type']) is_training = bool(main_params['is_training']) leaking_coeff = float(main_params['leaking_coeff']) batchsize = int(main_params['minibatch_size']) lr = float(main_params['learning_rate']) num_epoch = int(main_params['num_epoch']) _lambda = float(main_params['lambda']) ones = np.array(list(np.ones((batchsize, 1)))) if is_training: # Load the MNIST dataset from Keras (x_train, y_train), (x_test, y_test) = mnist.load_data() # Normalize images to [0, 1] x_train = x_train.astype(np.float64) / 255.0 x_test = x_test.astype(np.float64) / 255.0 # One-hot encode the labels (assume 10 classes for MNIST digits 0-9) num_classes = 10 y_train = np.eye(num_classes)[y_train] y_test = np.eye(num_classes)[y_test] # Flatten the images from (28, 28) to (784,) x_train = x_train.reshape(x_train.shape[0], -1) x_test = x_test.reshape(x_test.shape[0], -1) # Use a portion of the training data for validation (the 'split' index) split = int(main_params['split']) x_val = x_train[split:] y_val = y_train[split:] x_train = x_train[:split] y_train = y_train[:split] # If available, load pretrained weights; otherwise, initialize new random weights. if os.path.isfile("./weightin.npz"): print("using ./weightin.npz") randfile = np.load("./weightin.npz", "r") W1 = randfile["W1"] W2 = randfile["W2"] randfile.close() else: print("using new random weights") # Note: The input layer now has 785 neurons (784 features + 1 bias). W1 = np.array(list(np.random.normal(0, 0.1, (785, 100)))) # The first hidden layer has 100 neurons; add bias so 101 W2 = np.array(list(np.random.normal(0, 0.1, (101, 10)))) np.savez_compressed("./weightout.npz", W1=W1, W2=W2) delta_W1 = np.array(list(np.zeros(W1.shape))) delta_W2 = np.array(list(np.zeros(W2.shape))) # Convert weights to desired type (xlns variants or float) if main_params['type'] == 'xlnsnp': lnsW1 = xl.xlnsnp(np.array(xl.xlnscopy(list(W1)))) lnsW2 = xl.xlnsnp(np.array(xl.xlnscopy(list(W2)))) lnsones = xl.xlnsnp(np.array(xl.xlnscopy(list(np.ones((batchsize, 1)))))) lnsdelta_W1 = xl.xlnsnp(np.array(xl.xlnscopy(list(np.zeros(W1.shape))))) lnsdelta_W2 = xl.xlnsnp(np.array(xl.xlnscopy(list(np.zeros(W2.shape))))) elif main_params['type'] == 'xlnsnpv': lnsW1 = xl.xlnsnpv(np.array(xl.xlnscopy(list(W1))), 6) lnsW2 = xl.xlnsnpv(np.array(xl.xlnscopy(list(W2))), 6) lnsones = xl.xlnsnpv(np.array(xl.xlnscopy(list(np.ones((batchsize, 1)))))) lnsdelta_W1 = xl.xlnsnpv(np.array(xl.xlnscopy(list(np.zeros(W1.shape))))) lnsdelta_W2 = xl.xlnsnpv(np.array(xl.xlnscopy(list(np.zeros(W2.shape))))) elif main_params['type'] == 'xlnsnpb': lnsW1 = xl.xlnsnpb(np.array(xl.xlnscopy(list(W1))), 2**2**-6) lnsW2 = xl.xlnsnpb(np.array(xl.xlnscopy(list(W2))), 2**2**-6) lnsones = xl.xlnsnpb(np.array(xl.xlnscopy(list(np.ones((batchsize, 1))))), 2**2**-xl.xlnsF) lnsdelta_W1 = xl.xlnsnpb(np.array(xl.xlnscopy(list(np.zeros(W1.shape)))), 2**2**-xl.xlnsF) lnsdelta_W2 = xl.xlnsnpb(np.array(xl.xlnscopy(list(np.zeros(W2.shape)))), 2**2**-xl.xlnsF) elif main_params['type'] == 'xlns': lnsW1 = np.array(xl.xlnscopy(list(W1))) lnsW2 = np.array(xl.xlnscopy(list(W2))) lnsones = np.array(xl.xlnscopy(list(np.ones((batchsize, 1))))) lnsdelta_W1 = np.array(xl.xlnscopy(list(np.zeros(W1.shape)))) lnsdelta_W2 = np.array(xl.xlnscopy(list(np.zeros(W2.shape)))) elif main_params['type'] == 'xlnsud': lnsW1 = np.array(xl.xlnscopy(list(W1), xl.xlnsud)) lnsW2 = np.array(xl.xlnscopy(list(W2), xl.xlnsud)) lnsones = np.array(xl.xlnscopy(list(np.ones((batchsize, 1))), xl.xlnsud)) lnsdelta_W1 = np.array(xl.xlnscopy(list(np.zeros(W1.shape)), xl.xlnsud)) lnsdelta_W2 = np.array(xl.xlnscopy(list(np.zeros(W2.shape)), xl.xlnsud)) elif main_params['type'] == 'xlnsv': lnsW1 = np.array(xl.xlnscopy(list(W1), xl.xlnsv, 6)) lnsW2 = np.array(xl.xlnscopy(list(W2), xl.xlnsv, 6)) lnsones = np.array(xl.xlnscopy(list(np.ones((batchsize, 1))), xl.xlnsv)) lnsdelta_W1 = np.array(xl.xlnscopy(list(np.zeros(W1.shape)), xl.xlnsv)) lnsdelta_W2 = np.array(xl.xlnscopy(list(np.zeros(W2.shape)), xl.xlnsv)) elif main_params['type'] == 'xlnsb': lnsW1 = np.array(xl.xlnscopy(list(W1), xl.xlnsb, 2**2**-6)) lnsW2 = np.array(xl.xlnscopy(list(W2), xl.xlnsb, 2**2**-6)) lnsones = np.array(xl.xlnscopy(list(np.ones((batchsize, 1))), xl.xlnsb, 2**2**-xl.xlnsF)) lnsdelta_W1 = np.array(xl.xlnscopy(list(np.zeros(W1.shape)), xl.xlnsb, 2**2**-xl.xlnsF)) lnsdelta_W2 = np.array(xl.xlnscopy(list(np.zeros(W2.shape)), xl.xlnsb, 2**2**-xl.xlnsF)) elif main_params['type'] == 'float': lnsW1 = np.array(list(W1)) lnsW2 = np.array(list(W2)) lnsones = np.array(list(np.ones((batchsize, 1)))) lnsdelta_W1 = np.array(list(np.zeros(W1.shape))) lnsdelta_W2 = np.array(list(np.zeros(W2.shape))) performance = {} performance['lnsacc_train'] = np.zeros(num_epoch) performance['lnsacc_val'] = np.zeros(num_epoch) start_time = time.process_time() # Training loop for epoch in range(num_epoch): print('At Epoch %d:' % (1 + epoch)) # Loop through training batches for mbatch in range(int(split / batchsize)): start = mbatch * batchsize x = np.array(x_train[start:(start + batchsize)]) y = np.array(y_train[start:(start + batchsize)]) # At this point, each x is already flattened (batchsize x 784) # Conversion based on type if main_params['type'] == 'xlnsnp': lnsx = xl.xlnsnp(np.array(xl.xlnscopy(np.array(x, dtype=np.float64)))) lnsy = xl.xlnsnp(np.array(xl.xlnscopy(np.array(y, dtype=np.float64)))) elif main_params['type'] == 'xlnsnpv': lnsx = xl.xlnsnpv(np.array(xl.xlnscopy(np.array(x, dtype=np.float64)))) lnsy = xl.xlnsnpv(np.array(xl.xlnscopy(np.array(y, dtype=np.float64)))) elif main_params['type'] == 'xlnsnpb': lnsx = xl.xlnsnpb(np.array(xl.xlnscopy(np.array(x, dtype=np.float64))), 2**2**-xl.xlnsF) lnsy = xl.xlnsnpb(np.array(xl.xlnscopy(np.array(y, dtype=np.float64))), 2**2**-xl.xlnsF) elif main_params['type'] == 'xlns': lnsx = np.array(xl.xlnscopy(np.array(x, dtype=np.float64))) lnsy = np.array(xl.xlnscopy(np.array(y, dtype=np.float64))) elif main_params['type'] == 'xlnsud': lnsx = np.array(xl.xlnscopy(np.array(x, dtype=np.float64), xl.xlnsud)) lnsy = np.array(xl.xlnscopy(np.array(y, dtype=np.float64), xl.xlnsud)) elif main_params['type'] == 'xlnsv': lnsx = np.array(xl.xlnscopy(np.array(x, dtype=np.float64), xl.xlnsv)) lnsy = np.array(xl.xlnscopy(np.array(y, dtype=np.float64), xl.xlnsv)) elif main_params['type'] == 'xlnsb': lnsx = np.array(xl.xlnscopy(np.array(x, dtype=np.float64), xl.xlnsv, 2**2**-xl.xlnsF)) lnsy = np.array(xl.xlnscopy(np.array(y, dtype=np.float64), xl.xlnsv, 2**2**-xl.xlnsF)) elif main_params['type'] == 'float': lnsx = np.array(x, dtype=np.float64) lnsy = np.array(y, dtype=np.float64) # Concatenate the bias "ones" with input features for the first layer lnss1 = xl.hstack((lnsones, lnsx)) @ lnsW1 lnsmask = (lnss1 > 0) + (leaking_coeff * (lnss1 < 0)) lnsa1 = lnss1 * lnsmask lnss2 = xl.hstack((lnsones, lnsa1)) @ lnsW2 lnsa2 = softmax(lnss2) lnsgrad_s2 = (lnsa2 - lnsy) / batchsize lnsgrad_a1 = lnsgrad_s2 @ xl.transpose(lnsW2[1:]) lnsdelta_W2 = xl.transpose(xl.hstack((lnsones, lnsa1))) * lnsgrad_s2 lnsgrad_s1 = lnsmask * lnsgrad_a1 lnsdelta_W1 = xl.transpose(xl.hstack((lnsones, lnsx))) * lnsgrad_s1 lnsW2 -= (lr * (lnsdelta_W2 + (_lambda * lnsW2))) lnsW1 -= (lr * (lnsdelta_W1 + (_lambda * lnsW1))) print('#= ', split, ' batch=', batchsize, ' lr=', lr) lnscorrect_count = 0 # Evaluate accuracy on training set for mbatch in range(int(split / batchsize)): start = mbatch * batchsize x = x_train[start:(start + batchsize)] y = y_train[start:(start + batchsize)] if main_params['type'] == 'xlnsnp': lnsx = xl.xlnsnp(np.array(xl.xlnscopy(np.array(x, dtype=np.float64)))) elif main_params['type'] == 'xlnsnpv': lnsx = xl.xlnsnpv(np.array(xl.xlnscopy(np.array(x, dtype=np.float64)))) elif main_params['type'] == 'xlnsnpb': lnsx = xl.xlnsnpb(np.array(xl.xlnscopy(np.array(x, dtype=np.float64))), 2**2**-xl.xlnsF) elif main_params['type'] == 'xlns': lnsx = np.array(xl.xlnscopy(np.array(x, dtype=np.float64))) elif main_params['type'] == 'xlnsud': lnsx = np.array(xl.xlnscopy(np.array(x, dtype=np.float64), xl.xlnsud)) elif main_params['type'] == 'xlnsv': lnsx = np.array(xl.xlnscopy(np.array(x, dtype=np.float64), xl.xlnsv)) elif main_params['type'] == 'xlnsb': lnsx = np.array(xl.xlnscopy(np.array(x, dtype=np.float64), xl.xlnsv, 2**2**-xl.xlnsF)) elif main_params['type'] == 'float': lnsx = np.array(x, dtype=np.float64) lnss1 = xl.hstack((lnsones, lnsx)) @ lnsW1 lnsmask = (lnss1 > 0) + (leaking_coeff * (lnss1 < 0)) lnsa1 = lnss1 * lnsmask lnss2 = xl.hstack((lnsones, lnsa1)) @ lnsW2 lnscorrect_count += np.sum(np.argmax(y, axis=1) == xl.argmax(lnss2, axis=1)) lnsaccuracy = lnscorrect_count / split print("train-set accuracy at epoch %d: %f" % (1 + epoch, lnsaccuracy)) performance['lnsacc_train'][epoch] = 100 * lnsaccuracy lnscorrect_count = 0 # Evaluate on the validation set for mbatch in range(int(split / batchsize)): start = mbatch * batchsize x = x_val[start:(start + batchsize)] y = y_val[start:(start + batchsize)] if main_params['type'] == 'xlnsnp': lnsx = xl.xlnsnp(np.array(xl.xlnscopy(np.array(x, dtype=np.float64)))) elif main_params['type'] == 'xlnsnpv': lnsx = xl.xlnsnpv(np.array(xl.xlnscopy(np.array(x, dtype=np.float64)))) elif main_params['type'] == 'xlnsnpb': lnsx = xl.xlnsnpb(np.array(xl.xlnscopy(np.array(x, dtype=np.float64))), 2**2**-xl.xlnsF) elif main_params['type'] == 'xlns': lnsx = np.array(xl.xlnscopy(np.array(x, dtype=np.float64))) elif main_params['type'] == 'xlnsud': lnsx = np.array(xl.xlnscopy(np.array(x, dtype=np.float64), xl.xlnsud)) elif main_params['type'] == 'xlnsv': lnsx = np.array(xl.xlnscopy(np.array(x, dtype=np.float64), xl.xlnsv)) elif main_params['type'] == 'xlnsb': lnsx = np.array(xl.xlnscopy(np.array(x, dtype=np.float64), xl.xlnsv, 2**2**-xl.xlnsF)) elif main_params['type'] == 'float': lnsx = np.array(x, dtype=np.float64) lnss1 = xl.hstack((lnsones, lnsx)) @ lnsW1 lnsmask = (lnss1 > 0) + (leaking_coeff * (lnss1 < 0)) lnsa1 = lnss1 * lnsmask lnss2 = xl.hstack((lnsones, lnsa1)) @ lnsW2 lnscorrect_count += np.sum(np.argmax(y, axis=1) == xl.argmax(lnss2, axis=1)) lnsaccuracy = lnscorrect_count / split print("Val-set accuracy at epoch %d: %f" % (1 + epoch, lnsaccuracy)) performance['lnsacc_val'][epoch] = 100 * lnsaccuracy print("elapsed time=" + str(time.process_time() - start_time)) fig = plt.figure(figsize=(16, 9)) ax = fig.add_subplot(111) x_axis = range(1, 1 + performance['lnsacc_train'].size) ax.plot(x_axis, performance['lnsacc_train'], 'y') ax.plot(x_axis, performance['lnsacc_val'], 'm') ax.set_xlabel('Number of Epochs') ax.set_ylabel('Accuracy') plt.suptitle(main_params['type'] + ' ' + str(split) + ' Validation and Training MNIST Accuracies F=' + str(xl.xlnsF), fontsize=14) ax.legend(['train', 'validation']) plt.grid(which='both', axis='both', linestyle='-.') plt.savefig('genericaccuracy.png') plt.show() # Now, show predictions on a few test images num_examples = 5 # Number of test images to display selected_indices = np.arange(num_examples) # choose the first few images for demo x_sample = x_test[selected_indices] y_sample = y_test[selected_indices] # For prediction, create a bias vector matching the sample size ones_sample = np.ones((x_sample.shape[0], 1)) z1_sample = np.hstack((ones_sample, x_sample)) @ lnsW1 mask_sample = (z1_sample > 0) + (leaking_coeff * (z1_sample < 0)) a1_sample = z1_sample * mask_sample z2_sample = np.hstack((ones_sample, a1_sample)) @ lnsW2 pred_probs = softmax(z2_sample) predictions = np.argmax(pred_probs, axis=1) true_labels = np.argmax(y_sample, axis=1) # Plot each test image along with its prediction and true label plt.figure(figsize=(10, 2)) for i in range(num_examples): plt.subplot(1, num_examples, i + 1) # Reshape the flattened image back to 28x28 for display plt.imshow(x_sample[i].reshape(28, 28), cmap='gray') plt.title(f"Pred: {predictions[i]}\nTrue: {true_labels[i]}") plt.axis('off') plt.tight_layout() plt.show() if __name__ == '__main__': # In a Kaggle notebook, set parameters manually using a dictionary. main_params = { 'is_training': True, 'split': 50, 'learning_rate': 0.01, 'lambda': 0.000, 'minibatch_size': 1, 'num_epoch': 5, 'leaking_coeff': 0.0078125, 'type': 'float' } main(main_params)
MNIST raqamlari tasnifi uchun Logarifmik sanoq tizimini (LNS) ko'p qatlamli perseptronni (MLP) amalga oshiradigan ushbu kodni sizga aytib beraman. Keling, uni asosiy bo'limlarga ajrataman:
Kod logarifmik sanoq tizimi operatsiyalari uchun xlns
kutubxonasidan foydalanadi
U turli xil aniqlik va ishlash o'zgarishlari uchun bir nechta LNS variantlarini (xlnsnp, xlnsnpv, xlnsud va boshqalar) taklif qiladi.
MNIST ma'lumotlar to'plami Keras orqali yuklanadi
def softmax(inp): max_vals = inp.max(axis=1) max_vals = xl.reshape(max_vals, (xl.size(max_vals), 1)) u = xl.exp(inp - max_vals) v = u.sum(axis=1) v = v.reshape((xl.size(v), 1)) u = u / v return u
Bu LNS operatsiyalari uchun moslashtirilgan raqamli barqaror softmax ilovasi.
Kirish qatlami: 784 neyron (28x28 tekislangan MNIST tasvirlari) + 1 egilish = 785
Yashirin qatlam: 100 ta neyron + 1 moyillik = 101
Chiqish qatlami: 10 neyron (har bir raqam uchun bitta)
Og'irliklar fayldan ("weightin.npz") yuklanadi yoki tasodifiy ishga tushiriladi
Tasodifiy og'irliklar o'rtacha = 0, std = 0,1 bo'lgan normal taqsimotdan foydalanadi
Turli LNS variantlari turli ishga tushirish usullarini (xlnsnp, xlnsnpv va boshqalar) talab qiladi.
for epoch in range(num_epoch): for mbatch in range(int(split / batchsize)): # Forward pass lnss1 = xl.hstack((lnsones, lnsx)) @ lnsW1 lnsmask = (lnss1 > 0) + (leaking_coeff * (lnss1 < 0)) lnsa1 = lnss1 * lnsmask lnss2 = xl.hstack((lnsones, lnsa1)) @ lnsW2 lnsa2 = softmax(lnss2) # Backward pass lnsgrad_s2 = (lnsa2 - lnsy) / batchsize lnsgrad_a1 = lnsgrad_s2 @ xl.transpose(lnsW2[1:]) lnsdelta_W2 = xl.transpose(xl.hstack((lnsones, lnsa1))) * lnsgrad_s2 lnsgrad_s1 = lnsmask * lnsgrad_a1 lnsdelta_W1 = xl.transpose(xl.hstack((lnsones, lnsx))) * lnsgrad_s1
Treningning asosiy jihatlari:
Oqish ReLU faollashuvidan foydalanadi (leaking_coeff tomonidan boshqariladi)
Standart orqaga tarqalishni amalga oshiradi, lekin LNS operatsiyalari bilan
L2 tartibga solishni o'z ichiga oladi (lambda parametri)
O'rganish tezligi "lr" bilan gradient tushishidan foydalangan holda og'irliklarni yangilaydi
Trening va tasdiqlash aniqligini kuzatib boradi
Davrlar bo'yicha aniqlikni ko'rsatadigan o'rganish egri chiziqlari
Sinov rasmlarida namunali bashoratlarni ko'rsatadi
main_params = { 'is_training': True, 'split': 50, 'learning_rate': 0.01, 'lambda': 0.000, 'minibatch_size': 1, 'num_epoch': 5, 'leaking_coeff': 0.0078125, 'type': 'float' }
Mini-to'plamli gradient tushishidan foydalanadi (birlamchi partiya hajmi = 1)
Tekshirish to'plamini ajratish orqali erta to'xtatishni amalga oshiradi
Leaky ReLU koeffitsienti 0,0078125 ga o'rnatiladi
Bu erda asosiy yangilik LNS arifmetikasidan foydalanish bo'lib, u ko'paytirishni log domenidagi qo'shimchalar bilan almashtiradi va ma'lum apparat dasturlari uchun yanada yaxshi hisoblash samaradorligini ta'minlaydi. Kod bir nechta LNS variantlarini qo'llab-quvvatlaydi, bu esa turli xil aniqlik va samaradorlik o'zgarishlariga imkon beradi.
Training on device: cuda Epoch [1/5], Loss: 0.8540, Train Acc: 79.60%, Val Acc: 88.22% Epoch [2/5], Loss: 0.3917, Train Acc: 88.97%, Val Acc: 89.92% Epoch [3/5], Loss: 0.3380, Train Acc: 90.29%, Val Acc: 90.60% Epoch [4/5], Loss: 0.3104, Train Acc: 90.96%, Val Acc: 91.12% Epoch [5/5], Loss: 0.2901, Train Acc: 91.60%, Val Acc: 91.62% Training completed in 57.76 seconds.
At Epoch 1: train-set accuracy at epoch 1: 52.00% Val-set accuracy at epoch 1: 24.00% At Epoch 2: train-set accuracy at epoch 2: 74.00% Val-set accuracy at epoch 2: 40.00% At Epoch 3: train-set accuracy at epoch 3: 86.00% Val-set accuracy at epoch 3: 58.00% At Epoch 4: train-set accuracy at epoch 4: 94.00% Val-set accuracy at epoch 4: 70.00% At Epoch 5: train-set accuracy at epoch 5: 96.00% Val-set accuracy at epoch 5: 68.00% elapsed time = 0.35 seconds.
Aspekt | Suzuvchi nuqta (FP) | Logarifmik sanoq tizimi (LNS) |
---|---|---|
Trening vaqti | 57,76 soniya | 0,35 s |
Poyezdning aniqligi | 91,60% | 96,00% |
Val aniqligi | 91,62% | 68,00% |
Aniqlik | Yuqori | Pastroq (taxminlash xatolari) |
Xotira samaradorligi | Yuqori foydalanish | Pastroq xotira maydoni |
Ko'paytirish bilan ishlash | Mahalliy ko'paytirish | Qo'shishga asoslangan soddalashtirishlar |
Logarifmik sanoq tizimi (LNS) va Floating-point (FP) arifmetikasi o'rtasidagi kelishuvlar neyron tarmoqlar uchun apparat-dasturiy ta'minotni birgalikda loyihalash bo'yicha qiziqarli misollarni taqdim etadi. LNS ma'lum sohalarda muhim afzalliklarni taqdim etsa ham:
Biroq, aniqlik bilan bog'liq muammolar muhim:
Bir nechta istiqbolli yondashuvlar LNS qo'llanilishini oshirishi mumkin:
LNS-ni chuqur o'rganish tizimlariga integratsiya qilish uchun quyidagilarni o'rganish mumkin:
Chuqur ta'lim hamjamiyati ushbu imkoniyatlarni asosiy tizimlarga integratsiyalashdan katta foyda olishi mumkin, bu esa yanada samaraliroq, kam quvvat sarflaydigan va yuqori tezlikdagi neyron tarmoqlarni ta'minlaydi.
Raqamli aniqlik va hisoblash samaradorligi o'rtasidagi muvozanat haqida qanday fikrdasiz? LNS ayniqsa foydali bo'lishi mumkin bo'lgan maxsus foydalanish holatlariga duch keldingizmi?
Bu haqda o'z fikringizni bildiring.
[1] G. Alsuhli va boshqalar, "Chuqur neyron tarmoqlari arxitekturalari uchun raqam tizimlari: so'rov", arXiv: 2307.05035 , 2023.
[2] M. Arnold, E. Chester va boshqalar, "Faqat taxminiy jadvalsiz LNS ALU yordamida neyron tarmoqlarni o'rgatish". Ilovaga xos tizimlar, arxitekturalar va protsessorlar bo'yicha 31-xalqaro konferentsiya, IEEE , 2020, 69–72-betlar. DOI
[3] O. Kosheleva va boshqalar, “Logarifmik sanoq tizimi AI hisoblashlari uchun optimaldir: empirik muvaffaqiyatning nazariy izohi”, maqola
[4] D. Miyashita va boshqalar, "Logarifmik ma'lumotlarni taqdim etishdan foydalangan holda konvolyutsion neyron tarmoqlari", arXiv: 1603.01025 , 2016 yil mart.
[5] J. Zhao va boshqalar, "LNS-Madam: Multiplicative Weight Update yordamida Logarifmik raqamlar tizimida past aniqlikdagi trening", IEEE Transactions on Computers , jild. 71, yo'q. 12, 3179–3190-betlar, 2022 yil dekabr. DOI