paint-brush
Thuto ye e Tebilego e Matha ka Dipalo tša Ntlha ye e Phaphametšego. Go thwe’ng Ge e ba Yeo e le Phošo?ka@abhiyanampally_kob9nse8
549 dipuku tša go balwa
549 dipuku tša go balwa

Thuto ye e Tebilego e Matha ka Dipalo tša Ntlha ye e Phaphametšego. Go thwe’ng Ge e ba Yeo e le Phošo?

ka 40m2025/02/11
Read on Terminal Reader

Nako e telele kudu; Go bala

Tshepedišo ya Dinomoro tša Logarithmic (LNS) ke boemedi bjo bongwe bja dipalo go dipalo tša Floating-Point (FP). LNS e emela dinomoro ka tekanyo ya logarithm, e fetolela dikatišo go ditlaleletšo, tšeo di ka bago theko ya fase ka dikhomphuthara ka go ditlhamo tše itšego tša didirišwa tša go šoma ka thata. Le ge go le bjalo, tlaleletšo le go ntšha ka go LNS di nyaka dikakanyetšo, e lego seo se lebišago go fokotšegeng ga go nepagala. Re šomiša LNS go tlwaetša Multi-Layer Perceptron ye bonolo yeo e kgokagantšwego ka botlalo go MNIST.
featured image - Thuto ye e Tebilego e Matha ka Dipalo tša Ntlha ye e Phaphametšego. Go thwe’ng Ge e ba Yeo e le Phošo?
undefined HackerNoon profile picture
0-item
1-item

Ge ke be ke thoma go kopana le kgopolo ya go diriša Logarithmic Number System (LNS) thutong e tseneletšego, ke ile ka kgahlwa eupša gape ke belaela. Go swana le bontši bja rena, ke be ke dutše ke šoma ka dipalo tša Floating-Point (FP)—e lego tekanyetšo ya go diriša dikhomphuthara tša dipalo thutong e tseneletšego. FP e fana ka tekatekano e ntle pakeng tsa ho nepahala le fapaneng, empa e tla le trade-offs: tshebediso ya memori e phahameng, eketsehileng rarahaneng dikhomphutha, le tshebediso ya matla e khōlō. Ka gona, ke ile ka phetha ka go leka le go iponela—LNS e bapetšwa bjang le FP ge e tlwaetša Multi-Layer Perceptron (MLP) e bonolo yeo e kgokagantšwego ka mo go feletšego go MNIST?

Ke ka Baka La’ng re Nagana ka LNS?

LNS e emela dinomoro ka tekanyo ya logarithmic, e fetolela dikatišo go ditlaleletšo, tšeo di ka bago theko ya fase ka dikhomphuthara ka go ditlhamo tše itšego tša didirišwa tša go šoma ka thata. Bokgoni bjo bo tla ka ditshenyagalelo tša go nepagala, kudukudu go tlaleletša le go ntšha ditiro, tšeo di raraganego kudu ka go LNS. Lega go le bjalo, mehola yeo e ka bago gona—go fokotšega ga kgato ya memori, go dira dikhomphuthara ka lebelo le go dirišwa ga matla ka tlase—di ile tša dira gore ke rate go tseba ka mo go lekanego gore nka e leka.

Lemorago: Tshepedišo ya Dinomoro tša Ntlha ya go Phaphamala vs. Logarithmic

Boemedi bja Ntlha ya go Phaphamala (FP).

Dipalo tša Floating-Point ke boemedi bja dipalo bja maemo ka bontši bja ditlhako tša go ithuta tše di tseneletšego, go swana le PyTorch le TensorFlow. Dinomoro tša FP di na le:


  • A pontšo bit (go laetša boleng bjo bobotse goba bjo bobe) .
  • Sešupo (ntlha ya go lekanya) .
  • A mantissa (bohlokwa) (go nepagala ga palo) .


FP32 (go nepagala ga tee) e šomišwa ka tlwaelo go ithuteng ka mo go tseneletšego, e fa tekatekano magareng ga go nepagala ga dipalo le bokgoni bja khomphutha. Difomete tše di šomago gabotse kudu tša go swana le FP16 le BF16 di hwetša go tuma go akgofiša tlwaetšo.

Tshepedišo ya Dinomoro tša Logarithmic (LNS) .

LNS ke boemedi bjo bongwe bja dipalo moo dinomoro di bolokilwego bjalo ka dilogaritmo: [ x = \log_b (y) ] moo ( b ) e lego motheo wa logaritmo. LNS e na le mehola e mmalwa:


  • Go atiša go nolofaditšwe go tlaleletša : ( x_1 * x_2 = b^{(\ log_b x_1 + \log_b x_2)} )
  • Karoganyo e nolofaditšwe go tloša : ( x_1 / x_2 = b^{(\ log_b x_1 - \log_b x_2)} )
  • Mešomo ya kgolo ya exponential e ba linear


Le ge go le bjalo, tlaleletšo le go ntšha ka go LNS di nyaka dikakanyetšo, tšeo di lebišago go fokotšegeng ga go nepagala.

LNS Mešomo ya Dipalo

Go tšwela pele go hlahloba LNS, ke ile ka phethagatša ditiro tša motheo tša dipalo tša go swana le go oketša, go ntšha, go atiša, le karoganyo ka go šomiša dikemedi tša ka gare tša LNS.


 import torch import numpy as np import xlns as xl # Assuming xlns module is installed and provides xlnsnp # Function to convert floating-point numbers to xlns internal representation def float_to_internal(arr): xlns_data = xl.xlnsnp(arr) return xlns_data.nd # Function to convert xlns internal representation back to floating-point numbers def internal_to_float(internal_data): original_numbers = [] for value in internal_data: x = value // 2 s = value % 2 # Use x and s to create xlns object xlns_value = xl.xlns(0) xlns_value.x = x xlns_value.s = s original_numbers.append(float(xlns_value)) return original_numbers # Function to perform LNS addition using internal representation def lns_add_internal(x, y): max_part = torch.maximum(x, y) diff = torch.abs(x - y) adjust_term = torch.log1p(torch.exp(-diff)) return max_part + adjust_term # Function to perform LNS subtraction using internal representation def lns_sub_internal(x, y): return lns_add_internal(x, -y) # Function to perform LNS multiplication using internal representation def lns_mul_internal(x, y): return x + y # Function to perform LNS division using internal representation def lns_div_internal(x, y): return x - y # Input floating-point arrays x_float = [2.0, 3.0] y_float = [-1.0, 0.0] # Convert floating-point arrays to xlns internal representation x_internal = float_to_internal(x_float) y_internal = float_to_internal(y_float) # Create tensors from the internal representation tensor_x_nd = torch.tensor(x_internal, dtype=torch.int64) tensor_y_nd = torch.tensor(y_internal, dtype=torch.int64) # Perform the toy LNS addition on the internal representation result_add_internal = lns_add_internal(tensor_x_nd, tensor_y_nd) # Perform the toy LNS subtraction on the internal representation result_sub_internal = lns_sub_internal(tensor_x_nd, tensor_y_nd) # Perform the toy LNS multiplication on the internal representation result_mul_internal = lns_mul_internal(tensor_x_nd, tensor_y_nd) # Perform the toy LNS division on the internal representation result_div_internal = lns_div_internal(tensor_x_nd, tensor_y_nd) # Convert the internal results back to original floating-point values result_add_float = internal_to_float(result_add_internal.numpy()) result_sub_float = internal_to_float(result_sub_internal.numpy()) result_mul_float = internal_to_float(result_mul_internal.numpy()) result_div_float = internal_to_float(result_div_internal.numpy()) # Convert the results back to PyTorch tensors result_add_tensor = torch.tensor(result_add_float, dtype=torch.float32) result_sub_tensor = torch.tensor(result_sub_float, dtype=torch.float32) result_mul_tensor = torch.tensor(result_mul_float, dtype=torch.float32) result_div_tensor = torch.tensor(result_div_float, dtype=torch.float32) # Print results print("Input x:", x_float) print("Input y:", y_float) print("Addition Result:", result_add_float) print("Addition Result Tensor:", result_add_tensor) print("Subtraction Result:", result_sub_float) print("Subtraction Result Tensor:", result_sub_tensor) print("Multiplication Result:", result_mul_float) print("Multiplication Result Tensor:", result_mul_tensor) print("Division Result:", result_div_float) print("Division Result Tensor:", result_div_tensor)


Mona ke go thubega ga phethagatšo ya ka ya diteko ya Logarithmic Number System (LNS).

1. Kgopolo ya Motheo ya LNS le Ditlhohlo ka go PyTorch

Ka go LNS, dinomoro di emetšwe bjalo ka dilogaritmo, tšeo di fetošago go atiša le karoganyo go ba tlaleletšo le go ntšha. Le ge go le bjalo, go phethagatša se ka PyTorch go tšweletša ditlhohlo tše di itšego ka ge ditensor tša PyTorch di šomiša dikemedi tša ntlha ya go phaphamala ka gare. Se se hlola dinyakwa tše mmalwa:


  • Boloka boemedi bja logarithmic go ralala le dikhomphuthara.
  • Netefatša go tsepama ga dipalo.
  • Swara diphetogo ka kelohloko.
  • Laola boemedi bja ka gare o šomiša dikarolo tše pedi:
    • x : boleng bja logarithmic.
    • s : e pontšo hanyenyane (0 kapa 1).

2. Boemedi bja ka Gare le Phetošo

Mogato wa mathomo ke go fetolela dinomoro tša dintlha tše di phaphametšego go boemedi bja tšona bja ka gare bja LNS.

 import torch import numpy as np import xl # Hypothetical external LNS library def float_to_internal(arr): xlns_data = xl.xlnsnp(arr) return xlns_data.nd # Convert floating-point arrays to xlns internal representation x_float = np.array([2.0, 3.0]) y_float = np.array([-1.0, 0.0]) x_internal = float_to_internal(x_float) y_internal = float_to_internal(y_float) # Create tensors from the internal representation tensor_x_nd = torch.tensor(x_internal, dtype=torch.int64) tensor_y_nd = torch.tensor(y_internal, dtype=torch.int64)


Tšhomišo ya dtype=torch.int64 e bohlokwa ka gobane:

  • E boloka boemedi bja ka gare bja LNS bjo bo nepagetšego ntle le diphošo tša go dikologa ga ntlha ya go phaphamala.
  • Packs bobedi boleng logarithmic le saena bit ka palo e le 'ngoe e feletseng.
  • E thibela ditiro tša ntlha ya go phaphamala tšeo di sa rerišwago go senya boemedi bja LNS.

3. Mešomo ya Motheo ya Dipalo

a) Go atiša

 def lns_mul_internal(x, y): return x + y

Go atiša ka go LNS e ba tlaleletšo:

  • Ge e ba a = log(x) le b = log(y) , gona log(x×y) = log(x) + log(y) .

b) Karoganyo

 def lns_div_internal(x, y): return x - y

Karoganyo e fetoga go ntšha:

  • log(x/y) = log(x) - log(y) .

c) Tlaleletšo

 def lns_add_internal(x, y): max_part = torch.maximum(x, y) diff = torch.abs(x - y) adjust_term = torch.log1p(torch.exp(-diff)) return max_part + adjust_term


Tlaleletšo e raragane kudu e bile e na le kwelobohloko ka dipalo ka gobane:

  • E akaretša ditiro tša exponential le logarithmic.
  • Go tsenywa tirišong ga ntlha ya go phaphamala ka go lebanya go ka lebiša go go tlala/go elela ka fase.
  • E šomiša tekano: log(x + y) = log(max(x,y)) + log(1 + exp(log(min(x,y)) - log(max(x,y)))) .
  • E šomiša log1p go e na le log(1 + x) bakeng sa go tsepama ga dipalo tše kaone.

4. Mohuta Polokego le Taolo ya Phetošo

 def internal_to_float(internal_data): for value in internal_data: x = value // 2 # Integer division s = value % 2 # Integer modulo


Phaephe ya phetošo e boloka karoganyo ye e kwagalago:

  1. Fetola go tšwa go float → LNS boemedi bja ka gare (dinomoro tše di feletšego).
  2. Dira ditiro tša LNS ka go šomiša dipalo tša palomoka.
  3. Fetolela morago go phaphamala fela ge go nyakega.
 # Convert results back to float and tensor result_add_float = internal_to_float(result_add_internal.numpy()) result_add_tensor = torch.tensor(result_add_float, dtype=torch.float32)

5. Mehola le Dithibelo tše Bohlokwa

Mehola

  • Go atiša le karoganyo di nolofatšwa go ya go tlaleletšo le go ntšha.
  • Wide mafolofolo fapaneng le tsitsitseng-ntlha dipalo.
  • Kgonagalo e šomago gabotse kudu bakeng sa dikgopelo tše itšego.

Dithibelo

  • Go tlaleletša le go ntšha ke ditiro tše di raraganego kudu .
  • Phetošo godimo ga hlogo magareng ga floating-point le LNS.
  • E nyaka tshwaro ye e kgethegilego ya dinomoro tša lefela le tše mpe.
  • PyTorch tensor go sepelelana hloka taolo ya mofuta e hlokolosi.

6. Dikgonagalo tša go dira gore dilo di šome gabotse

Go kaonafatša tshepedišo ya mošomo, motho a ka:

  1. Diriša mošomo wa autograd wa PyTorch wa tlwaelo bakeng sa ditiro tša LNS.
  2. Hlama mohuta wa tensor ya tlwaelo yeo ka setlogo e thekgago LNS.
  3. Diriša di-kernel tša CUDA bakeng sa ditiro tša LNS tše di šomago gabotse go GPU.


Phethagatšo ya bjale e dira gore go be le kgwebišano ye e šomago:

  • E etiša pele go hlaka le go hlokomelwa go feta tshepedišo ya godimodimo.
  • E šomiša mananeokgoparara a tensor ao a lego gona a PyTorch mola e boloka go nepagala ga LNS.
  • E boloka go tsepama ga dipalo ka taolo ya mohuta ka kelohloko.
  • E fokotša diphetogo magareng ga dikemedi .

7. Mohlala wa Phallo ya Datha

Megato ye e latelago e bontšha phaephe ye e feletšego ka go šomiša dikelo tša mohlala [2.0, 3.0] le [-1.0, 0.0] :

  1. Fetolela diphaphamadi tša tsenyo go boemedi bja ka gare bja LNS.
  2. Hlama ditensor tša palomoka go boloka dikelo tša LNS.
  3. Phetha ditiro dipalo ka LNS domain name.
  4. Fetolela dipoelo morago go phaphamala-ntlha.
  5. Hlama ditensor tša mafelelo tša PyTorch bakeng sa tshepedišo ye nngwe.


Phethagatšo ye e atlegile go thiba sekgoba magareng ga tshepedišo ya tensor ya ntlha ya go phaphamala ya PyTorch le dipalo tša LNS mola e dutše e boloka go tsepama ga dipalo le go nepagala.


Go tlwaetša MLP yeo e kgokagantšwego ka botlalo go dataset ya MNIST Digit ka FP le LNS

Setup ya Teko

Ke tlwaeditše MLP yeo e kgokagantšwego ka botlalo go dataset ya MNIST ke diriša bobedi dikemedi tša FP le LNS. Mohlwaela wa mohlala o be o le bonolo:

  • Input lera: 784 neurons (flatted 28x28 litšoantšo)
  • Dikarolo tše di utilwego: Dikarolo tše pedi tše di nago le di-neuron tše 256 le tše 128, di-activation tša ReLU
  • Output lera: 10 neurons (e mong bakeng sa dinomoro mong le e mong, sebelisa softmax)
  • Mošomo wa tahlegelo: Cross-entropy
  • Modiriši wa go dira gore dilo di šome gabotse: Adam


Bakeng sa phethagatšo ya LNS, ke ile ka swanelwa ke go gata ka ntle ga tshepedišo ya ka ya mošomo ya PyTorch ye e tlwaelegilego. Go fapana le FP, yeo PyTorch e thekgago ka setlogo, PyTorch ga e neelane ka ditiro tša LNS tšeo di agetšwego ka gare. Ke hweditše projeke ya GitHub yeo e bitšwago xlns , yeo e phethagatšago dikemedi tša dinomoro tša logarithmic le dipalo, e dira gore go kgonege go šomiša LNS ka gare ga dinetweke tša ditšhika.

Floating-Ntlha MLP ka PyTorch

Re thoma ka go phethagatša MLP ye e kgokagantšwego ka botlalo yeo e theilwego godimo ga FP re šomiša PyTorch:

 import torch import torch.nn as nn import torch.optim as optim import torchvision import torchvision.transforms as transforms import matplotlib.pyplot as plt import numpy as np import time # For calculating elapsed time # Define the multi-layer perceptron (MLP) model with one hidden layer class MLP(nn.Module): def __init__(self): super(MLP, self).__init__() # Input: 28*28 features; Hidden layer: 100 neurons; Output layer: 10 neurons self.fc1 = nn.Linear(28 * 28, 100) self.relu = nn.ReLU() self.fc2 = nn.Linear(100, 10) self.logsoftmax = nn.LogSoftmax(dim=1) # For stable outputs with NLLLoss def forward(self, x): # Flatten image: (batch_size, 1, 28, 28) -> (batch_size, 784) x = x.view(x.size(0), -1) x = self.fc1(x) x = self.relu(x) x = self.fc2(x) return self.logsoftmax(x) def train_and_validate(num_epochs=5, batch_size=64, learning_rate=0.01, split_ratio=0.8): # Set the device to GPU if available device = torch.device("cuda" if torch.cuda.is_available() else "cpu") print(f"Training on device: {device}") # Transformation for MNIST: convert to tensor and normalize transform = transforms.Compose([ transforms.ToTensor(), transforms.Normalize((0.5,), (0.5,)) ]) # Load the MNIST training dataset train_dataset_full = torchvision.datasets.MNIST( root='./data', train=True, transform=transform, download=True ) # Split the dataset into training and validation sets n_total = len(train_dataset_full) n_train = int(split_ratio * n_total) n_val = n_total - n_train train_dataset, val_dataset = torch.utils.data.random_split(train_dataset_full, [n_train, n_val]) # Create DataLoaders for training and validation datasets train_loader = torch.utils.data.DataLoader(train_dataset, batch_size=batch_size, shuffle=True) val_loader = torch.utils.data.DataLoader(val_dataset, batch_size=batch_size, shuffle=False) # Initialize the model, loss function, and optimizer; move model to device model = MLP().to(device) criterion = nn.NLLLoss() optimizer = optim.SGD(model.parameters(), lr=learning_rate) # Lists to store training and validation accuracies for each epoch train_accuracies = [] val_accuracies = [] # Record the start time for measuring elapsed time start_time = time.time() # Training loop for epoch in range(num_epochs): model.train() running_loss = 0.0 correct_train = 0 total_train = 0 for inputs, labels in train_loader: # Move inputs and labels to device inputs, labels = inputs.to(device), labels.to(device) optimizer.zero_grad() outputs = model(inputs) loss = criterion(outputs, labels) loss.backward() optimizer.step() # Compute running loss and training accuracy running_loss += loss.item() * inputs.size(0) _, predicted = torch.max(outputs.data, 1) total_train += labels.size(0) correct_train += (predicted == labels).sum().item() train_accuracy = 100.0 * correct_train / total_train train_accuracies.append(train_accuracy) # Evaluate on validation set model.eval() correct_val = 0 total_val = 0 with torch.no_grad(): for inputs, labels in val_loader: inputs, labels = inputs.to(device), labels.to(device) outputs = model(inputs) _, predicted = torch.max(outputs.data, 1) total_val += labels.size(0) correct_val += (predicted == labels).sum().item() val_accuracy = 100.0 * correct_val / total_val val_accuracies.append(val_accuracy) print(f"Epoch [{epoch+1}/{num_epochs}], Loss: {running_loss/total_train:.4f}, " f"Train Acc: {train_accuracy:.2f}%, Val Acc: {val_accuracy:.2f}%") # Calculate elapsed time elapsed_time = time.time() - start_time print(f"Training completed in {elapsed_time:.2f} seconds.") # Show sample predictions from the validation set show_predictions(model, val_loader, device) # Optional: plot training and validation accuracies epochs_arr = np.arange(1, num_epochs + 1) plt.figure(figsize=(10, 6)) plt.plot(epochs_arr, train_accuracies, label='Training Accuracy', marker='o') plt.plot(epochs_arr, val_accuracies, label='Validation Accuracy', marker='x') plt.xlabel('Epoch') plt.ylabel('Accuracy (%)') plt.title('Training and Validation Accuracies') plt.legend() plt.grid(True) plt.savefig('pytorch_accuracy.png') plt.show() def show_predictions(model, data_loader, device, num_images=6): """ Displays a few sample images from the data_loader along with the model's predictions. """ model.eval() images_shown = 0 plt.figure(figsize=(12, 8)) # Get one batch of images from the validation dataset for inputs, labels in data_loader: inputs, labels = inputs.to(device), labels.to(device) with torch.no_grad(): outputs = model(inputs) _, predicted = torch.max(outputs, 1) # Loop through the batch and plot images for i in range(inputs.size(0)): if images_shown >= num_images: break # Move the image to cpu and convert to numpy for plotting img = inputs[i].cpu().squeeze() plt.subplot(2, num_images // 2, images_shown + 1) plt.imshow(img, cmap='gray') plt.title(f"Pred: {predicted[i].item()}") plt.axis('off') images_shown += 1 if images_shown >= num_images: break plt.suptitle("Sample Predictions from the Validation Set") plt.tight_layout() plt.show() if __name__ == '__main__': train_and_validate(num_epochs=5, batch_size=64, learning_rate=0.01, split_ratio=0.8)


Phethagatšo ye e latela phaephe ya go ithuta ye e tseneletšego ye e tlwaelegilego moo dikatišo le tlaleletšo di swarwago ka dipalo tša FP.


Mona ke walkthrough e feletšego ya phethagatšo ye ya PyTorch ya Multi-Layer Perceptron (MLP) bakeng sa dataset ya MNIST.

  1. Mohlala wa Boagi (MLP Class):
 class MLP(nn.Module): def __init__(self): super(MLP, self).__init__() self.fc1 = nn.Linear(28 * 28, 100) # First fully connected layer self.relu = nn.ReLU() # Activation function self.fc2 = nn.Linear(100, 10) # Output layer self.logsoftmax = nn.LogSoftmax(dim=1)
  1. Go Feta Pele: .
 def forward(self, x): x = x.view(x.size(0), -1) # Flatten: (batch_size, 1, 28, 28) -> (batch_size, 784) x = self.fc1(x) # First layer x = self.relu(x) # Activation x = self.fc2(x) # Output layer return self.logsoftmax(x) # Final activation
  1. Thulaganyo ya Tlwaetšo: .
 def train_and_validate(num_epochs=5, batch_size=64, learning_rate=0.01, split_ratio=0.8): device = torch.device("cuda" if torch.cuda.is_available() else "cpu") # Data preprocessing transform = transforms.Compose([ transforms.ToTensor(), transforms.Normalize((0.5,), (0.5,)) # Normalize to [-1, 1] ])

Dikarolo tše bohlokwa:

  • GPU tšehetso ka sesebediswa kgetho

  • Data normalization bakeng sa tlwaetšo e kaone

  • Di-hyperparameter tše di ka rulaganywago


  1. Taolo ya Dataset: .
 train_dataset_full = torchvision.datasets.MNIST( root='./data', train=True, transform=transform, download=True ) # Split into train/validation n_train = int(split_ratio * n_total) train_dataset, val_dataset = torch.utils.data.random_split(train_dataset_full, [n_train, n_val])
  • Downloads MNIST dataset ge e se gona

  • Arola ya data ka thupelo (80%) le netefatso (20%) disete


  1. Tlwaetšo Loop: .
 for epoch in range(num_epochs): model.train() for inputs, labels in train_loader: inputs, labels = inputs.to(device), labels.to(device) optimizer.zero_grad() # Clear gradients outputs = model(inputs) # Forward pass loss = criterion(outputs, labels)# Calculate loss loss.backward() # Backward pass optimizer.step() # Update weights

Tshepedišo ya tlwaetšo ya kgale:

  • Di-gradient tša lefela

  • Go feta pele

  • Tahlehelo ya dipalopalo

  • Go feta morago

  • Diapdeite tša boima bja mmele


  1. Netefatšo: .
 model.eval() with torch.no_grad(): for inputs, labels in val_loader: outputs = model(inputs) _, predicted = torch.max(outputs.data, 1) total_val += labels.size(0) correct_val += (predicted == labels).sum().item()

Dikarolo tše bohlokwa:

  • Model beha ho mokgwa wa tekolo

  • Ga go nyakege gore go dirwe dipalopalo tša gradient

  • Go balwa ga go nepagala


  1. Go bona ka leihlo la kgopolo:
 def show_predictions(model, data_loader, device, num_images=6): model.eval() plt.figure(figsize=(12, 8)) # Display predictions vs actual labels
  • E bontšha dipolelelopele tša mohlala go tšwa go sete ya netefatšo

  • E na le mohola bakeng sa kelo ya boleng


  1. Go Latela Tiragatšo:
 # Training metrics train_accuracies.append(train_accuracy) val_accuracies.append(val_accuracy) # Plot learning curves plt.plot(epochs_arr, train_accuracies, label='Training Accuracy') plt.plot(epochs_arr, val_accuracies, label='Validation Accuracy')
  • Tracks thupelo le netefatšo nepagalo

  • Plots go ithuta dikhubedu

  • E ela nako ya tlwaetšo


Se se fa motheo wo o tiilego wa go bapetša le diphethagatšo tše di theilwego go LNS, ka ge se phethagatša dikarolo ka moka tša maemo tša phaephe ya go ithuta ye e tseneletšego ka go šomiša dipalo tša setšo tša ntlha ye e phaphametšego.

Tshepedišo ya Dinomoro tša Logarithmic (LNS) MLP

Bakeng sa LNS, re swanetše go šomiša bokgobapuku xlns . Go fapana le FP, LNS e tšea sebaka sa ditiro tše boima tša go atiša ka tlaleletšo ka domain ya logarithmic. Le ge go le bjalo, PyTorch ga e thekge se ka setlogo, ka fao re swanetše go diriša ka seatla ditiro tša LNS moo go lego maleba.

 import numpy as np import matplotlib.pyplot as plt import os import time import argparse import xlns as xl from tensorflow.keras.datasets import mnist # Use Keras's MNIST loader # If you are using fractional normalized LNS, make sure the following are uncommented import xlnsconf.xlnsudFracnorm xlnsconf.xlnsudFracnorm.ilog2 = xlnsconf.xlnsudFracnorm.ipallog2 xlnsconf.xlnsudFracnorm.ipow2 = xlnsconf.xlnsudFracnorm.ipalpow2 # Set global parameter in xlns xl.xlnssetF(10) def softmax(inp): max_vals = inp.max(axis=1) max_vals = xl.reshape(max_vals, (xl.size(max_vals), 1)) u = xl.exp(inp - max_vals) v = u.sum(axis=1) v = v.reshape((xl.size(v), 1)) u = u / v return u def main(main_params): print("arbitrary base np LNS. Also xl.hstack, xl routines in softmax") print("testing new softmax and * instead of @ for delta") print("works with type " + main_params['type']) is_training = bool(main_params['is_training']) leaking_coeff = float(main_params['leaking_coeff']) batchsize = int(main_params['minibatch_size']) lr = float(main_params['learning_rate']) num_epoch = int(main_params['num_epoch']) _lambda = float(main_params['lambda']) ones = np.array(list(np.ones((batchsize, 1)))) if is_training: # Load the MNIST dataset from Keras (x_train, y_train), (x_test, y_test) = mnist.load_data() # Normalize images to [0, 1] x_train = x_train.astype(np.float64) / 255.0 x_test = x_test.astype(np.float64) / 255.0 # One-hot encode the labels (assume 10 classes for MNIST digits 0-9) num_classes = 10 y_train = np.eye(num_classes)[y_train] y_test = np.eye(num_classes)[y_test] # Flatten the images from (28, 28) to (784,) x_train = x_train.reshape(x_train.shape[0], -1) x_test = x_test.reshape(x_test.shape[0], -1) # Use a portion of the training data for validation (the 'split' index) split = int(main_params['split']) x_val = x_train[split:] y_val = y_train[split:] x_train = x_train[:split] y_train = y_train[:split] # If available, load pretrained weights; otherwise, initialize new random weights. if os.path.isfile("./weightin.npz"): print("using ./weightin.npz") randfile = np.load("./weightin.npz", "r") W1 = randfile["W1"] W2 = randfile["W2"] randfile.close() else: print("using new random weights") # Note: The input layer now has 785 neurons (784 features + 1 bias). W1 = np.array(list(np.random.normal(0, 0.1, (785, 100)))) # The first hidden layer has 100 neurons; add bias so 101 W2 = np.array(list(np.random.normal(0, 0.1, (101, 10)))) np.savez_compressed("./weightout.npz", W1=W1, W2=W2) delta_W1 = np.array(list(np.zeros(W1.shape))) delta_W2 = np.array(list(np.zeros(W2.shape))) # Convert weights to desired type (xlns variants or float) if main_params['type'] == 'xlnsnp': lnsW1 = xl.xlnsnp(np.array(xl.xlnscopy(list(W1)))) lnsW2 = xl.xlnsnp(np.array(xl.xlnscopy(list(W2)))) lnsones = xl.xlnsnp(np.array(xl.xlnscopy(list(np.ones((batchsize, 1)))))) lnsdelta_W1 = xl.xlnsnp(np.array(xl.xlnscopy(list(np.zeros(W1.shape))))) lnsdelta_W2 = xl.xlnsnp(np.array(xl.xlnscopy(list(np.zeros(W2.shape))))) elif main_params['type'] == 'xlnsnpv': lnsW1 = xl.xlnsnpv(np.array(xl.xlnscopy(list(W1))), 6) lnsW2 = xl.xlnsnpv(np.array(xl.xlnscopy(list(W2))), 6) lnsones = xl.xlnsnpv(np.array(xl.xlnscopy(list(np.ones((batchsize, 1)))))) lnsdelta_W1 = xl.xlnsnpv(np.array(xl.xlnscopy(list(np.zeros(W1.shape))))) lnsdelta_W2 = xl.xlnsnpv(np.array(xl.xlnscopy(list(np.zeros(W2.shape))))) elif main_params['type'] == 'xlnsnpb': lnsW1 = xl.xlnsnpb(np.array(xl.xlnscopy(list(W1))), 2**2**-6) lnsW2 = xl.xlnsnpb(np.array(xl.xlnscopy(list(W2))), 2**2**-6) lnsones = xl.xlnsnpb(np.array(xl.xlnscopy(list(np.ones((batchsize, 1))))), 2**2**-xl.xlnsF) lnsdelta_W1 = xl.xlnsnpb(np.array(xl.xlnscopy(list(np.zeros(W1.shape)))), 2**2**-xl.xlnsF) lnsdelta_W2 = xl.xlnsnpb(np.array(xl.xlnscopy(list(np.zeros(W2.shape)))), 2**2**-xl.xlnsF) elif main_params['type'] == 'xlns': lnsW1 = np.array(xl.xlnscopy(list(W1))) lnsW2 = np.array(xl.xlnscopy(list(W2))) lnsones = np.array(xl.xlnscopy(list(np.ones((batchsize, 1))))) lnsdelta_W1 = np.array(xl.xlnscopy(list(np.zeros(W1.shape)))) lnsdelta_W2 = np.array(xl.xlnscopy(list(np.zeros(W2.shape)))) elif main_params['type'] == 'xlnsud': lnsW1 = np.array(xl.xlnscopy(list(W1), xl.xlnsud)) lnsW2 = np.array(xl.xlnscopy(list(W2), xl.xlnsud)) lnsones = np.array(xl.xlnscopy(list(np.ones((batchsize, 1))), xl.xlnsud)) lnsdelta_W1 = np.array(xl.xlnscopy(list(np.zeros(W1.shape)), xl.xlnsud)) lnsdelta_W2 = np.array(xl.xlnscopy(list(np.zeros(W2.shape)), xl.xlnsud)) elif main_params['type'] == 'xlnsv': lnsW1 = np.array(xl.xlnscopy(list(W1), xl.xlnsv, 6)) lnsW2 = np.array(xl.xlnscopy(list(W2), xl.xlnsv, 6)) lnsones = np.array(xl.xlnscopy(list(np.ones((batchsize, 1))), xl.xlnsv)) lnsdelta_W1 = np.array(xl.xlnscopy(list(np.zeros(W1.shape)), xl.xlnsv)) lnsdelta_W2 = np.array(xl.xlnscopy(list(np.zeros(W2.shape)), xl.xlnsv)) elif main_params['type'] == 'xlnsb': lnsW1 = np.array(xl.xlnscopy(list(W1), xl.xlnsb, 2**2**-6)) lnsW2 = np.array(xl.xlnscopy(list(W2), xl.xlnsb, 2**2**-6)) lnsones = np.array(xl.xlnscopy(list(np.ones((batchsize, 1))), xl.xlnsb, 2**2**-xl.xlnsF)) lnsdelta_W1 = np.array(xl.xlnscopy(list(np.zeros(W1.shape)), xl.xlnsb, 2**2**-xl.xlnsF)) lnsdelta_W2 = np.array(xl.xlnscopy(list(np.zeros(W2.shape)), xl.xlnsb, 2**2**-xl.xlnsF)) elif main_params['type'] == 'float': lnsW1 = np.array(list(W1)) lnsW2 = np.array(list(W2)) lnsones = np.array(list(np.ones((batchsize, 1)))) lnsdelta_W1 = np.array(list(np.zeros(W1.shape))) lnsdelta_W2 = np.array(list(np.zeros(W2.shape))) performance = {} performance['lnsacc_train'] = np.zeros(num_epoch) performance['lnsacc_val'] = np.zeros(num_epoch) start_time = time.process_time() # Training loop for epoch in range(num_epoch): print('At Epoch %d:' % (1 + epoch)) # Loop through training batches for mbatch in range(int(split / batchsize)): start = mbatch * batchsize x = np.array(x_train[start:(start + batchsize)]) y = np.array(y_train[start:(start + batchsize)]) # At this point, each x is already flattened (batchsize x 784) # Conversion based on type if main_params['type'] == 'xlnsnp': lnsx = xl.xlnsnp(np.array(xl.xlnscopy(np.array(x, dtype=np.float64)))) lnsy = xl.xlnsnp(np.array(xl.xlnscopy(np.array(y, dtype=np.float64)))) elif main_params['type'] == 'xlnsnpv': lnsx = xl.xlnsnpv(np.array(xl.xlnscopy(np.array(x, dtype=np.float64)))) lnsy = xl.xlnsnpv(np.array(xl.xlnscopy(np.array(y, dtype=np.float64)))) elif main_params['type'] == 'xlnsnpb': lnsx = xl.xlnsnpb(np.array(xl.xlnscopy(np.array(x, dtype=np.float64))), 2**2**-xl.xlnsF) lnsy = xl.xlnsnpb(np.array(xl.xlnscopy(np.array(y, dtype=np.float64))), 2**2**-xl.xlnsF) elif main_params['type'] == 'xlns': lnsx = np.array(xl.xlnscopy(np.array(x, dtype=np.float64))) lnsy = np.array(xl.xlnscopy(np.array(y, dtype=np.float64))) elif main_params['type'] == 'xlnsud': lnsx = np.array(xl.xlnscopy(np.array(x, dtype=np.float64), xl.xlnsud)) lnsy = np.array(xl.xlnscopy(np.array(y, dtype=np.float64), xl.xlnsud)) elif main_params['type'] == 'xlnsv': lnsx = np.array(xl.xlnscopy(np.array(x, dtype=np.float64), xl.xlnsv)) lnsy = np.array(xl.xlnscopy(np.array(y, dtype=np.float64), xl.xlnsv)) elif main_params['type'] == 'xlnsb': lnsx = np.array(xl.xlnscopy(np.array(x, dtype=np.float64), xl.xlnsv, 2**2**-xl.xlnsF)) lnsy = np.array(xl.xlnscopy(np.array(y, dtype=np.float64), xl.xlnsv, 2**2**-xl.xlnsF)) elif main_params['type'] == 'float': lnsx = np.array(x, dtype=np.float64) lnsy = np.array(y, dtype=np.float64) # Concatenate the bias "ones" with input features for the first layer lnss1 = xl.hstack((lnsones, lnsx)) @ lnsW1 lnsmask = (lnss1 > 0) + (leaking_coeff * (lnss1 < 0)) lnsa1 = lnss1 * lnsmask lnss2 = xl.hstack((lnsones, lnsa1)) @ lnsW2 lnsa2 = softmax(lnss2) lnsgrad_s2 = (lnsa2 - lnsy) / batchsize lnsgrad_a1 = lnsgrad_s2 @ xl.transpose(lnsW2[1:]) lnsdelta_W2 = xl.transpose(xl.hstack((lnsones, lnsa1))) * lnsgrad_s2 lnsgrad_s1 = lnsmask * lnsgrad_a1 lnsdelta_W1 = xl.transpose(xl.hstack((lnsones, lnsx))) * lnsgrad_s1 lnsW2 -= (lr * (lnsdelta_W2 + (_lambda * lnsW2))) lnsW1 -= (lr * (lnsdelta_W1 + (_lambda * lnsW1))) print('#= ', split, ' batch=', batchsize, ' lr=', lr) lnscorrect_count = 0 # Evaluate accuracy on training set for mbatch in range(int(split / batchsize)): start = mbatch * batchsize x = x_train[start:(start + batchsize)] y = y_train[start:(start + batchsize)] if main_params['type'] == 'xlnsnp': lnsx = xl.xlnsnp(np.array(xl.xlnscopy(np.array(x, dtype=np.float64)))) elif main_params['type'] == 'xlnsnpv': lnsx = xl.xlnsnpv(np.array(xl.xlnscopy(np.array(x, dtype=np.float64)))) elif main_params['type'] == 'xlnsnpb': lnsx = xl.xlnsnpb(np.array(xl.xlnscopy(np.array(x, dtype=np.float64))), 2**2**-xl.xlnsF) elif main_params['type'] == 'xlns': lnsx = np.array(xl.xlnscopy(np.array(x, dtype=np.float64))) elif main_params['type'] == 'xlnsud': lnsx = np.array(xl.xlnscopy(np.array(x, dtype=np.float64), xl.xlnsud)) elif main_params['type'] == 'xlnsv': lnsx = np.array(xl.xlnscopy(np.array(x, dtype=np.float64), xl.xlnsv)) elif main_params['type'] == 'xlnsb': lnsx = np.array(xl.xlnscopy(np.array(x, dtype=np.float64), xl.xlnsv, 2**2**-xl.xlnsF)) elif main_params['type'] == 'float': lnsx = np.array(x, dtype=np.float64) lnss1 = xl.hstack((lnsones, lnsx)) @ lnsW1 lnsmask = (lnss1 > 0) + (leaking_coeff * (lnss1 < 0)) lnsa1 = lnss1 * lnsmask lnss2 = xl.hstack((lnsones, lnsa1)) @ lnsW2 lnscorrect_count += np.sum(np.argmax(y, axis=1) == xl.argmax(lnss2, axis=1)) lnsaccuracy = lnscorrect_count / split print("train-set accuracy at epoch %d: %f" % (1 + epoch, lnsaccuracy)) performance['lnsacc_train'][epoch] = 100 * lnsaccuracy lnscorrect_count = 0 # Evaluate on the validation set for mbatch in range(int(split / batchsize)): start = mbatch * batchsize x = x_val[start:(start + batchsize)] y = y_val[start:(start + batchsize)] if main_params['type'] == 'xlnsnp': lnsx = xl.xlnsnp(np.array(xl.xlnscopy(np.array(x, dtype=np.float64)))) elif main_params['type'] == 'xlnsnpv': lnsx = xl.xlnsnpv(np.array(xl.xlnscopy(np.array(x, dtype=np.float64)))) elif main_params['type'] == 'xlnsnpb': lnsx = xl.xlnsnpb(np.array(xl.xlnscopy(np.array(x, dtype=np.float64))), 2**2**-xl.xlnsF) elif main_params['type'] == 'xlns': lnsx = np.array(xl.xlnscopy(np.array(x, dtype=np.float64))) elif main_params['type'] == 'xlnsud': lnsx = np.array(xl.xlnscopy(np.array(x, dtype=np.float64), xl.xlnsud)) elif main_params['type'] == 'xlnsv': lnsx = np.array(xl.xlnscopy(np.array(x, dtype=np.float64), xl.xlnsv)) elif main_params['type'] == 'xlnsb': lnsx = np.array(xl.xlnscopy(np.array(x, dtype=np.float64), xl.xlnsv, 2**2**-xl.xlnsF)) elif main_params['type'] == 'float': lnsx = np.array(x, dtype=np.float64) lnss1 = xl.hstack((lnsones, lnsx)) @ lnsW1 lnsmask = (lnss1 > 0) + (leaking_coeff * (lnss1 < 0)) lnsa1 = lnss1 * lnsmask lnss2 = xl.hstack((lnsones, lnsa1)) @ lnsW2 lnscorrect_count += np.sum(np.argmax(y, axis=1) == xl.argmax(lnss2, axis=1)) lnsaccuracy = lnscorrect_count / split print("Val-set accuracy at epoch %d: %f" % (1 + epoch, lnsaccuracy)) performance['lnsacc_val'][epoch] = 100 * lnsaccuracy print("elapsed time=" + str(time.process_time() - start_time)) fig = plt.figure(figsize=(16, 9)) ax = fig.add_subplot(111) x_axis = range(1, 1 + performance['lnsacc_train'].size) ax.plot(x_axis, performance['lnsacc_train'], 'y') ax.plot(x_axis, performance['lnsacc_val'], 'm') ax.set_xlabel('Number of Epochs') ax.set_ylabel('Accuracy') plt.suptitle(main_params['type'] + ' ' + str(split) + ' Validation and Training MNIST Accuracies F=' + str(xl.xlnsF), fontsize=14) ax.legend(['train', 'validation']) plt.grid(which='both', axis='both', linestyle='-.') plt.savefig('genericaccuracy.png') plt.show() # Now, show predictions on a few test images num_examples = 5 # Number of test images to display selected_indices = np.arange(num_examples) # choose the first few images for demo x_sample = x_test[selected_indices] y_sample = y_test[selected_indices] # For prediction, create a bias vector matching the sample size ones_sample = np.ones((x_sample.shape[0], 1)) z1_sample = np.hstack((ones_sample, x_sample)) @ lnsW1 mask_sample = (z1_sample > 0) + (leaking_coeff * (z1_sample < 0)) a1_sample = z1_sample * mask_sample z2_sample = np.hstack((ones_sample, a1_sample)) @ lnsW2 pred_probs = softmax(z2_sample) predictions = np.argmax(pred_probs, axis=1) true_labels = np.argmax(y_sample, axis=1) # Plot each test image along with its prediction and true label plt.figure(figsize=(10, 2)) for i in range(num_examples): plt.subplot(1, num_examples, i + 1) # Reshape the flattened image back to 28x28 for display plt.imshow(x_sample[i].reshape(28, 28), cmap='gray') plt.title(f"Pred: {predictions[i]}\nTrue: {true_labels[i]}") plt.axis('off') plt.tight_layout() plt.show() if __name__ == '__main__': # In a Kaggle notebook, set parameters manually using a dictionary. main_params = { 'is_training': True, 'split': 50, 'learning_rate': 0.01, 'lambda': 0.000, 'minibatch_size': 1, 'num_epoch': 5, 'leaking_coeff': 0.0078125, 'type': 'float' } main(main_params)


Ke tla go sepetša ka khoutu ye yeo e phethagatšago Logarithmic Number System (LNS) Multi-Layer Perceptron (MLP) bakeng sa go hlopha dinomoro tša MNIST. E re ke e arole ka dikarolo tše bohlokwa:


  1. Setup le Dithoto tša ka Ntle:
  • Khoutu e šomiša bokgobapuku bja xlns bakeng sa ditiro tša tshepedišo ya dinomoro tša logarithmic

  • E fana ka multiple LNS variants (xlnsnp, xlnsnpv, xlnsud, joalo-joalo) bakeng sa fapaneng nepahetseng le tshebetso tradeoffs

  • The MNIST dataset e imelwa ka Keras


  1. Mešomo ya Motheo: .
 def softmax(inp): max_vals = inp.max(axis=1) max_vals = xl.reshape(max_vals, (xl.size(max_vals), 1)) u = xl.exp(inp - max_vals) v = u.sum(axis=1) v = v.reshape((xl.size(v), 1)) u = u / v return u

Ye ke phethagatšo ya softmax ye e tsepamego ka dipalo yeo e fetotšwego bakeng sa ditiro tša LNS.


  1. Sebopego sa Neteweke: .
  • Input lera: 784 neurons (28x28 flattened MNIST litšoantšo) + 1 leeme = 785

  • Lera le patiloeng: di-neuron tše 100 + leeme le 1 = 101

  • Output lera: 10 neurons (e mong ka dinomoro) .


  1. Boima bja mmele Initialization: .
  • Dikelo di imelwa go tšwa faeleng ("weightin.npz") goba di thongwa ka go se kgethe

  • Dikelo tša go se kgethe di šomiša kabo ya tlwaelo ka magareng = 0, std = 0.1

  • Diphetogo tše di fapanego tša LNS di nyaka mekgwa ye e fapanego ya go thoma (xlnsnp, xlnsnpv, bjalobjalo)


  1. Tlwaetšo Loop: .
 for epoch in range(num_epoch): for mbatch in range(int(split / batchsize)): # Forward pass lnss1 = xl.hstack((lnsones, lnsx)) @ lnsW1 lnsmask = (lnss1 > 0) + (leaking_coeff * (lnss1 < 0)) lnsa1 = lnss1 * lnsmask lnss2 = xl.hstack((lnsones, lnsa1)) @ lnsW2 lnsa2 = softmax(lnss2) # Backward pass lnsgrad_s2 = (lnsa2 - lnsy) / batchsize lnsgrad_a1 = lnsgrad_s2 @ xl.transpose(lnsW2[1:]) lnsdelta_W2 = xl.transpose(xl.hstack((lnsones, lnsa1))) * lnsgrad_s2 lnsgrad_s1 = lnsmask * lnsgrad_a1 lnsdelta_W1 = xl.transpose(xl.hstack((lnsones, lnsx))) * lnsgrad_s1


Dikarolo tše bohlokwa tša tlwaetšo:

  • E šomiša go tsenywa tirišong ga ReLU ye e dutlelago (yeo e laolwago ke leaking_coeff) .

  • E phethagatša phatlalatšo ya morago ya maemo eupša ka ditiro tša LNS

  • E akaretša L2 regularization (lambda paramethara) .

  • E mpshafatša boima ka go šomiša go theoga ga gradient ka seelo sa go ithuta 'lr'.


  1. Tekolo:
  • Tracks bobedi tlwaetšo le netefatšo nepagalo

  • Diploto dikhube tša go ithuta tšeo di bontšhago go nepagala godimo ga dinako

  • Bontšha dipolelelopele tša mohlala diswantšhong tša teko


  1. Ditekanyetšo tše di feteletšego: .
 main_params = { 'is_training': True, 'split': 50, 'learning_rate': 0.01, 'lambda': 0.000, 'minibatch_size': 1, 'num_epoch': 5, 'leaking_coeff': 0.0078125, 'type': 'float' }
  • E šomiša go theoga ga gradient ya sehlopha se senyenyane (bogolo bja sehlopha sa tlwaelo = 1) .

  • E phethagatša go ema ka pela ka go aroganya sete ya netefatšo

  • Leaky ReLU segokanyipalo e behiloeng ho 0.0078125


  1. Go bona ka leihlo la kgopolo:
  • Bopa diploto tšeo di bontšhago go nepagala ga tlwaetšo le netefatšo
  • Bontšha diswantšho tša teko ya mohlala ka dipolelelopele le dileibole tša nnete
  • Boloka ploto ya go nepagala bjalo ka 'genericaccuracy.png'.


Boitlhamelo bja bohlokwa mo ke tšhomišo ya dipalo tša LNS tšeo di tšeago sebaka sa go atiša ka ditlaleletšo ka go domain ya log, yeo e ka fago bokgoni bjo bokaone bja khomphutha bakeng sa diphethagatšo tše itšego tša didirišwa tša go šoma. Khoutu e thekga diphetogo tše ntši tša LNS tšeo di dumelelago di-tradeoff tša go fapana tša go nepagala-tshebetšo.

Papetšho ya Motheo ya Tiragatšo

Tiragatšo ya Mohlala wa Ntlha ya go Phaphamala

 Training on device: cuda Epoch [1/5], Loss: 0.8540, Train Acc: 79.60%, Val Acc: 88.22% Epoch [2/5], Loss: 0.3917, Train Acc: 88.97%, Val Acc: 89.92% Epoch [3/5], Loss: 0.3380, Train Acc: 90.29%, Val Acc: 90.60% Epoch [4/5], Loss: 0.3104, Train Acc: 90.96%, Val Acc: 91.12% Epoch [5/5], Loss: 0.2901, Train Acc: 91.60%, Val Acc: 91.62% Training completed in 57.76 seconds. 

Dipolelelopele tša mohlala wa MLP wo o theilwego go FP

Tlhahlo le Netefatšo Curve bakeng sa FP thehiloeng MLP Model


Tiragatšo ya Mohlala wa Tshepedišo ya Nomoro ya Logarithmic

 At Epoch 1: train-set accuracy at epoch 1: 52.00% Val-set accuracy at epoch 1: 24.00% At Epoch 2: train-set accuracy at epoch 2: 74.00% Val-set accuracy at epoch 2: 40.00% At Epoch 3: train-set accuracy at epoch 3: 86.00% Val-set accuracy at epoch 3: 58.00% At Epoch 4: train-set accuracy at epoch 4: 94.00% Val-set accuracy at epoch 4: 70.00% At Epoch 5: train-set accuracy at epoch 5: 96.00% Val-set accuracy at epoch 5: 68.00% elapsed time = 0.35 seconds. 

Dipolelelopele tša mohlala wa MLP wo o theilwego godimo ga LNS

Tlhahlo le Netefatšo Curve bakeng sa LNS thehiloeng MLP Model


FP vs. LNS: Dipapišo tša Bohlokwa

Ntlha

Phaphamala-Ntlha (FP) .

Tshepedišo ya Dinomoro tša Logarithmic (LNS) .

Nako ya Tlwaetšo

57.76s

0.35s

Go Nepagala ga Terene

91,60% e .

96,00% e .

Val Go nepagala

91,62% e .

68,00% e .

Go nepagala

Godimo

Tlase (diphošo tša kakanyetšo) .

Bokgoni bja Memory

Tšhomišo ya godimo

Kgato ya maoto ya memori ya tlase

Tshwaro ya go Atiša

Go atiša ga setlogo

Dinolofatšo tše di theilwego godimo ga tlaleletšo

Mafetšo

Dikgwebišano magareng ga Logarithmic Number System (LNS) le Floating-Point (FP) dipalo di tšweletša thuto ya mohlala ye e kgahlišago ka go tlhamo ya tirišano ya hardware-software bakeng sa dinetweke tša ditšhika. Le ge LNS e fa mehola ye bohlokwa dikarolong tše itšego:

Lebelo la Tlwaetšo

  • E tšea sebaka sa go atiša ka tlaleletšo ka go domain ya log
  • Fokotša ditiro tše di raraganego go dipalo tše bonolo
  • Kudu-kudu e šomago gabotse bakeng sa go atiša matrix ka dinetweke tša ditšhika
  • Ka fihlelela 2-3x speedup ka diphethagatšo tše dingwe

Mehola ya go Gopola

  • Ka tlwaelo e nyaka dikotwana tše mmalwa go emela dinomoro
  • Ka compress boima le activations ka bokgoni haholoanyane
  • Fokotša ditlhoko tša bandwidth ya memori
  • Tšhomišo ya maatla ya fase bakeng sa phihlelelo ya memori


Le ge go le bjalo, ditlhohlo tša go nepagala ke tše bohlokwa:

  • Tahlegelo ya go nepagala nakong ya go kgoboketšwa ga dikelo tše nnyane
  • Bothata bja go emela dinomoro tšeo di lego kgauswi kudu le lefela
  • Kgonagalo ya go se tsepame ka dipalelo gradient
  • Ka hloka kelohloko hyperparameter tuning

Ditaelo tša Bokamoso

Mekgwa ye mmalwa ye e holofetšago e ka godiša go šoma ga LNS:

1. Dipalo tše di Itšego tša Llaga

  • Diriša FP bakeng sa dikarolo tše di nago le kwelobohloko (go swana le go hlopha ga mafelelo) .
  • Diriša LNS ka dillaga tše di utilwego tšeo di nago le dipalopalo tše boima
  • Dynamically switjha thehiloeng ditlhoko dipalo

2. Dikhomphutha tše di Tlwaelago ka go Nepagala

  • Thoma tlwaetšo le FP bakeng sa go tsepama
  • Ganyenyane-ganyenyane phetogo go LNS ge boima bo kopana
  • Boloka ditsela tše bohlokwa ka go nepagala mo go phagamego

3. Hardware Co-Morero

  • Custom accelerators le bobeli FP le LNS diyuniti
  • Smart kemiso magareng ga mehuta ya dipalo
  • Specialized memori hierarchies bakeng sa mong le e mong sebopeho

4. Dilo tše mpsha tša Algorithmic

  • New kenya tshebetsong mesebetsi optimized bakeng sa LNS
  • Dialgoritmo tša go dira gore dilo di šome gabotse tše di fetotšwego tšeo di bolokago go tsepama
  • Dikemedi tša palo ya motswako

Thekgo ya PyTorch ye e ka bago gona

Go kopanya LNS ka gare ga ditlhako tša go ithuta tše di tseneletšego, tše di latelago di ka hlahlobja:

1. Mešomo ya Autograd ya Tlwaelo

  • Diriša ditiro tša LNS bjalo ka mešomo ya autograd ya tlwaelo
  • Boloka dipalopalo gradient ka domain name log
  • Fana ka di-kernel tša CUDA tše di šomago gabotse bakeng sa go akgofiša

2. Dikatološo tša Mohuta wa Nomoro

  • Oketša mehuta ya tensor ya LNS ya setlogo
  • Diriša ditiro tša motheo (*+, -, , / ) ka gare ga domain ya log
  • Fana ka didirišwa tša phetošo go ya/go tšwa ntlheng ya go phaphamala

3. Diphetogo tša Llaga

  • Hlama diphetolelo tša LNS tša magato a tlwaelegilego (Linear, Conv2d) .
  • Optimize morago feta bakeng sa LNS dipalopalo
  • Thekga tlwaetšo ya go nepagala ye e hlakantšwego


Setšhaba sa go ithuta ka mo go tseneletšego se ka holega kudu ka go kopanya bokgoni bjo ka gare ga ditlhako tše di tlwaelegilego, go kgontšha dinetweke tša ditšhika tše di šomago gabotse kudu, tša maatla a tlase le tša lebelo le legolo .


Ke dikgopolo dife tša gago mabapi le tekatekano magareng ga go nepagala ga dipalo le bokgoni bja khomphutha? Na o kopane le maemo a itšego a tšhomišo moo LNS e ka bago e holago kudu?


Ntsebiše dikgopolo tša gago ka taba ye.

Ditšhupetšo


[1] G. Alsuhli, et al., "Ditshepedišo tša Dinomoro tša Ditlhamo tša Neteweke ya Neural ye e Tebilego: Dinyakišišo," arXiv:2307.05035 , 2023.

[2] M. Arnold, E. Chester, et al., “Go tlwaetša malokwa a ditšhika ka go diriša feela LNS ALU yeo e akanyetšwago yeo e se nago tafola.” Khonferentshe ya Boditšhabatšhaba ya bo 31 ya Ditshepedišo tše di itšego tša Kopo, Dipolane le Diprosesa, IEEE , 2020, maq. DOI ya go dira bjalo

[3] O. Kosheleva, et al., "Logarithmic Number System Ke Optimal bakeng sa AI Computations: Tlhaloso ya Teori ya Katleho Empirical," Pampiri

[4] D. Miyashita, et al., "Convolutional Neural Networks sebelisa Logarithmic Datha Kemedi," arXiv: 1603.01025 , Mar 2016.

[5] J. Zhao et al., "LNS-Madam: Tlwaetšo ya go nepagala ya Tlase ka Tshepedišo ya Dinomoro tša Logarithmic ka go šomiša Mpshafatšo ya Boima bja go Atiša," IEEE Transactions on Computers , vol. 71, no. 12, maq.3179-3190, Dibatsela 2022. DOI