This repository implements an experimantal approach to training neural networks without relying solely on traditional gradient-based methods. We explore a search-based learning mechanism, where individual parameters are optimized by directly minimizing the loss function. Additionally, we compare these methods with conventional gradient-based techniques to evaluate the effectiveness of each approach.
Neural networks are commonly trained using gradient-based methods like backpropagation, which adjust the network's parameters to minimize a loss function. However, this approach has certain limitations:
- Gradient vanishing/exploding issues
- Sensitivity to local minima
- Need for differentiable loss functions
To address these challenges, we explore a search-based method where we exhaustively search over parameter ranges to minimize the loss without relying on gradients. The repository also includes:
- A search-based model that optimizes each parameter through a brute-force search.
- A gradient-based model for traditional optimization comparison.
- A hybrid model that combines the benefits of both approaches.
The SearchModel is built on top of the BaseModel
. Instead of relying on gradient-based optimization, the model searches over specific ranges of weights and biases to find the values that minimize the loss function.
- Parameter Search: For each weight and bias, we perform a search within the defined range, finding the value that results in the lowest loss.
- Early Stopping: The search can stop early if the loss converges below a certain tolerance level.
search_model = SearchModel(input_size=10, hidden_sizes=[5, 4], output_size=1,
weight_range=[-5, 5], bias_range=[-5, 5],
weight_range_size=100, bias_range_size=100)
history_search = search_model.search(X, y, max_iterations=30)
The GradientModel uses traditional gradient-based optimization (Adam optimizer) to update the network parameters. This serves as a baseline for comparison with the search-based approach.
gradient_model = GradientModel(input_size=10, hidden_sizes=[5, 4], output_size=1)
history_grad = gradient_model.train_model(X, y, num_epochs=30, learning_rate=0.1)
The HybridModel combines both search-based and gradient-based techniques. It first performs a parameter search and then refines the results using gradient descent.
hybrid_model = HybridModel(input_size=10, hidden_sizes=[5, 4], output_size=1)
history_hybrid = hybrid_model.search_and_train(X, y, num_iterations=30, learning_rate=0.1)
To run the models and visualize the results, execute the comparison.ipynb
notebook. This notebook loads a dataset, trains the models, and displays the training history and final loss values for each model.
jupyter notebook comparison.ipynb
Each model's training history is visualized to observe how the loss decreases over iterations.
The bar chart below compares the final loss across the different models:
- Search Model: Optimizes through brute-force parameter search.
- Gradient Model: Traditional backpropagation with Adam optimizer.
- Hybrid Model: Combines the search and gradient-based approaches.
This research demonstrates an experimental approach to neural network optimization that does not rely solely on gradients. The search-based model offers an alternative mechanism, particularly useful when gradient methods face challenges, such as non-differentiable functions or poor convergence.
Overall, this work opens up new possibilities for training neural networks, especially in cases where traditional methods struggle. Future research could explore more efficient search algorithms, integrate additional optimization techniques, or apply this approach to more complex neural network architectures and real-world tasks.