Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rubix/ML TruncatedSVD() made PHP crash without any message #38

Closed
ahmaruff opened this issue Jul 4, 2023 · 2 comments
Closed

Rubix/ML TruncatedSVD() made PHP crash without any message #38

ahmaruff opened this issue Jul 4, 2023 · 2 comments

Comments

@ahmaruff
Copy link

ahmaruff commented Jul 4, 2023

This is a copy issue from Rubix/ML RubixML/ML#303
I'm putting this here because I don't know whether this problem is related to Rubix/ML or Rubix/tensor

hello, I try to create a chatbot using LSA Algorithm.
I am new to Machine learning, so I don't know if this is correct.

since I use the LSA algorithm, so I need to call TruncatedSVD()
but this script made PHP dev server crash without any message.
when I remove it from Pipeline everything is working

my machine is
Debian 12
AMD A9
RAM 4GB
PHP 8.2.7
Laravel 9

I have installed tensor extension manually using this fork/PR #36 (since PHP 8.2 is not fully supported yet)

here is my code

<?php
namespace App;
use Illuminate\Support\Facades\Storage;
use Rubix\ML\Classifiers\KDNeighbors;
use Rubix\ML\Datasets\Labeled;
use Rubix\ML\Graph\Trees\BallTree;
use Rubix\ML\Kernels\Distance\Cosine;
use Rubix\ML\Loggers\Screen;
use Rubix\ML\PersistentModel;
use Rubix\ML\Pipeline;
use Rubix\ML\Transformers\TextNormalizer;
use Rubix\ML\Transformers\TfIdfTransformer;
use Rubix\ML\Transformers\TruncatedSVD;
use Rubix\ML\Transformers\WordCountVectorizer;
use Rubix\ML\Persisters\Filesystem;

class Chatbot
{
    private $estimator;
    private $persister;
    private $logger;

    public function __construct() {
        $this->logger = new Screen();
        $modelPath = Storage::path('dataset/lsa.model');
        $this->persister = new Filesystem($modelPath);
        if(is_file($modelPath)){
            $this->estimator = $this->persister->load();
            $this->logger->info('model loaded');
        } else {
            $this->train();
        }
    }
    function train() {
        $start = microtime(true);
        $this->logger->info('start load dataset');
        $samples =[
            ['Cara mengganti password', 'Bagaimana cara mengganti kata sandi wifi', 'cara ganti sandi'],
            ['cara ganti nama wifi', 'bagaimana cara mengganti nama wifi', 'gimana caranya ganti SSID'],
            ['cara bayar tagihan', 'cara membayar wifi', 'nomor rekening pembayaran'],
            ['wifi tidak stabil', 'wifi kadang terkoneksi kadang tidak', 'internet bermasalah'],
            ['indikator warna merah hidup', 'indikator los hidup', 'ada lampu warna merah menyala'],
            ['internet lemot', 'jaringan lambat', 'cara upgrade paket']
        ];

        $labels = [
            "Untuk mengganti password, atau kata sandi caranya adalah dengan masuk Google Chrome lalu Ketik 192.168.0.1 untuk Username sama password untuk login adalah user setelah masuk Klik menu Network, Klik bagian Wlan, Klik bagian Security lalu Setelah diganti klik submit",
            "Untuk mengganti nama WiFi caranya adalah dengan masuk Google Chrome lalu Ketik 192.168.0.1 untuk Username sama password untuk login adalah user setelah masuk Klik menu Network Klik bagian Wlan Klik bagian SSID Setting lalu Setelah diganti klik submit",
            "untuk pembayaran wi-fi bisa melalui transfer dengan no rekening MANDIRI : 185000416082",
            "jika internet wifi lambat atau lemot, silakan coba dikurangi jumlah penggunanya agar pemakaianya stabil dan lancar, atau upgrade pake wifi",
            "apabila indikator LOS berwarna merah menyala, kemungkinan terdapat permasalahan pada device atau terdapat kabel yang terputus sehingga diperlukan pengecekan langsung di lokasi Silakan hubungi admin melalui layanan Pengaduan Masalah.",
            "jika internet wifi lambat atau lemot, silakan coba dikurangi jumlah penggunanya agar pemakaianya stabil dan lancar, atau upgrade pake wifi",
        ];

        $dataset = new Labeled($samples, $labels);
        $nearest = 5;

        $this->estimator = new PersistentModel(
            new Pipeline([
                new TextNormalizer(),
                new WordCountVectorizer(1000,1, 0.8),
                new TfIdfTransformer(),
                new TruncatedSVD(10), //crash. but when I remove this, everything is working.
            ], new KDNeighbors($nearest,true,new BallTree($nearest, new Cosine()))),
            $this->persister
        );

        $this->estimator->train($dataset);

        $this->estimator->save();

        dump((memory_get_peak_usage()/1024/1024)." MB\n");
        $time = microtime(true) - $start;
        dump("Time: ".$time." s\n");
    }

}
@andrewdalpino
Copy link
Member

andrewdalpino commented Jul 18, 2023

Hi @ahmaruff, thanks for posting this here. Just to carry over some info from the initial bug report ... I believe there is an issue with the call to LAPACK/OpenBLAS's SVD routine. This would be a great place to start for someone who has time to take a look into the issue.

@andrewdalpino
Copy link
Member

I believe this is fixed now, I was able to run SVD on the Iris dataset with the new Tensor 3.0.5 extension which has an updated Zephir library.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants