Skip to content

johnstonskj/rml-core

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

90 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Racket Machine Learning - Core

GitHub release Travis Status Coverage Status raco pkg install rml-core Documentation GitHub stars MIT License

This Package is part of an expected set of packages implementing machine learning capabilities for Racket. The core of this package is the management of datasets, these data sets are assumed to be for training and testing of machine learning capabilities. This package does not assume anything about such capabilities, and uses an expansive notion of machine learning that should cover statistical inferencing, tree and decision matrix models, as well as deep leaning approaches.

This module deals with two opaque structure types, data-set and data-set-field. These are not available to clients directly although certain accessors are exported by this module. Conceptually a data-set is a table of data, columns represent fields that are either features that represent properties of an instance, and classifiers or labels that are used to train and match instances.

See the rml-knn (not quite there yet) repository for an example capability built upon this package.

Modules

  • data - manages data sets, load from CSV and JSON files, save and load snapshots as well as manage partitions and statistics.
  • individual - manages individuals when classifying against a data set.
  • classify - describes a contract for classifier functions and a set of higher-order cross-classifiers over data sets.
  • results - provides a confusion matrix that records the results of classification as a mapping from true to predicted values.
  • not-implemented - really a convenience for raising fail:unsupported exceptions.

Example

The following example loads a sample data set and displays some useful information before_script writing a snapshot to the current output port.

(require rml/data.rkt)

(define iris-data-set
  (load-data-set "iris_training_data2.csv"
                 'csv
                 (list
                   (make-feature "sepal-length" #:index 0)
                   (make-feature "sepal-width" #:index 1)
                   (make-feature "petal-length" #:index 2)
                   (make-feature "petal-width" #:index 3)
                   (make-classifier "classification" #:index 4))))

(displayln (classifier-product dataset))
(newline)

(displayln (feature-statistics dataset "sepal-width"))
(newline)

(write-snapshot dataset (current-output-port))
(newline)

(for ([row (result-matrix-formatted (make-result-matrix dataset))])
  (displayln row))

The result of feature-statistics returns an instance of the statistics struct from math/statistics.

Racket Language