Skip to content

(WIP) A straightforward tokenization library for seamless text processing.

License

Notifications You must be signed in to change notification settings

trag1c/crossandra-rs

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

85 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

crossandra-rs

crossandra-rs is a work-in-progress ⚠️, straightforward tokenization library for seamless text processing. A simplified Rust implementation of the Python Crossandra library.

Usage

Add this to your Cargo.toml:

[dependencies]
crossandra = "0.0.1"

Import and use like this:

use crossandra::{Tokenizer, common};

fn main() {
    let word_finder = Tokenizer::default()
        .with_patterns(vec![common::WORD.clone()])
        .expect("built-in pattern should be safe");

    let text = "Hello, world!";

    for token in word_finder.tokenize(text).flatten() {
        println!("{:?}", token);
    }
    // Token { name: "word", value: "Hello", position: 0}
    // Token { name: "word", value: "world", position: 7}
}

Documentation

The documentation is available at docs.rs/crossandra.

Acknowledgements

Huge thanks to @Maneren for his invaluable guidance in developing this library 🫶

License

crossandra-rs is licensed under the MIT License.
© trag1c, 2024

About

(WIP) A straightforward tokenization library for seamless text processing.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •  

Languages