Litt

Literature tool for searching all PDFs in a directory.

tl;dr

Installing:

wget https://github.com/georgbuechner/litt/releases/download/v1.0.1/litt-ubuntu-latest 
chmod +x litt-ubuntu-latest
cp -f litt-ubuntu-latest /usr/local/bin/litt

(Replace version and platform accordingly)

Usage:

litt <index-name> -i <path-to-documents>  # create new index <index-name>
litt <index-name> <search-term>  # search for <search-term> in <index-name>

Litt

Literature tool for searching all PDFs in a directory.

Installation

Requirements

The command-line tool pdftotext should be available on your system.

Also, we advise to install zathura, a very lightweight pdf reader. If zathura is availible we can open the selected pdf (litt <num>) on the selected page and with the search term highlighted.

Pre-built binaries.

There are pre-built binaries available for Windows and Linux. Simply download the binary (see: Releases) and you are ready to go. It is advised to add litt to path.

Linux

This would be a way to go for Linux:

wget https://github.com/georgbuechner/litt/releases/download/v1.0.1/litt-ubuntu-latest 
chmod +x litt-ubuntu-latest
cp -f litt-ubuntu-latest /usr/local/bin/litt

wget downloads the file (make sure to change the version-number to the latest version).
chmod +x grants permission to execute litt. (This must be run as sudo user)
cp -f litt-ubuntu-latest /usr/local/bin/litt makes litt available system wide under the name litt.

Verify by running litt --version. It should show something like:

litt 1.0.1

Windows

Honestly, I don't really know. After downloading the windows binary, litt should be added to path. This guide gives some explanation on how to do that: https://windowsloop.com/how-to-add-to-windows-path/

MacOS

Sadly there are no binarys available for MacOS. For installation see Compile from source.

Compile from source

First you should install Rust/Cargo: https://www.rust-lang.org/tools/install

Then clone the Github repository (in Windows we suggest using git-bash)

git clone https://github.com/georgbuechner/litt.git

Then run:

cargo build --release

Finally, make the release availible system-wide. In Linux f.e.: cp -f target/release/litt /usr/local/bin.

Usage

Create a new `litt`-index

This is how you create a new index:

litt <index-name> -i <path-to-documents>

Assuming you have some documents stored at Documets/Literature/books/ which you would like to index, you can do this as following:

litt books -i Documets/Literature/books/

NOTE:

the index-name can be any name. It need not match with the directory name.
any relative path is automatically changed to an absolute path (e.i. Docuemts/Literature/books/ to /home/<user>/Docuemts/Literature/books/)

Updating, reloading, deleting an existing index

To see all existing indices, type:

litt -l 
> Currently available indices:
 - ("books", "/home/<user>/Documents/Literature/books/")
 - ("papers", "/home/<user>/Documents/papers/")
 - ("notes", "/home/<user>/Documents/notes/")

You can then update an existing index: litt books -u which is usually very fast, but might not track all changes made to existing documents and will never track deleted documents. Use litt books --reload to fully reload the index. This might take a while.

To delete an index, type: litt books -r

Searching

In general you search like this:

litt <index-name> <search-term>

If your search term is more than one word, you should add quotations: litt <index-name> '<term1 term2 ...>'

Use --offset and --limit to show more results. (Default shows the top ten results. --offset 10 shows the first 10 to 20 results. --offset 10 --limit 50 shows the first 10 to 60 results).

Use litt <num> to open a document (num refers to the number in brackets, f.e. - [1] p. XXX: ...)

NOTE (open on wrong page): Possibly the searched term was not found by zathura since it breaks line, e.i:

my- 
stifiziert

Try to search for a substring to then find the term on the page.

/my

Exact matching

You can search for multiple words, the following will give the same result

litt books "Tulpen Rosen" 
litt books "Tulpen OR Rosen"

And show all documents (pages) which contain the term Tuplen or Rosen. This

litt books "Tulpen AND Rosen"

will only show documents (pages) which contain both the term Tulpen and the term Rosen.

You may also combine:

litt books "(Tulpen AND Rosen) OR Narzisse"

You can also search for fixed phrases:

litt books '"Tulpen Narzisse"'

Or:

litt books '"Tulpen Narzisse"~1'

which will also match f.e. Tulpen wie Narzisse.

Finally, you can find partial matches with:

litt books '"Tulpen Narz"*'

A detailed listing of possible queries and also limitations can be found on the tantivy page: https://docs.rs/tantivy/latest/tantivy/query/struct.QueryParser.html

Fuzzy Matching

Fuzzy matching can be helpful to find partial matches on single words (e.i. match nazis when searching for nazi). But also to correct typos or bad scans (e.i. find nacis when searching for nazis). This can be done by using the fuzzy flag:

litt books nazis --fuzzy

You can also specify the distance the search and matched term may have (default=2):

litt books nazis --fuzzy --distance 2

You may also search for multiple words:

litt books 'Tulp Narz' --fuzzy

Note:

working with phrases (litt books '"Tulpen Narzisse"~1') or AND/OR does not work with fuzzy search
In some cases no preview can be shown when using fuzzy search, we're working to improve this!
fuzzy matching only works on the body, not the title.

Acknowledgements

Dependencies

We explicitly want to thank all the great developers who help write and maintain the awesome libraries makeing litt possible:

tantivy the mind blowing full-text search engine library
clap the beautiful command line argument parser.
serde-json for serializing and deserializing JSONs
shellexpand for making our lives in a cross-platform world easier :)
uuid how would we (uniquely) identify each other without you?
colored for making our output more colorful (even though it really isn't)
walkdir for helping us gather all your documents
pdftotext which is amazingly good at doing its job)
rayon for parallelizing indexing and making it ~10 times faster!
levenshtein-rs for allowing us to show atleast some previews for fuzzy search

We also want to clarify, that not all dependencies use the same license as we do:

name	license
clap	Apache-2.0, MIT
rayon	Apache-2.0, MIT
serde json	Apache-2.0, MIT
shellexpand	Apache-2.0, MIT
uuid	Apache-2.0, MIT
tantivy	MIT
walkdir	MIT
colored	MPL-2.0

Name		Name	Last commit message	Last commit date
Latest commit History 124 Commits
.github/workflows		.github/workflows
images		images
index		index
litt		litt
resources		resources
scripts/search_two_words		scripts/search_two_words
search		search
shared		shared
.gitignore		.gitignore
Cargo.toml		Cargo.toml
README.md		README.md
makefile		makefile

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Litt

tl;dr

Table of contents

Litt

Installation

Requirements

Pre-built binaries.

Linux

Windows

MacOS

Compile from source

Usage

Create a new `litt`-index

Updating, reloading, deleting an existing index

Searching

Exact matching

Fuzzy Matching

Acknowledgements

Dependencies

About

Releases 3

Packages

Contributors 2

Languages

georgbuechner/litt

Folders and files

Latest commit

History

Repository files navigation

Litt

tl;dr

Table of contents

Litt

Installation

Requirements

Pre-built binaries.

Linux

Windows

MacOS

Compile from source

Usage

Create a new litt-index

Updating, reloading, deleting an existing index

Searching

Exact matching

Fuzzy Matching

Acknowledgements

Dependencies

About

Resources

Stars

Watchers

Forks

Releases 3

Packages 0

Contributors 2

Languages

Create a new `litt`-index

Packages