-
Notifications
You must be signed in to change notification settings - Fork 57
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
User scripts, bug fixes, docker image
- Loading branch information
Showing
21 changed files
with
488 additions
and
62 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,9 @@ | ||
FROM ubuntu:19.10 | ||
MAINTAINER simon987 <me@simon987.net> | ||
|
||
RUN apt update | ||
RUN apt install -y libglib2.0-0 libcurl4 libmagic1 libharfbuzz-bin libopenjp2-7 | ||
|
||
ADD sist2 /root/sist2 | ||
|
||
ENTRYPOINT ["/root/sist2"] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,8 @@ | ||
cp ../sist2 . | ||
|
||
version=$(./sist2 --version) | ||
|
||
echo "Version ${version}" | ||
docker build . -t simon987/sist2:${version} -t simon987/sist2:latest | ||
docker push simon987/sist2:${version} | ||
docker push simon987/sist2:latest |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -80,6 +80,9 @@ | |
"analyzer": "my_nGram" | ||
} | ||
} | ||
}, | ||
"tag": { | ||
"type": "keyword" | ||
} | ||
} | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,117 @@ | ||
## User scripts | ||
|
||
*This document is under construction, more in-depth guide coming soon* | ||
|
||
During the `index` step, you can use the `--script-file <script>` option to | ||
modify documents or add user tags. This option is mainly used to | ||
implement automatic tagging based on file attributes. | ||
|
||
The scripting language used | ||
([Painless Scripting Language](https://www.elastic.co/guide/en/elasticsearch/painless/7.4/index.html)) | ||
is very similar to Java, but you should be able to create user scripts | ||
without programming experience at all if you're somewhat familiar with | ||
regex. | ||
|
||
This is the base structure of the documents we're working with: | ||
```json | ||
{ | ||
"_id": "e171405c-fdb5-4feb-bb32-82637bc32084", | ||
"_index": "sist2", | ||
"_type": "_doc", | ||
"_source": { | ||
"index": "206b3050-e821-421a-891d-12fcf6c2db0d", | ||
"mime": "application/json", | ||
"size": 1799, | ||
"mtime": 1545443685, | ||
"extension": "md", | ||
"name": "README", | ||
"path": "sist2/scripting", | ||
"content": "..." | ||
} | ||
} | ||
``` | ||
|
||
**Example script** | ||
|
||
This script checks if the `genre` attribute exists, if it does | ||
it adds the `genre.<genre>` tag. | ||
```Java | ||
ArrayList tags = ctx._source.tag = new ArrayList(); | ||
|
||
if (ctx._source?.genre != null) { | ||
tags.add("genre." + ctx._source.genre.toLowerCase()) | ||
} | ||
``` | ||
|
||
You can use `.` to create a hierarchical tag tree: | ||
|
||
![scripting/genre_example](genre_example.png) | ||
|
||
|
||
To use regular expressions, you need to add this line in `/etc/elasticsearch/elasticsearch.yml` | ||
```yaml | ||
script.painless.regex.enabled: true | ||
``` | ||
Or, if you're using docker add `-e "script.painless.regex.enabled=true"` | ||
|
||
### Examples | ||
|
||
If `(20XX)` is in the file name, add the `year.<year>` tag: | ||
```Java | ||
ArrayList tags = ctx._source.tag = new ArrayList(); | ||
Matcher m = /[\(\.+](20[0-9]{2})[\)\.+]/.matcher(ctx._source.name); | ||
if (m.find()) { | ||
tags.add("year." + m.group(1)) | ||
} | ||
``` | ||
|
||
Use default *Calibre* folder structure to infer author. | ||
```Java | ||
ArrayList tags = ctx._source.tag = new ArrayList(); | ||
// We expect the book path to look like this: | ||
// /path/to/Calibre Library/Author/Title/Title - Author.pdf | ||
if (ctx._source.name.contains("-") && ctx._source.extension == "pdf") { | ||
String[] names = ctx._source.name.splitOnToken('-'); | ||
tags.add("author." + names[1].strip()); | ||
} | ||
``` | ||
|
||
If the file matches a specific pattern `AAAA-000 fName1 lName1, <fName2 lName2>...`, add the `actress.<actress>` and | ||
`studio.<studio>` tag: | ||
```Java | ||
ArrayList tags = ctx._source.tag = new ArrayList(); | ||
Matcher m = /([A-Z]{4})-[0-9]{3} (.*)/.matcher(ctx._source.name); | ||
if (m.find()) { | ||
tags.add("studio." + m.group(1)); | ||
// Take the matched group (.*), and add a tag for | ||
// each name, separated by comma | ||
for (String name : m.group(2).splitOnToken(',')) { | ||
tags.add("actress." + name); | ||
} | ||
} | ||
``` | ||
|
||
Set the name of the last folder (`/path/to/<studio>/file.mp4`) to `studio.<studio>` tag | ||
```Java | ||
ArrayList tags = ctx._source.tag = new ArrayList(); | ||
if (ctx._source.path != "") { | ||
String[] names = ctx._source.path.splitOnToken('/'); | ||
tags.add("studio." + names[names.length-1]); | ||
} | ||
``` | ||
|
||
Set the name of the last folder (`/path/to/<studio>/file.mp4`) to `studio.<studio>` tag | ||
```Java | ||
ArrayList tags = ctx._source.tag = new ArrayList(); | ||
if (ctx._source.path != "") { | ||
String[] names = ctx._source.path.splitOnToken('/'); | ||
tags.add("studio." + names[names.length-1]); | ||
} | ||
``` |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.