-
Notifications
You must be signed in to change notification settings - Fork 0
/
first-time-using-rodan
292 lines (195 loc) · 24.6 KB
/
first-time-using-rodan
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
Workflows, resources, jobs, oh my.
# Step 1) Getting Started
What does everything I'm seeing mean?
Welcome to Rodan! This section will detail everything you see on the landing page once you have made an account. As of 2024-02-22, make sure you're using the rodan2.simssa.ca URL for general use, learning, and so on.
First thing you'll see is the following image- ![[Rodan Documentation - Landing page (logged in).png]]
As you can see, I currently have an existing project, but for making a new project (or your first one), you'll want to click on the 'Create new Project' button in the top right corner. This will automatically populate an untitled project onto your screen. Double click onto the row, and you'll be taken to your **project page**.
Your project page is where the bulk of your resources, notes, and workflows will be found, but there's a couple quirks to be aware of. You can rename your project in the top right corner, under 'project details' box, and leave a description if you like; I frequently also use this as a 'note to self' space to keep track of run projects, failures, and things to remember for next time.
**I strongly recommend keeping a notes tab or some other separate space to keep track of what you're doing, what you did, and what you're doing next. Eventually your window will fill up enormously and it's easy for the screen to get very busy. Label everything clearly.**
![[Rodan documentation - new porject landing page.png]]
![[Rodan documentation - new untitle project first look.png]]
![[Rodan documentation - project landing page project title and description.png]]
# 1a) What am I seeing?
You should see four main tabs- "Workflow runs", "Run jobs", "Workflows", and "Resources".
**Workflow runs**: As the name implies, after you've run a workflow, they'll all populate here in a list. Workflows are the umbrellas which encompass jobs- jobs occur within a workflow. Workflow runs will not always show the results of the jobs run.
**Run jobs**: Jobs run within a workflow will show here. If there is a download affiliated with what job or workflow you ran here, this is where you'll see it. Similarly, if a job fails, or if you want to see the details of the run jobs which occurred within a workflow, click the relevant line here and you'll see the information in the far right column, in a bottom box noted 'Job details'.
> NOTE: After you have run several workflows, it may take a moment for information to populate here. Give is a few seconds, and if nothing is showing up after a minute, refresh the page.
**Workflows**: Here is a list of the workflows you've created.
**Resources**: The items you're using as inputs for your workflows and jobs! Run jobs can populate further resources, such as a ZIP file. Similarly, manuscript image files that you would like to OMR will be added here.
### Importing Resources
To import resources, navigate to this tap and select 'upload resources'. A dialog box will pop up allowing you to select the file(s) you would like to upload. You can upload multiple files at once, including zip files.
Once you've selected all the files you would like to use, click open, and they will begin uploading.
NOTE: if you're just beginning, you don't need to upload all of the manuscript images you would like to use. Start small, become effective on your smaller set, and proceed from there. For the purposes of this walkthrough, I'd recommend starting with up to 5 images, though a sample of 1 is perfectly fine.
Once all of your resources are uploaded, it's time to make your first workflow. For the purposes of preparing an image for the OMR process, this means Background removal, and preparing an image for Pixel.js.
# 1b) What do these mean, and how do I navigate them?
**Workflows**
Workflows are the principal thing you will be making and maneuvering in Rodan, so it's helpful to know exactly what they are and how to manipulate them.
When you create a workflow within a project, you're doing exactly what the label implies: designing the building blocks along which your- and the machines- work will flow to a defined end goal (the training, classification, and transcription of a music manuscript image).
> **Creating a workflow**
> "New Workflow" will create a blank row with "untitled" as its name. Double click on the row.
> Right click anywhere in the graph field, select 'Edit name/Description'. Name your workflow something that will distinguish it from later versions, and will be easily findable by you at a later date.
> I recommend something like "[Project name]-Learning Rodan-[Stage name, ex: background removal]-"Your Name", or just "[Project Name]-[Stage]-[Your initials]". In the description you can add dates and notes to yourself.
To search for a specific title, navigate to the small icon near the top right corner, and click onto it: this will enable a cursor to appear, and you can search for jobs by typing
**Jobs**
Exist inside of workflows: the building blocks to the larger task your workflow is addressing
>Inside of the workflow field, right click and click 'add job'. Depending on what stage of your process you are in, this may change.
> To search for a specific title, navigate to the small icon near the top right corner, and click onto it: this will enable a cursor to appear, and you can search for jobs by typing
**Resources**
Naming your resources something clear "MS 73" is good, but you are eventually going to have hundreds of these. Ideally you want something that identifies not only the manuscript, but its page and any other relevant information. "MS 073 fol.1r" is clearer, and if searched will immediately take you to the relevant page you are looking for.
Generated resources will have an automatic title. You can edit this title, but try and keep some of the initial title- for example, at a later stage it is very helpful to immediately know if the resource is a ZIP of layers or not.
Generated: "Pixel_JS - ZIP"
Edited: "MS 0073 fol.x-x - Pixel_JS - ZIP"
If you are hesitant to edit the title, no worries: you can add LABELS. This is similarly an area where you can add relevant manuscript, project, page, or folio information.
```ad-tip
Use label names you will remember.
```
# Step 2) Background Removal, Pixel.js
Create a new workflow, label it what you would like [best practices].
Import the following jobs: PNG (RGB), BACKGROUND REMOVAL, and PIXEL_JS.
> **PNG(RGB)** is one of the resources you've uploaded- the image you would like to manipulate.
> **BACKGROUND REMOVAL** is the process which will strip the background of your image. [This job](https://github.com/DDMAL/background_removal/tree/release) classifies pixels of a manuscript into two categories: foreground and background, and removes the background. Foreground includes objects such as text, neumes, and staff, and background contains all pixels that do not belong to the foreground.
> **PIXEL_JS** is the interactive window to which all of the information which has just been processed will channel to- it is here you will select and separate out the relevant areas which you will teach your model. Depending on what you would like to separate out, and what's present in your image, this at its most basic includes: notes (including clef), stafflines, text, and an empty layer.
##### Nodes per job
```ad-attention
title: How to add or assign a node
When in doubt in Rodan, right click. From adding jobs to assigning resources, right click will often pull up a relevant menu. In this case, right click on the red (or green) node, where a list of what you can do will pop up ("Assign Resources"). Right clicking on the job itself (such as "Pixel_js") will allow you to select "ports". Here you will be able to add inputs at the top of the menu, then scroll down to outputs, and at the bottom will display how many of each you have assigned.
```
![[Rodan documentation - right click a job.png]]
![[Rodan documentation - editing a job - input types.png]]
>PNG needs one import node and one export (the import node will be the resource you want processed, and you can assign by right clicking). The export square should automatically pop up. You will want to drag the export node to the top node of BACKGROUND REMOVAL as well as PIXEL_JS.
>BACKGROUND REMOVAL, you will notice, has two export nodes- essentially "RGB_PNG" and "empty layer". You may need to right click and 'add ports'.
```ad-note
title: TIP
icon: info
It can take Rodan a moment to reflect what you've clicked, do after clicking in the relevant boxes- even if it doesn't update inside the work box that you've selected another port- click out and refresh to make sure the ports are there. It's very easy to accidentally have a dozen ports from repeated clicks.
```
>The PIXEL_JS job you will need to add input nodes to. Assign as many nodes + 1 (n +1) as you will need layers. Generally this will mean **four** total nodes, given you will need: an empty layer, a staff layer, a notes layer, and a text layer.
![[RODAN - Images for background removal and pixel.js.png]]
You can now run the job. It may take some time, so don't worry. We do recommend refreshing the page every now and then, as Rodan won't always immediately reflect completed jobs/workflow runs- click into and keep an eye on the 'run jobs' tab- you should have a row labelled with 'pixel_js' with a 'processing' status and a 'false' availability. When it's complete, you'll see the row look like this:
![[RODAN - Step 1 'waiting for input' label.png]]
At some point, in your 'workflow runs' window, the status will be labelled as 'finished', and it will be labelled 'waiting for input.
Double-clicking on the row will pop up a new window in your screen (or update what you have there), showing a row at the borrow under 'resources'. Clicking this new row will add a new section to the column on the right hand side of your screen, at the bottom of which will be several buttons, among them will be 'DOWNLOAD' and 'VIEW.' You can view to see what everything looks like, but you'll need to download them for the next step.
This will open a new window on your computer- pixel.js!
**NOTE: pixel.js is big. I highly recommend using a desktop if possible, or otherwise being very patient.**
# Step 3) Pixel.js
A popup window should have now appeared. If it hasn't, ensure you have allowed popups from Rodan2.simssa and try again.
In the new popup you should see the image you uploaded at the top of the screen you should see the title of your project or file. NOTE: If this title is still "CDN Salzinnes", don't be alarmed, this is an ongoing feature/bug we're aware of. If this is still up in May, please go to out Github page and comment [here](https://github.com/DDMAL/Pixel.js/issues/268) that this is still present ("also have this issue" "still present" "bump").
At the top of your screen you will see three boxes- one which probably looks like a black box, one which looks like a bunch of sliders, and one which (FOLLOW UP ON THIS), as well as a zoom slider. By clicking the black square, you've entered editing mode. A tutorial explaining how to navigate and use pixel will pop up.
```ad-bug
title: the toolbox
Make sure to access the toolbox you scroll from the thin gap between the image viewer and the edge of the browser. It may be a thin white or grey line. This is how you will scroll down. Alternatively, enable 'view scrollbars' in your browser and manually scroll down to the bottom, where you should see the toolbox (if you are in editing mode)
```
Double click into each relevant layer to rename it. You are welcome to assign whatever color you like to whichever layer so long as you **remain consistent**. Personally I assign: yellow (staff), red (notes), blue (text).
#### Getting to work: Pixel.js
The first thing you should do is SELECT THE AREA YOU WILL WORK ON. **YOU DO NOT NEED TO WORK THE ENTIRE MANUSCRIPT IMAGE. NOT ONLY WILL THIS TAKE YOU FOREVER IT WILL TAKE FOREVER TO PROCESS. LOVE THYSELF.**
Having highlighted the relevant area, you should now see a vague color overlay of the area. This is your boundary.
```ad-warning
title: lag
Highlighting/brushing areas will have periods of exceptional lag if you are using a computer with insifficient memory. If you experience lag, move in incremental spaces until you become more familiar. It is doable, but slow.
```
##### Highlighting stafflines
Click the layer you have assigned as your staff color, select the line tool, and get to work. For pages where the majority of the line is straight, I recommend making use of the keyboard shortcut to establish a line.
This is an area where, if you have a slow computer or limited memory, you will encounter extreme lag.
##### Highlighting notes
Select into the 'notes' layer,
The square tool is your friend. Use the square tool to highlight the majority of the note, and then fill in the rest with the brush tool.
TIP: move small, move slow!
```ad-info
title: Count a clef and a custos (the symbol at the end of the line) as a note.
```
##### Highlighting words
With the exception of large initials (the big fancy letters starting chants), often you will have to use the brush tool. For rows of single downstroke letters (i's, m's, n's, l's, s's - which will often look like f's- and so on) you can use the square tool to highlight the majority of the letter
```ad-quote
title: SLOW IS SMOOTH, SMOOTH IS FAST.
```
This process will take some time. The slower you move and more attention to detail you can devote, the better your sample will be for training.
#### Submitting to Rodan
When you're done, give everything a once over from a zoomed out perspective to make sure everything in your selected box is covered. Once you're happy, scroll to the bottom of the page, where you will see multiple options for getting your files.
You can submit to Rodan, which will automatically add the resources. If you select this option, DO NOT CLOSE PIXEL, then change your window ("command" + "~" on Mac, "control"+ "~" on windows) to view your Rodan Project page, navigate to resources, refresh if necessary. At this time (2024-02-22) you will not receive a notification upon a successful export to rodan- give it a moment and check.
You can also download the layers as individual and a ZIP files. I recommend this as a means to be certain. You can then upload these files as resources to your project.
Congratulations- you've annotated and sorted your first image!
```ad-tip
The more of these you annotate and then save, the more examples you have for the training phase. I've been generally successful with reliable results by annotating 5-8 images.
```
# Step 4) Training
You should now have several items in the resources window, among them a zip file. Look, don't open- it's huge. Next step is **training**.
New workflow will use the **Training model for Patchwise Analysis of Music Document, *Training***(note: this is distinct from "Training model for Patchwise Analysis of Music Document, *Classifying,* so make sure double check you have the right one.) This job is also known as `Paco Trainer`. #PacoTrainer
### So- what is this? Why are we doing it?
The *Paco Trainer* (what's within this job) generates neural network models to classify pixels into OMR-relevant layers. It uses source images (manuscripts) and annotated layers (staff, neumes, background, text, …) as the input and generates models to automatically classify pixels of a manuscript into one of the annotated layers.
### Inputs and outputs for the trainer.
New workflow, label per [best practices].
You'll need to assign the resource generated from the previous step- which should pop up as both an option to add the individual files or the zipped version, I recommend assigning the zip file.
![[Rodan documentation - Patchwise training workflow.png]]
Make sure as you're assigning your ports that the number of model outputs matches the number of layers. So, if during the pixel.js generating stage you wanted to produce 4 layers, make sure you have 4 MODEL LAYERS. **CRUCIAL: two model layers ALREADY EXIST- if you have 4 outputs, you only need to add two extra models**
![[Rodan documentation - patchwise training model ports.png]]
If your first runs fail, click into the run jobs tab, click the relevant row which failed so that the 'run job details' window pops up (should only take one click on the relevant row), and scroll to the very bottom of the 'error details' box. This should detail exactly want went wrong, and you'll be able to go back, fix it, and run it again. See below:
![[Rodan Documentation - Patchwise FAILURE.png]]![[Rodan Documentation - Patchwise WHY FAILURE.png]]
Be patient. When run, this will take a long time to process. Refresh occasionally, but go make lunch/a cup of tea or two. The more zipped files you assigned the longer this will take (hours). Plan accordingly!
Eventually, you will see the following:
![[Rodan Documentation - Patchwise complete SUCCESS.png]]
A sign something is wrong: not only will the job finish quickly, but the status will report 'failed'. Follow steps mentioned above to locate reason for failure, the navigate back to workflows and run the job again (likely you have too few or too many models! Remember that you have **one more than you think you do**).
#### As a note:
```ad-info
title: Multiple inputs
If you have multiple pages which you trained, separated, and marked, you can upload multiple ZIP files (produced by the previous Pixel.js download stage) in the 'add resource' stage of Patchwise(training), selecting multiple resources and moving them over. You do **NOT** need to add more model outputs- you only need as many models as you have layers.
```
# Step 5) Classifier from models
## fix this title to distinguish from symbolic classifying better/more clearly
This step uses [`Training model of Patchwise Analysis of Music Document, Classifying`](https://ddmal.music.mcgill.ca/e2e-omr-documentation/overview/document-analysis.html#paco-classifier) (`Paco Classifier` for short) job. It uses the trained models from `Paco Trainer` to classify pixels of an input manuscript into layers.
The input for this job is a manuscript that we’re going to classify and generated models from `Paco Trainer.` The outputs contain a log file and`rgba` images corresponding to each predicted layer’s pixels. See [`Training model of Patchwise Analysis of Music Document, Classifying`](https://ddmal.music.mcgill.ca/e2e-omr-documentation/overview/document-analysis.html#paco-classifier) for details.
![[Rodan documentation - Patchwise, classifying image.png]]
Right click to add a job, and search for 'classifying', this job should pop right up. Click add, then close out. When it first appears, the top ports will be red until we assign resources.
>The first port will be for IMAGES. This will be the initial image(s) you submitted; select it again here (this is also why clear labeling of your resources is important- hunting items down, particularly when you have hundreds, will be tedious). Close out.
>The second port will be for MODELS. The models generated from the last stage will be produced here. Unlike last step where you need to account for a background model and a log being included in your account, this is exactly as the label says- select the models (probably 3, aligning with the staff-notes-text process)
> the final port is for your BACKGROUND MODEL. Be careful to distinguish it from the other models produced and added to the second port.
## Step-by-step
> Inputs
1) Assign the image you're using as your resource for Port 1
2) Assign the models for Port 2 - NOTE: do not assign background model here
3) Assign the background model here
>Outputs
1) Port 1 will be Layer 1
2) Port 2 will be Background
3) Port 3: should be 'log file' (which is a log of the training process)
1) right click > ports > outputs > log
> Run
1) Right-click, run.
2) Wait.
# Step 6) Classifying: Symbol Classification
## What is this?
Once the layers of a manuscript folio have been generated using the [Fast Pixelwise Classifier](https://ddmal.music.mcgill.ca/e2e-omr-documentation/overview/document-analysis#fast-pixelwise-classifier) and the models from the [Patchwise Trainer](https://ddmal.music.mcgill.ca/e2e-omr-documentation/overview/document-analysis#patchwise-trainer), it is time to classify the music symbols. While the processing of text and staff lines is straightforward, the music symbols layer contains various neumes, clefs, and custodes that require additional effort to differentiate. For this, the Interactive Classifier can be used.
### Interactive classifier
Setting this up requires several jobs and the adding of resources and some ports. Below is an image of what jobs you'll need along with what they do.
![[Rodan documentation - Interactive Classifier preparation (unlabelled).png]]
![[Rodan documentation - Interactive classifier preparation (labelled).png]]
This is the necessary preceding step before the active classifying stage, which is similar to the pixel.js stage in that it will open a separate browser for you to work on labelling the notes with their corresponding label.
Once all of the necessary resources and ports are added, run the job. When the job is complete, it will prompt you to open/interact with the Gamera graphical interface, designed to train a music symbol model. This will teach the model what the particular iterations of symbols used in your manuscript look like (for example, while all punctum's look similar to the human eye, the variances between them make them distinct to a computer- we need to teach the machine what an MS73 punctum and all its possible little variations mean- namely that they're all punctum, etc).
### Using the Gamera-based interactive classifier (music symbols)
This link: https://github.com/DDMAL/Interactive-Classifier/wiki/How-to-Use has a lot of great resources, as well as all other information you might need. Because it's the github page for the tool, if you run into any issues- visual, UI, or processing- please immediately check for an existing issue (and bump it if it's significant/breaking), or open a new one.
When you're done labelling, the final stage is **FINALIZING**, which will generate 1-2 Gamera files: Training Data and Classified Glyphs, and a text file - Class Names.
Training Data contains all the glyphs that you manually classified, plus the original (optional) imported training glyphs. Classified Glyphs contains all of the page glyphs, both manually and automatically classified. Class Names contains all the classes and subclasses in the class tree.
Now that this interactive stage is complete, we can share our results with the non-interactive Gamera classifier, which will allow for much faster future processing.
### Non-interactive
(TBF)
# Music Reconstruction and Encoding
This stage has multiple steps, each of which require the resources and information generated from the interactive classifier step. Before beginning, I highly recommend checking your resource names and labels and making sure you're able to distinguish between them/easily identify what each file contains. Onwards!
## Miyao Staff Finding
The Miyao Staff Finding job uses the staff lines layer (remember highlighting them in pixel? That layer- for a reminder, see the [document analysis step](https://ddmal.music.mcgill.ca/e2e-omr-documentation/tutorial/document-analysis)) and determines the characteristics of the staff/staves. It requires a specific black-and-white cleaned-up input image, much like the CC Analysis job previously.
Here, however, after despeckling the black-and-white image it should be dilated using the Dilate job to improve the results of the staff finding job. The steps to run the Miyao Staff Finding job are presented in the image below.
![[Pasted image 20240228144224.png]]
When you add the 'png rbg' job, you can right click edit the name to add the [staff lines] distinction to remind yourself. Make sure you have the relevant image at hand.
## Heuristic Pitch Finding
## Text Alignment
# General Tips or Notes
- Once you have created a workflow, by clicking the 'workflow' label on the top ribbon of your screen in rodan, you will see the option to import other workflows. This can be a helpful way to see what other projects have done, even if you don't end up using them.
# notes to self/ideas
- add 'rodan for dummies' to READ.ME?
- problem at root here is that we lose previous versions. Where can we put this that it will be consistently/easily found or migrated with later workers/versions?
- make a general 'in lab' version, as well as a 'for newbies' version?
- next the 'step by step' option underneath the 'for DDMAL' instructions?
# Best practices
- [ ] What do we want to recommend as best practice for internal lab learning?
- [ ] Do we want to recommend a format for general users?
Friends don't let friends incorrectly label their sources.
Rodan sees all. Rodan judges. Don't get judged my Rodan- label your stuff.
![[Rodan is judging you (kaiju).png]]