forked from OS2World/APP-GRAPHICS-ScanSort
-
Notifications
You must be signed in to change notification settings - Fork 0
/
ScanSort181.txt
1315 lines (975 loc) · 59.2 KB
/
ScanSort181.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
ScanSort 1.81 27.08.99 sturedman@hotmail.com
Homepage: http://www.geocities.com/SouthBeach/Pier/3193/
--------------------------------------------------------------------------------
What is it ?
--------------------------------------------------------------------------------
Frustrated with the existing tools for scan collection managing I decided to write
my own. The features are:
EASY handling: command line driven, no GUI
automagically sort new files to where they belong, even if they are renamed (YEAH!)
supports long filenames, Win32 only
process ALL your collections in one run, FAST
search directory trees for pics (e.g. complete CDs)
CRC-Checking, takes CSVs with or without CRCs
generate report files (like MTCM, or widely configurable)
span collections over multiple volumes
automatically repair some corrupt files
generate descript.ion files for ACDSee
support for trading (match reports against collections,
copy files to send to a directory or pack them into zipfiles)
create, update, verify, manage CSV files
create model based collections (search all pics for a specific girl)
You need collection descriptions in MTCM-CSV-format, that is
name,size,CRC32,optional description
MALPPR01,219416,1431ab7b,Lisa Matthews
You can get them e.g. at my homepage:
http://www.geocities.com/SouthBeach/Pier/3193/
--------------------------------------------------------------------------------
Why you need it
--------------------------------------------------------------------------------
The typical scan collector's "work day" goes like this:
1) get new pictures (newsgroups, email trading, web, ftp ...)
2) try to identify them:
a) look at pic 1 in the browser (search for small icons in the corners)
- oh, could be a AV-scan
b) look at the filename (damn - renamed to girl0317.jpg)
c) try to identify it by comparing the filesize with the values in the csv-file
d) rename it to the correct name (e.g. "AV_Rachel_Jean_Marteen_NU97_02.jpg")
3) move it to your collection directory (if it exists you have to check if the
file you already have is good or bad)
4) repeat 2-3 for all other pics
5) run your favorite collection manager
6) rename/erase files reported as bad or extra, run it again ...
YOUR workday will be:
1) get new pics (can't automize this), dumping EVERYTHING in one directory (e.g. "new")
2) run scansort; watch it moving new scans to their target directories and deleting
files you already have
3) (optionally) weed through the remainders (no more correct collection scans there)
Maybe you'll even find time to LOOK AT THE PICS ! :-)
--------------------------------------------------------------------------------
Quick Start
--------------------------------------------------------------------------------
To get started take the following example:
CSV-files are in c:\csv
copy some scans and some other stuff to c:\new, rename some of them (all for testing)
make directory c:\scans
cd \scans
scansort -s* -dcc:\csv \new
ScanSort reads all CSVs in c:\csv.
Then it starts looking for jpg-files in \new. Each picture is compared
against the database by size and CRC. If it matches an entry it is copied with the
correct name to the target directory (which is created if it doesn't exist),
but only if there is not already a file with the correct size.
Try and see !
SOME useful commandline switches (there are LOTS more):
-m move files, that means delete source files. Source files are also deleted
if the target already exists, so it can report more files deleted than moved.
-ra create report files for all collections with at least one file
with -ra no source paths need to be specified
-Kr clean up your CSVs, removing all those obsolete ones and renaming them to their
correct names
Be careful - all switches are case sensitive, -m and -M have different meanings !
To remove any junk from your collections just use your collection tree as source
and an empty directory as target. Run ScanSort with -m . Then all good pics
are moved to the new location and only files with wrong size/crc remain.
Don't be afraid of those lots of switches - most are just for fine-tuning
and not needed in the beginning.
--------------------------------------------------------------------------------
Configuration File
--------------------------------------------------------------------------------
Usually you don't want to check your pics against all existing CSVs, only against
those you actually collect. And you don't want to type always the same switches.
So you need a config file.
The configuration file tells ScanSort which of your CSV-files it should use.
It is a text file with one collection name per line, like this:
# This is a comment
-ra # you can put commandline switches at the beginning of the config file
-dcc:\csv # read CSV-files from c:\csv
-dpc:\scans # path for the collection
Eroscan
Skunkmaster
Scanmaster
Weatherby weatherbyscp wbyscp
Yeah, I have done it. I've CHANGED the format for 1.7. Don't stone me, guys (and gals ?)...
The old format (using the CSV name) was a real nuisance with different prefixes
(MTCM, McBluna, or nothing), slight spelling differences (WeatherbySCP, weatherby-SCP,
weatherby_scp) or several versions of the CSV name (Felines, Echoscan-Felines).
O.K., now you type in the name of the collections, just as you like it. I know this
takes some time, but if a collection isn't worth adding its name to the config file,
then it shouldn't be worth collecting.
The name in the config file will be the name used for reports and for the target
path. If you want to use a different one you can still do it with -p:
Harli_SWA
Harli_SWA_index -pHarli_SWA\index
L-Port -pd:\L-Port
You see, you can use both relative and absolute paths. You can also put several
collections into the same path.
You can also use different report names with -r:
Wscan_SWA -r # no reports for this one
ZorroScans_SDC -rZorros.txt
How does Scansort find the CSVs for the collections ? It always examines ALL of your
CSVs (which are usually all in one directory, but you can also use several CSV directories
with several -dc switches). It removes the prefixes "MTCM_" and "McBluna_", the suffix
"_(FINAL)", the text "scan" or "scans" and the number at the end. Then it takes what
remains, removes all characters except letters and numbers (like -_'°) and compares
what remains against the collection names. If several CSVs match the one with the highest
count (number at the end) is taken. Scansort relies on this number to be correct !
If the CSV is still named differently you can add several alternate names (like wbyscp
above).
Got it ? Now - if your collection is named "Light&Magic_HQ", which CSV won't be
recognized ?
a) MTCM_Light_&_MagicHQ.csv
b) Light_and_Magic_HQ_28.csv
c) McBluna_lightmagichq_28.csv
You got it, b). If you get non matching CSVs regularly you can beat them easily with
alternate names:
Light&Magic_HQ light-and-magic-hq
You may use upper-lower-casing, underscores, dashes, whatever for the alternate names-
all of these will be ignored.
If you are really, really, 100% sure you WANT spaces in the names of your collections
you can do this as well (although it sucks imho):
"Light & Magic HQ"
Prefixes and suffixes: the current trend is not to use prefixes at all any more. However
you can specify your own prefixes and suffixes now:
-pFoyle
-PMTCM -PMcBluna -PHI -P_SNF
The difference is that prefixes with -P are removed from the CSVs when using -K
(kill/cleanup), while -p prefixes will be left alone (except when using -Kr). The prefixes
-pMTCM -pMcBluna -pfinished -pfinal -pongoing (and their modifications)
are built-in, you don't need to specify these.
The switches are the same for prefixes and suffixes, use -P_SNF to remove the SNF from
CC_BigCenterfold_Renamed_SnF_548.csv, but leave SnF_Open_318.csv alone.
Be careful:
1) -P will remove prefixes and suffixes from ALL your CSVs (not only from those you are using)
2) -Kr will remove all prefixes and suffixes (even those given with -p) from the CSVs used
in your config file. This has caused some confusion.
The config file can have any name and could be placed in the same directory
as the CSV-files. Or, you can place it anywhere you like and specify the
CSV-directory with -dc (commandline or configfile).
You can create multiple config files of course (but use only one per run).
You can use wildcards for the collection name:
Harli* will select all Harli collections.
I do NOT suggest this however. When you use wildcards the actual collection
names have to be determined according to the CSV names AGAIN:
if Harli_SWA_40 gets replaced by Harli-SWA_42 it will create a new folder
like before (yuck). So don't use Wildcards.
The wildcard * will select ALL collections that were not selected before.
So you can put all other collection scans somewhere using
* -pMiscStuff
at the end of the config file.
If you have lots of collections and want a quick start you can run
scansort -s -dcc:\csv -xu
and then open the scansort.log in your text editor. There is a section in it
with all recognized collection names which can be pasted into your config file.
-xu lists all Collections for which there is a CSV but which aren't used in the
config file in the logfile. So you can easily add new ones turning up to the
config file.
You can (and should) put switches at the beginning of your config file, but NOT
after the first collection. However you can adjust the collection path and report path
(-dp, -dr) between the collection names to spread large collections over several
disks.
I have included 2 sample config files (sample1,2.txt) in the Zipfile.
sample1 can be a good template for your use, sample2 is an example how
"professionals" can easily spread their collection over several disks.
You see, I've removed a few options that were there before because I found
them rather unnecessary. If I get enough complaints (well - friendly suggestions ;-) )
I may extend the format of the config file, but it will stay compatible from
now on.
--------------------------------------------------------------------------------
CSV-Files
--------------------------------------------------------------------------------
You need collection descriptions in MTCM-CSV-format, that is
name,size,CRC32,optional description
MALPPR01,219416,1431ab7b,Lisa Matthews
You can get them e.g. on my homepage.
Several people have asked for support for CSVs without CRCs
("Checker-CSVs"), so I have added it. You can even mix
entries with and without CRCs in one file !
I still suggest you check if you can't find a CSV WITH CRC.
Of course, files without CRC cannot be detected if they are renamed
(except upper/lowercase). There are lots of same-sized files in
the collection database.
Even CSVs with entries with zero length are now accepted
(only the correct entries), though you should forget that junk.
--------------------------------------------------------------------------------
E-CSV-Files (extended CSV)
--------------------------------------------------------------------------------
There are a bunch of CSVs around with a new format:
img0018.jpg,129908,b8f1da32,\Denna\,
img0019.jpg,93470,d688bd6c,\Denna\,
img0020.jpg,147257,f4dbf161,\Denna\,
img0001.jpg,138347,51341f96,\Hellen
img0002.jpg,160207,477c0c43,\Hellen\,
img0001.jpg,258827,7504ede5,\Ingrid\,
img0002.jpg,246678,29a214a6,\Ingrid\,
You see, the comment is replaced by a sub path, and (often) many pics carry the
same name. Now many people collect these pics, and so I decided to support it
(instead of just raving about that nonsense :-[ ). This means, the pics go
in a path named like the collection (as always) and there into a subfolder as
specified. Reports work also.
There are some limitations though:
- trading only possible as zips
- E-CSVs can be created with -CE, but not updated
- extra comments AFTER the path are ignored
- if you don't have the correct CSV those img???? pics are treated as bad
(because there are always pics with same name and different size).
I suggest you use -B to leave bad file alone (NOT -b !!!)
There's also a new switch -xb8 to leave all bad pics named with 8 or less
characters alone made especially for this problem.
--------------------------------------------------------------------------------
Running ScanSort
--------------------------------------------------------------------------------
I suggest you put ScanSort somewhere in your path (e.g. into windows\command).
Now open a Dos-Box and type
scansort
- it will show the command line syntax and all the switches it knows. Don't worry,
you won't need many of them for daily work ...
Besides the switches you have to supply the name and path of the config file (which should
be located in the same directory as your CSV-files) and the SOURCE PATHS (the directories
you want ScanSort to search through for new pics).
The TARGET PATH (where the pics are written to) is the actual directory,
so you have to cd to the place where you want them first.
--------------------------------------------------------------------------------
Examples
--------------------------------------------------------------------------------
(You don't want to read on and start right away ? - O.K.)
Places for files (replace them with your own):
CSV-Files: c:\csv
Config file: c:\csv\all.txt
(You should take your time to setup this using sample1.txt as template. Or you
can create a file with just a * in it.)
Your existing Scans: d:\old (in various subdirectories)
Incoming pictures in: c:\new
Target directory: d:\scans (empty at the beginning)
All examples assume that you have cd'd to the target directory or set the target path
using -dp in the config file.
First-time cleanup of your collection:
scansort -m c:\csv\all.txt \old
This moves all files from your old collections to the new place, sorting everything
to the correct places and renaming the files to the correct names from the CSV-files.
Only files with correct size and CRC are moved, everything else remains in the old
place.
Moving incoming files to their places:
scansort -m c:\csv\all.txt c:\new
Files you don't have are moved, those you already have are deleted. Source files
are only deleted if the copy was succesful of course (someone asked me this).
Unknown files stay where they are.
Create MTCM/Colver-style reports:
scansort -rM c:\csv\all.txt
Reports are written to the target directory (if not specified else in the
config file). Only collections with at least 1 file are reported.
You can specify a report directory with -drDIRNAME
Create Scansort-style reports:
scansort -ra c:\csv\all.txt
Make a list "have.txt" of all files currently in your collection
scansort -rH c:\csv\all.txt (-rHv to add the comments)
Import this list if you have moved out some of your pics (e.g. on CD)
scansort -hhave.txt c:\csv\all.txt [other options]
Match your files against someone else's report (creates ask.txt and offer.txt)
scansort c:\csv\all.txt -tao report.txt
(No reason to STOP reading now. You know: If all else fails, read the doku !)
--------------------------------------------------------------------------------
Commandline switches
--------------------------------------------------------------------------------
Commandline switches start with '-' or '/'. You can't join multiple switches
after one switch character, and switches must be seperated by blanks:
-mv Wrong (v is ignored)
-m-v Wrong (v is ignored)
-m -v Correct
(Yeah, that's not standard, I know. It's because the switch parsing is completely
hand-made. Another thing that sucks is that -dc c:\csv causes an error - you
have to use -dcc:\csv . But people have got used to it, and nobody has complained
ever, so I've always been after more pressing features. :-) )
Be careful - all switches are case sensitive, -m and -M have different meanings !
-m move files (erase source files)
All files you already have in your collection are deleted from the
source path. (Those you don't have are copied and then deleted -
if the copying was succesful.)
-u no uppercase for DOS-Filenames
By default, filenames with 8 or less characters (DOS-Names) are
converted to uppercase. This switch turns this feature off.
-_ Spaces in picture names suck and are thus converted to '_' by default.
Same story with umlauts and special characters - Kâtâ becomes Kata.
You can turn this feature off with this switch.
-v more messages (rather obsolete meanwhile...)
-a check all files (regardless of extension)
If you got your pics renamed to pic.001, pic.002 or something alike.
Not needed any more for sorting collections with different file types
-l don't write logfile
By default, a lot of info is written to "scansort.log" in the current path.
-sNAME process single collection NAME (no config file). -s* means "all collections"
If you use a config file with this option (must be before the -s switch in the
command line) just the -s - collections are processed (for speed-up):
scansort config.txt -sScanmaster -sSkunkmaster
-exyz set file extension to xyz (instead of jpg)
So you can sort collections of wavelet or fractal compressed pics in the future ! :)
-b always delete bad files (see below)
-K kill duplicate CSV-files (see below)
-T touch pics (set to current date/time) when moving into collection
-r give help on reports (to keep help on one screen...)
-t give help on trading
-M give help on model collections
--------------------------------------------------------------------------------
Choosing directories
--------------------------------------------------------------------------------
By default, ScanSort searches the CSV-files in the same directory as the
config file and everything else in the current directory (where you have
"cd"d to before running it. BTW: "directory" means the same as "path" or "folder").
Now you can override all of these:
-dcDIR set directory for CSV-files to DIR (instead of path of config file)
You can use several -dc switches if you want to sort your CSVs into
several directories.
-dCDIR this directory is searched for CSVs just like those with -dc, but after the run
all CSVs actually used are moved to the -dC directory (only if reports were
generated). If there are two -dC dirs the first gets the CSVs for incomplete,
the second the CSVs for complete collections. If the completion status changes
the CSV is moved back and forth as requested. (However if a CSV is not used
any more it stays in the -DC path and is not moved away.)
-drDIR set directory for report files to DIR (instead of current path)
You can override this by specifying an absolute path for the report
in the config file.
-dpDIR set collection directory to DIR (instead of current path)
You can override this by specifying an absolute path for the target
in the config file.
The pics are copied to Collectiondir\Collectionname
-dbDIR set directory where bad pictures go to (instead of "BadPictures")
(some more for trading, see below)
Warning: don't put a space between -dc and the name ! (You get a sensible error message now.)
Correct: -drc:\report Wrong: -dr c:\report
-d switches now support the ~ char which stands for your home directory under unix.
You can use this under Windows as well if you type like set HOME=d:\scans
before running Scansort. Then -dc~\csv will set your CSV dir to d:\scans\csv .
--------------------------------------------------------------------------------
Reports
--------------------------------------------------------------------------------
There are two styles of reports:
1) -rM Mastertech-Style (Missing name length description)
2) -rhmiesa ScanSort-Style: separate sections for files you
'h'ave, 'm'iss, 'i'ncorrect and 'e'xtra files and the 's'ummary
- or 'a'll of the above
You cannot use 1) and 2) together.
Modifiers:
-rb brief (no descriptions)
-rc CRC-check all files (SSLLOOWW). Use this only if you think your harddisk could
be corrupt, or to verify a new burnt CD. Remember: ALL pics were CRC-checked when
they were added to the collection !
-rf freshen: only create a report for a collection for which new pics were added,
or for which the report was missing or older than the CSV.
Needed to keep a web page up to date (to see which reports have changed).
-rE No empty reports: best used with -rmies (everything except the files you have).
Then there won't be reports for complete collections showing just the summary.
-rR recurse collection for report generation, like in versions before 1.61
This is not recommended and only needed if pics were manually moved to subfolders.
-rA report all collections in the config file, even the inactive ones (those of which
you have no pics at all). This also decides if inactive collections go to the summary.
-rn add numbers of have/all to report names (like CSA_239-240.txt )
-rr Don't generate reports. Now, what could this be good for ?
Well, I'm using -rmiesbofE in my config file, but don't want to generate reports
in every run. So I can
a) use -rr in the command line to suppress them for one run
b) use -rr in the config file to suppress them always and -rr in the command line
to turn them back on. (-rH and -rT will turn them on as well)
-rI some comment - add "some comment" to every report. This switch can only be used
in the config file.
Other forms of report :
-rd create "descript.ion" files for use with ACDSee (the BEST viewer available !)
-rD create descript.ion as hidden files
descript.ion is only created if there ARE descriptions in the CSV !
-rH create "have.txt" for multi-volume-spanning (see below)
-rS print summary for every collection in table form on the screen,
not only into the logfile
-ro output the collection summary to the file "summary.txt" in the report path.
(complete collections first, then the rest).
-rT create HTML-table for my homepage. Don't know if anybody else can use this. :-)
It also creates the CSV zips I supply automatically. -rTv ("vacation") omits
the links for the text files to offer only the CSVs.
This option searches for a file "trade_tp.html" in the report directory and creates
a file "trade.html" out of it. It compares line by line:
CHANGE-DATE insert current date/time
COMPLETE-TABLE insert table of complete collections
INCOMPLETE-TABLE insert table of incomplete collections (you guessed that ;-) )
I've include a trade_tp.html for you (based on my trade page). After editing it with
a HTML-Editor (like Netscape Composer) open it in a text editor and make sure that
the tags from above are still each in a line for themselves as they were before !
-rT also creates a file requests.zip with all your reports and your missing.csv.
-rx export all missing pics into "missing.csv" to get a quick overview
(only missing pics from active collections, thats collections you have at least
one pic of)
-rX export missings into CSVs (one per collection). Yeah, you've talked me into
this. Now I just hope everybody will be able to tell the difference between
"real" CSVs and requests on the newsgroups... :-|
-rN creates a bunch of filters for Forte Newsagent, both with and without filesizes
as suggested by Eric. -dNpath sets the directory where these filters go to.
Try and see yourself. I don't use Forte Newsagent myself, but was told this
works great.
You can group all report switches (MhiesabcdDHvSoT) after only one -r.
Hint: ScanSort is a sorter, not just a checker. I suggest you use ONLY ScanSort
to add new files to your collection. Then you will NEVER have any
bad or extra files there !
Because of this, there is usually no need to CRC-check pics for reports.
I only suggest it if you suspect your filesystem to be screwed up...
You can put bad sized files manually into your collection directory,
they will be replaced when the correct file comes up,
but you really shouldn't if you do any trading.
NEVER put files with correct size/wrong CRC there because they won't be
recognized as bad and so won't be replaced when you get the correct file !
(if you want to keep them change the size: copy /b bad.jpg + small.txt bad1.jpg )
--------------------------------------------------------------------------------
Sorting other pics than JPGs
--------------------------------------------------------------------------------
You COULD switch Scansort's default extension from jpg to gif with
-egif
but what you usually need is support for collections with images of multiple
types (jpg and gif, mpg, avi). This works. Scansort checks all filetypes which
appear in any of the used CSVs (yeah, -a is not needed any more for this).
Keep in mind that Scansort is designed and tested for sorting PICS and not
100-MB-movies. Every file checked is kept in memory during the process, so
put enough RAM into your PC for sorting big movies. (BabeTV_SWA works fine
on my computer with 32MB RAM).
--------------------------------------------------------------------------------
The Wastebasket
--------------------------------------------------------------------------------
A nasty bug in version 1.62 could lead to deletion of pics under certain
circumstances. To get rid of such problems once and for all, I changed the
behaviour of file deletion:
Obsolete files are now not deleted, but moved to a certain "Wastebasket"
directory. So if a file should get erroneously deleted, you can alway find
it there afterwards and restore it. The only exception to this rule are
pics that were MOVED to the collection (that means: succesfully copied
to a new place), these don't get a copy in the waste since they are still there.
The Wastebasket folder is "ScanSortWaste" in the current directory;
you change this with
-dwd:\waste (to d:\waste)
-w will turn off the feature and have the files really deleted again.
I'm quite sure you won't need this feature in the future, but better save than sorry...
Of course you shouldn't forget to empty this folder every now and then.
Remember, pics there are probably already in your collection, or bad pics with good
ones existing.
If Scansort tries to move a file to the wastebasket (or to the Bad Folder, see below)
and there is already a pic of that name there the new pic gets renamed.
If anybody knows how to delete files into the Windows Wastebasket please send me
a source code example and I'll be glad to add this feature. I found nothing about
it in the doku.
--------------------------------------------------------------------------------
Handling of bad files
--------------------------------------------------------------------------------
Bad files are files with a valid name, but wrong size or CRC.
ScanSort puts these into the directory "BadPictures" unless there
is a good version in the collection. If you use -m they are deleted
afterwards.
-b If you use the -b option bad files are never copied and always deleted
(but first checked if they can be repaired, see below).
-b40 only copy bad files if their size is at least 40% of the original size,
otherwise delete (with -m). This was the default behaviour until version 1.61,
and was changed after someone lost a bunch of files with stupid names
(there are lots of collections with pics named "img0001", "img0002", ...).
Use -b57 for 57%, or whatever.
-dbDIR specify an other target directory DIR for bad files (WORKS NOW)
-B If you use the option -B bad files are left alone completely (but still
checked if they could be repaired)
--------------------------------------------------------------------------------
Handling of extra files
--------------------------------------------------------------------------------
If you follow my suggestions and use ONLY Scansort to copy pics into your
collection there should never be any "extra files" (files not mentioned in the
CSV) there. Well, after downloading new CSVs those extras often appear out of
thin air. The reason is usually that someone fixed a bug or typo in the CSV,
or even reorganized it (moved some pics to a new CSV).
Since the computer should handle at least simple problems automatically, Scansort
tries its identification function on every extra file it finds when generating
a report (only when you use the move option -m). Usually the pic is identified
as a (slightly differently named) member of the current collection and renamed.
This works as well if the pics were moved to a new CSV (if the new CSV is included
in your config file).
If you use the switch -E extra files are always removed, even if they couldn't
be identified. You shouldn't use this by default !
Another nuisance fixed: if you change your CSV suplier you often find that the
pics are named same, but with different casing (all lowercase/first uppercase).
This looks bad in my opinion, so pics with correct name but different casing
are now renamed according to the CSV automatically.
--------------------------------------------------------------------------------
Multi-Volume spanning of collections
--------------------------------------------------------------------------------
When you collect succesfully, you will soon run out of disk space :-(
and start moving files to CD-Rs or Zipdisks. No problem, ScanSort supports
spanned collections !
-hhave1.txt imports a "havelist" from the file "have1.txt".
This list contains size, CRC and name of files you have elsewhere,
which are then registered as "already there". If these files come
up again, they are NOT copied to the target path
and deleted if you use -m.
There are two ways of creating a havelist (which is always named "have.txt"):
1) scansort c:\csv\all.txt -H d:\pics (-Hv to add comments to havelist)
This searches d:\pics (recursive) for pics from the collections in all.txt.
The same search algorithm is used as when checking new pics: all pics are
checked for size and CRC, regardless of their name and path.
This takes a long time, but allows creation of havelists from CDs with
incorrect named pictures or picture paths. Also you only have to do this ONCE.
2) scansort c:\csv\all.txt -rH (-rHv to add comments to havelist)
This reports only files from you correct sorted current collection. It only
checks filenames and filesizes and is really FAST.
You can easily combine (or convert to verbose) havelists by importing them
with -h:
scansort c:\csv\all.txt -hcd1.txt -hcd2.txt -rHv -dpd:\EMPTY_DIRECTORY
I suggest you create one havelist for each collection volume you have.
(you have to rename them because the created file is always named "have.txt")
Like all options the -h option can be placed at the beginning of your config file !
In Version 1.50 I changed the format of the havelist to standard CSV format.
Old havelists are still accepted. (However this caused Scansort versions before
1.8 to ignore all pics in text havelists which were named only in digits,
like 19990703.jpg. :-( Binary havelists were always fine, and the bug is fixed now.)
If you want to import the havelist into another application like a database
you can user -rHs or -rHsv to 's'uppress the collection comments.
Then the collection name is added for each line between CRC and comment, and formatting
spaces are removed.
The comments in the havelist start with #, so that pics starting with that character
are ignored. This concerns the #SWA_1st_Anniversary collection only for all I know.
Make a binary havelist for this one. (Now that I think of it I could have removed the
stupid char from the filenames. Oh well, you can't swat all flies... ;-) )
If you create a summary file (-ro) there will be a list for each collection
where the pics are located. HD means in the collection on your hard drive,
otherwise the names of the have files are shown. So I suggest you rename
the have files to cd1.txt, cd2.txt, zip1.txt and so on.
If you have several CDs burned importing all the have files can get a bit time
consuming. Therefor I created a new binary havelist format:
-Hb
-rHb will create a binary file have.bin which can be read with -h much faster.
I suggest you create a standard text havelist for each CD (-rH or -rHv),
and a binary one for faster import as well.
You can also convert a text list to binary :
scansort config.txt -hcd1.txt -hb
-hb will convert ALL text havelists read into binary havelists (cd1.txt becomes cd1.bin).
The text havelists are kept of course, but if a binary havelist with the same name exists
already it is overwritten. It doesn't matter what CSVs are used for the run (but at least
one is necessary).
The pics in the havelist are recognized by their size and CRC, they will stay
"have" even when they get renamed in the CSV later. Remember that you may have a
pic under the old name on your CD if you have backed up a partial collection.
If you want to create a new havelist you should usually tell Scansort to ignore
all other havelists in the config file (or the new one will include all the other
ones together with the stuff on the harddisk). You can do this with the switch
-hx ignore all havelists
--------------------------------------------------------------------------------
Trading
--------------------------------------------------------------------------------
"Trading" means contacting other Scancollectors and exchanging pics with them.
There are many trade pages on the Web (e.g. mine :) ) where people list their
collections in the form of report files. You download one of these reports,
send some of the missing pics to the trader and ask for some pics YOU miss
that HE has.
(Many people on IRC state the opinion that rather than trading everybody
should give away everything for free. Well, this may be o.k. for 10 fresh
released pics, but find somebody who will send you 800 without getting
anything back...
Anyway, you can use the features described below of course for "giving"
or "asking" instead of "trading" just as well. )
Now imagine you trading Scanmasters with somebody. You each have 600 of the
1000 pics in the collection. Now you want to send 100 pics YOU have that
HE hasn't and choose 100 YOU miss that HE has. If you ever have done this you
know that this is an utter pita... This is just dumb work and an ideal job
for a dumb, fast computer !
You need to download the trader's report and save it to a file. Now the fun
begins (for me), because there are as many different report styles as there
are collection checkers (MANY)... I have tried to beat most of them, but
if you stumble upon a report ScanSort doesn't accept mail it to me so I can
try to support it.
There are three main styles of reports:
1) Mastertech (all files alphabetically with 'valid' or 'missing' as first word)
2) ScanSort (different sections for having or missing files)
3) simple (just the names of the missing files)
Also some guys (like me) only upload reports of files they MISS. If you
ever downloaded a 1000-line-report over a slow connection to find out which
two pics the dude was missing you know why...
Switches for trading:
-t help for trading
-dt set trading directory for files generated or copied (default: current)
-ds set source directory for files to copied (default: collection dir)
(if you have this collection e.g. on CD-ROM)
You can give either the full path including the name of the collection
or just the collection base path. The standard collection dir is still
searched, so you can keep part of your collection on CD and part on hard disk.
You can specify multiple source directories (each with -ds), but if
you have spread your pics over multiple CDs and you have only one drive,
then you're out of luck :-[
(You have to run Scansort once for each CD, which is no problem since
the still missing pics are written to need.txt .)
-ta make list "ask.txt" of files to 'a'sk the trader for
-to make list "offer.txt" of files you can 'o'ffer to the trader
-tm make list "missing.txt" of files 'm'issing in both collections.
Use this to ask a third party for the pics you both need.
-tgNR copy NR files you can offer to the trading directory (e.g. -tg20)
Pics you have and the other misses are chosen at random and copied
so that you can easily move them to a zip-file and 'g'ive them to the trader.
If you don't give a number, or if the number is more than you can offer,
all files are copied.
If the number is 500 or more it is interpreted as kilobytes, not pics.
A file need.txt is written with all the pictures you have not sent yet
so you don't lose tracks and send pics twice ! Use this file next time
instead of the report.
If need.txt exists already it is saved to need.bak.
If any pics were given the original report is DELETED, so you don't
get confused later which files were given. (The remaining filenames
are written to need.txt.)
-tzNR works only together with -tg. The files you give are not copied but
stored in Zipfiles using InfoZip (zip.exe must be in your path !)
The number tells how many pics will go into each zip.
If the number is 500 or more it means maximum size in kilobytes
per zip.
Files are zipped by alphabet, so the zipfiles may be below the size
you specified.
A logfile "zip.log" is written with the contents of every zipfile.
If you run Scansort several times it always appends to this logfile,
so you should delete it before you start.
Why use InfoZip ("zip.exe") ?
InfoZip is a Unix Zip program which was ported to Win32.
It can be called by commandline (other than WinZip) and supports
long Filenames (other than PKzip).
You can download it from my homepage.
Compression is turned on by default. To turn it of, type
SET ZIPCMD=-0
(you can put this in your autoexec.bat if you like)
-tA, -tO, -tM, -tG
Same as -ta, -to, -tm, -tg, except that the collection name is appended
to the name of the created text file. With -tGz, the zips are
named "Collectionxx.zip" instead of "givexx.zip"
Options for use together with -tg
-tZname Set the basename of the generated Zips to "name" instead of "give"
(this is quite obsolete now; use -tGz instead)
-tr choose pics at random (instead of from the beginning of the collection)
-tf fake it (don't copy anything, useful to see how much would be copied)
-tF same, but don't check if files actually exist (makes a difference
when you use multi-volume spanning)
-tw trade 'w'hole collections. You need this if your partner wants a complete
collection from you, and so you have no report. Use -tw and the name
of the collection (e.g. Scanmaster) instead of the name of the report file.
The name of the reportfile from which the informations are taken replaces the
source directory in the commandline. You can't sort in any pics in trade mode.
If you want to ask from a different collection than offer, it takes two reports
and two runs of ScanSort.
In versions before 1.8 only reports which include the filelength for each pic after the
name were supported. Now it can handle simple reports (just the missing names) as well.
Of course Scansort can determine the collection then only if the first name is unique.
If you have a report which is not supported mail it to me and I'll see what I can do.
You can speed up trading much if you know the collection(s) used:
scansort config.txt -sScanmaster -sSkunkmaster need*.txt -tGz
will only load the two CSVs for Scanmster and Skunkmaster and leave the other 1000 alone.
I got several mails from people having problems with trading so here are examples.
Still it is an "advanced" feature so you should first get a bit familiar with
ScanSort.
Don't misunderstand the concept - ScanSort doesn't compare TWO reports against
each other but ONE report against your collection. So you have to specify a
configfile and cd to your collectiondir (or give it with -dpDIR), like always.
Example:
file trade.txt (in d:\trade)
#
# CSV dir
-dcd:\gra\csv
# Collection dir
-dpd:\gra\scan
# Collections currently traded
MTCM_Rhabdo_234.csv
MTCM_Scanbyte_421.csv
McBluna_Simulator_730.csv
McBluna_Riptorn_168.csv
# end of file
You want to complete your Rhabdos, your friend "John" his Scanbytes.
John doesn't use ScanSort, so you have to do all the work.
Cd to d:\trade where you have stored the reports rhabdo.txt and scanbyte.txt
which John sent you.
1)
scansort trade.txt rhabdo.txt -ta
creates a file ask.txt with all the Rhabdos John has and you need.
You can send this to John so he can easily select the files for you.
Watch out for the messages ! If the report is ScanSort-style with the header
removed Scansort could become confused if the files are "have" or "miss".
Then you should put the keywords "have" or "miss" into the report
(in the line before the first picture)
2)
scansort trade.txt scanbyte.txt -to
creates a file offer.txt with all the Scanbytes you have and John needs.
3)
scansort trade.txt scanbyte.txt -tg20
- copies the first 20 Scanbytes that John needs to the current dir.
scansort trade.txt need.txt -trg30z5
- copies 30 MORE random Scanbytes that John needs to the current dir in Zipfiles
with 5 pics each (zip.exe must be in path)
(need.txt was generated in the last run and has all files John needs
except the 20 you already copied)
scansort trade.txt scanbyte.txt -tgz1000Zscb
- copies all your Scanbytes that John needs to the current dir in Zipfiles with
less than 1000k each. Files are named: scb01.zip, scb02.zip, ...
If you have moved your Scanbytes on a CD (see multi-volume-spanning):
scansort trade.txt scanbyte.txt -hhave.txt -dse:\Scanbyte -tg500z1000
( you may want to put the -h and the -ds options into the config file trade.txt)
I suggest that you arrange with John how much you want to trade in the next
few days. Then prepare all the Zipfiles needed in advance and send them in
the quantity you decided. When you have run out of them you should send each other
new reports to resync.
If John uses ScanSort too you can just send him a plain report instead of the ask.txt.
--------------------------------------------------------------------------------
Matching whole collections
--------------------------------------------------------------------------------
O.K., you got an eager friend and a fast connection, so you want to go for the
BIG thing - give him ALL files he needs.
1) Ask him to send you all his reports, as individual files (in a zip file probably)
2) scansort all.txt -tg *.txt
This is essentially the same as single collection trading repeated for each
report file *.txt. There are a few differences:
- verbose names are used always for text files or zipfiles
- if you don't zip the pics they go to individual folders for each collection
The number of pics / kilobytes you give is honored now. So you can easily
prepare 40 MB for transfer:
scansort all.txt -tg40000 *.txt
Scansort runs through the reportfiles. If all pics from one report are copied it is
deleted, if not it is replaced by need_NAME.txt. If you have spread the collection
over multiple CDs you have to run ScanSort once for each CD.
--------------------------------------------------------------------------------
Automatic repair of corrupt files
--------------------------------------------------------------------------------
Don't expect too much here !
I have stumbled upon several pics with correct size but wrong CRC. When I examined
them I found out that just the last byte was set to zero. Now, the correct ending
of a JPG is always 0xff 0xd9, so if a file has wrong CRC and the ending is wrong
ScanSort replaces the ending and checks the CRC again.
There are also sometimes files with are a bit longer than they should be
because they got some extra bytes appended at the ending by buggy mail programs.
If they still have the correct name they are identified though and copied
to your collection (without the extra bytes of course).
This works only for some few files. Most corrupt files have data missing and can't be