Contents |
I have a number of scores which are currently awaiting time to process. If any readers have partiular requests among these, let me know and I may give them a higher priority. In the absence of any convincing (and polite) requests, I'll continue working to my own pace and to my own judgement.
This has been suggested by a visitor to this page. (My own wishlist is long and impractical.)
In the many hours I've spent in music libraries, I don't recall ever seeing this work. It appears to be available from the British Library, but acquiring it would incur an inter-library loan fee.
I still have scans of these on a HDD somewhere, but as they have since been uploaded from other sources I'm unlikely to do any further work on them.
I started by scanning scores from my personal collection in my local public library. Their low-budget Canon flatbed scanner performed well enough, so when I decided to buy my own scanner I bought a Canon LiDE 210, also because being USB-powered it would be portable. A downside to this is that the low-power illumination requires pages to be absolutely flat on the glass, so it may not be suitable for fragile or historically valuable documents.
By a happy coincidence, This model also works very well with XSane up to 600dpi, though at 1200dpi several narrow black vertical lines appear in XSane's output.
At various times I've used Scangear (windoze), VueScan (windoze) and XSane (Linux) as scanner front-ends. All these have produced excellent results. For further processing I use ImageMagick and Gimp. In the local public library I used their installation of Photoshop.
My scanning method differs from those suggested elsewhere on this site. I scan all music pages in greyscale, never in monochrome. Firstly, the deskew and resampling functions which I sometimes use give much better results with greyscale images. Secondly, having greyscale files means I can tinker with threshold settings in post-processing, rather than leaving this key part of the process to scanner software which is often horribly inadequate. I only convert pages to monochrome after they're straight and at the final resolution. This method also allows me to set a different threshold level for each page if necessary.
I also find that by using greyscale, scanning at 600dpi gives more than adequate results. I've only once had to resort to 1200dpi scanning, for a particularly unclear miniature score. In my experience, a 600dpi greyscale page is much easier to work with than 1200dpi monochrome: noise is easier to get rid of, and there are fewer ugly jaggies from the resampling process.
I generally try to recreate in the PDF the (virtual) size of the original, to the nearest millimetre, anyway. This an element of the original document, it doesn't take much extra trouble, and it functions as a kind of checksum on scanning, resampling, and processing. Some digital libraries provide size data; if they don't, I make a guess based on similar documents if any are readily available. Most current PDF printing software can be used to adjust the scaling of the pages to fit your printer.
A number of digital music libraries have adopted a distribution method which rather than making their scans available per-page or per-document, breaks the scans (which are usually of excellent quality) into small tiles which are reassembled in small batches when a user requests a page or part of a page.
For example, the Morgan Library's autograph manuscript of Schubert's Die Winterreise was (at its highest resolution) broken into 15,540 tiles no larger than 256x256px.
A later reassembling project downloaded 20,540 tiles and reassembled them into the pages of the orchestral score of Lassen's Symphony in D, while I was out doing the evening shopping.
The BnF's system is rather friendlier: it allows one to fetch the tiles in sizes up to 2048x2048px, which makes the downloading and reassembling processes much faster. The process was the same in both cases though: download the tiles, giving them filenames with row and column data which could be used by a script calling ImageMagick to reassamble the tiles into pages.
Here's an example of how to get from a medium-res JPEG to a reasonable quality mono PDF page in a single step:
convert input.jpg -filter lanczos -resize 200% -threshold 60% -monochrome -units PixelsPerInch -density 600 -compress group4 output.pdf
Here's how to create a blank (white) page:
convert -size 6368x8160 canvas:white -density 600 output.png
This is particularly useful for restoring the original size of pages scanned on an under-sized scanner. If the specified geometry is smaller than the input file, the output will be cropped. This example creates a 5338x7040px image with the input file placed in the top left corner.
convert input.tif -background white -extent 5338x7040+0+0 +repage output.png
Alternatively, the gravity command can be used to place the starting image relative to the new canvas. This example places it in the bottom right corner.
convert input.tif -background white -gravity southeast -extent 5338x7040 +repage output.png
This example is based on a recent real-life score hack in which the pages were 1500px wide and 1900~2000px tall colour JPEGs, with a watermark at the bottom of each page. This command removes the bottom 160 pixels of the input page, resamples the remainder to 3x its size, deskews the page, converts it to monochrome, crops the page to a suitable size, and saves it with a suitable density. After getting poor results with the cubic resampling filter, I tried the other available filters with a representative page sample and found that the catrom filter gave the best results for later conversion to monochrome. (I think that this filter would probably work well with most colour music scans.) In this case, because page size data was not available, I used a page size and density which gave a reasonable estimate of the original size. Saving the files as PNGs enabled me to hand-edit some particularly obtrusive noise from a few pages before converting them to PDFs and collating them.
Note the doubled percentage symbols for use in a Windows batch file. For *n*x shell scripts, you'd use single percentage symbols.
convert input.jpg -gravity south -chop 0x160 -filter catrom -resize 300%% -deskew 40%% -threshold 70%% -extent 4500x5400-100+0 +repage -density 450 output.png
This bash script takes files with names like ps001.tif, rotates them by 180 degrees if they are odd-numbered, deskews them, crops them to A4 portrait at 600dpi, converts them to monochrome, and saves them with names like pt001.png, while dumping a log of what it does.
#!/bin/bash LOGFILE="pp1.log" PAGEFIRST=1 PAGELAST=32 PAGEX=4960 # width of final page PAGEY=7016 # height of final page THRESHOLD="50%" PAGE=$PAGEFIRST until [ $PAGE -gt $PAGELAST ];do remainder=$(( $PAGE % 2 )) FILENAMEOLD="ps`printf '%03d' $PAGE`.tif" FILENAMENEW="pt`printf '%03d' $PAGE`.png" if [ $remainder -eq 0 ] then # even-numbered page COMMAND="convert $FILENAMEOLD -deskew 40% -gravity center -extent ${PAGEX}x${PAGEY} +repage -threshold $THRESHOLD -monochrome $FILENAMENEW" else # odd-numbered page COMMAND="convert $FILENAMEOLD -rotate 180 -deskew 40% -gravity center -extent ${PAGEX}x${PAGEY} +repage -threshold $THRESHOLD -monochrome $FILENAMENEW" fi echo "$(date +"%Y-%m-%d %H:%M:%S"): $COMMAND" echo "$(date +"%Y-%m-%d %H:%M:%S"): $COMMAND" >>$LOGFILE #echo $COMMAND `$COMMAND` let PAGE+=1 done
All of these are free and cross-platform. Many are standard parts of most Linux distros.
For some years now I've used PDF-XChange as a PDF viewer and printer for Windoze. It's closed-source, but far better than Adobe's equivalent bloatware, and with more features than the Foxit Reader. It has some very nice options for scaling pages and printing booklets. I've read that it works well under Linux/Wine, but I have yet to try this myself. Evince (Linux) works well for all of the scanned or vector-engraved music I've thrown at it so far.
VueScan is a highly-tweakable scanner front end (cross-platform, closed-source payware with a watermarking trial mode) which can work with a multitude of different scanners as well as RAW files from digital cameras. Folk doing industrial quantities of scanning, or using film scanners or scanners with poor software, should probably check it out.
Most of my scanning and processing projects were in abeyance for some time due to ill health. I've recently started uploading stuff again, though this will still be in small quantities.
GNU LilyPond. |
acmp-4 | This user is a professional accompanist. |
fl-4 | This user is a professional flautist. |