More haste less speed ... redoing work
Update ... I have just re-scanned another 20 mags having missed a line in another place on the page, but what I had missed is that there is a WHITE line on the back of every scan. While the paper was white, it's not immediately obvious and working back through the scans it does appear to be growing in width. The latest batch of magazines have a greyish colour to the paper and when one zooms in it is a lot more obvious. Time to take advantage of the warranty and see what Brother say about it. Do I want to rescan all these mags yet again? It is going to irritate if I don't, but being practical, the real end target of properly OCRed versions will eliminate the paper colouration and hence the problem, so it does not seem necessary to do anything with the existing production. Just a matter of if I wait for a responce from Brother as to if it will be fixed anyway?
The ledger size magazines from the 1980's have all been processed as well, and the car has been emptied. I have a couple of boxes with ledger size magazines from the 1960's. Issues 3224 to 3311 which I think are Volumn 129, 130 and 131. These as well as being a little smaller than A3, are also a smaller number of pages with 11 or 12 sheets to scan and the resulting file is well clear of the memory limit of the scanner. Since the front page has colour print, I am still doing a colour scan of the whole magazine. I had tried using the grey scale scan to see if the document resulted in a smaller file size, but as yet I've not seen any saving. That at a later stage it may be worth trying other options where the majority of the magazine is only black and white.
One other thing that has become apparent is that while the files that are produced by PDF Arranger look organised A4 sheets, this is created by simply tagging the orginal images to trim and rotate the displayed page. I've asked a question about this on the PDF Arranger forum to see if my understanding is correct. At some point I would like to be able to output a set of pictures from the magazine but this is complicated if the base image is not the A4 page I am looking at. A proper process to OCR the magazines and eliminate the raw scans is on the todo list.