Discussion:
multi-page continuous scanner anyone?
(too old to reply)
Boris Epstein
2009-06-17 22:02:51 UTC
Permalink
Hi all,

I am wondering if anybody can recommend a duplex-capable multi-sheet
(automatic) scanner to be used under OpenSuSE Linux. By multi-page I
mean something with a feeder that can scan in a whole stack of paper.

Thanks.

Boris.
--
To unsubscribe, e-mail: opensuse+***@opensuse.org
For additional commands, e-mail: opensuse+***@opensuse.org
Ritchie Fraser
2009-06-17 23:20:00 UTC
Permalink
Post by Boris Epstein
Hi all,
I am wondering if anybody can recommend a duplex-capable multi-sheet
(automatic) scanner to be used under OpenSuSE Linux. By multi-page I
mean something with a feeder that can scan in a whole stack of paper.
Thanks.
Boris.
Hi Boris

I have an HP5590 and it works quite well under OpenSUSE 11.1. Autosheet feeder
and duplex scanning too.
--
Kind Regards,

Ritchie
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Ritchie Fraser
Web: http://www.rpfraser.uklinux.net
Registered Linux User #255860
--
To unsubscribe, e-mail: opensuse+***@opensuse.org
For additional commands, e-mail: opensuse+***@opensuse.org
Boris Epstein
2009-06-18 16:53:12 UTC
Permalink
HI Boris,
Post by Boris Epstein
Hi all,
I am wondering if anybody can recommend a duplex-capable multi-sheet
(automatic) scanner to be used under OpenSuSE Linux. By multi-page I
mean something with a feeder that can scan in a whole stack of paper.
Thanks.
Boris.
Just last week a weekly news email I get talked about getting a new office
scanner. Below is what they wrote. I looked it up on the 'net and it seems to
work with Linux. Check it out,
Hope this helps,
JIM
##################
A PRODUCT RECOMMENDATION. I needed a new scanner. I asked a few people,
   including a guy I know who owns a business employing 75 people but has
   no filing cabinets, what is a good scanner to computerize receipts and
   such? The response was unanimous: the Fujitsu ScanSnap. It's amazing: a
   contract and a check? Scans the differing page sizes without a hitch,
   scans fronts and backs at the same time, discards any blank pages,
   creates a PDF file, and then does an OCR (optical character
   recognition) pass on the file so that you can search within it. And it
   comes with all the software you need, too. The thing is so amazing that
   if you put a page in upside down, it will usually detect that and flip
   it around for you! And it detects the rare occasions when it misfeeds
   (e.g., pulls more than one page through at a time.) A 5-page scan,
   front AND back, only takes about 15 seconds (plus OCR time; that
   process runs in the background). I'm completely blown away by it, and
   my bookkeeper just loves it, too, since I now know where everything is,
   and can e-mail stuff to her easily. It's a tad pricey ($470), but
   cheaper at Amazon ($420 as of this writing). This thing gets my highest
   recommendation. http://ThisIsTrue.com/d-scansnap
   Ugh: I just clicked the link to check it before sending this out, and
   Amazon has raised their price to $465 since Monday's Premium edition!
   Sheesh. Check Newegg.com, then: it's $409 there right now, though
   that's plus shipping (free on Amazon). Still, that's the best price I
   can find right now.
#################
--
Jim Hatridge
Linux User #88484
Ebay ID: WartHogBulletin
Thanks Jim!

Looks like it only goes up to 600x600 dpi optical, though.

Boris.
--
To unsubscribe, e-mail: opensuse+***@opensuse.org
For additional commands, e-mail: opensuse+***@opensuse.org
Adam Tauno Williams
2009-06-18 18:27:03 UTC
Permalink
Post by Boris Epstein
recognition) pass on the file so that you can search within it. And it
comes with all the software you need, too. The thing is so amazing that
if you put a page in upside down, it will usually detect that and flip
I see the phrase "all the software you need" and red flags go up. How
much of all this is done my 'the device' and how much is done by
software on a [Windows?] host?

Does this device work with LINUX/openSUSE?

I'm curious because I've looked at several such things and they are
always tethered to a Windows 200x server, usually with M$-SQL as well.
So the cost of the-device is almost irrelevant.
Post by Boris Epstein
Looks like it only goes up to 600x600 dpi optical, though.
For document archive 600x600 is overkill.
--
To unsubscribe, e-mail: opensuse+***@opensuse.org
For additional commands, e-mail: opensuse+***@opensuse.org
Greg Freemyer
2009-06-18 20:36:40 UTC
Permalink
Post by Adam Tauno Williams
Post by Boris Epstein
Looks like it only goes up to 600x600 dpi optical, though.
For document archive 600x600 is overkill.
Typically 200x200 is used and 300x300 is used for high quality.
Assuming your coming from normal paper docs.

Greg
--
Greg Freemyer
Head of EDD Tape Extraction and Processing team
Litigation Triage Solutions Specialist
http://www.linkedin.com/in/gregfreemyer
First 99 Days Litigation White Paper -
http://www.norcrossgroup.com/forms/whitepapers/99%20Days%20whitepaper.pdf

The Norcross Group
The Intersection of Evidence & Technology
http://www.norcrossgroup.com
--
To unsubscribe, e-mail: opensuse+***@opensuse.org
For additional commands, e-mail: opensuse+***@opensuse.org
Carlos E. R.
2009-06-18 21:51:01 UTC
Permalink
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Post by Greg Freemyer
Post by Adam Tauno Williams
Post by Boris Epstein
Looks like it only goes up to 600x600 dpi optical, though.
For document archive 600x600 is overkill.
Typically 200x200 is used and 300x300 is used for high quality.
Assuming your coming from normal paper docs.
If I were scanning my magazine collections, with photos, I would use
600dpi minimum, so that I could print a page later as good as the
original.

Which makes me wonder if it could be possible to scan a page with
different resolutions for text and images, automatically.

Maybe in the future.

Or at least store it differently. Perhaps DjVu... but the available open
tools for creating djvu files are far from optimal.

- --
Cheers,
Carlos E. R.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.9 (GNU/Linux)

iEYEARECAAYFAko6ttMACgkQtTMYHG2NR9WVjACfcMmXEdPRZ//VAajBk+2u7I3X
pSAAoIwo72ZyDTtLDnFadul1UCOCsuFs
=2LjR
-----END PGP SIGNATURE-----
--
To unsubscribe, e-mail: opensuse+***@opensuse.org
For additional commands, e-mail: opensuse+***@opensuse.org
Randall R Schulz
2009-06-18 22:51:02 UTC
Permalink
Post by Carlos E. R.
Post by Greg Freemyer
Post by Adam Tauno Williams
Post by Boris Epstein
Looks like it only goes up to 600x600 dpi optical, though.
For document archive 600x600 is overkill.
Typically 200x200 is used and 300x300 is used for high quality.
Assuming your coming from normal paper docs.
If I were scanning my magazine collections, with photos, I would use
600dpi minimum, so that I could print a page later as good as the
original.
I agree, and 600 dpi won't get you a particularly faithful reproduction.
Phototypsetting equipment realizes 2400 DPI, typically.
Post by Carlos E. R.
Which makes me wonder if it could be possible to scan a page with
different resolutions for text and images, automatically.
Maybe in the future.
Or at least store it differently. Perhaps DjVu... but the available
open tools for creating djvu files are far from optimal.
I'm a little curious what Google and ACM (to name only two) use to
digitize print collections. The results render well and, what's much
more impressive are OCR-ed quite well, too. ACM's entire digital
library (most of which predates digital originals) is searchable even
when the original had to be scanned and OCR-ed.
Post by Carlos E. R.
--
Cheers,
Carlos E. R.
Randall Schulz
--
To unsubscribe, e-mail: opensuse+***@opensuse.org
For additional commands, e-mail: opensuse+***@opensuse.org
Carlos E. R.
2009-06-20 08:28:07 UTC
Permalink
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Post by Randall R Schulz
Post by Carlos E. R.
Post by Greg Freemyer
Post by Adam Tauno Williams
Post by Boris Epstein
Looks like it only goes up to 600x600 dpi optical, though.
For document archive 600x600 is overkill.
Typically 200x200 is used and 300x300 is used for high quality.
Assuming your coming from normal paper docs.
If I were scanning my magazine collections, with photos, I would use
600dpi minimum, so that I could print a page later as good as the
original.
I agree, and 600 dpi won't get you a particularly faithful reproduction.
Phototypsetting equipment realizes 2400 DPI, typically.
600 dpi happens to be my printer resolution, so going further would be
pointless ;-)
Post by Randall R Schulz
Post by Carlos E. R.
Which makes me wonder if it could be possible to scan a page with
different resolutions for text and images, automatically.
Maybe in the future.
Or at least store it differently. Perhaps DjVu... but the available
open tools for creating djvu files are far from optimal.
I'm a little curious what Google and ACM (to name only two) use to
digitize print collections. The results render well and, what's much
more impressive are OCR-ed quite well, too. ACM's entire digital
library (most of which predates digital originals) is searchable even
when the original had to be scanned and OCR-ed.
Yep. Good OCR for me is almost impossible to achieve, but these big chaps
seems to have it solved.

Djvu format, by the way, can store B/W for text, color for photos, and
text for the OCR, all in the same file and for each page. In theory, at
least: with the open tools we have that's almost impossible to get. The
better tools are not open.

It is a very good format for scanned material, but it doesn't seem to
catch :-?

- --
Cheers,
Carlos E. R.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.9 (GNU/Linux)

iEYEARECAAYFAko8naAACgkQtTMYHG2NR9XZjACeP8AKmEtwJDlMP1rsAtitF6aM
sW0AoI7QZhla26P/CbR86Tr5SHVgMTjR
=GHRx
-----END PGP SIGNATURE-----
--
To unsubscribe, e-mail: opensuse+***@opensuse.org
For additional commands, e-mail: opensuse+***@opensuse.org
David C. Rankin
2009-06-28 03:32:03 UTC
Permalink
Post by Carlos E. R.
Post by Randall R Schulz
Post by Carlos E. R.
Post by Greg Freemyer
Post by Adam Tauno Williams
Post by Boris Epstein
Looks like it only goes up to 600x600 dpi optical, though.
For document archive 600x600 is overkill.
Typically 200x200 is used and 300x300 is used for high quality.
Assuming your coming from normal paper docs.
If I were scanning my magazine collections, with photos, I would use
600dpi minimum, so that I could print a page later as good as the
original.
I agree, and 600 dpi won't get you a particularly faithful reproduction.
Phototypsetting equipment realizes 2400 DPI, typically.
600 dpi happens to be my printer resolution, so going further would be
pointless ;-)
Post by Randall R Schulz
Post by Carlos E. R.
Which makes me wonder if it could be possible to scan a page with
different resolutions for text and images, automatically.
Maybe in the future.
Or at least store it differently. Perhaps DjVu... but the available
open tools for creating djvu files are far from optimal.
I'm a little curious what Google and ACM (to name only two) use to
digitize print collections. The results render well and, what's much
more impressive are OCR-ed quite well, too. ACM's entire digital
library (most of which predates digital originals) is searchable even
when the original had to be scanned and OCR-ed.
Yep. Good OCR for me is almost impossible to achieve, but these big chaps
seems to have it solved.
Djvu format, by the way, can store B/W for text, color for photos, and
text for the OCR, all in the same file and for each page. In theory, at
least: with the open tools we have that's almost impossible to get. The
better tools are not open.
It is a very good format for scanned material, but it doesn't seem to
catch :-?
Just to add to the OCR discussion, I have had good luck with tesseract. I use
it as part of our hylafax/avantfax fax server that automatically does OCR on
incoming faxes at our office....
--
David C. Rankin, J.D.,P.E.
Rankin Law Firm, PLLC
510 Ochiltree Street
Nacogdoches, Texas 75961
Telephone: (936) 715-9333
Facsimile: (936) 715-9339
www.rankinlawfirm.com
--
To unsubscribe, e-mail: opensuse+***@opensuse.org
For additional commands, e-mail: opensuse+***@opensuse.org
Carlos E. R.
2009-06-28 09:19:30 UTC
Permalink
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1



On Saturday, 2009-06-27 at 22:32 -0500, David C. Rankin wrote:

...
Post by David C. Rankin
Just to add to the OCR discussion, I have had good luck with tesseract. I use
it as part of our hylafax/avantfax fax server that automatically does OCR on
incoming faxes at our office....
Interesting.

From Wikipedia, the free encyclopedia

In computer software, Tesseract is a free optical character recognition
engine. It was originally developed as proprietary software at
Hewlett-Packard between 1985 until 1995. After ten years without any
development taking place, Hewlett Packard and UNLV released it as open
source in 2005. Tesseract is currently developed by Google and released
under the Apache License, Version 2.0.[2][3][1]

Tesseract is considered one of the most accurate free software OCR
engines currently available.[3][4]

The current version of Tesseract is 2.03, released April 22, 2008.[5]

...

Tesseract is an OCR engine, and it does not have a graphical user
interface. It runs from the command line, and may be called with the
command:[7]

tesseract image.tif output [options]

Tesseract handles image files in TIFF format (with filename extension
.tif);[7] other file formats need to be converted to TIFF before being
submitted to Tesseract.

Tesseract does not support layout analysis, which means that it cannot
interpret multi-column text, images, or equations, and in these cases
will produce a garbled text output.[3]

http://en.wikipedia.org/wiki/Tesseract_(software)


You could add how do you installed it, in suse. Looking on webpin, I just
see unofficial packages:

***@nimrodel:~> webpin tesseract
2 results (2 packages) found for "tesseract" in openSUSE_110
* tesseract-ocr: An OCR engine
- 20080718svn178 [BS::home:/jnweiger]
* tesseract-ocr-devel: Libraries and Header Files to Develop with Tesseract
- 20080718svn178 [BS::home:/jnweiger]
***@nimrodel:~>



The wikipedia mentions also OCRopus, used by Google Book Search, using
Tesseract as a plugin:

OCRopus is a free document analysis and OCR system released under the
Apache License, Version 2.0 with a very modular design through the use of
plugins. These plugins allow OCRopus to swap out components easily.

OCRopus is currently developed under the lead of Thomas Breuel from the
German Research Centre for Artificial Intelligence in Kaiserslautern,
Germany and is sponsored by Google. OCRopus is developed for Linux;
however, users have reported success with OCRopus on Mac OS X and an
application called TakOCR[1] has been developed that installs OCRopus on
Mac OS X and provides a simple droplet interface.

It is also CLI only.

- --
Cheers,
Carlos E. R.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.9 (GNU/Linux)

iEYEARECAAYFAkpHNakACgkQtTMYHG2NR9XtuwCeMEvFr0hfvWdoRsmpsLrfFV/0
L8QAoIDUAZz/j/KPQvPhLuLsnjuXDbDJ
=2Eac
-----END PGP SIGNATURE-----
--
To unsubscribe, e-mail: opensuse+***@opensuse.org
For additional commands, e-mail: opensuse+***@opensuse.org
Adam Tauno Williams
2009-06-28 14:26:16 UTC
Permalink
Post by David C. Rankin
Post by Carlos E. R.
Post by Randall R Schulz
Post by Carlos E. R.
Post by Greg Freemyer
Post by Adam Tauno Williams
Post by Boris Epstein
Looks like it only goes up to 600x600 dpi optical, though.
For document archive 600x600 is overkill.
Typically 200x200 is used and 300x300 is used for high quality.
Assuming your coming from normal paper docs.
If I were scanning my magazine collections, with photos, I would use
600dpi minimum, so that I could print a page later as good as the
original.
I agree, and 600 dpi won't get you a particularly faithful reproduction.
Phototypsetting equipment realizes 2400 DPI, typically.
600 dpi happens to be my printer resolution, so going further would be
pointless ;-)
Post by Randall R Schulz
Post by Carlos E. R.
Which makes me wonder if it could be possible to scan a page with
different resolutions for text and images, automatically.
Maybe in the future.
Or at least store it differently. Perhaps DjVu... but the available
open tools for creating djvu files are far from optimal.
I'm a little curious what Google and ACM (to name only two) use to
digitize print collections. The results render well and, what's much
more impressive are OCR-ed quite well, too. ACM's entire digital
library (most of which predates digital originals) is searchable even
when the original had to be scanned and OCR-ed.
Yep. Good OCR for me is almost impossible to achieve, but these big chaps
seems to have it solved.
Djvu format, by the way, can store B/W for text, color for photos, and
text for the OCR, all in the same file and for each page. In theory, at
least: with the open tools we have that's almost impossible to get. The
better tools are not open.
It is a very good format for scanned material, but it doesn't seem to
catch :-?
Just to add to the OCR discussion, I have had good luck with tesseract. I use
it as part of our hylafax/avantfax fax server that automatically does OCR on
incoming faxes at our office....
How about posting your Hylafax faxrcvd script so other can use it as a
template? Or a link if you used some site/howto for setting it up.
--
OpenGroupware developer: ***@whitemice.org
<http://whitemiceconsulting.blogspot.com/>
OpenGroupare & Cyrus IMAPd documenation @
<http://docs.opengroupware.org/Members/whitemice/wmogag/file_view>
--
To unsubscribe, e-mail: opensuse+***@opensuse.org
For additional commands, e-mail: opensuse+***@opensuse.org
David C. Rankin
2009-06-29 05:19:43 UTC
Permalink
Post by Adam Tauno Williams
Post by David C. Rankin
Post by Carlos E. R.
Post by Randall R Schulz
Post by Carlos E. R.
Post by Greg Freemyer
Post by Adam Tauno Williams
Post by Boris Epstein
Looks like it only goes up to 600x600 dpi optical, though.
For document archive 600x600 is overkill.
Typically 200x200 is used and 300x300 is used for high quality.
Assuming your coming from normal paper docs.
If I were scanning my magazine collections, with photos, I would use
600dpi minimum, so that I could print a page later as good as the
original.
I agree, and 600 dpi won't get you a particularly faithful
reproduction. Phototypsetting equipment realizes 2400 DPI, typically.
600 dpi happens to be my printer resolution, so going further would be
pointless ;-)
Post by Randall R Schulz
Post by Carlos E. R.
Which makes me wonder if it could be possible to scan a page with
different resolutions for text and images, automatically.
Maybe in the future.
Or at least store it differently. Perhaps DjVu... but the available
open tools for creating djvu files are far from optimal.
I'm a little curious what Google and ACM (to name only two) use to
digitize print collections. The results render well and, what's much
more impressive are OCR-ed quite well, too. ACM's entire digital
library (most of which predates digital originals) is searchable even
when the original had to be scanned and OCR-ed.
Yep. Good OCR for me is almost impossible to achieve, but these big
chaps seems to have it solved.
Djvu format, by the way, can store B/W for text, color for photos, and
text for the OCR, all in the same file and for each page. In theory, at
least: with the open tools we have that's almost impossible to get. The
better tools are not open.
It is a very good format for scanned material, but it doesn't seem to
catch :-?
Just to add to the OCR discussion, I have had good luck with tesseract. I
use it as part of our hylafax/avantfax fax server that automatically does
OCR on incoming faxes at our office....
How about posting your Hylafax faxrcvd script so other can use it as a
template? Or a link if you used some site/howto for setting it up.
Sure,

The Package I uses was Avantfax. I set up a page that is a short howto:

http://www.3111skyline.com/linux/avantfax.php
--
David C. Rankin, J.D.,P.E.
Rankin Law Firm, PLLC
510 Ochiltree Street
Nacogdoches, Texas 75961
Telephone: (936) 715-9333
Facsimile: (936) 715-9339
www.rankinlawfirm.com
--
To unsubscribe, e-mail: opensuse+***@opensuse.org
For additional commands, e-mail: opensuse+***@opensuse.org
Continue reading on narkive:
Loading...