Class OCR.Options
- java.lang.Object
-
- org.sikuli.script.OCR.Options
-
- All Implemented Interfaces:
java.lang.Cloneable
- Enclosing class:
- OCR
public static class OCR.Options extends java.lang.Object implements java.lang.Cloneable
A container for the options relevant for usingOCR
onRegion
orImage
.Use
OCR.Options
to get a new option setUse
OCR.globalOptions()
to access the global optionsIn case you have to consult the Tesseract docs
- See Also:
- Tesseract docs
-
-
Constructor Summary
Constructors Constructor Description Options()
create a new Options set from the initial defaults settings.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description OCR.Options
asChar()
Configure Options to recognize a single character.OCR.Options
asLine()
Configure Options to recognize a single line.OCR.Options
asWord()
Configure Options to recognize a single word.OCR.Options
bestDPI(int dpi)
INTERNAL: (under investigation).OCR.Options
clone()
makes a copy of this Optionsjava.util.List<java.lang.String>
configs()
get current configsOCR.Options
configs(java.lang.String... configs)
set one ore more configs file names.OCR.Options
configs(java.util.List<java.lang.String> configs)
set a list of configs file names.java.lang.String
dataPath()
get the current datapath in this Options.OCR.Options
dataPath(java.lang.String dataPath)
Set folder for Tesseract to find language and configs files.OCR.Options
fontSize(int size)
Configure the image optimization.java.lang.String
language()
get the cutrrent languageOCR.Options
language(java.lang.String language)
Set the language short string.int
oem()
get this OEM.OCR.Options
oem(int oem)
set this OEM.OCR.Options
oem(OCR.OEM oem)
set this OEM.int
psm()
get this PSM.OCR.Options
psm(int psm)
set this PSM.OCR.Options
psm(OCR.PSM psm)
set this PSM.OCR.Options
reset()
resets this Options set to the initial defaults.OCR.Options
resetPSM()
Sets this PSM to -1.OCR.Options
resizeInterpolation(org.sikuli.script.Element.Interpolation method)
INTERNAL: (under investigation).OCR.Options
smallFont()
Convenience: Configure the Option's optimization.float
textHeight()
current base for image optimization before OCR.OCR.Options
textHeight(float height)
Configure image optimization.java.lang.String
toString()
Current state of this Options as some formatted lines of text.OCR.Options
userDPI(int dpi)
INTERNAL: (under investigation).OCR.Options
variable(java.lang.String key, java.lang.String value)
set a variable for Tesseract.java.util.Map<java.lang.String,java.lang.String>
variables()
-
-
-
Constructor Detail
-
Options
public Options()
create a new Options set from the initial defaults settings.about the default settings see
OCR.reset()
-
-
Method Detail
-
clone
public OCR.Options clone()
makes a copy of this Options- Returns:
- new Options as copy
-
reset
public OCR.Options reset()
resets this Options set to the initial defaults.- Returns:
- this
- See Also:
OCR.reset()
-
toString
public java.lang.String toString()
Current state of this Options as some formatted lines of text.OCR.Options: data = ...some-path.../tessdata language(eng) oem(3) psm(3) height(15,1) factor(1,99) dpi(96) configs: conf1, conf2, ... variables: key:value, ...
- Overrides:
toString
in classjava.lang.Object
- Returns:
- a text string as before
-
oem
public int oem()
get this OEM.- Returns:
- oem as int
- See Also:
OCR.OEM
-
oem
public OCR.Options oem(int oem)
set this OEM.- Parameters:
oem
- as int- Returns:
- this Options
- See Also:
OCR.OEM
-
oem
public OCR.Options oem(OCR.OEM oem)
set this OEM.- Parameters:
oem
- as enum constant- Returns:
- this Options
- See Also:
OCR.OEM
-
psm
public int psm()
get this PSM.- Returns:
- psm as int
- See Also:
OCR.PSM
-
psm
public OCR.Options psm(int psm)
set this PSM.- Parameters:
psm
- as int- Returns:
- this Options
- See Also:
OCR.PSM
-
psm
public OCR.Options psm(OCR.PSM psm)
set this PSM.- Parameters:
psm
- as enum constant- Returns:
- this Options
- See Also:
OCR.PSM
-
resetPSM
public OCR.Options resetPSM()
Sets this PSM to -1.This causes Tess4J not to set the PSM at all.
Only use it, if you know what you are doing.- Returns:
- this Options
-
asLine
public OCR.Options asLine()
Configure Options to recognize a single line.- Returns:
- this Options
-
asWord
public OCR.Options asWord()
Configure Options to recognize a single word.- Returns:
- this Options
-
asChar
public OCR.Options asChar()
Configure Options to recognize a single character.- Returns:
- this Options
-
language
public java.lang.String language()
get the cutrrent language- Returns:
- the language short string
- See Also:
language(String)
-
language
public OCR.Options language(java.lang.String language)
Set the language short string.(must not be null or empty, see
Settings.OcrLanguage
for a useable fallback)According to the Tesseract rules this is a 3-lowercase-letters string like eng, deu, fra, rus, ....
For special cases it might be something like xxx_yyy (chi_sim) or even xxx_yyyy (deu_frak) or even xxx_yyy_zzzz (chi_tra_vert), but always all lowercase.
Take care that you have the corresponding ....traineddata file in the datapath/tessdata folder latest at time of OCR feature usage
- Parameters:
language
- the language string- Returns:
- this Options
- See Also:
- Tesseract language files
-
dataPath
public java.lang.String dataPath()
get the current datapath in this Options.might be null, if no OCR feature was used until now
if null, it will be evaluated at time of OCR feature usage to the default SikuliX path or to Settings.OcrDataPath (if set)
- Returns:
- the current Tesseract datapath in this Options
-
dataPath
public OCR.Options dataPath(java.lang.String dataPath)
Set folder for Tesseract to find language and configs files.in the tessdata subfolder (the path spec might be given without the trailing /tessdata)
TAKE CARE, that all is in place at time of OCR feature usage
if null, it will be evaluated at time of OCR feature usage to the default SikuliX path or to Settings.OcrDataPath (if set)
- Parameters:
dataPath
- the absolute filename string- Returns:
- this Options
- See Also:
language(String)
-
smallFont
public OCR.Options smallFont()
Convenience: Configure the Option's optimization.Might give better results in cases with small fonts with a pixel height lt 12 (font sizes lt 10)
- Returns:
- this Options
-
textHeight
public float textHeight()
current base for image optimization before OCR.- Returns:
- value
- See Also:
textHeight(float)
-
textHeight
public OCR.Options textHeight(float height)
Configure image optimization.should be the (in case average) height in pixels of an uppercase X in the image's text
NOTE: should only be tried in cases, where the defaults do not lead to acceptable results
- Parameters:
height
- a number of pixels- Returns:
- this Options
-
fontSize
public OCR.Options fontSize(int size)
Configure the image optimization.should be the (in case average) fontsize as base for internally calculating the
textHeight()
NOTE: should only be tried in cases, where the defaults do not lead to acceptable results
- Parameters:
size
- of a font- Returns:
- this Options
-
resizeInterpolation
public OCR.Options resizeInterpolation(org.sikuli.script.Element.Interpolation method)
INTERNAL: (under investigation).should not be used - not supported
see
Element.Interpolation
for method options- Parameters:
method
- the interpolation method- Returns:
- this Options
-
bestDPI
public OCR.Options bestDPI(int dpi)
INTERNAL: (under investigation).should not be used - not supported
- Parameters:
dpi
- the dpi value- Returns:
- this Options
-
userDPI
public OCR.Options userDPI(int dpi)
INTERNAL: (under investigation).should not be used - not supported
- Parameters:
dpi
- 70 .. 2400- Returns:
- this Options
-
variables
public java.util.Map<java.lang.String,java.lang.String> variables()
- Returns:
- the currently stored variables
- See Also:
variable(java.lang.String, java.lang.String)
-
variable
public OCR.Options variable(java.lang.String key, java.lang.String value)
set a variable for Tesseract.you should know, what you are doing - consult the Tesseract docs
- Parameters:
key
- the keyvalue
- the value- Returns:
- this Options
- See Also:
- Tesseract docs
-
configs
public java.util.List<java.lang.String> configs()
get current configs- Returns:
- currently stored names of configs files
-
configs
public OCR.Options configs(java.lang.String... configs)
set one ore more configs file names.you should know, what you are doing - consult the Tesseract docs
- Parameters:
configs
- one or more configs filenames- Returns:
- this Options
- See Also:
- Tesseract docs
-
configs
public OCR.Options configs(java.util.List<java.lang.String> configs)
set a list of configs file names.you should know, what you are doing - consult the Tesseract docs
- Parameters:
configs
- a list of configs filenames- Returns:
- this Options
- See Also:
- Tesseract docs
-
-