Difference between revisions of "Prokee Module: bws"
Jump to navigation
Jump to search
(→Global Settings) |
m (Text replacement - "http://www.andreaspollhammer.com/downloads/docu/html/" to "http://www.andreaspollhammer.com/lab/downloads/docu/html/") |
||
| (8 intermediate revisions by the same user not shown) | |||
| Line 1: | Line 1: | ||
This module provides the second part of a two part scanning approach. | This module provides the second part of a two part scanning approach. | ||
| − | # BasicBlockScanner | + | # [[Prokee Module: bbs|BasicBlockScanner (bbs)]] |
# BasicWesternScanner (bws) | # BasicWesternScanner (bws) | ||
| Line 22: | Line 22: | ||
== Languages == | == Languages == | ||
| + | |||
| + | === Example === | ||
| + | [language_German] | ||
| + | ->haveSeperators="true"; | ||
| + | ->alphabet="latin_german"; | ||
| + | ->alphabet="alphanumeric"; | ||
| + | ->useDictionary="false"; | ||
| + | ->dictionary="main_german"; | ||
| + | |||
| + | [language_Chinese] | ||
| + | ->haveSeperators="false"; | ||
| + | ->alphabet="hanzi"; | ||
| + | ->cutat="Symbol"; | ||
| + | |||
| + | [language_Japanese] | ||
| + | ->haveSeperators="false"; | ||
| + | ->alphabet="Kanji"; | ||
| + | ->alphabet="Hiragana"; | ||
| + | ->alphabet="Katakana"; | ||
| + | ->cutat="Alphabet"; | ||
== Alphabets == | == Alphabets == | ||
| + | |||
| + | === Example === | ||
| + | [alphabet_hexdigit] | ||
| + | ->first="0"; | ||
| + | ->last="9"; | ||
| + | ->extra="ABCDEFGH"; | ||
| + | ->extra="abcdefgh"; | ||
| + | |||
| + | [alphabet_alphanumeric] | ||
| + | ->first="0"; | ||
| + | ->last="9"; | ||
| + | ->first="A"; | ||
| + | ->last="Z"; | ||
| + | ->first="a"; | ||
| + | ->last="z"; | ||
| + | ->extra="_"; | ||
| + | |||
| + | [alphabet_Japanese] | ||
| + | ->include="Hiragana";//including an other alphabet | ||
| + | ->include="Katakana";//including an other alphabet | ||
| + | ->include="Kanji"; //including an other alphabet | ||
== Global Settings == | == Global Settings == | ||
| Line 36: | Line 77: | ||
->cutat="Alphabet"; | ->cutat="Alphabet"; | ||
->cutat="Symbol"; | ->cutat="Symbol"; | ||
| + | |||
| + | == Implementations == | ||
| + | * [http://www.andreaspollhammer.com/lab/downloads/docu/html/bws_v01.php bws (version v01)] | ||
| + | |||
| + | [[Category:Prokee Modules]] | ||
| + | [[Category:Scanner]] | ||
Latest revision as of 16:01, 1 June 2019
This module provides the second part of a two part scanning approach.
- BasicBlockScanner (bbs)
- BasicWesternScanner (bws)
Contents
Tokens
The following types of tokens are recognized:
- Literals
- Operators
- Keywords
- Other words (separated by operators or change of alphabets)
- Level-2 blocks (as provided by BasicBlockScanner)
Literals
Operators
Keywords
Other Words
Level-2 Blocks
Languages
Example
[language_German] ->haveSeperators="true"; ->alphabet="latin_german"; ->alphabet="alphanumeric"; ->useDictionary="false"; ->dictionary="main_german"; [language_Chinese] ->haveSeperators="false"; ->alphabet="hanzi"; ->cutat="Symbol"; [language_Japanese] ->haveSeperators="false"; ->alphabet="Kanji"; ->alphabet="Hiragana"; ->alphabet="Katakana"; ->cutat="Alphabet";
Alphabets
Example
[alphabet_hexdigit] ->first="0"; ->last="9"; ->extra="ABCDEFGH"; ->extra="abcdefgh"; [alphabet_alphanumeric] ->first="0"; ->last="9"; ->first="A"; ->last="Z"; ->first="a"; ->last="z"; ->extra="_"; [alphabet_Japanese] ->include="Hiragana";//including an other alphabet ->include="Katakana";//including an other alphabet ->include="Kanji"; //including an other alphabet
Global Settings
Example
[settings] ->useAlphabets="true"; ->useDictionary="false"; ->dictionary="main_german"; ->dictionary="main_english"; ->cutat="Operator"; ->cutat="Alphabet"; ->cutat="Symbol";