Difference between revisions of "Prokee Module: bws"
Jump to navigation
Jump to search
| Line 1: | Line 1: | ||
This module provides the second part of a two part scanning approach. | This module provides the second part of a two part scanning approach. | ||
# BasicBlockScanner | # BasicBlockScanner | ||
| − | # BasicWesternScanner (bws) | + | # [[Prokee Module: bbs|BasicWesternScanner (bws)]] |
== Tokens == | == Tokens == | ||
Revision as of 00:51, 7 May 2019
This module provides the second part of a two part scanning approach.
- BasicBlockScanner
- BasicWesternScanner (bws)
Contents
Tokens
The following types of tokens are recognized:
- Literals
- Operators
- Keywords
- Other words (separated by operators or change of alphabets)
- Level-2 blocks (as provided by BasicBlockScanner)
Literals
Operators
Keywords
Other Words
Level-2 Blocks
Languages
Example
[language_German] ->haveSeperators="true"; ->alphabet="latin_german"; ->alphabet="alphanumeric"; ->useDictionary="false"; ->dictionary="main_german"; [language_Chinese] ->haveSeperators="false"; ->alphabet="hanzi"; ->cutat="Symbol"; [language_Japanese] ->haveSeperators="false"; ->alphabet="Kanji"; ->alphabet="Hiragana"; ->alphabet="Katakana"; ->cutat="Alphabet";
Alphabets
Example
[alphabet_hexdigit] ->first="0"; ->last="9"; ->extra="ABCDEFGH"; ->extra="abcdefgh"; [alphabet_alphanumeric] ->first="0"; ->last="9"; ->first="A"; ->last="Z"; ->first="a"; ->last="z"; ->extra="_"; [alphabet_Japanese] ->include="Hiragana";//including an other alphabet ->include="Katakana";//including an other alphabet ->include="Kanji"; //including an other alphabet
Global Settings
Example
[settings] ->useAlphabets="true"; ->useDictionary="false"; ->dictionary="main_german"; ->dictionary="main_english"; ->cutat="Operator"; ->cutat="Alphabet"; ->cutat="Symbol";