Difference between revisions of "Prokee Module: bws"
Jump to navigation
Jump to search
(→Languages) |
(→Example) |
||
| Line 23: | Line 23: | ||
== Languages == | == Languages == | ||
| − | == Example == | + | === Example === |
[language_German] | [language_German] | ||
->haveSeperators="true"; | ->haveSeperators="true"; | ||
Revision as of 02:58, 29 April 2019
This module provides the second part of a two part scanning approach.
- BasicBlockScanner
- BasicWesternScanner (bws)
Contents
Tokens
The following types of tokens are recognized:
- Literals
- Operators
- Keywords
- Other words (separated by operators or change of alphabets)
- Level-2 blocks (as provided by BasicBlockScanner)
Literals
Operators
Keywords
Other Words
Level-2 Blocks
Languages
Example
[language_German] ->haveSeperators="true"; ->alphabet="latin_german"; ->alphabet="alphanumeric"; ->useDictionary="false"; ->dictionary="main_german"; [language_Chinese] ->haveSeperators="false"; ->alphabet="hanzi"; ->cutat="Symbol"; [language_Japanese] ->haveSeperators="false"; ->alphabet="Kanji"; ->alphabet="Hiragana"; ->alphabet="Katakana"; ->cutat="Alphabet";
Alphabets
Example
[alphabet_hexdigit] ->first="0"; ->last="9"; ->extra="ABCDEFGH"; ->extra="abcdefgh"; [alphabet_alphanumeric] ->first="0"; ->last="9"; ->first="A"; ->last="Z"; ->first="a"; ->last="z"; ->extra="_"; [alphabet_Japanese] ->include="Hiragana";//including an other alphabet ->include="Katakana";//including an other alphabet ->include="Kanji"; //including an other alphabet
Global Settings
Example
[settings] ->useAlphabets="true"; ->useDictionary="false"; ->dictionary="main_german"; ->dictionary="main_english"; ->cutat="Operator"; ->cutat="Alphabet"; ->cutat="Symbol";