Difference between revisions of "Prokee Module: bws"

From prokee
Jump to navigation Jump to search
m (Text replacement - "lab/docu/html/" to "downloads/docu/html/")
Line 79: Line 79:
  
 
== Implementations ==
 
== Implementations ==
* [http://www.andreaspollhammer.com/lab/docu/html/bws_v01.php bws (version v01)]
+
* [http://www.andreaspollhammer.com/downloads/docu/html/bws_v01.php bws (version v01)]
  
 
[[Category:Prokee Modules]]
 
[[Category:Prokee Modules]]
 
[[Category:Scanner]]
 
[[Category:Scanner]]

Revision as of 01:21, 24 May 2019

This module provides the second part of a two part scanning approach.

  1. BasicBlockScanner (bbs)
  2. BasicWesternScanner (bws)

Tokens

The following types of tokens are recognized:

  • Literals
  • Operators
  • Keywords
  • Other words (separated by operators or change of alphabets)
  • Level-2 blocks (as provided by BasicBlockScanner)

Literals

Operators

Keywords

Other Words

Level-2 Blocks

Languages

Example

[language_German]
->haveSeperators="true";
->alphabet="latin_german";
->alphabet="alphanumeric";
->useDictionary="false";
->dictionary="main_german";

[language_Chinese]
->haveSeperators="false";
->alphabet="hanzi";
->cutat="Symbol";

[language_Japanese]
->haveSeperators="false";
->alphabet="Kanji";
->alphabet="Hiragana";
->alphabet="Katakana";
->cutat="Alphabet";

Alphabets

Example

[alphabet_hexdigit]
->first="0";
->last="9";
->extra="ABCDEFGH";
->extra="abcdefgh";

[alphabet_alphanumeric]
->first="0";
->last="9";
->first="A";
->last="Z";
->first="a";
->last="z";
->extra="_";

[alphabet_Japanese]
->include="Hiragana";//including an other alphabet
->include="Katakana";//including an other alphabet
->include="Kanji";   //including an other alphabet

Global Settings

Example

[settings]
->useAlphabets="true";
->useDictionary="false";
->dictionary="main_german";
->dictionary="main_english";
->cutat="Operator";
->cutat="Alphabet";
->cutat="Symbol";

Implementations