Difference between revisions of "Prokee Module: bws"

From prokee
Jump to navigation Jump to search
m (Text replacement - "http://www.andreaspollhammer.com/downloads/docu/html/" to "http://www.andreaspollhammer.com/lab/downloads/docu/html/")
 
(9 intermediate revisions by the same user not shown)
Line 1: Line 1:
 
This module provides the second part of a two part scanning approach.
 
This module provides the second part of a two part scanning approach.
# BasicBlockScanner
+
# [[Prokee Module: bbs|BasicBlockScanner (bbs)]]
 
# BasicWesternScanner (bws)
 
# BasicWesternScanner (bws)
  
Line 22: Line 22:
  
 
== Languages ==
 
== Languages ==
 +
 +
=== Example ===
 +
[language_German]
 +
->haveSeperators="true";
 +
->alphabet="latin_german";
 +
->alphabet="alphanumeric";
 +
->useDictionary="false";
 +
->dictionary="main_german";
 +
 +
[language_Chinese]
 +
->haveSeperators="false";
 +
->alphabet="hanzi";
 +
->cutat="Symbol";
 +
 +
[language_Japanese]
 +
->haveSeperators="false";
 +
->alphabet="Kanji";
 +
->alphabet="Hiragana";
 +
->alphabet="Katakana";
 +
->cutat="Alphabet";
  
 
== Alphabets ==
 
== Alphabets ==
 +
 +
=== Example ===
 +
[alphabet_hexdigit]
 +
->first="0";
 +
->last="9";
 +
->extra="ABCDEFGH";
 +
->extra="abcdefgh";
 +
 +
[alphabet_alphanumeric]
 +
->first="0";
 +
->last="9";
 +
->first="A";
 +
->last="Z";
 +
->first="a";
 +
->last="z";
 +
->extra="_";
 +
 +
[alphabet_Japanese]
 +
->include="Hiragana";//including an other alphabet
 +
->include="Katakana";//including an other alphabet
 +
->include="Kanji";  //including an other alphabet
  
 
== Global Settings ==
 
== Global Settings ==
 +
 +
=== Example ===
 +
[settings]
 +
->useAlphabets="true";
 +
->useDictionary="false";
 +
->dictionary="main_german";
 +
->dictionary="main_english";
 +
->cutat="Operator";
 +
->cutat="Alphabet";
 +
->cutat="Symbol";
 +
 +
== Implementations ==
 +
* [http://www.andreaspollhammer.com/lab/downloads/docu/html/bws_v01.php bws (version v01)]
 +
 +
[[Category:Prokee Modules]]
 +
[[Category:Scanner]]

Latest revision as of 16:01, 1 June 2019

This module provides the second part of a two part scanning approach.

  1. BasicBlockScanner (bbs)
  2. BasicWesternScanner (bws)

Tokens

The following types of tokens are recognized:

  • Literals
  • Operators
  • Keywords
  • Other words (separated by operators or change of alphabets)
  • Level-2 blocks (as provided by BasicBlockScanner)

Literals

Operators

Keywords

Other Words

Level-2 Blocks

Languages

Example

[language_German]
->haveSeperators="true";
->alphabet="latin_german";
->alphabet="alphanumeric";
->useDictionary="false";
->dictionary="main_german";

[language_Chinese]
->haveSeperators="false";
->alphabet="hanzi";
->cutat="Symbol";

[language_Japanese]
->haveSeperators="false";
->alphabet="Kanji";
->alphabet="Hiragana";
->alphabet="Katakana";
->cutat="Alphabet";

Alphabets

Example

[alphabet_hexdigit]
->first="0";
->last="9";
->extra="ABCDEFGH";
->extra="abcdefgh";

[alphabet_alphanumeric]
->first="0";
->last="9";
->first="A";
->last="Z";
->first="a";
->last="z";
->extra="_";

[alphabet_Japanese]
->include="Hiragana";//including an other alphabet
->include="Katakana";//including an other alphabet
->include="Kanji";   //including an other alphabet

Global Settings

Example

[settings]
->useAlphabets="true";
->useDictionary="false";
->dictionary="main_german";
->dictionary="main_english";
->cutat="Operator";
->cutat="Alphabet";
->cutat="Symbol";

Implementations