You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
Ivan Enderlin 774f04e6ab Prepare 3.16.08.15. 3 years ago
Bin Parser: Use the lexer as an iterator. 4 years ago
Documentation Move to PSR-1 and PSR-2. 5 years ago
Exception Quality: Run `hoa devtools:cs`. 3 years ago
Llk Quality: Run `hoa devtools:cs`. 3 years ago
Test Quality: Run `hoa devtools:cs`. 3 years ago
Visitor Quality: Run devtools:cs. 4 years ago
.Mime Fix MIME type format. 7 years ago
.State Add the state of the library. 6 years ago
.gitignore Add a `.gitignore` file. 5 years ago
CHANGELOG.md Prepare 3.16.08.15. 3 years ago
Ll1.php Quality: Run `hoa devtools:cs`. 3 years ago
README.md Documentation: Refresh some phrasings. 3 years ago
composer.json Composer: New stable libraries. 4 years ago

README.md

Hoa

Hoa is a modular, extensible and structured set of PHP libraries. Moreover, Hoa aims at being a bridge between industrial and research worlds.

Hoa\Compiler state

This library allows to manipulate LL(1) and LL(k) compiler compilers. A dedicated grammar description language is provided for the last one: the PP language.

Installation

With Composer, to include this library into your dependencies, you need to require hoa/compiler:

{
    "require": {
        "hoa/compiler": "~3.0"
    }
}

Please, read the website to get more informations about how to install.

Quick usage

As a quick overview, we will look at the PP language and the LL(k) compiler compiler.

The PP language

A grammar is constituted by tokens (the units of a word) and rules (please, see the documentation for an introduction to the language theory). The PP language declares tokens with the following construction:

%token [source_namespace:]name value [-> destination_namespace]

The default namespace is default. The value of a token is represented by a PCRE. We can skip tokens with the %skip construction.

As an example, we will take the simplified grammar of the JSON language. The complete grammar is in the hoa://Library/Json/Grammar.pp file. Thus:

%skip   space          \s
// Scalars.
%token  true           true
%token  false          false
%token  null           null
// Strings.
%token  quote_         "        -> string
%token  string:string  [^"]+
%token  string:_quote  "        -> default
// Objects.
%token  brace_         {
%token _brace          }
// Arrays.
%token  bracket_       \[
%token _bracket        \]
// Rest.
%token  colon          :
%token  comma          ,
%token  number         \d+

value:
    <true> | <false> | <null> | string() | object() | array() | number()

string:
    ::quote_:: <string> ::_quote::

number:
    <number>

#object:
    ::brace_:: pair() ( ::comma:: pair() )* ::_brace::

#pair:
    string() ::colon:: value()

#array:
    ::bracket_:: value() ( ::comma:: value() )* ::_bracket::

We can see the PP constructions:

  • rule() to call a rule;
  • <token> and ::token:: to declare a token;
  • | for a disjunction;
  • (…) to group multiple declarations;
  • e? to say that e is optional;
  • e+ to say that e can appear at least 1 time;
  • e* to say that e can appear 0 or many times;
  • e{x,y} to say that e can appear between x and y times;
  • #node to create a node the AST (resulting tree);
  • token[i] to unify tokens value between them.

Unification is very useful. For example, if we have a token that expresses a quote (simple or double), we could have:

%token  quote   "|'
%token  handle  \w+

string:
    ::quote:: <handle> ::quote::

So, the data "foo" and 'foo' will be valid, but also "foo' and 'foo"! To avoid this, we can add a new constraint on token value by unifying them, thus:

string:
    ::quote[0]:: <handle> ::quote[0]::

All quote[0] for the rule instance must have the same value. Another example is the unification of XML tags name.

LL(k) compiler compiler

The Hoa\Compiler\Llk\Llk class provide helpers to manipulate (load or save) a compiler. The following code will use the previous grammar to create a compiler, and we will parse a JSON string. If the parsing succeed, it will produce an AST (stands for Abstract Syntax Tree) we can visit, for example to dump the AST:

// 1. Load grammar.
$compiler = Hoa\Compiler\Llk\Llk::load(new Hoa\File\Read('Json.pp'));

// 2. Parse a data.
$ast = $compiler->parse('{"foo": true, "bar": [null, 42]}');

// 3. Dump the AST.
$dump = new Hoa\Compiler\Visitor\Dump();
echo $dump->visit($ast);

/**
 * Will output:
 *     >  #object
 *     >  >  #pair
 *     >  >  >  token(string, foo)
 *     >  >  >  token(true, true)
 *     >  >  #pair
 *     >  >  >  token(string, bar)
 *     >  >  >  #array
 *     >  >  >  >  token(null, null)
 *     >  >  >  >  token(number, 42)
 */

Pretty simple.

Compiler in CLI

This library proposes a script to parse and apply a visitor on a data with a specific grammar. Very useful. Moreover, we can use pipe (because Hoa\File\Read —please, see the Hoa\File library— supports 0 as stdin), thus:

$ echo '[1, [1, [2, 3], 5], 8]' | hoa compiler:pp Json.pp 0 --visitor dump
>  #array
>  >  token(number, 1)
>  >  #array
>  >  >  token(number, 1)
>  >  >  #array
>  >  >  >  token(number, 2)
>  >  >  >  token(number, 3)
>  >  >  token(number, 5)
>  >  token(number, 8)

You can apply any visitor classes.

Errors

Errors are well-presented:

$ echo '{"foo" true}' | hoa compiler:pp Json.pp 0 --visitor dump
Uncaught exception (Hoa\Compiler\Exception\UnexpectedToken):
Hoa\Compiler\Llk\Parser::parse(): (0) Unexpected token "true" (true) at line 1
and column 8:
{"foo" true}
       ↑
in hoa://Library/Compiler/Llk/Parser.php at line 1

Samplers

Some algorithms are available to generate data based on a grammar. We will give only one example with the coverage-based generation algorithm that will activate all branches and tokens in the grammar:

$sampler = new Hoa\Compiler\Llk\Sampler\Coverage(
    // Grammar.
    Hoa\Compiler\Llk\Llk::load(new Hoa\File\Read('Json.pp')),
    // Token sampler.
    new Hoa\Regex\Visitor\Isotropic(new Hoa\Math\Sampler\Random())
);

foreach ($sampler as $i => $data) {
    echo $i, ' => ', $data, "\n";
}

/**
 * Will output:
 *     0 => true
 *     1 => { " )o?bz " : null , " %3W) " : [ false , 130 , " 6 " ] }
 *     2 => [ { " ny  " : true } ]
 *     3 => { " Ne;[3 " : [ true , true ] , " th: " : true , " C[8} " : true }
 */

Documentation

Different documentations can be found on the website: http://hoa-project.net/.

License

Hoa is under the New BSD License (BSD-3-Clause). Please, see LICENSE.