HTGEN

The Website compiler and optimizer

Abstract

Websites typically use PHP or .NET to manage dynamic user interactions.

Web designers often use these languages and frameworks to generate web pages dynamically, even when the page does not contain any dynamic elements. This can lead to server overloads.

We believe that websites can be more efficiently implemented by separating the dynamic and static components.

We have developed a tool called HTGEN, a website compiler that addresses the static aspects at compile time, while the dynamic parts are managed by the chosen framework, such as PHP.

HTGEN significantly reduces server overloads by serving mostly static HTML pages, which are well-cached in the network infrastructure. Additionally, HTGEN transparently minifies the generated HTML, JS, and CSS files. It also supports Markdown syntax for writing web pages.

The website www.clari3d.com is entirely built using this exciting technology.

Traditional Web Design

Php include

Websites are frequently developed using PHP, either directly or through a CMS that relies on PHP. Components of a web page are included using the include directive. Consequently, each time a client requests a page, PHP includes the components, parses, and executes them:

<html>
  <head>
  </head>
  <body>
    <\?php
      include 'header.php';
      include 'content.php';
      include 'footer.php';
    \?>
  </body>
</html>

Loading, parsing, analyzing, and executing PHP code takes a few milliseconds. While this is fast enough to be imperceptible to the end user, the cumulative effect across billions of web pages results in a significant amount of wasted processing time.

The question is: why perform dynamic inclusion when the page content remains unchanged?

Database server

Generally, a website is associated with a database server. On the Internet, the commonly used database server is MySQL.

SQL database servers are advanced software that listen for requests on their input port, analyze them, fetch data from storage, and format the results. The queries are written in SQL, an old-fashioned but powerful language that allows for very complex queries. These servers are highly optimized, using query optimization, caching, and other techniques.

These servers can store billions of records and access them very quickly using smart and efficient indexing algorithms. They are fault-tolerant, meaning that all changes in storage are transactional: they are either committed or canceled. Additionally, they support data replication.

A CMS generally stores its content in a MySQL server. Each part of a document is stored in tables and retrieved during rendering.

The question remains: why store content in a database when the content generally changes slowly?

There is a movement on the Internet called NoDB website design that advocates for the removal of database servers. Instead, data is stored in the web server's file system. Note that this is feasible for small amounts of data, but database servers cannot be avoided for millions of records.

HTML Syntax

HTML is a relatively simple language, making it easy to write a web page directly in HTML. However, this can become tedious for longer pages.

<html>
  <head>
  </head>
  <body>
   <h1>Main title</h1>
   <p>This a paragraph</p>
   <p>And here, a new <strong>paragraph</strong></p>
   <ul>
    <li>And here, an item</li>
    <li>Followed by another item</li>
   </ul>
  </body>
</html>

This is why CMSs are very useful: they generally have an online formatted text editor that allows you to change the text online. However, the generated HTML code can be far from optimized...

Formatting tables in HTML can be tedious:

<html>
  <head>
  </head>
  <body>
   <h1>Table exemple</h1>
   <table>
    <tr>
     <th>head 1</th>
     <th>head 2</th>
    </tr>
    <tr>
     <td>text 11</td>
     <td>text 12</td>
    </tr>
    <tr>
     <td>text 21</td>
     <td>text 22</td>
    </tr>
    <tr>
     <td>text 31</td>
     <td>text 32</td>
    </tr>
   </table>
  </body>
</html>

The Pains

Common website design suffers from several issues:

Dynamic page rendering is beneficial during website development, but once the site is live, dynamic rendering is often unnecessary. Frequently, the main page, even if it is in HTML, needs to be parsed by PHP.
Database servers are the bottleneck of websites because all data accesses are done through them.
HTML syntax is not suitable for long web pages. CMSs simplify text writing, but there is no control over the quality of the generated HTML code.
Backups and source control are challenging with an SQL database due to the dynamic nature of the structure and data. Additionally, data is not stored in files but on a server.

Our Web Design

This website is designed to address common issues: it uses static compiled pages, NoDB storage, Ajax, and an MVC controller.

PHP is still used for the dynamic parts of the web pages, such as user login management, and it is used in conjunction with Ajax to update only specific parts of the main page.

Static Website Compiler HTGEN

We have developed a tool to compile websites. While there are several website generators available, they generally have a strong impact on the site organization. HTGEN is versatile and does not impose any specific structure on the websites. ### Scheme

HTGEN is written in Scheme, a dialect of Lisp. Scheme expressions can be embedded inside HTML pages using special tags <?scm ... ?>.

The embedded Scheme code is executed during website generation, leaving no trace of the Scheme code in the generated web pages.

One powerful command is @include "file", which includes the content of the specified file in place of the include command. These include commands can be nested.

Another useful command is @files "search", which returns a list of matching files. This command allows you to obtain a list of files and process them accordingly.

Our index.html file is straightforward:

<;\?scm
 (@set! 'top-page "_top.html")
 (@set! 'title "Home page of Clari3D")
 (@set! 'description "Clari3D is a nice 3D viewer");
 (@set! 'keywords "andéor, 3D, viewer, cad, step, stl, obj, wavefront")
 (@template "_template.html")
\?>

<;\?scm
(for-each (lambda (name)
            (@include (string-append "_index-" name ".html")))
          '("top" "intro" "functions" "webgl"        "products"
            "features"    "customers" "organization" "support"
            "sitemap"     "formats"   "video"        "news"
            "design"))
\?>

The generated file does not contain any PHP calls; therefore, it is sent directly by the web server without any processing, making it very fast.

Output Optimization

We have added a command to generate image tags that automatically creates a low-resolution image to be loaded first, followed by the full-size image. This improves the perceived loading speed.

Additionally, the generated HTML code is optimized to reduce its size, further increasing loading speed.

Wait and Scan

The compiler can operate in scan mode: In this mode, it monitors the source directory for any changes and recompiles only the files that need updating.

For example, the news section of this website consists of a collection of Markdown files in a specific directory. The code that generates the corresponding web page processes these files and generates their HTML counterparts.

If a new file is added, the news section is recompiled, making the changes visible almost immediately. This conversion process is executed only once per new file: generate once, view many times!

Markdown

Markdown is a text file format with two main features: it is human-readable and can be easily transformed into HTML. It is widely used by open-source software developers for read-me files and documentation. However, due to its compatibility with HTML, it can also be effectively used for simple web pages.

In Markdown, the example given above is:

# Main title

This a paragraph

And here, a new **paragraph**

* And here, an item
* Followed by another item

We have integrated a Markdown compiler into HTGEN that converts plain Markdown pages into HTML. It also processes embedded Markdown snippets within HTML code using the <?md ... ?> tag. This ability to blend HTML and Markdown is extremely powerful.

Markdown is particularly effective for creating tables:

#### Table exemple

| head 1  | head 2  |
|---------|---------|
| text 11 | text 12 |
| text 21 | text 22 |
| text 31 | text 32 |

Embedded Markdown

One of the standout features of HTGEN is its ability to seamlessly integrate HTML or PHP with Markdown within the same file.

<div class="split">
  <div>
    <?md
    Here we are in the Markdown world

    | This | is    |
    |------|-------|
    | a    | table |

    and this is a new paragraph that can be
    written across several lines.
    ?>
  </div>
  <div>
    <?md
    Of course, since Markdown allows the inclusion of HTML, there is no limit to the integration!
    ?>
  </div>
</div>

Custom JavaScript Framework: Causal.js

There are many JavaScript frameworks available, each dedicated to specific tasks or more general purposes, such as JQuery, Angular, Dojo, etc.

While these frameworks are robust and stable, they can be quite large, especially if a website requires more than one framework.

To address this, we developed our own framework, Causal.js, which automates tasks such as Ajax queries and includes a set of user interface widgets. Causal.js is designed to be a small, versatile piece of code with native OS web widgets.

Ajax

Ajax allows querying the web server for a URL and obtaining the result of this query. It enables dynamic updates to parts of a web page without reloading the entire page. Causal.js offers an excellent Ajax interface that simplifies the use of Ajax.

Lightweight MVC Controller: mvc.php

MVC stands for Model-View-Controller, a programming method primarily used in PHP. It separates the model (data access), the view (web page rendering), and the controller (page rendering control).

There are several MVC-based PHP frameworks available, such as Laravel or Symfony. However, these frameworks often impose a rigid structure on the website, limiting the web designer's flexibility.

We developed a specific MVC controller that is lightweight, versatile, and does not impact the site structure.

Lightweight FSDB File System Database Engine

Website designers often default to using databases for storing data due to the powerful querying capabilities of SQL. However, this approach requires a database server, which can quickly become a bottleneck as all data requests are routed through it, necessitating dynamic page rendering.

For smaller datasets (fewer than 500k records), storing data in the file system can be more efficient. However, file storage lacks the indexing of internal record attributes, which databases provide.

To address this, we developed a custom storage system with interchangeable drivers for both file system and database storage, using an identical API. This system functions as a relational database engine based on the file system, supporting fast indexing, SQL-like queries, and constraints. It is efficient for up to 100,000 records, and we are currently working on a transaction manager.

More information about the API is defined with a PHP abstract class:

<?php
namespace FSDB;

class Transaction {
};

class FileData {
  /*! debug flag */
  public $debug = false;

  /* check a table structure.
   * [attr-name => ['id'      | 'id:table' | 'id:~table' |
   *                'integer' | 'number'   | 'char'      |
   *                'string'  | 'date'     | 'blob'      |
   *                'primary' | 'index'    | 'uniq'      |
   *                'not-null']
   * return string: the error string or NULL.
   */
  protected function check_structure (& $structure);


  // I N T E R F A C E

  /*! DataFile constructor.
   * @param location the database location,
   * @param journalized if true, transactions, journals and rollbacks are used,
   * @param transactionned if true, an automatic transaction is created on
   * startup,
   * param log: optional log manager,
   * @return DataFile: a new DataFile.
   */
  function __construct (string & $location,
                        bool     $journalized   = false,
                        bool     $transactioned = false,
                        ?Log   & $log           = NULL);

  /*! Set an option.
   * delete-blob-files: when blobs are put in the db, the original file is
   *                    deleted
   * lock:              is true, set a lock manager, unlocked accesses
   *                    otherwise,
   * journalized:       if true, transactions, journals and rollbacks are used,
   * read_only:         the whole db is in read only,
   * transactioned:     create an overall transaction,
   * @param name the option name,
   * @param value the option value,
   * @return void.
   */
  function set (string $name, mixed $value): void;

  /*! Get an option.
   * @param name the option name,
   * @return mixed: the value or NULL if not defined.
   */
  function get (string $name): mixed;

  /*! Begin a transaction
   * @param $transaction the optional transaction or NULL,
   * @return ?Transaction: the new transaction or NULL.
   */
  function begin (?Transaction $transaction = NULL): ?Transaction;

  /*! End a transaction
   * @param $transaction the optional transaction or NULL,
   * @return the abstract transaction object of NULL if the transaction
   * has been rolledback.
   */
  function end (?Transaction $transaction = NULL): ?Transaction;

  /*! Rollback a transaction
   * @param transaction the transaction to rollback or NULL for the default
   * one,
   * @param reasons the reasons argument,
   * @return bool: the status.
   */
  function rollback (?Transaction $transaction, mixed reasons): bool;

  /*! Allow to know if a transaction is rolledback.
   * @param transaction the transaction or NULL,
   * @return boolean.
   */
  function rolledback (?Transaction $transaction = NULL): bool;

  /*! Return the list of the tables.
   * @param transaction the transaction or NULL,
   * @return array: the array of table names or NULL on error.
   */
  function tables (?Transaction $transaction = NULL): ?array;

  /*! Return the size of a table.
   * @param name the table name,
   * @param transaction the transaction or NULL,
   * @return int: the size.
   */
  function size (string $name, ?Transaction $transaction = NULL): int;

  /*! Create a table.
   * @param name the table name,
   * @param structure the table structure,
   * @param $transaction the optional transaction or NULL,
   * @return bool: the status.
   */
  function create (string       $name,
                   array        $structure,
                   ?Transaction $transaction = NULL): bool;

  /*! Test table existence.
   * @param name the table name,
   * @param $transaction the optional transaction or NULL,
   * @return bool: the status.
   */
  function exists (string $name, ?Transaction $transaction = NULL): bool;

  /*! Return the structure of a table.
   * @param table the table name,
   * @param transaction the transaction or NULL,
   * @return structure: the structure or NULL on error.
   */
  function structure (string $table, ?Transaction $transaction = NULL): ?array;

 /*! Rename a table.
   * @param name the table name,
   * @param new_name the new name,
   * @param transaction the transaction or NULL,
   * @return bool: the status.
   */
  function rename (string       $table,
                   string       $new_name,
                   ?Transaction $transaction = NULL): bool;

  /*! Delete a table.
   * @param name the table name,
   * @param structure the table structure,
   * @param $transaction the optional transaction or NULL,
   * @return bool: the status.
   */
  function delete (string $name, ?Transaction $transaction = NULL): bool;

  /*! Insert a new record in a table
   * @param table the table name,
   * @param values the values,
   * @param $transaction the optional transaction or NULL,
   * return array: the new records or NULL on error.
   */
  function insert (string       $table,
                   array        $values,
                   ?Transaction $transaction = NULL): ?array;

  /*! Remove some records.
   * @param $table the table,
   * @param $records the records to remove, as result of a select,
   * @param $transaction the optional transaction or NULL,
   * @return bool: the status.
   */
  function remove (string       $table,
                   array        $records,
                   ?Transaction $transaction = NULL): bool;

  /*! Update the given records of the table with the array of values.
   * @param table the table,
   * @param records the recorde, as a result of select,
   * @param values the array of values to update,
   * @param $transaction the optional transaction or NULL,
   * @return bool: the status.
   */
  function update (string       $table,
                   array        $records,
                   array        $values,
                   ?Transaction $transaction = NULL): bool;

  /*! Select the value of the matching records from the table where the
   * attributes equal the given values.
   * @param table the table,
   * @param attributes attribute or array of attributes,
   * @param exprs optional where expression or array of expressions as:
   *   test  := [ '=' | '<>' | '<' | '>' | '<=' | '>=' | 'LIKE' ]
   *   exp   := [ 'ident' test 'value' ]
   *   op    := AND | OR
   *   where := exp
   * @param $transaction the optional transaction or NULL,
   * @return array or NULL: the array of found records or NULL.
   */
  function select (string       $table,
                   mixed        $attributes,
                   mixed        $exprs = NULL,
                   ?Transaction $transaction = NULL): ?array;

  /*! Reindex all the records of the table, if specified, or all the tables.
   * @param table the table to reindex, or false to reindex all the tables,
   * @param transaction the transaction or NULL,
   * @return boolean: the status.
   */
  function reindex (string       $table       = NULL,
                    ?Transaction $transaction = NULL): bool;

  /*! Insert a new attribute in the table.
   * @param transaction the transaction or NULL,
   * @param table the table,
   * @param attribute the attribute to insert,
   * @param specs the attribute specifications,
   * @return bool: the status.
   */
  function insert_attribute (string       $table,
                             array        $attribute,
                             array        $specs,
                             ?Transaction $transaction = NULL): bool;

  /*! Delete a attribute from the table.
   * @param table the table,
   * @param attribute the attribute to insert,
   * @param transaction the transaction or NULL,
   * @return bool: the status.
   */
  function delete_attribute (string       $table,
                             array        $attribute,
                             ?Transaction $transaction=NULL): bool;

  /*! Update an attribute of the table.
   * @param table the table,
   * @param attribute the attribute to insert,
   * @param specs the attribute specifications,
   * @param transaction the transaction or NULL,
   * @return bool: the status.
   */
  function update_attribute (string       $table,
                             array        $attribute,
                             array        $specs,
                             ?Transaction $transaction=NULL): bool;

  /*! Rename an attribute of the table.
   * @param table the table,
   * @param attribute the attribute to rename,
   * @param new_name the new name of the attribute,
   * @param transaction the transaction or NULL,
   * @return bool: the status.
   */
  function rename_attribute (string       $table,
                             string       $attribute,
                             string       $new_name,
                             ?Transaction $transaction=NULL): bool;

  /*! Return the actual path of a blob file.
   * @param record the record,
   * @param attribute the blob attribute name,
   * @return string: the blob path.
   */
  function blob_path (array & $record, string $attribute): ?string;

  /* Execute a function in a transaction.
   * @param callback the callback function with the transaction object as
   * argument,
   * @param transaction the transaction or NULL,
   * @return data: return the result of the callback call or false.
   */
  function execute (callable     $callback,
                    ?Transaction $transaction = NULL): mixed;

  /*! Select and return the only record; if there is several records,
   * return false.
   * @param table the table,
   * @param attributes string with one attribute or array of attributes,
   * @param exprs optional where expression
   * @param transaction the transaction or NULL,
   * @return array: the revord or NULL.
   */
  function select_record (string       $table,
                          array        $attributes,
                          array        $exprs       = NULL,
                          ?Transaction $transaction = NULL): array;

  /*! Upgrade the table structure according to the given structure.
   * @param table the table,
   * @param structure the structure to upgrade,
   * @param transaction the transaction or NULL,
   * @return status: status in ['created', 'modified', 'unchanged']
   */
  function upgrade (string       $table,
                    array        $structure,
                    ?Transaction $transaction = NULL): string;
}
All other PHP scripts in our system that access data (primarily the `model.php` files in our MVC implementation) use this class. During initialization, either the file driver or the database driver is chosen and instantiated. If we decide to switch from one driver to another, all the code remains unchanged!

!

## Conclusion

This prospective work on website compilation has been very exciting because the results **truly transform the lives of web designers**. We have **reduced the loading time by four times** compared to the previous PHP-driven website, even with a one-page design.

Additionally, we have gained more control over **SEO aspects** because all the generated pages can leverage the new compiler features once developed.

**Markdown syntax has greatly simplified content management**, making it simple, readable, and manageable. Tables are written almost naturally, and the other formatting syntax is easy to use. The ability to combine Markdown and HTML significantly enhances the power of expression.

The **removal of the SQL engine** has also increased the website's loading speed by eliminating the bottleneck of traditional websites.

HTGen is not available for download; however, it is possible that Andéor will distribute it as an open-source package in the future.

## Links

* JavaScript
  * [Ajax](http://www.w3schools.com/ajax/): Programming samples.
  * [Angular](https://angularjs.org): JavaScript framework.
  * [Dojo](https://dojotoolkit.org): JavaScript framework.
  * [JQuery](https://jquery.com): JavaScript framework.

* PHP
  * [Drupal](https://www.drupal.org): CMS.
  * [Laravel](https://laravel.com): PHP framework.
  * [Symfony](https://symfony.com): PHP framework.
  * [Wordpress](https://wordpress.com): CMS.

* Website
  * [Google AMP Project](https://www.ampproject.org): A specification
    for static Web pages for mobile.
  * [Hugo](https://gohugo.io): Static Website generator.
  * [Markdown](https://daringfireball.net/projects/markdown/)
    Syntax description.
  * [NoDB](http://stratos.seas.harvard.edu/files/stratos/files/nodb-cacm.pdf):
    NoDB explanations.
  * [Static Site Generators](https://staticsitegenerators.net): The
    definitive listing of Static Site Generators - all 437 of them!
  * [Why Static Website Generators Are The Next Big Thing](https://www.smashingmagazine.com/2015/11/modern-static-website-generators-next-big-thing/).

Explore, present, share, sell, and breathe!