PageCentric - A Light-weight PHP Framework
Introduction
PageCentric is a light-weight PHP web application framework intended for use when developing bespoke web applications. It is not intended for user-generated, content-oriented websites such as business websites, blogs, etc.
PageCentric provides functionality in the following key areas:
- A page centric method of defining page structures
- Comprehensive input filtering
- A simple mechanism for interacting with MySQL Stored Procedures
- An account authentication and authorisation system based on SQL Stored Procedures
System Architecture
The PageCentric based web application is written in PHP and executes within a standard Apache Web Server environment. For increased security, the MySQL database should be hosted on a separate virtual machine.
The PHP application communicates with the database only using SQL Stored Procedures. The PHP application authenticates to the database using an account that is only allowed to call SQL Stored Procedures.
All authentication and authorisation is handled by the MySQL database.
The Users_Session_Replace( username, password )
procedure is used to authenticate the PHP session and obtain a session ID.
The session ID is later passed to any SQL Stored Procedures that require authorisation.
Host Configuration
Apache Configuration
Apache only requires a slightly customised configuration file to that normally found in Debian based Linux distributions.
<VirtualHost *:80>
ServerName www.example.org ServerAdmin webmaster@example.org DocumentRoot /local/served/live/EXAMPLE
PageCentric makes use of the auto_prepend_file
command to prepend the configuration file to the beginning of each php file executed.
<Directory /local/served/live/EXAMPLE/> php_value auto_prepend_file "/local/served/live/EXAMPLE/Example/latest/example/configuration/configuration.php" </Directory>
PageCentric uses the Apache rewrite module to redirect all pages to single index.php
script.
The rewrite condition ensures that this only occurs if the called URL does not exist.
This is critical as it allows the browser to retrieve images, CSS, and javascript resources.
<Directory /local/served/development/JOBSCAST/> RewriteEngine On RewriteBase / RewriteCond %{REQUEST_FILENAME} !-f RewriteRule /* index.php </Directory>
The rest of the Apache configuration file is standard. On production systems, it uses the expires module to configure caching. Access and error logs are also configured.
</VirtualHost>
Application Installation
The PHP application is installed beneath the Apache DocumentRoot
.
By convension, applictions are stored within a directory hierarchy of:
the application name in capitalised camel case;
the version number;
then the application name in lower case.
For example:
.../EXAMPLE/Example/0.1.0/example/... .../EXAMPLE/Example/0.1.0/example/bin/index.php .../EXAMPLE/Example/0.1.0/example/configuration/configuration.php .../EXAMPLE/Example/0.1.0/example/cronjobs/... .../EXAMPLE/Example/0.1.0/example/dep/libpagecentric/ .../EXAMPLE/Example/0.1.0/example/lib/... .../EXAMPLE/Example/0.1.0/example/resources/css/... .../EXAMPLE/Example/0.1.0/example/resources/images/... .../EXAMPLE/Example/0.1.0/example/resources/javascript/... .../EXAMPLE/Example/0.1.0/example/share/articles/... .../EXAMPLE/Example/0.1.0/example/share/content/... .../EXAMPLE/Example/0.1.0/example/share/sql/... .../EXAMPLE/Example/0.1.0/example/share/templates/... .../EXAMPLE/Example/0.1.0/example/source/php/...
A symbolic link called 'latest' is created that points to the current version.
.../EXAMPLE/Example/latest → 0.1.0
Links to the resources directory and the main bin/index.php PHP script are created,
that use the latest symbolic link,
in the DocumentRoot
.
.../EXAMPLE/index.php → Example/latest/example/bin/index.php .../EXAMPLE/resources → Example/latest/example/resources
Dependencies
Library dependencies are stored underneath the dep
directory.
For example, the libpagecentric
library, which provides the PageCentric frame work is located at:
.../EXAMPLE/Example/latest/example/dep/libpagecentric/
In a development environment these would probably be symbolic links to other project locations. During deployment these libraries are copied along with the main application.
Lib
The lib
directory contains a PHP file that defines an autoload
function for the application/library.
The autoload function automatically includes any required dependendencies obviating the need for include statements within source code files.
<?php spl_autoload_register( function( $classname ) { switch ( $classname ) { case 'File': include( some/php/File.php ) break; } } );
These autoload functions need only be included once in the index.php
script after the include path has been suitably configured.
Share
The share directory contains resources the web application uses locally, i.e. they are not retrieved by a client web browser.
Content
The content
directory contains web page content that has been broken out into sepate content text files.
Typcially, the subdirectories of the content
are named to correspond to a specific page, e.g.
index
corresponds to the home page while about-index
corresponds to an 'about' page.
Other potential directories include modals
, placeholders
, and popups
.
This content is retrieved by the PHP application by calling:
Content::getHTMFor( directory, filename )
SQL
This directory contains the SQL files used to initialise the database.
Templates
This directory contains templates for the mailer cronjob.
Source
The source directory contains a php
directory, which in turn contains directories that organise the applications source code.
The use of a flat directory structure is recommended.
For a library this should be in the form:
domain.module-name.grouping
For example:
.../php/myapp.profile.controllers/ (DEPRECATED) .../php/myapp.profile.controls/ .../php/myapp.profile.elements/ .../php/myapp.profile.forms/ .../php/myapp.profile.modals/ .../php/myapp.profile.models/ .../php/myapp.profile.page/ .../php/myapp.profile.sidebars/ .../php/myapp.profile.views/
Note
The myapp.profile.page
directory is not spelt '.pages' because the Mac OS operating system treats this as
a Pages document rather than a directory.
In the future, the other directories may lose their plural nature in order to achieve consistency.
Experimental
A finer-level of grouping related objects has also been experimented with.
The bundles
keyword has been used to indicate when a directory bundles
together components that might normally be spread between different directories.
For example, the following directory might contain controls, forms, and elements:
.../php/myapp.profile.bundles.work_experience/
Object Types
PageCentric distinguishes between different forms of component so that the programmer can reason about them more easily. Often, the key difference between these types of object is how they are initialised and what side-effects they are allowed to cause.
An instance of a subclass of the Page class is the root class of the composition hierarchy.
Usually, a single user class called SitePage
will inherit from Page
,
This class implements any aspects that are shared by all pages within a web-application.
Further classes then inherit from SitePage
.
Instances of Controls, Elements, Forms, Modals, Sidebars, and Views, are, in turn, instatiated as members of page classes.
Page |-- Views(s) |-- Control(s) |-- Model(s) |-- Form(s) |-- View(s) |-- Element(s)
The composition hierarchy is typically similar to that represented above. Control objects are responsible for handling posted actions from the user. A control will use a Model object to store data to the databse, or retrieve data from the database.
Note
In the past Controller
objects were used to handle actions that store data into the database.
These are currently being phased out (or perhaps repurposed) -- the work now being divided between
the control and the model objects.
Page
Typcially,
in the main index.php
PHP script,
a page is instantiated,
the setTitle( ... )
method is called,
then the render()
method is called.
When constructed,
if the user is logged in,
the Page
class accesses the MySQL database using a session ID that is stored in a cookie.
Page methods
When render()
is called, subclasses are able to call the following methods on Page
.
Method | Action/Returns |
addModal( $modal ) | adds modal |
getEmail() | email address of user |
getFamilyName() | family name of user |
getGivenName() | given name of user |
getIDType() | ID type of user |
getUser() | ID of user |
getUserHash() | user hash of user |
getUserType() | same as getIDType() |
getPageId() | page path with '/' replaced with '-' |
getPagePath() | page path |
getRequest( $key ) | value of POST/GET |
getSession() | session object |
getSessionId() | sessionid for database access |
isAdmin() | TRUE if admin |
isAuthenticated() | TRUE if authenticated |
isHomePage() | TRUE if path is / |
isSupportedBrowser() | TRUE if supported browser |
logout() | logout |
setTitle() | set the title |
showModal( $modal_id ) | when page is loaded modal is shown |
Page members
When render()
is called, subclasses are able to access the following members of Page
.
Member | Description |
request | The array of filtered form inputs |
debug | A debug Printer used for debugging |
Note
The debug
printer should only be used before the bodyStart
method is called.
Debug output is printed between the HTML HEAD section and BODY section as HTML comments.
The Page::render call-graph
In most cases, the render()
method shouldn't be overridden.
However, render
in turn calls other methods that are intended to be overriden by derived classes.
Page::render() | |-- redirect( $debug ) |-- presync( $debug ) |-- headers( $out ) |-- doctype( $out ) |-- html | |-- htmlStart( $out ) |-- htmlContent( $out ) | | | |-- headStart( $out ) | |-- headContent( $out ) | | | | | |-- title( $out ) | | |-- meta( $out ) | | |-- stylesheets( $out ) | | |-- javascript( $out ) | | | |-- headEnd( $out ) | |-- sync( $debug ) | |-- bodyStart( $out ) | |-- bodyContent( $out ) | | | | | |-- bodyNavigation( $out ) | | |-- bodyBackground( $out ) | | |-- bodyBreadcrumbs( $out ) | | |-- bodyHeader( $out ) | | |-- bodyMiddle( $out ) | | |-- bodyFooter( $out ) | | | |-- bodyModals( $out ) | |-- bodyPopups( $out ) | |-- finalJavascript( $out ) | |-- bodyEnd( $out ) |-- htmlEnd( $out )
Typically, the SitePage
class will override the methods called by headContent
(in order to include CSS and/or javascript);
and will also override the bodyNavigtaion
, bodyMiddle
, and bodyFooter
methods.
A simple SitePage
The constructor of classes extending Page
should take no parameters.
By convension, the SitePage
is expected to define methods that will then be overriden by subclasses so that they may take responsiblity for rendering a specific part of the page.
While the SitePage
class below is responsible for rendering the navigation and footer,
a sub-class would be responsibile for overriding the renderLeftolumn
and renderRightColumn
classes.
<?php class SitePage extends Page { function __construct() { parent::__construct(); $this->viewNavigation = new NavigationView(); $this->viewFooter = new Footer(); } function bodyNavigation( $out ) { $this->viewNavigation->render( $out ); } function bodyMiddle( $out ) { $out->inprint( "<div class='center span12'>" ); { $out->inprint( "<div class='row'>" ); { $out->inprint( "<div class='span span8'>" ); { $this->renderLeftColumn( $out ); } $out->outprint( "</div>" ); $out->inprint( "<div class='span span4'>" ); { $this->renderRightColumn( $out ); } $out->outprint( "</div>" ); } $out->outprint( "</div>" ); } $out->outprint( "</div>" ); } function bodyFooter( $out ) { $this->viewFooter->render( $out ); } }
A subclass would then only need to implement those methods.
<?php class HomePage extends SitePage { function renderLeftColumn( $out ) { ... } function renderRightColumn( $out ) { ... } }
Elements
An Element is constructed by passing an array into its contructor. The array is usually a tuple that has been retrieved from a database. Elements are unable to cause any side-effects.
<?php class MyElement extends Element { function __construct( $tuple ) { $this->given_name = array_get( $tuple, "given_name" ); $this->family_name = array_get( $tuple, "family_name" ); } function render( $out ) { $out->inprint( "<div data-class='MyElement'>" ); { $out->println( "<span>$this->given_name : $this->family_name<span>" ); } $out->outprint( "</div>" ); } }
Views
A View is constructed by passing a $page
reference to its constructor.
$page
is a reference to the enclosing subclass of Page
.
The $page->getSessionId()
method can be used to retrieve the session ID,
which may then be passed to model methods to retrieve information.
By convension, View classes should not call any Model methods that alter the database.
<?php class MyView extends View { function __construct( $page ) { $sid = $page->getSessionId(); $tuples = MyModel::retrieveUsers( $sid, $page->debug ); $this->elements = array(); foreach ( $tuples as $tuple ) { $this->elements[] = new MyElement( $tuple ); } } function render( $out ) { $out->inprint( "<div data-class='MyView'>" ); { foreach ( $this->elements as $element ) { $element->render( $out ); } } $out->outprint( "</div>" ); } }
Controls
A Control is constructed by passing a $page
reference to its constructor.
The $page->getSessionId()
method can be used to retrieve the session ID,
which may then be passed to model methods to retrieve and store information.
By convension, when a control is constructed it first checks to see whether an action has occured that requires it to store data into the database.
Then it will retrieve any required data from the database, or construct any any views, tables, or elements it may require.
Important !!!
Authorisation (access control) is perfomed by called MySQL Stored Procedures.
Input validation logic performed by control objects is performed for
the sack of usability and UI integrity.
<?php class MyControl extends Control { function __construct( $page ) { $sid = $page->getSessionId(); $USER = $page->getUser(); switch ( $page->getRequest( "action" ) ) { case "users_update": $required["email"] = "EMAIL"; $required["phone"] = "PHONE"; $iv = new InputValidation( $page->request, $required ); if ( $iv->validate() ) { MyModel::updateUser( $sid, $page->request, $page->debug ); } break; default: $iv = new InputValidation( array(), array() ); } $tuple = MyModel::retrieveUser( $sid, $USER, $page->debug ); $this->form = new MyForm( $iv, $tuple ); } function render( $out ) { $out->inprint( "<div data-class='MyControl'>" ); { $this->form->render( $out ); } $out->outprint( "</div>" ); } }
Note
The InputValidation
object that is created is passed to the form
so that it may determine if any warnings need to be shown.
Models
Note
PageCentric does not recommend any particular way of designing model objects.
It is reconised that there is much disagreement and controversy on how database-based
information should be represented in object-oriented operating systems.
For now, however, our model objects have evolved towards the following structure. The model object exposes static methods that correspond to an SQL Stored Procedure defined within the database. There are two types of methods -- update methods change the database, while query methods just interrogate the database.
Usually, update methods cause a REPLACE statement to occur --
therefore the method often needs to pass many values to the stored procedure.
As, typically, these values are drawn from form data the Page::request
member is
passed as an argument, then the required values are retrieved using the array_get
method.
The following example shows the 'UpdateUser' function of Accounts. See the "Controls" section above for an example of it being called.
class Accounts extends Model { ... static function UpdateUser( $sid, $request, $debug ) { $success = False; $email = array_get( $request, "email" ); $USER = array_get( $request, "USER" ); $given_name = array_get( $request, "given_name" ); $family_name = array_get( $request, "family_name" ); $sql = "Users_Update( '$sid', '$USER', '$email', '$given_name', '$family_name' )"; $success = is_array( DBi_callProcedure( DB, $sql, $debug ) ); return $success; } ... }
array_get
and a number of other utility functions are defined within
the 'pagecentric.util/HelperFunctions.php' PHP source file.
array_get
is guarrenteed to return an empty string if the passed array does not
contain a value corresponding to the key passed.
DBi_callProcedure
, and a similar method DBi_callFunction
,
wrap the standard PHP mysqli API.
Apart from providing a simpler calling mechanism the objects returned are more consistent than the
standard mysqli functions.
On success, DBi_callProcedure
returns a (potentially empty) array of tuples
(i.e. an array of arrays); otherwise FALSE is returned.
Therefore, if a query's results set is a single tuple that tuple will be the only member of the returned array.
For procedures that only change the database and don't expect a result set the empty tuple is still returned on success; otherwise FALSE is returned.
Forms
Typically, forms will be instantiated by passing in an $iv
(InputValidation)
object and a $tuple
containing data for the form.
Generally, if the form has just been submitted, the POST values should be available in the
$iv->request
member.
However, whether this is true or not will depend on the enclosing control and the
programmer's intention.
<?php class MyForm extends Form { function __construct( $iv, $tuple ) { $t = $iv->validate() ? $tuple : $iv->request; $this->form = $this->CreateForm( $iv, $t ); } function render( $out ) { $out->inprint( "<div data-class='MyForm'>" ); { $out->println( $this->form ); } $out->outprint( "</div>" ); } static function CreateForm( $iv, $tuple ) { $USER = array_get( $tuple, "USER" ); $email = array_get( $tuple, "email" ); $phone = array_get( $tuple, "phone" ); $warning_email = $iv->value( "email" ) ? "warning" : ""; $warning_phone = $iv->value( "phone" ) ? "warning" : ""; return " <form method='post'> <div> <input type='hidden' name='action' value='users_update'> <input type='hidden' name='USER' value='$USER'> </div> <label class='$warning_email'> <tt>Email</tt> <input type='text' name='email' value='$email'> </label> <label class='$warning_phone'> <tt>Phone</tt> <input type='text' name='phone' value='$phone'> </label> <div> <input type='submit' name='submit' value='Save'> </div> </form> "; } }
Note
No doubt many people will be offended by the inlining of HTML within a source
file in this manner.
After using a variety of manners to generate forms I have come to the conclusion
that inlining in this manner allows a form to be created with the best size versus
meaning trade-off.
- Saving as an external "template" file would create yet another file that needs to be found and managed.
- Inlining allows simple substitution of variables into the form.
- Creating forms programatically is initially appealing, however, it is quite hard to understand exactly what might be happening with a form at a latter date.
The index.php file
The index.php
needs to perform the following crucial tasks:
- Set the include path appropriately.
- Include the autoload "library" files for the project and any dependencies.
-
Call
Page::initialise()
, which sets up several global variables. - Instantiate an appropriate page object and render it.
Remember
A PHP configuration is automatically prepended to this file by Apache as
specified in the Apache virtual host configuration file.
The BASE
global variable corresponds to the website's
document root.
The REDIRECT_URL
is defined when Page::initialise() is called and
corresponds to the page requested.
<?php /* * (1) Setup include path */ set_include_path( BASE . "/dep/libpagecentric/source/php" . ":" . BASE . "/source/php" ); /* * (2) Include autoload "library" files */ include_once( BASE . "/dep/libpagecentric/lib/libpagecentric.php" ); include_once( BASE . "/lib/libjobscast.php" ); /* * (3) Initialise global variables */ Page::initialise(); /* * (4) Instantiate and render appropriate page */ switch ( REDIRECT_URL ) { case "/": $page = new HomePage(); $page->setPageTitle( "Home | My Great Web App" ); break; case "/about/": case "/help/": $page = new ContentPage(); break; case "/profile/": $page = new ProfilePage(); $page->setPageTitle( "My Profile | My Great Web App" ); break; ... default: $page = new FourOhFourPage(); } if ( isset( $page ) ) { $page->render(); }
Security Architecture
Input Filtering
Input is filtered to prevent cross site scripting (XSS) attacks.
PageCentric web applications should access form values from the 'request' array member of the page object. These values are filtered within the objects constructor. Any cookie values are also available from this array.
class Page { function __construct() { ... $this->request = Input::FilterInput( $_REQUEST, $this->debug ); ... } ... }
The FilterInput static method filters each $key => $value
pair.
static function FilterInput( $request, $debug ) { $filtered = array(); foreach ( $_REQUEST as $key => $val ) { $filtered_key = Input::Filter( $key ); $filtered_val = Input::Filter( $val ); $filtered[$filtered_key] = $filtered_val; } ... }
The 'filter' function first translates any unicode characters to HTML entities;
then translates any special ASCII characters to HTML entitites;
'addslashes' is used to preserver any text that appears like a escaped character;
and finally the value is passed to DBi_escape
, which escapes any
characters that should be escaped before being formed into an SQL query.
static function Filter( $value ) { $value = Input::unidecode( $value ); $value = htmlentities( $value, ENT_QUOTES, 'UTF-8', false ); $value = get_magic_quotes_gpc() ? $value : addslashes( $value ); $value = DBi_escape( $value ); return $value; }
Invocation of Stored Procedures
Stored Procedures are invoked by forming an $sql
query string,
which is then passed to DBi_callProcedure
.
... $sql = "Users_Update( '$sid', '$USER', '$email', '$given_name', '$family_name' )"; $success = is_array( DBi_callProcedure( DB, $sql, $debug ) ); ... }
SQL injection attacks are guarded against in theses ways:
-
Firstly, the value of any variables substituted into the
$sql
string have already been filtered (see above) so that any quote characters have been converted into html entitites. - Secondly, if a quote is added to one of the arguments (in order to add a malicious variable) the procedure call will failure due to too many arguments being passed in the procedure.
- Thirdly, arguments used within SQL statements within a stored procedures are treated as atomic values. If a quote is somehow passed it will not affect the structure (and meaning) of the SQL statement.
Important !!!
If a stored procedure uses prepared statements (3) does not apply.
Extreme care must be taken when using prepared statements within SQL Stored
Procedures.
Authentication and Authorisation
Session IDs
PageCentric uses a 64 byte session ID that is generated by salting SHA2.
CREATE FUNCTION generate_salt() RETURNS CHAR(64) BEGIN DECLARE salt CHAR(64); SET salt = RAND(); SET salt = SHA2( salt, 256 ); return salt; END
Password Hashes
Password Hashes are created by taking a 64 byte hash of the concatination of an encryption key, a user specific generated salt, and the user's password value.
Note
The MYSQL DES_ENCRYPT
function encrypts the string "EncryptionPassPhrase"
using a secret key that is located within a file system.
This mitigates against brute-force attacks against hashed passwords if the
database content is stolen.
CREATE FUNCTION Users_Compute_Hash( salt CHAR(64), value TEXT ) RETURNS CHAR(64) BEGIN DECLARE enckey TEXT; DECLARE string TEXT; DECLARE hash CHAR(64); SET enckey = SHA2( HEX( DES_ENCRYPT( "EncryptionPassPhrase" ) ), 256 ); SET string = CONCAT( enckey, salt, value ); SET hash = SHA2( string, 256 ); return hash; END
Authorisation
Each time an SQL Stored Procedure is called that requires authorisation the
Users_Authorise_Sessionid
procedure is called.
If the session ID has not yet expired,
the email address, user ID, and ID type
of the account associated with the session ID are returned;
otherwise the session ID is terminated.
If the session ID has not yet expired, it is extended.
CREATE PROCEDURE Users_Authorise_Sessionid ( $Sid CHAR(64), OUT $Email CHAR(99), OUT $USER INT(11), OUT $IDType VARCHAR(20) ) BEGIN IF Users_Sessions_Verify( $Sid ) THEN CALL Users_Sessions_Extend_Expiry( $Sid ); SELECT email INTO $Email FROM users_sessions WHERE sid = $Sid; SELECT USER INTO $USER FROM users WHERE email = $Email; SELECT type INTO $IDType FROM users_uids WHERE USER = $USER; ELSE CALL Users_Sessions_Terminate( $Sid ); END IF; END
Access Control
Access control decisions are made by using the values returned from the
Users_Authorise_Sessionid
procedure.
Typically, the user ID associated with the session (@USER) is used as a guard within the SQL statement. Below, if the target user ID ($USER) is the same as the session user ID (@USER) the SQL is performed.
CREATE PROCEDURE Users_Update_Name ( $Sid CHAR(64), $USER INT(11), $given_name CHAR(50), $family_name CHAR(50) ) BEGIN CALL Users_Authorise_Sessionid( $Sid, @email, @USER, @idtype ); IF @USER = $USER THEN UPDATE users SET given_name=$given_name, family_name=$family_name WHERE USER=$USER; END IF; END
Local Access Procedures
Some SQL Stored Procedures are only intended for access by local,
non-authenticated services.
Access to these services is determined by calling the Is_Local_Caller
function.
Caution !!!
Such methods would be vulnerable if the database is not run on a separate virtual machine as intended.
CREATE FUNCTION Is_Local_Caller() RETURNS BOOL DETERMINISTIC BEGIN DECLARE $USER TEXT; SET $USER = USER(); return ('public@localhost' = $USER OR 'root@localhost' = $USER); END
Transport Security
It is expected that communication between the web server and the client is encrypted using Secure Sockets Layer/Transport Layer Security (SSL/TLS).
Scalability
Scalabiliy is achived by ensuring that state is maintained within the Database.
Any number of identical web servers may be added to the system.
As the need arises slave (read-only) databases may be added to the system.
Potential Issues
Access to Stored Procedures
It is important to ensure that the web server only have access to those stored procedures that are required for its operation.
Efficiency of Database Access
Efforts may be made to make database access more efficient by caching the results of queries during a page load.