Backend Architecture

Version / Description / Author / Date
0.1 / Creation of initial document. / Aaron Cottle / 10/18/2007
0.2 / Description of the public API to the backend, and the design of the database. / Aaron Cottle / 10/18/2007
0.2.5 / Converted the document to new documentation standard. / Felipe Serrano / 10/25/2007
1.0 / Updated document to reflect final release. / Aaron Cottle / 12/21/2007
1.1 / Added list of Backend files / M. Freeburg / 12/26/2007

Contents

Introduction

Public API

Data Flow

Backend Object

Content

Entities

Searching

Searching

Tags and Comments

Database Access

Backend Database Tables

Category

Content

ContentProperties

Data

Entity

EntityProperties

IdType

Notifications

PageViews

Permissions

Relation

SystemSettings

Tag

TagProperties

TransactionBuyer

TransactionContents

Backend Files

Access Control

IAccessControl.cs & AccessControl.cs

IAccessControlList.cs & AccessControlList.cs

AccessControlSettingsHelper.cs

IAccessControlObject.cs

RelationsPermissionCalculator.cs

Backend

IBackend.cs, IBackendInternal.cs, & Backend.cs

BackendFactory.cs

DatabaseConnection.cs

GetBackendObjectVisitor.cs

BackendObject

IBackendObject.cs & BackendObject.cs

IBackendObjectCollection.cs

BackendQuery

BackendQuery.cs

IQueryValidator.cs & QueryValidator.cs

IStatementMember.cs & StatementMember.cs

SparqlTranslator.cs

Statement.cs

BackendRelation

IBackendRelation.cs & BackendRelation.cs

Category

ICategory.cs & Category.cs

Collections

IDBDictionary.cs, DBDictionary.cs, & ADBDictionary.cs

IBackendDictionary.cs & BackendDictionary.cs

GroupCollection.cs

Content

IContent.cs & Content.cs

ContentCollection.cs

DB Configuration

CreateTables.sql

DevDB.reg & ProdDB.reg

Entity

IEntity.cs & Entity.cs

EntityCollection.cs

IUser.cs & User.cs

Exceptions

BackendDisposedException.cs

BannedUserException.cs

IllegalAccessException.cs

IncorrectPasswordException.cs

ObjectNotCreatedException.cs

ObjectNotDeletedException.cs

ObjectNotFoundException.cs

ObjectNotInitializedException.cs

UserAlreadyExistsException.cs

UserNotValidatedException.cs

Notifications

Relationships

Tags

ITag.cs & Tag.cs

Transactions

IPaymentDelegate.cs

ITransaction.cs & Transaction.cs

ITransactionMember.cs & TransactionMember.cs

Miscellaneous

AuthenticationSystem.cs

BackendConstants.cs

NpgsqlStream.cs

Introduction

The backend provides a display- independent way to store, retrieve, and search for data. This document broadly describes both the public-facing API and the internal architecture hidden from the outside, including database schemas. For a detailed description of all methods available to be called, consult the actual code comments. IBackend.cs contains all of the methods that can be called from the backend and their documentation.

Public API

There are a number of public data interfaces exposed, each representing an object in the system. The top-level interface is IBackendObject which provides a number of attributes common to all objects in the backend. The IEntity, IUser, and IContent classes derive from this, each adding methods specific to users and content. Another interface, IBackendObjectCollection, is also exposed, which not only derives from IBackendObject but also implements ICollection. This means that in addition to being able to be treated as a backend object it contains a number of backend objects. In this way, groups of, say, users can be treated in exactly the same way as a single object. No concrete implementations of any of these interfaces are exposed.

One adapter interface is exposed which houses all methods which can be called on the backend to store or retrieve data: IBackend. No concrete implementations are exposed, but an additional class, BackendFactory, has a public static method which creates one. The only access to the backend must go through the IBackend implementation provided by BackendFactory. IBackend controls all access to database connections and opens a new connection when it’s instantiated, so it’s necessary to enclose all instances of it in a “using” block in C# to make sure it’s properly disposed of and connections aren’t left open. IBackend also manages transactions, allowing multiple changes to happen to the database and be committed concomitantly with a call to SaveChanges(). Any changes made to the database which are not eventually followed by a call to SaveChanges() will be rolled back when the instance of the backend is disposed.

Data Flow

Backend Object

There are a number of features common to all objects stored in the backend. To create a new object, parameters are passed into the proper method on IBackend. The backend then allocates a new Guid based upon what type of object is being created, saves the given information to the database, and returns the created object. All mutator methods on a backend object write directly to the database, but of course do not actually change the database until SaveChanges() is called on the IBackend.

Each backend object internally retains a reference to the backend which created it in order to change the database when needed. This means that once the backend is disposed (such as when the using block has closed) any backend objects, even if still in scope, cannot change themselves. It’s best to think of a backend object as a reader on a stream: a reader can be set to retrieve information from a stream, but if the stream is closed, the reader becomes worthless. Most backend objects locally cache data values, so it may be possible to read information about a backend object after the backend is closed, but trying to set information will throw an exception.

Backend object can be retrieved from IBackend by passing a certain amount of uniquely identifying information to the proper method. All backend objects have a Guid which uniquely identifies them, and it is even possible to determine the type of an object given only its Guid.

Content

Content objects represent any type of user-uploaded data. Content objects, once created, can produce any number of different data streams which can be written to or read. Each stream can be retrieved by a key. For example, a newly uploaded video will have the “default” stream of the actual video content itself, but it may also have a “thumbnail” stream, which provides a small snapshot of a frame of the video.

Entities

Any agent viewing or creating content in the system is classified as an IEntity. The IEntity interface uses three types of uniquely identifying information: the Guid universal to all backend objects, plus an email address and a username.Each entity has various permissions on types of content. For example, the author of a piece of content would have permission to view and edit it, while an entity who has purchased content could view but not edit it, while an entity who has not purchased the content cannot view the content in full, but could perhaps view a preview of it.

Each entity also has a password associated with it. This password can be verified with the stored one in the database to determine if the entity can successfully log in. Entities can also be “banned” from the system temporarily for violating terms of use, at the discretion of an administrator, which prevents them from logging in.

One entity can also “sponsor” one or more other entities, showing that the sponsor favors the sponsored entity’s content and wishes to promote him or her. The backend provides methods for obtaining both the list of all sponsors of an entity and all entities whom this entity is sponsoring.

Searching

Searching goes through several stages. For a search starting from a user’s typed text, the search module parses and translates the query into a search tree. The relations module then translates the search tree into a series of Sparql statements which are bundled together inside a BackendQuery. Backend queries are accepted in IBackend, turned into Sql, and executed, returning all information which matched the query. Backend queries can also be formed manually, without starting from user input.

Searching

Searching pulls information primarily from the Relations table in the database. The relations table is an RDF-style database table, formed based on the “subject-predicate-object” form. In brief, two backend objects form the subject and the object; these can be thought of as nouns. The predicate describes the relation between the two backend objects; this can be thought of as a verb. The RDF language, which is a W3C standard, is excellent for showing that different types of relationships between objects exist, but searching also needs the ability to store the “strength,” or value, of a relationship. The value can be thought of as an adverb. As an example, the simple sentence “Matt trusts Brad 10%” can thus be stored in the relation table.

Sparql is a language designed to query RDF-style databases. We again have extended this slightly to incorporate the notion of the strength of a relationship.Sparql allows for sentences as above to be formed, but also allows for variables to take the place of some literals, representing unknown information. The pseudo-Sparql sentence “?truster Trusts Brad > 10%” is a simple query to find all people who trust Brad at least 10%. Joining this to the sentence “?trusterMemberOfBradLikers” will further require that all returned results must not only trust Brad at least 10% but must also be a member of the group “BradLikers.” More of these Sparql statements can be bundled together in a single BackendQuery to precisely specify what results are desired. Backend queries are passed into the backend through the IBackend interface.

Tags and Comments

Tags can be used to provide additional information about a piece of content. Since tags are backend objects, each time a tag is created it is allocated a new Guid, regardless of whether or not the tag’s title has been used before.

The relations between tags and the thing they describe are stored as relations in the relations table in the format “TagId Tags ContentId.” This general method of storing tags allows a wide number of things to be stored in the same fashion, such as comments and messages.

Database Access

The database connection, username, and password are stored in the Windows registry. The system has been designed to be agnostic with respect to the type of database used. An adapter class, DatabaseConnection, abstracts all of the implementation-specific details and serves as a factory for connections, commands, and parameters.

Backend Database Tables

There is a visitor on IContent objects which performs different actions based on the content type of the object. Specifically, this visitor is used to determine how to store and retrieve different types of data. While there are some general tables which handle all of the common properties of content, the additional information which is specific to different content types is stored in separate tables. Each content type, in essence, knows how to store and retrieve itself. To add a new content type, add a new entry to the ContentTypeenum, and add new entries to the extended visitors for the new content type; everything else remains the same.

One alternative to the extended visitor pattern would be to have different public sub-interfaces of IContent, one for each type of content. Implementations of these interfaces would contain the content-specific knowledge about how to store and retrieve data. The benefit of this is that a general purpose dictionary of key-value pairs would not be necessary, and different types of data would have methods specifically for retrieving it.

Another alternative to having multiple tables is to have one large table which serves as a list of key-value pairs for all types of content. The benefit of this that once the backend is written it would never need to be changed, even when adding new types of content. The downside is that the key-value model may be too inflexible for some types of data.

Category

Stores the hierarchy of all categories.

Id / uuid / Guid to uniquely identify this category.
Name / char varying(200) / Name of this category.
CreationDate / timestamp / Time this category was created.
Parent / uuid / The parent category.

Content

Stores all universal properties of content objects.

Id / uuid / Guid to uniquely identify this content object.
Type / smallint / Specification of the type of content.
Name / char varying(200) / The name to display of the content object.
Published / boolean / Flag to determine if this content should be viewable or not.
Description / char varying(4000) / Description of the type of content.
CreationDate / timestamp / The time this content was created.
Deleted / boolean / Flag to determine if this content has been deleted.
Price / numeric(8,2) / Price at which this should be sold. 0 if content is free.

ContentProperties

A dictionary of additional non-searchable properties about a piece of content.

Id / uuid / The Id of the piece of content.
Name / char varying(50) / The key of the dictionary.
Value / char varying(200) / The value of the dictionary.

Data

A mapping of postgresql data streams to content objects.

Id / uuid / The Id of the piece of content.
Data / oid / The postgresql data stream id.
VersionKey / char varying(30) / The key to know what version of data this is.

Entity

Stores all universal properties of entities.

Id / uuid / The Id of this entity.
Email / char varyring(150) / The email address of this entity. Also used as a login.
Password / bytea / MD5 hash of password appended to user id.
Name / char varying(500) / The display name of this entity.
Validated / boolean / Flag for if this entity has had its real-world identity validated.
Deleted / boolean / Flag for if this entity has been deleted.
CreationDate / timestamp / The time this entity was added to the system.
Username / char varying(32) / The username of this entity.
Banned / boolean / Flag for if this entity has been banned from the system.

EntityProperties

A dictionary of additional non-searchable properties about an entity.

Id / uuid / The Id of the entity.
Name / char varying(50) / The key of the dictionary.
Value / char varying(200) / The value of the dictionary.

IdType

A mapping of what type of object every Guid represents.

Id / uuid / The Id of the object.
Type / integer / The SearchTypes value of the type of object.

Notifications

A listing of all notifications the system has generated.

QueueOrder / serial / The order of the notification.
ObjectNotified / uuid / The object being notified.
Handled / boolean / Flag for whether this notification has been handled.
Notification / bytea / The data of the notification.

PageViews

A list of the number of times each entity has viewed each content.

UserId / uuid / The Id of the entity.
ContentId / uuid / The content being viewed.
ViewCount / integer / The number of times the content has been viewed.
LastView / timestamp / The time the user last viewed this piece of content.

Permissions

Listing of all users and the objects they have permission for.

ObjectId / uuid / The Id of the object the user has permission on.
EntityId / uuid / The Entity who has permission.
Action / integer / The type of permission the entity has.

Relation

The RDF table containing all searchable relations between objects.

Subject / uuid / The Id of the subject.
Predicate / integer / The RelationTypes value of the type of relation.
Object / uuid / The id of the object.
Value / integer / A measure of the “strength” of the relationship.
CreationDate / timestamp / The time the relation was created.

SystemSettings

A dictionary of system-wide properties.

Name / char varying(25) / The key of the dictionary.
Value / char varying(25) / The value of the dictionary.

Tag

Stores the information about a tag.

Id / uuid / The id of this tag.
Name / char varying(200) / The title of the tag.
CreationDate / timestamp / The time the tag was created.
TagType / integer / The TagType value of the type of tag it is.
Description / char varying(4000) / A longer description of the tag.

TagProperties

A dictionary of additional non-searchable properties about a tag.

Id / uuid / The Id of the tag.
Name / char varying(50) / The key of the dictionary.
Value / char varying(200) / The value of the dictionary.

TransactionBuyer

A listing of all of the unique information about a single transaction, including the buyer.

TransactionId / uuid / The Id of the transaction.
BuyerId / uuid / The id of the buyer.
Date / timestamp / The date this transaction occurred.
Delegate / bytea / The action to perform once the payment has gone through.

TransactionContents

A listing of all the pieces of content within a particular transaction.

TransactionId / uuid / The Id of the transaction.
ContentId / uuid / The content item.
Price / numeric(9, 2) / The price this content was sold for in this transaction.

Backend Files

Access Control

IAccessControl.csAccessControl.cs

Interface and implementation for querying access permissions of Entities upon BackendObjects.

IAccessControlList.csAccessControlList.cs

Interface and implementation to model the set of access permissions upon a single BackendObject.

AccessControlSettingsHelper.cs

Services for changing access permissions upon a BackendObject.

IAccessControlObject.cs

Definition of the interface implemented by BackendObjects which supports access control functionality.

RelationsPermissionCalculator.cs

Service that determines which members (subject and object) of a specific relation type need what permissions in order to conduct various access operations on that relationship.

Backend

IBackend.cs, IBackendInternal.cs, & Backend.cs

IBackend.cs contains the interface provided by the Backend to other modules. IBackendInteral.cs contains extra interface methods available only within the Backend module. Backend.cs contains the combined implementation of these interfaces.

BackendFactory.cs

Provides a safe access point for other modules to create a Backend object for use.

DatabaseConnection.cs

Provides a generic database interface for other Backend classes, to encapsulate and limit the necessary changes for using different database solutions.

GetBackendObjectVisitor.cs

Visitor that creates an appropriate specific object type from a generic BackendObject.

BackendObject

IBackendObject.csBackendObject.cs

IBackendObject.cs describes the interface used by other modules to interact with generic Backend objects, and BackendObject.cs contains the implementation of Backend objects.

IBackendObjectCollection.cs

Defines a Collection<BackendObject> interface, which is implemented for User and Content collections in their respective sections.

BackendQuery

BackendQuery.cs

Implementation of a model of a Sparql-like search query for data modeled by the Backend module. Other modules build a BackendQuery, and then send it to the Backend for processing.

IQueryValidator.csQueryValidator.cs

Interface and abstract class which provide a standard way to validate and get other information about the various query classes (BackendQuery, Statement, and StatementMember).