My "todo" List

I am planning to write some blogs at somepoint of time in the future on the following topics:

Spring Framework, Hibernate, ADO.NET Entity Framework, WPF, SharePoint, WCF, Whats new in Jave EE6, Whats new in Oracle 11G, TFS, FileNet, OnBase, Lombardi BPMS, Microsoft Solution Framework (MSF), Agile development, RUP ..................... the list goes on


I am currently working on writing the following blog

Rational Unified Process (RUP) and Rational Method Composer (RMC)

Sunday, May 13, 2012

Hyland - OnBase

General Overview

OnBase is primarily an Imaging system (IS) but it also provides hybrid features that come with other ECMS, BPMS, and BRMS systems. Its core strength however lies in Image management and providing components and tool-sets that allow image manipulation.  In this blog I am going to dive into some of the features that OnBase provides. I plan to write another blog for IBM FileNet the competitor for Hyland-OnBase in this space. So let’s get started

The figure below gives 10,000 feet oversimplified view of what an Imaging system provides


The intake of documents (specifically images) to the imaging system occurs through various routes
  • Users scanning-in documents  (Point of contact scanning)
  • Mail rooms doing batch scanning
  • Other systems pushing electronic images into a central imaging system (OnBase) via API (Application Programming Interface)
  • Electronic documents (not just images, but Word, PowerPoint, PDF and many more) uploaded into the imaging system
  • List is not complete (as there are many other options)
Once the documents are scanned in they are indexed via keywords. Once the documents are indexed with keyword values, they can be retrieved and then assuming the content type is an “image”, image manipulation operations like annotations, redactions, lighten/darken image, rotate/reverse, zoom-in/zoom-out image etc. can be performed on them.
In the following section, I am going to explain how OnBase configuration helps tailor the system to suit your needs (NOTE: Since I have worked for many years in designing software systems using quite a few programming languages from ground-up I will provide hints through-out this blog about how to design a Custom Imaging System (IS) if you had to custom-code the Imaging system using programming languages from ground-up, I will tag my comments within this blog with “Custom-Code:” to help folks search for ideas if they want to design a imaging system from ground up)

Once the documents are scanned in they are indexed via keywords. Once the documents are indexed with keyword values, they can be retrieved and then assuming the content type is an “image”, image manipulation operations like annotations, redactions, lighten/darken image, rotate/reverse, zoom-in/zoom-out image etc. can be performed on them.
In the following section, I am going to explain how OnBase configuration helps tailor the system to suit your needs (NOTE: Since I have worked for many years in designing software systems using quite a few programming languages from ground-up I will provide hints through-out this blog about how to design a Custom Imaging System (IS) if you had to custom-code the Imaging system using programming languages from ground-up, I will tag my comments within this blog with “Custom-Code:” to help folks search for ideas if they want to design an imaging system from ground up)

Glossary

I will try to use this section as a place holder to describe terms that I use within this blog that I do not explain but use it as though it’s crystal clear to everyone; I cannot promise to be very detailed and if I have missed any do point it out to me so I can publish a revision.
  • Folders - Folders within OnBase allow us to rearrange documents that are logically related; think of Folders in windows explorer. OnBase uses Folders in a similar manner. Folders are used for various purposes (Workflow, Record Management are just a few examples where Folders are used). Determination of which documents belong to a folder is done based on keywords that are common between the document types that the documents belong to and the keywords that are defined at the Folder Type of the Folder.
  • Folder Types – This is just the type that defines the Folder. It allows you to define metadata like user groups; keywords used to associate documents to the Folders of the Folder Type.
  • File Cabinets - File cabinet is the highest organization unit within an OnBase folder structure; File Cabinet cannot have a parent. It’s just another hierarchical term used within the context of OnBase Folder structure; nothing fancy.
  • Managed components /objects –This is not an OnBase term; I like to define configurable items within any product as managed components /objects. So to all the “OOP” guys I apologies as this might annoy you; technically their meaning is different in the “OOAD” world.
  • Metadata/attributes/properties – I tend to treat these terms as synonyms. Components/Objects have metadata/attributes/properties that can change the behavior of the Components/Objects. Once again I apologies to the “OOP” guys for not making a technical distinction within this blog.
Basic Configuration steps in OnBase 


 
Define Document Type Groups (DTG)/Document Types (DT)
OnBase allows you to define categories (two levels of categorization) to categories the documents that are taken into the imaging system. OnBase refers to those two categories as “Document Type Groups (DTG)”, the parent category and “Document Types (DT)”, the sub-category. 
It then allows you to configure attributes (metadata) for the Document Types which allows you to control various aspects like
  • What User Groups/Users (UG/U) have access to the Document Type (Security)
  • What is the default document content type (Image, Word, Excel, PowerPoint etc), you can define any document content type in order to display the documents in their native program (like MS Word, MS Excel etc), OnBase allows you to associate the content type in a similar manner to how you associate MIME types for your browser.
  • What workflows these DT belong to?
  • What’s the retention period for the DT
  • What products and privileges the UG/U have access to?
  • Etc.
Define Keyword Type Groups (KTG)/Keyword Types (KT)
Just like how DTG allows you to group DT into a collection of related items. KTG allow you to group KT into a collection of related items.
KT allows you to define the keyword fields that will be used while indexing the documents. Various data field types like date, varchar, numeric etc. are available. Format mask capabilities are also available. You can specific if a KT is required or optional. Quite a few configuration options are available for KT.
Once you define KTG/KT you can assign them to DT, that way when you try to scan in a document or import an electronic document you will be asked to enter the keyword fields that will be used to index the document. Here is the simple rule to remember when it comes to configuration
  • Document maps to DTG/DT
  • Index/keyword values maps to KTG/KT
 As simple as that

Define User Groups/Users (UG/U)
Define User Groups and users so that you basically can control authorization based on roles that the users are assigned to. Pretty basic nothing fancy here

Assign User Groups/Users to (DTG/DT) and configure what product components and operations they can access/perform.
Once you define the UG/U; you can assign them to DTG/DT to configure access rights. You can also while defining the UG configure access rights that the UG has to various product components
Access rights examples include

  • What actions the UG/U can perform (document retrieval, document re-index, document delete etc.)
  • What Product access the UG/U have (Workflow access, Record Management access etc.)
Custom-Code:
If you are planning to custom code an Imaging System (IS), then basically design categories (DTG/DG), and Keyword types (KTG/KT) in your system database and then define relationship between them. Expose User Interface (UI) for defining and creating the categories and keyword types along with their relationship; simple right. Convert this into an administrator function that you can then control through UG/U.

With regards to user groups and users; follow the same thought process. Make sure you provide integration with LDAP, custom table driven authentication/authorization; maybe integration with Active Directory via NTLM (Kerberos) might help.
 
For image manipulations all you need to do is purchase some good UI controls (browser plugins if you want to do it over the web) instead of building these UI controls from scratch; there are quite a few UI controls available in the market that allow you to perform image manipulations like zoom-in, zoom-out, rotate etc.  There you go you have a custom Imaging System

Logical Architectural components of Hyland




NOTE: The logical architectural diagram above does not show all the server components of Hyland. Hyland server components for the most part are scalable horizontally as well as vertically and “yes” Hyland server components can run on Virtualized servers (no more “hasp” – “hasp” is just a USB flash drive that Hyland was using with its earlier versions of OnBase to protect licensing).


Component
Description
Web Client
OnBase Unity Client:
It is the Windows application of OnBase that is deployed over the web using the Microsoft ClickOnce technology and has the look and feel similar to Microsoft Office suite. It’s a hybrid solution, as its thick client deployed over the web and uses pure http (or https) traffic to retrieve data from the web server
OnBase Web Client:
Hyland provides HTML, ActiveX and Java versions of pure web based clients depending on what your needs are.
OnBase Reporting Service Client:  This is a Windows application similar to OnBase Unity Client used for running OnBase Reports. Its web-deployed to the end-user workstation using Microsoft's ClickOnce Technology.
Thick Client
OnBase thick client and OnBase Desktop :
OnBase provides thick client to access OnBase functionality.
OnBase Configuration Client:
This is a thick client that allows administrators to configure various components/objects of OnBase
Network Equipment
This is just a place holder I have in the architectural diagram for any of the following network equipment
  •         Router
  •         Load Balancer
  •         Switches
Depending on what’s there between your servers and the web client the exact network architecture will change; example if the web clients are public users then the network equipment will most likely be a router and then you will have a load-balancer after that (assuming you have multiple web servers to load balance; maybe you will have a Cisco ASA equipment also who knows)
Web Tier
This will be hosting the Hyland web application component (basically a virtual ASP.NET application) and it can be load balanced. The web server will be IIS.
Firewall
You will have a firewall between your tiers for security. If your web tier is in the DMZ you may also have VLAN configured and a much strict firewall rules; maybe a router in between tiers.
Application Tier
This will be hosting the Hyland Application Server component (basically a virtual ASP.NET application which exposes a web service) and it can be load balanced as well. The web server will be IIS. NOTE: You can merge the web and application tier if you do not have any DMZ zone to save cost.
Data Tier
The data tier has
Database servers:
Both SQL Server and Oracle databases are supported by Hyland OnBase. Hyland OnBase also supports High Availability (HA) configuration features provided by both databases.
SAN/File Server:
The database servers only store the metadata of OnBase images (metadata like keywords, pointer to image, user account information etc.). The Actual images/documents are stored in SAN/File Server in their native format (which is TIFF for images). OnBase uses logical units called Disk Groups for storing images. Disk Groups allows us to configure various storage aspects for documents (images).
Custom-Code: If you want to design your own Imaging system all you have to do is define metadata tables and place the images on SAN/File Servers and then use your Application layer to access the images; expose the application function via API. Use the web tier to access these application layer API (Examples of application layer API: WCF for windows or EJB for Java); use ASP.Net or Windows application or Java or JEE (JSP/Servlets) for User Interface.
 
Hybrid Features of OnBase

In this section I am planning to focus on some of the features of OnBase that make it compete with the ECMS, BPMS and BRMS system vendors.



Record Management


Record is defined within OnBase as a collection of related documents and guess how you relate them; “yes” – Folders. Refer to my “Glossary” section within this blog for what “Folders” and “Folder Type” mean within OnBase world.
Here are the steps that you will perform for Record Management (RM)
Define folder structure
Refer to my “Glossary” section within this blog for what “Folders”, “Folder Type”, and “File Cabinets” mean within OnBase world.  You need to define the folder structure for only one thing which is to arrange the various documents into structures that can then be managed. Think of your office file cabinet that’s what you are defining here; as simple as that. 
Define Hold Sets
They just define reasons why a particular folder within Record Management should be kept on “hold”. When a folder is kept on “hold” regular operations like adding or removing a document cannot be performed, Retention plans will also be on hold. Hold Sets let you define the reasons why the folder is kept on “hold”, this allows you to tag and track the reason for keeping the folder on hold.
Define Retention Plans/Retention Plan Sets
Retention plan let you define what “Actions” should be performed when “a certain time has elapsed” or an “Event” has occurred.
Within Record Management a folder transitions through various status; the status values are explained below

Open: Regular operations like adding and removing a document can be performed on the folder. This is the starting status for the folder.
Cutoff: This status signifies that the retention period for the folder has started.
Close: No operation can be performed when the folder is in closed status; except if you are an administrator and want to correct the contents of the folder for whatever reasons.
Final Disposition: In this status the record contents could be destroyed, purged, or retained permanently. There is an option within OnBase to set an approval process before this status and eventual outcome of the documents.
 


  • Open: Regular operations like adding and removing a document can be performed on the folder. This is the starting status for the folder.
  • Cutoff: This status signifies that the retention period for the folder has started.
  • Close: No operation can be performed when the folder is in closed status; except if you are an administrator and want to correct the contents of the folder for whatever reasons.
  • Final Disposition: In this status the record contents could be destroyed, purged, or retained permanently. There is an option within OnBase to set an approval process before this status and eventual outcome of the documents.
Retention plan let you control these transitions. When a folder is first created it’s always in the “Open” status which means users can add or remove documents from that folder. The figure below shows the status transitions that happen within a folders life-time
 
Retention Plan Set allows you to assign at runtime different Retention Plan to “Folders” of the same “Folder Type” depending on the folder’s value for a specific Keyword Type. Did I mention that Folder Types can be assigned “Keyword Type” seems obvious right? Remember my windows folder analogy. You need some way of creating the folder hierarchy branching; in windows operating system we have “inodes” as one of the metadata for the folders.  In OnBase, we use Keyword Types to Auto-folder the “folders” and guess how the Documents end up in the folders; based on Auto-folder “Keyword Types” defined at the “Document Type” level with Keyword Types that are common to both the “Folder Type” and “Document Type”

Create Event Sets/Events
Record Management allows you to post events to managed folders so that the status transitions as well as Retention Plan settings can be changed. Event Set is just a way to group related events together. You can post events manually to managed folders or use workflow (life-cycle) actions to post it.
Events within the context of Record Management allow us to change the status of the folder or redefine/override the Retention Plan defined for the managed folder.

Configure Document Types/User Groups
You need to configure “Document Type” for auto-folder and then associate them with managed Folder Type using common Keyword Type. Basically all you are trying to do here is define folder structuring based on Keyword values so folders can be automatically created based on keyword values of the documents that are getting scanned-in or imported and in doing this configuration you basically place those documents in the managed folders for Record Management as those documents are inputted into the imaging system via various input channels.
NOTE: The actual documents are physically located in one place in the imaging system (SAN/File Server) it’s just that their pointers are referenced in the managed folders (this saves disk space); even in the database server, only the image pointers are kept; the actual images always reside in “Disk Groups” which are logical units defined in OnBase with physical location being the SAN/File Server.

You also need to configure User Groups/Users that need access to Records to ensure security measures are in place.


The diagram below summaries what steps have being described so far when it comes to configuring Record Management within OnBase.


Custom-Code: You can define your own terms for Folder Type and Folders; you then proceed to define metadata at the “Folder Type” level to associate various “Document Type” with the “Folder Type”. You can then define retention plans as configurable component/object within your custom Imaging system and then use Timer services in Java EJB or WCF services in C#, VB.NET to monitor Retention policies defined for various Document Types. In fact you can use Java EJB/WCF timer concept for other background operations that you want to perform within your custom Imaging System, and of-course execute these services/components/objects in the application tier please; remember you have more than one tier architecture to design and use for scalability. Who knows you might compete with Hyland and IBM.

Document Retention

The main purpose of Document Retention is to save disk space and also ensure that documents are appropriately purged for legal reasons.
Document Retention is managed by defining Document Retention Processor, which is basically a batch job that can either be executed on-demand or scheduled to run at defined time intervals.
Document Retention can be of the following types
  • Static Retention
  • Dynamic Retention
Static Retention allows documents to be purged without any evaluation phase. Basically what it means is when the Retention batch job runs and it finds document that have static retention type defined for their Document Types; the batch job will purge the documents if their time-interval has elapsed.
When Document Retention is defined as ”dynamic”, then the batch job goes through two passes, in the first pass it will evaluate the retention, following three evaluation options are available
  • Workflow queue is used for evaluation
  • VB script does the evaluation
  • External program (remember OnBase is Microsoft technology oriented so guess what external programs it can invoke – DLL)
Once the evaluation has occurred and the documents are allowed to be purged; the batch job in its second pass will purge documents that have being identified as being “OK” to purge by the evaluation process and their retention time period has elapsed.
Document Retention is defined at the Document Type level; seems logical.  Document Retention is a separate module from Record Management module although both have retention capabilities; the latter is meant for record keeping purposes while the former is used when you truly wish to purge documents for good reasons.
If a document is in a Record Management folder that is not in “Open” state the Document Retention (DR) processor will exclude the document in its evaluation/purging passes; seems logical; remember the other status for records (Cutoff, Close, Final Disposition) imply that there is a retention plan being executed and hence the documents should not be touched by any other modules besides the Record Management module since they are contained in the folder “managed” by the Record Management module.

Workflow

In my blog titled “ECMS systems, IS systems, BPMS systems and BRMS systems”, I had introduced the terms document-centric versus process-centric workflows. OnBase provides the former. OnBase calls workflows “life-cycles”. So let’s get started
The workflow capabilities within OnBase are typically triggered based on documents that are scanned in or electronically uploaded; hence I refer to the workflow engine as document-centric.
Workflows are configured using the OnBase Configuration thick client (it’s a thick client used for administration of various components and modules of OnBase including workflow, it’s not a web client and is typically accessible to administrators only).
NOTE: OnBase allows you to configure a lot of metadata for components/objects that it maintains; I am not planning to mention every metadata that you can configure for various components/objects defined within OnBase in this blog as that will result in me merging all the system manuals into my blog (not a great idea); I will however give you some examples of metadata to give you some sense of what you can configure

Typically you start with
Define and Configure Life Cycle
Defining and configuring life-cycle (just a fancy term for workflow, it’s just that life-cycle is a managed object within the realm of OnBase with metadata that needs configuration; nothing special). Examples of some of the metadata that you can configure at lifecycle level include User Groups, Work Folders
Define and Configure Queues

Within the life-cycle we then proceed to define the queues. The queues act as place holders for the documents that need to be routed through the life cycle. You can define as many queues as you want within the life-cycle depending on how many stops you want the document to go through before it exits the life-cycle. Queues can be standard or load-balanced. With load balanced queues you have multiple options of load balancing the work load; for example you can load balance based on keyword types, keyword value, rules etc. OnBase allows you to define Rule queues that are used in conjunction with the Business Rule Engine module of OnBase.

Define and Configure Work
Within OnBase life-cycle you can define three types of work
  • System work
  • User Work and
  • Ad-hoc Task
Work can be executed by “Actions” and conditions for executing the actions can be defined by “Rules”. OnBase also allows you to group “Actions” and  “Rules” into “Task List”

Define Document Types/User Groups/Users
You basically have to configure which document types are associated with the life-cycle and then of course associate user groups/users to life-cycles, queues and Document Types.

There you go it’s as simple as that. The figure below illustrates what I have being saying so far.


A few things to remember when it comes to workflow engines that are document centric
  • They are document centric (well that’s news right)
  • They do provide API to interact with external systems. OnBase provides various API but they are still document-centric not process-centric
Remember the above two points that I keep repeating across my blogs that I write on this topic; in a separate blog I will be speaking about BPMS system products like IBM BPM, Oracle BPM where I will point out the difference to make it clear to you when to use document centric workflows versus process/business centric workflows. 

I also plan to write a blog on IBM FileNet a competing product with Hyland OnBase just to give you a flavor of other vendors in this space. To all those advance users (the techies), here is what it boils down to; if you want to work in an environment that is closer to Java technology stack like Java, Linux, AIX, Oracle etc. then go for IBM FileNet; if you want to work with an environment that is closer to Microsoft and its technology stack like C#, SQL Server, VB.NET, VBScript, Windows Server O.S. etc. then go for Hyland OnBase. 

NOTE: Both databases (Oracle, SQL Server) are supported by either product (Hyland OnBase, IBM FileNet); it’s just that you are better off using one set of  technology stack over the other and that decision depends on what your IT department’s Enterprise IT Architecture (EITA) vision is, which staff skill sets you already have. Remember finally it’s you who will have to maintain the systems not the product companies.

EDM Services

Electronic Document Management module of OnBase allows you to store, change and manage other document types particularly Microsoft Word, Excel and PowerPoint by defining Document Types for the documents and utilizing plug-ins to allow document manipulation directly from Microsoft Office product suite while maintain the documents in a central OnBase Imaging repository. By enabling revisions you can also do version controlling of these documents.
Obviously you can initiate workflows based on document imports. I am not planning to elaborate the EDM Service feature of OnBase as I prefer to explain the same in my blog that is going to be dedicated for Microsoft SharePoint (MOSS). In my mind Microsoft SharePoint is a better solution when you want to do Enterprise Content Management outside of Image content. I prefer an ECMS system for managing generic contents not just Image contents.

API

OnBase has tons and tons of API and it’s due to the fact that their product has gone through series of transformations to keep pace with Microsoft’s changing architectural and technology focus. Microsoft started with windows applications (VC++, VB etc) then shifted to web based applications (ASP, ASP.NET). Its middle ware shifted from MTS/DCOM/COM+ to .NET Remoting and now WCF. Even within its windows application it did some transformation from simple windows forms to WPF. Within the web world, we started with ASP, then ASP.NET, web services, AJAX, SOAP, SOA and now Cloud (Azure)

OnBase is doing catch-up. So its API libraries are spread all over the map. I hope they start phasing out some of their older API sets as well as their product base tied to those older technologies. Easier for me to say that, but I understand how it’s difficult for a product company to do that.
Following diagram shows the various API options available within OnBase

Automation, OnBase Client and Desktop API are provided to allow integration with the OnBase Thick client or the Desktop client.
Pop API (DocPop, FilterPop, FolderPop, ObjectPop) provides you with the fastest integration with OnBase with minimal coding. Example: DocPop basically allows you to retrieve documents via a web URL by passing document retrieval parameters as query strings. Simple and quick integration.
Hyland.Services API is a web service that exposes SOAP request and delivers SOAP responses to get information from OnBase.
Unity API is an object oriented layer on top of Hyland.Services API to abstract out the SOAP handshake complexities from programmers. Other than that there is no real difference between Unity API and Hyland.Service API. I have a suggestion for Hyland; please rename this API to be anything other than “Unity API”. It confuses a lot of folks as Hyland has a client-application called “OnBase Unity Client” which has nothing to do with Unity API.
The following table provides details regarding the programming options available for each API set
API
Programming option
Automation API
VB Script
OnBase Client API
COM
Pop
URL
Desktop API
COM
Hyland.Services
COM, .NET or Java
Unity
.NET

Reports.
OnBase provides quite a few reports that can be accessed via the Reporting Services Client which is a hybrid client application that is deployed over the web via ClickOnce technology (similar to the OnBase Unity Client).
OnBase categories reports into groups, following are some of the per-configured groups within OnBase
  • Configuration Reports
  • Document Knowledge Transfer Reports
  • Document Management
  • Document Processing
  • Licensing Reports
  • Medical Records Reports
  • Physical Records Management
  • RAC Reports
  • Records Management
  • Scanning Reports
  • Security Reports
  • Storage Reports
  • Transaction Log Reports
  • Usage Reports
  • User Reports
  • Workflow Reports
Each of these Groups have multiple predefined canned-reports. These predefined  canned-reports exist for Oracle as well as SQL Server database.

Note: if you want to custom code a report using SQL, T-SQL, PL/SQL, stored procedures etc. then OnBase Reporting configuration allows you to do that by making calls to these external database objects through the database drivers; the report layout rendering can then be done from the Report Configuration UI provided by OnBase.


Miscellaneous/etc.

Besides what I have said so far, Hyland likes to create business domain specific modules and there are quite a few of them, some of them are listed below.

 
Summary

In this blog I focused on the basic strengths of OnBase and did not spend time providing specifics on BAM, BPM, and BRE features of OnBase because I think there are better products available in the market for that and I will provide details of those products in my other blogs. I also provided tips on how to Custom-Code an Imaging System, “yes” the tips were too simple but my intent in this blog was to provide pointers, tips and hints to Software designers.
Hope you get to use OnBase