Avoka Blog: Adobe LiveCycle

April 7, 2009

Processes, Orchestrations, Services, and other confusing terminology

Filed under: Components, Designing Processes, LiveCycle, LiveCycle Architecture — htreisman @ 11:47 am

Introduction

There are a number of different terms used in Adobe LiveCycle Process Management, including process, orchestration, workflow, service, and component. These can be confusing.

Part of the reason for the proliferation of terminology is the history of the Process Management product and industry trends, and this blog entry tries to provide this perspective. This is my understanding of how these terms fit together, and should not be regarded as a definitive explanation.

If you don’t care about the historical perspective, skip to the end for a summary.

Workflow

The first version of Process Management was called “Adobe Workflow”. And the things you created were called “Workflows”. Even today, many people will generically refer to any combination of boxes and lines that you create in Workbench as a “workflow”.

However, the term workflow was used primarily in the 80’s and 90’s to refer to more traditional image storing and routing applications, and doesn’t really adequately describe the much more useful data-oriented and integration capabilities of LiveCycle. In the 2000’s, the terms BPM or Business Process Management started to become more widely used, not just as a technology, but also as a way of thinking about and improving your business. Various BPM disciplines evolved, including Six Sigma, Lean, and others. LiveCycle fits in much more closely with BPM than it does with the older imaging-oriented workflow systems, so…

Adobe Workflow became LiveCycle Process Management, and Workflow was replaced with …

Process

A Process is no different to a workflow, really, it’s just a term that better reflects the data-orientation and integration capabilities of the LiveCycle platform.

In LiveCycle, a process is almost always associated with a Form, and a series of human interactions that allow people to interact with the form and its data. There are almost always integration steps as well, such as pre-filling the form, writing to a database, adding a digital signature, sending emails, etc. These are also known as “long lived processes”, because they involve people interating with the form over a period of days or weeks.

Long lived process have a unique process-id, have every step recorded and audited, can be re-tried if any step fails, and can be viewed and managed through the Adminui interface.

Orchestration

In LiveCycle ES, we started seeing an increasing number of processes that are short-lived, and don’t include human steps. An example is the Default Render Process in LiveCycle ES. These processes used to be written as Java code in LiveCycle 7 – in ES, they have the advantage that being processes, we end-users can see and modify them very easily. It’s really just visual drag-and-drop programming. You can also easily build your own orchestrations to do something useful – for example, you may want to grab some data from an XML file, populate a form with it, encrypt the form with a password, and email the result to someone. There are no human steps in that, just a series of operation that need to be performed as quickly as possible. You could do this using the API’s, but implementing this as an orchration is quicker, more reliable, and much easier to change.

These short-lived or “straight-thru” processes do not store an audit trail, run as fast as possible (very close to native Java speeds), and do not have a unique process-id that identifies them. They cannot be managed through Adminui, and if they fail, they simply throw an exception back to the caller, rather than being able to be re-tried thru Adminui.

Someone coined the term “Orchestration” to refer to this type of process, and the name has stuck. I’m not sure where the use of this term originated, but it’s used in several other standards BPM standards and products.

Notes:

  • Adobe do not officially use this term in their documentation (see http://blogs.adobe.com/livecycledocs/2009/02/the_livecycle_terminology_secr.html) but I think it’s a helpful term that differentiates a process as being short-lived.
  • You can still use a long-lived process for something that doesn’t involve humans. We often use long-lived-processes for processes that geneally run very quickly, because of the advantages of having each step audited, and for the ability to re-try a step if anything goes wrong. I would still call this an orchestration, because it doesn’t involve any human steps.

Here is a screenshot from WorkBench. This shows a process, its implementation, and its properties. This is a short-lived process, so I would call it an orchestration.

process

Component

A LiveCycle Component is basically a bunch of Java code, packaged up into a jar file, and deployed into the LiveCycle server.

LiveCycle Components are brilliant. I love ’em. They are my insurance policy. I know that there is no problem that I can’t solve in the process world, because if i get stuck, I can write Java code to implement it. Then I can turn it into a LiveCycle component, drop it into LiveCycle, and bingo, I can use it my processes. And the real beauty of it is that anyone else can use my component too, without needing to understand programming. In LiveCycle, Adobe have created the best implementation of a server-side component model that I’ve ever seen – it’s a bit like those Visual Basic ocx controls that you can buy to build your GUI – but on the server.

The concept of LiveCycle server-side components came from a predecessor of Process Management, and were originally called QPACs – Quick (QLink for the real old-timers) Process Action Components.

Each component defines one or more services (see below). Although in reality, most components only define a single service, so components and services are somewhat interchangable.

A service can define one or more operations. An operation is the thing that actually does the work – like sending an email, encrypting a PDF, or writing a file to disk. Services group common operations together – so the File Utilities component contains a FileUtilsService, which in turn contains operations to read, write, create and delete files. Operations each define a number of input and output parameters.

For Java programmers:

  • Component == .jar file
  • Service == Class
  • Operation == method

Here is a screenshot of a Component within LiveCycle.

components

Avoka has been building components to solve real business problems with LiveCycle for many years, and we turn most of our components into re-usable components that anyone can use. We have a large library of components that solve most of the problems we’ve encountered over the last 5 years. So most of the time, you don’t even have to learn how to build components – if it’s not already in the Adobe “box”, then check out our website – there’s a good chance we may already have built it for you.

For more information, see: http://www.avoka.com/avoka/escomponents.shtml, or email info@avoka.com

Service

Another concept that was introduced in LiveCycle ES was that of a “Service Bus” or “Service Oriented Architecture”. LiveCycle is internally implemented as a service bus. In simple terms, any bit of functionality within the LiveCycle server can be exposed to the outside world as a “Service”, and can be invoked in a number of different and useful ways. Once you’ve defined a service, you can define one or more EndPoints for that service. So for example, I might have a service that encrypts a PDF – I could add a Web Service Endpoint that allows me to invoke that service using Web Services. And I could also add a Watch Folder endpoint that allows me to invoke that service by dropping a file into a folder.

Now things get a little confusing. Services can be implemented as either some Java code, or as a LiveCycle process.

  • If you need to write Java code, you create a component, deploy that, and you get a service.
  • Or you can do it by creating an orchestration (or process), you define your orchestration, activate it, and you get a service.
  • Either way, you get a service. The choice of whether to use an orchestration or Java code depends on the complexity of the service you’re trying to create, and whether it can be created by joining together (or orchestrating) a number of other existing services.

The LiveCycle process engine doesn’t really care whether your service is implemented in Java or as an orchestration, you invoke and manipulate it in exactly the same way. You can either call a service externally (by defining an endpoint), or you can invoke one service from another, simply by dragging the service into process. This is a very neat way to handle things – why should you care how something is implemented, you should be able to use it the same way.

Note: An orchestration only ever defines a single operation, always called “invoke”. Whereas a component can define multiple operations.

This screenshot shows two services, one implemented as a component, and the other as an orchestration.

services1

Summary

  • Process: Something that you build in Workbench, and looks like a flowchart. Can be user-oriented/long-lived, or straight-thru/short-lived.
  • Orchestration: An unofficial but commonly-used term for a straight-thru/short-lived process.
  • Workflow: An older term for a process.
  • Component: A jar file containing some code to do something “useful”, that you deploy to the LiveCycle server. A component contains services, which in turn contain operations. All the Adobe services are built as components, as are all the Avoka add-on services.
  • Service: Something that lives in the LiveCycle server that provides some sort of “service”. Can be implemented either as Java code (as a component) or as a process/orchestration.

March 1, 2009

What do the LiveCycle ‘Process Fields’ really do?

Background

In order to integrate a LiveCyc le PDF form into LiveCycle Process Management, you need to embed some special fields into your form.

The LiveCycle documentation does indicate what these fields are for, but doesn’t really explain exactly how they are used and populated.

Process Field Documentation Link

This blog entry explains how it all really works.

The Fields

The Process fields inside Designer look like this:

process-fields-in-designer

The scenarios

When integrating with Process Management, the form will be presented inside Workspace. There are a number of scenarios:

  1. A user is submitting a form within Workspace (either to initiate a process, or as an item in their To Do list)
  2. The user has taken an initialization form offline
  3. The user has taken a form from their inbox offline.

Submitting a form within Workspace

In fact, for this basic scenario, you don’t need the process fields at all. All you really need is a Submit button.

However, if you do use the process fields, you get a few extra capabilities, including user choices, and the option to take the form offline.

The field AWS_SUBMIT is a regular button that contains some Javascript code. This code, among other things, checks whether you’re running in an online (within Workspace) or offline mode. If you’re running online, it simply sends a “click” event to the real submit button, FSSUBMIT_ (which is a hidden http submit button).

Apart from the choice fields described below, none of the other fields are necessary – LiveCycle knows exactly who you are, and what you’re doing, because you’re logged into a live session in Workspace. It’s only when you take a form offline that the other fields are necessary.

Giving the user a choice

Sometimes you want to give your users a choice, which will affect the routes that are taken within the LiveCycle process. If you put a bit of text into the AWS_CHOICE text field, such as “Approve,Deny”, the Javascript code will:

  • Display the Action dropdown (if there are no choices, it will be hidden)
  • Populate the Action dropdown with the values from the AWS_CHOICE field.

It will look something like this:

approvedeny

You can put your own comma-separated text into AWS_CHOICE, but LiveCycle will automatically do a whole bunch of useful things for you.

  • When you create a User Assign Task step in your workflow, check the checkbox that says “Populate Form with Routes”. LiveCycle will then look at all the route names coming out of your User step, and populate AWS_CHOICE with the route names.
  • When the user submits the form, LiveCycle will interrogate the value of the Action dropdown, and automatically route the process down the selected route.

Note: If you choose not to use this option, you can simply create rules in your routes that interrogate other data within your form to decide which route to follow.

Note: Certain route names are associated with special icons in Workspace. These are “Approve” and “Reject” (or maybe “Deny”, I forget.) You can configure these special icons and their associated routenames from Adminui.

Taking a form offline.

Once you take a form offline, it is “disconnected” from Workspace. You can fill it in, and when you click the Submit button, it will be submitted via email, rather than over the web. The AWS_Submit button has logic to determine whether you’re online or offline, and submit either via the web or via email.

If you’re submitting via email, the form needs to know what email address to submit it to. This email address is automatically populated by LiveCycle into the AWS_MAILTO field when you take the form offline. Like all the other fields, there’s no real magic – you can set the field to an email address manually or in code if you want to, but generally it’s easier and safer to let the LiveCycle server take care of this for you.

Note: In order to allow offline submissions, you need to set up a email box, and configure LiveCycle to monitor this email box. That’s the subject of another blog sometime. Email info-at-avoka.com if you’re having trouble.

When you submit your form data via email, LiveCycle no longer knows anything about you, or why you’re submitting the form. It simply knows that an email arrived in an inbox containing some XML as an attachment. So…

When you take the form offline, LiveCycle injects some data into a few fields:

  • When you take an initialization form offline, the LiveCycle server will populate AWS_PROCESSTYPE with the name of the process that should be initiated when you submit the form. When you submit the form via email, LiveCycle will start the named process on your behalf, using the data from the email attachment.
  • When you take a form in your inbox offline, the LiveCycle server will populate AWS_TASKID with the id of the task that this relates to. When the submit the form via email, LiveCycle will complete that task on your behalf.
  • In both cases, LiveCycle will use the email address of the sender of the email of the email to determine who the “completer” was. Warning: make sure you send from an email associated with your LiveCycle identity – if LiveCycle doesn’t recognize the sender, it will treat the email as spam, and silently discard the incoming email. (I haven’t verified this lately, but it used to work this way.)
  • In both cases, the LiveCycle server will populate the AWS_ASSIGNEDID with the internal LiveCycle ID of the user who was originally assigned this form. I have no idea whether this is used for anything, or why it might be useful to either you or LiveCycle.
  • Finally, once you submit via email, Reader will set the AWS_STATUS field from ” ” to “Submitted”. Once this field has the “Submitted” value, you will not be able to submit the form again. This simply prevents duplicate submissions being accidentally emailed.

Summary

While there’s quite a bit of logic associated with these fields, both within the form and with the LiveCycle server, you don’t really need to know how it all works. Just drop the fields into your form, and LiveCycle takes care of the rest. It is helpful, however, to know how this works, so that you can tweak the behaviour if necessary.

February 9, 2009

Using the Avoka XQuery Service Component

Filed under: Components, Designing Processes — Jason Hendry @ 8:46 am

Livecycle has some useful Service components for manipulating files, database rows and XML documents. And while XSLT is a powerful and useful language, it can sometimes be a little unwieldy to express simple logical problems.

I recently had to perform some simple logical programming on the server side to select one of three text fields on a form. The text fields were comments from users and the last user to touch the form before exiting the flow had their comments inserted into an email back to the form originator. There are a multitude of different ways of handling this problem on the form instead of on the server, however, this problem came to me as a system maintainer and there were constraints I had to observe.

Avoka’s XQuery component makes it possible to construct strings and XML fragments using the XQuery language. XQuery, as you may know, has a syntactic structure called FLWOR or

For
Let
Where
Order
Return

This is typified by the example given on W3Schools using the Books XML data sample:

for $x in doc("books.xml")/bookstore/book
where $x/price>30
order by $x/title
return $x/title

*Note that the Let is implied in $x in doc(…)/bookstore/book

But what I needed was not an iterative solution (at least, not one that required a for loop), but a way to simply express the conditional logic to select the first non-empty XML text node.

To test the hierarchy of user’s for the last commenter, I could have used a number of Decision and Set components, but why take 3 steps when you can take 1?

Alternately, I could have implemented similar logic using the Livecycle script component and while it would have performed in a single workflow step, the script component can be unwieldy when manipulating form values. Do you really want to …

patExecContext.getProcessDataStringValue("/process_data/status");

… just to test a couple of values?

Step 1: Workflow

There are two possible types of invocation on the XQuery Service Component, Single Document and Merge Document.  The Merge Document invocation is intended to be more powerful so in this example we’ll be considering the Single Document invocation of the XQuery component.

I inserted the XQuery component (Last Commenter) into the workflow, just before the email step to notify the originator of the last user’s comments.   This is where we’ll test the form data variable for the comment fields, and then copy them into another process variable to be inserted in the email.

Inserting the XQuery service object into the workflow

Inserting the XQuery service object into the workflow

Step 2: Service Configuration Parameters

The XQuery object is fairly simple to setup, only requiring an XQuery statement and an Output location. In the example here, I’m using a variable, emailCreditComments, to capture the comment tex which I’ll re-use in the following email workflow step.

Configuring the Avoka XQuery Service Component

Configuring the Avoka XQuery Service Component

Step 3: XQuery Statement

First, I chose a location in my process variables to act as the root of my XQuery statement. This is a parameter of the statement, rather than the XQuery configuration and acts to reduce the size of the document the XQuery statement operates on.  Since I’m looking for comment fields, I set the location deep into my XFA form variable structure:

/process_data/form/object/data/xdp/datasets/data/DocumentRoot

In this example, I’m using the root of the XFA data passed into the process as a form variable.

Next, as a matter of style, I set the form comments text into re-usable variables that I can test.

let $sp := //FormData/SeniorPartner/Comments/text()

Then, I constructed a logic statement that allowed for graceful regression without forcing the flow. In most functional programming languages, this style of cascading if() statement is to be avoided since it can become a real headache for maintainers. However, the nature of service components in Livecycle means that we can consider (and re-use!) pieces of functionality independently.

let $sp := //FormData/SeniorPartner/Comments/text()
let $hod := //FormData/HeadOfDepartment/Comments/text()
let $mgr := //FormData/Manager/Comments/text()
return if (string-length($sp) = 0)
	then if(string-length($hod) = 0)
		then $mgr
	else $hod
else $sp

As you can see the statement itself is quite simple, cascading from each if() statement, testing for validity in the order of the user hierarchy (Senior Partner, Head of Department, Manager). Validity in this case is a test for string length, such that any test that fails indicates we have a non-empty comment text node.  The first test to fail gets it’s comment text returned via the else clause and inserted into the configured output location.

XQuery Service Configuration Statement
Lastly, since the result of the XQuery statement will be inserted into an email, rather than re-used as an XML data set, we tell the XQuery service component to omit the XML processing instruction. Other options include the ability to ‘pretty print’ the output and to re-use namespaces from the source process definition.

Summary

This example demonstrates a simple, if somewhat unusual use of the XQuery language implemented using the Avoka XQuery Service Component.

References

http://manly.avoka.com/confluence/display/Public/XQuery+DSC

http://www.w3schools.com/xquery/default.asp

November 8, 2008

Creating a Read-only copy of a form

Filed under: Designing Forms, Designing Processes, LiveCycle — htreisman @ 2:27 pm

Overview

Very often, as part of a process, you’ll want to create a read-only copy of a form, and store it somewhere, or email it to someone. There are various ways of achieving this, and they vary in different ways.

Making the fields read-only

It’s possible to make all the fields in the form read-only, using some Javascript coding. This is fairly straight forward, and looks something like:

TextField1.access="readOnly"

If you have a few fields on your form, you can just do this for each field. If you have a lot of fields, you’ll probably want to write this as a recursive function, and do all fields in a single function call. This can be quite fiddly, because you need to correctly handle all the different types of form objects. (Avoka can help with this – contact info-at-avoka.com for assistance.)

Pros
  • Inexpensive

Cons

  • Can be fiddly to program
  • While this makes the form appear to be read only, a clever end-user could still modify it. To modify it, open the form in Acrobat, use Forms/Manage Form Data…/Export Data… to export the data as XML, modify the XML, and then re-import the XML data using Forms/Manage Form Data…/Import Data… The form data will have changed, and nobody is any the wiser.

“Flattening” the form

Adobe have a server product called LiveCycle Output. One of the things that LC Output will do is “flatten” a form – that means that it will be converted from an editable type of PDF document to a more traditional “read-only” document. The fields are no longer editable fields, they are really now just boxes and text drawn onto the page.

Pros

  • This is much more secure than setting fields to read only. The XML data no longer exists, and cannot be modified.
  • No JavaScript programming required to achieve this – simply route your form to LiveCycle Output, and generate the flattened PDF.

Cons

  • It’s still possible to modify the PDF file to change what appears on the form. This is much harder, because the PDF is now in a binary format, but it’s possible.
  • Requires server-side integration
  • Requires purchasing LiveCycle Output (but LC Output does other useful things, so this may really be an overall benefit to your organization).

Digitally Signing the Document

Digitally signing the document doesn’t actually prevent the document from being modified, so needs to be used in conjunction with one of the above techniques. However, it does take a “snapshot” of the document, and if any changes are made, the signature will be invalidated. This enables you to guarantee that the version of the file you saved has not been modified since it was produced.

Pros

  • Guarantees that the PDF has not been modified since it was saved.

Cons

  • If signing is performed directly by the client, it requires Adobe Acrobat (or Reader Extensions)
  • If signing is performed on the server, then you must purchase a copy of LiveCycle Digital Signatures, and integrate this into your process.

Summary

There are several different techniques for making a form read-only. The one to use depends primarily on how willing you are to accept the risk of the document being changed (accidentally or maliciously) after it’s been produced, and how much you are willing to spend to achieve the level of protection you want.

October 20, 2008

Customize User Task Escalation After Assignment

Filed under: Designing Processes — kendyyus @ 10:47 am

Background

LiveCycle Process Management has a number of built-in facilities for managing reminders, escalations, and deadlines for User Tasks. But sometimes the built-in features may be a little limited.

For example, you may want one of the following features:

  • In the first reminder, send it directly to the assigned user. But for the second reminder, cc that user’s manager. For the third reminder, send it to the user, their manager, and the CEO.
  • Have different text in the first, second and third reminder emails, with each email getting increasingly abusive.
  • Send the first reminder after 5 days, the second reminder after 2 days, the third reminder after 1 day, and subsequent reminders every 10 minutes after that.
  • You may need to check the status of an external system or database prior to deadlining or escalating the task. For example, the project relating to the task may have been cancelled by the customer, and the task is no longer relevant – in this case, we’d like to simply terminate the entire process.

These type of escalations cannot be accommodated by the built-in features, and require some sneaky techniques.

Calling a Sub-process to Handle Custom Task Escalation

One of the possible ways to solve the above scenario is to call a sub-process to handle the task after the task assignment. We need to actually kick off the sub-process before we actually allocate the task, because once we hit the User step, the process won’t continue till after the step has been completed.

The idea is:

  • call a sub-process before just before the user step asynchronously (in other words, without waiting for it to complete)
  • pass the parent process id to the sub-process (so that the sub-process can reference the parent process’s tasks)
  • the first step of the sub-process set to wait for a minute or two, to ensure that the parent process has time to allocate the task to the user.
  • the sub-process will then have sufficient information to look-up for the active task that would have been assigned

With the task information, the sub-process can be configured according to your business needs to do custom task reminder notifications, escalations, deadlines, and can even terminate the parent process if necessary.

This provides an extremely simple way of achieving almost unlimited functionality in the way that reminders, escalations and deadlines are handled. It also allows you to handle escalations for multiple User steps in a single place – changing the sub-process will change the way it’s handled for all User steps.

This approach involves some of the custom components Avoka has developed to perform those task operations outside of the User step.

Please refer to the link below for the full article of this approach to the described scenario, containing the example process and sample LCA file for download:

http://avoka.dnsalias.com/confluence/display/Public/Customize+User+Task+Escalation+After+Assignment .

August 19, 2008

Using the LiveCycle SQL (JDBC) component – Part I

Filed under: Designing Processes, LiveCycle — htreisman @ 1:51 pm

Overview

The LiveCycle JDBC/SQL component is probably the most useful component in the Foundation category. You can use it to query a database, which can help pre-populate forms, make routing decisions, and more. It’s a very powerful component, but it can get a bit complicated, particularly when you want to pass process variables to your query. In this blog, we investigate the JDBC component in detail.

Datasource name

The datasource name is a way of getting the application server to do all the hard work of connecting to a database. Basically, you configure a pool of data connections to a particular database by configuring the application server. The application server will connect to the database on your behalf, using the connection information you provide, and make the connections available for your use. This is done differently in different application servers. Check out the LiveCycle installation guide or application server documentation for instructions on how to create connection pools.

All you need to provide is the name of the datasource. In most cases, you will be using an “in memory” connection pool, which means that you will need a “java:/” prefix to your pool name.

For example, to connect to the LiveCycle database itself (which we will use in these examples), use:

java:/IDP_DS

Simple query

A simple query is shown below:

Passing a process variable as a parameter – “Embedded Xpath”

If you want to pass a process variable as a parameter to your query, you start by creating a variable, such as “begins_with”, of type String. You can use the “Xpath…” button to replace the “D” with the process variable, and insert it into the query. Your sql query will look like this:

select oid,id from tb_sc_service_category
where id like '{$ /process_data/@begins_with $}%'

Note that:

  • the single quotes are still there, indicating to the SQL engine that we’re using a string
  • the percent symbol is still there, indicating a SQL wild-card
  • the variable name has been “wrapped” in {$ … $}. This indicates to the process engine when the process runs, it should replace the {$…$} with the actual value of that variable.

So if you run this process, and set the variable begins_with = “D”, you should get exactly the same results.

However, if you click the Test button, you won’t get any data. This is because when you click the Test button, the process engine is not running, so the {$…$} wrapping is not replaced by anything, and you’re trying to locate a data that doesn’t exist.

This gets even worse if you use non-string variables, such as:

select oid,id from tb_sc_service_category
where oid ={$ /process_data/@some_id $}

In this case, you will actually get a SQL error:

Exception: You have an error in your SQL syntax;
check the manual that corresponds to your MySQL server
version for the right syntax to use near '' at line 2.

This is because the snippet of text “{$ /process_data/@some_id $}” doesn’t look anything like the integer that the SQL engine is expecting. When the process actually runs, this text will be substituted with the actual value of the some_id variable, such as “1”, and this would work fine – it’s only during testing that you have a problem.

Things would also not if you were trying to find a category called “Bob’s stuff”. After variable replacement, you would get a SQL statement that looks like this:

select oid,id from tb_sc_service_category
where id like 'Bob's stuff%'

You now have three quote characters in your SQL, and the SQL processor will get confused.

The way to fix all these problems is to use a Parameterized Query.

Passing a process variable as a parameter – “Parameterized”

With a parameterized query, rather than embedding the parameter directly into the query, we use a “?” to indicate the parameter. We can then provide both a test value and a runtime-variable to be used when the query is executed. This is shown below:

Note that:

  • We check the box that says “Use Parameterized Query”
  • We added a single row to the table because we have a single ? in our query. You must have the same number of rows as ?’s.
  • We can specify a test value to be used for testing the query, and a variable name for when the query actually runs in the process.
  • The wild-card character % must be embedded in the test data and the variable value.
  • We no longer need the quote characters in our SQL query, because we’re explicitly setting the type of the parameter to a string. We also don’t include quotes in our test data or variable value. This eliminates the problems related to having quotes in your parameter value.

Avoka’s Added Value

Avoka provides an added value component that makes SQL queries easier and more powerful.

This component is shown below:

Some of the additional features in Avoka’s component include:

  • Browsing and insertion of tables and column names into your query (shown above)
  • Additional output data formats, including XML, comma separated values, a CSV file, process variables of type “List”, and insertion of data directly into XFAForm variables.

You can see more examples of usage here:

http://avoka.dnsalias.com/confluence/display/Public/Retrieving+data+from+a+database+and+populating+a+form+with+the+data

http://avoka.dnsalias.com/confluence/display/Public/Retrieving+data+from+a+database+and+saving+as+concatenated+strings+or+lists+in+process+variables

http://avoka.dnsalias.com/confluence/display/Public/Retrieving+data+from+a+database+and+saving+as+CSV+file

You can download a trial version here:

Download trial

Summary

Using parameterized queries simplifies and enhances your ability to define and test your SQL queries. We generally recommend that you use a parameterized query rather than embedding the variable directly into the string.

June 19, 2008

Best practise for pre-populating a Drop Down with data?

Filed under: Designing Forms, Designing Processes, LiveCycle — htreisman @ 1:53 pm

One of our internal consultants recently asked: “What is best practice for populating a Drop Down list in a PDF form with data?”

There are several ways to do it, and there is no “best” or “recommended” way. The suggestions below are roughly in order of complexity.

  1. You can just hard code the values in the form template.
  2. You can populate a hidden field with comma-separated data, and use JavaScript to split the items and add them to the drop down. (If you use a comma separator, make sure that your data doesn’t have any commas in it).
  3. You can bind a Drop Down directly to element in your form’s XML, using a feature called Dynamic Properties (in Designer). Then, instead of populating a hidden field, you just populate elements directly into your XML. No javascript required, but a bit more work in your XML Schema.
  4. You can have your form calling a web service in order to get the data it needs, and use Javascript to process the results, and populate the Drop Down. You can build the web service any way you like, including as a LiveCycle orchestration. This is often the only way to approach things if the contents of the drop down depends on other data in your form.

For #2 and #3, you can modify the LiveCycle Default Render process to inject the data into your XML.

The screenshot below shows how to turn on “Dynamic Properties” in Designer.

Dynamic Properties in LiveCycle Designer

If anyone reading this blog needs more information on exactly how to do this, email info-at-avoka.com

April 14, 2008

Limitations of List Variables, Some Workflow Internals Exposed, and a Solution

Filed under: Designing Processes, LiveCycle — htreisman @ 9:30 am

On a recent project, we created a workflow to send an email to each of around 3000 users – the users’ email address and name came from a database table.

Our first attempt was to use the SQL component to read the users’ name and email address into list variables (separate lists for name and email address). Once we had the data in a list, it’s a reasonably easy workflow task to iterate over the list:

  • find out the size of the list, using xpath expression
    • /process_data/@count = xxx
  • define an integer counter variable
  • define a simple loop in the workflow, indexing each element in the lists like this:
    • /process_data/email_list[/process_data/@index] and
    • /process_data/name_list[/process_data/@index]
  • Loop with the following condition:
    • /process_data/@index < /process_data/@count

This all worked fine during simple testing, but once we started testing with large numbers of users, we suddenly experienced a massive performance degradation, around 2-3 seconds for every single step in the workflow. We suspected the very large list variables to be the culprit, but interestingly, even steps in the workflow that did not access the list variables were running slow. While this was happening, both CPU and disk were maxing out. What was going on?

A little knowledge of what is really going on under the covers when a workflow runs explains what is happening. When a long-lived workflow runs, every single step of the workflow effectively runs as a separate, asynchronous step, generally within its own transaction. This means that before each step runs, the workflow engine must establish the “process context”, loading it from the database, and after each step, the process context must be persisted back to the database. The process context basically contains all information about the process, including the values of all variables. The process context is loaded/persisted, regardless of whether a particular variable is actually used by a particular step, because the process engine doesn’t actually know whether a particular variable is going to be used or not.
So what is going on is that we have two very large Java List variables that are being loaded into memory from the database, deserialized, and the serialized and stored back into the database, on each and every step – regardless of whether the list is actually needed in that step or not. For two large lists, this accounted for the 2-3 seconds of activity on each step.

How to solve the problem?

Our first approach was to simply change the workflow to a short-lived workflow. Short-lived workflows don’t persist their data, they run more like a regular Java method, simply creating all variables in memory at the beginning of the process, and garbage collecting them at the end of the process – all variables are in memory all the time, and the process context is made available to each of the steps – no intermediate serialization or storage.

A short-lived workflow did indeed solve the problem, but introduced several new problems, related to error recovery:

  1. If the process failed for any reason (and it did – every now and then, the SendEmail step would fail due to a very overworked Mail Server being too busy to respond), the process threw an exception, but all the data was discarded, and we had no record of how many emails had been sent. It was impossible to restart the stalled process, because short-lived processes don’t stall, they simply throw an exception and they’re done.
  2. We could simply restart the failed process, but then some users would get two emails – not a good look.
  3. The process took several hours to complete, and this exceeded the default transaction timeout on the workflow. Workflow assumes that short-lived workflows are transactional by default. This can be fixed by modifying the workflows transactional properties, but felt “wrong”.

Finally, we solved the problem by completely removing the list variables, and adding a status column to the database table. The logic in our long-lived workflow was something like this:

  1. Using the database component, select a row using a query something like:
    “select id, name, email from usertable where status is null”.
    Retrieve only the first matching row into simple variables called “rowid”, “username” and “email”.
  2. Do stuff. In this case, this meant creating a document, and sending an email.
  3. Using the database component, update the previous row using a statement something like:
    “update usertable set status =’sent’ where id = ?”
    and set the parameter to /process_data/@rowid
  4. Loop. The next time around the loop, the query will pick up the second row, because the first row no longer has a null status. Repeat until no row is returned.

This approach solves the entire problem very elegantly:

  • The status column ensures that you always know where you’re up to. It’s persisted in the database, so it is resilient to any type of failure.
  • There are no large collections slowing down the running of the process – performance was almost an order of magnitude faster.
  • If the workflow stalls at any step, you can just restart it, and things will continue where they left off
  • Even if the workflow stalls in a weird way, where restarting it doesn’t actually cause it to continue (we have seen this occasionally), you can simply start another instance of the workflow, and it will pick the next row in the database. The only danger is that it is possible that one user may get the same email twice, although since it’s usually the mail server that stalls, not the database query, this is unlikely.

Notes:

  • We’re using a long-lived process, not because we have a user-step in the workflow, but because we want the auditability and reliability that long-lived processes deliver.
  • We believe (although we haven’t verified) that Document variables have some “smarts” that Lists and Maps don’t have. In particular, they are cached in memory between steps. It is therefore probably that a Document variable (perhaps containing the XML data from the query, would not create the loading/storing problems that we experienced with Lists. However, the XML data would still need to be parsed into an XML document each time a row was needed, and for a large data-set, this could introduce delays. Overall, we believe that the above solution is superior.

Blog at WordPress.com.