ColdFusion 9 PDF Enhancements

Another of the less visible, but still cool features in ColdFusion 9 are the enhancements we’ve made to <cfpdf>. We’ve added the ability to:

  • Add headers/footers to existing PDFs
  • Create PDF packages
  • Selectively optimize PDF size
  • Extract text from PDFs
  • Extract images from PDFs
  • Create high quality thumbnails

Of these features, my personal favorites are optimization and extraction.


PDFs can do a lot. Consequently, PDFs size can swell due to the presence of extra information, metadata, and embedded files. The optimize feature allows you to remove specific types of extras in order to selectively reduce the size of your PDF. But you can retain features that you need. When you take action=”optimize” the following options are open to you:

  • noattachments
  • nobookmarks
  • nocomments
  • nofonts
  • nojavascript
  • nolinks
  • nometadata
  • nothumbnails

Code looks like this:

As you can see, the code is pretty straightforward. I’ve seen reductions of 65-75% on PDF size when using all options.


Yes, you can get at the text or embedded images of a PDF with ColdFusion 9.

Here’s the code to get at the text of a PDF:

That code will extract the text of a PDF to XML. The structure divides the content into pages, so you can quickly get at content on particular pages, etc.

You have a few options that I’m not showing though. You can get the content as just a string. You can selectively get page numbers. You can even get XY coordinates for all of the words in the document.

Getting images is similar; you plug in a PDF, and send the images to a directory:

You have options to prefix the images, and pick image formats

As you can see, the engineers added some cool functionality here.

Caching Enhancements in ColdFusion 9

One of the less mentioned aspects of ColdFusion 9 is the enhanced caching that was added by including ehcache under the covers.

This opens a number of possibilities including fragment caching:

The time was #DateFormat(Now(), "mmmm d, yyyy")#
#TimeFormat(Now(), "hh:mm:ss tt")#

The time is #DateFormat(Now(), "mmmm d, yyyy")#
#TimeFormat(Now(), "hh:mm:ss tt")#

In this code, the first <cfoutput> will always show the time from the first time it was called. The second <cfoutput> will show the time from the actual time the code is called. The 1 for timespan means that it will cache that value for a day.

But, we can also do dependent cached items, where the value of one of the contained variables has an impact on whether or not the cached item needs to be refreshed.

The minutes is #minutesVariable#

The time was #DateFormat(Now(), "mmmm d, yyyy")#
#TimeFormat(Now(), "hh:mm:ss tt")#

In the above code, assume that it is called at 12:03. For the next minute minutesVariable is going to equal 3. For each call where minutesVariable equals 3 the cache is used. However when the time rolls over to 12:04, minutesVariable will equal 4. This will trigger a refreshing of the cache with the new content being cached for the next minute.

In addition to fragments, I can also cache objects (but for easy of understanding, variables) See this code:

FROM Artists

I try and retrieve the query that I’ve given an id of “testQuery” from cache, if it’s not there, I call the query and cache it.

So those are three fairly straight forward examples of the new caching, but you can do much more with it, including invalidating objects, analyzing cache usage, and more.




ColdFusion 9 and ColdFusion Builder Public Beta

That’s right, as of 12:01 EDT am Monday July 13th
ColdFusion 9 (Centaur) and ColdFusion Builder (Bolt) are available for public beta testing.

Please check them out at Adobe Labs.

Also I’m doing a presentation for the Online ColdFusion Meetup today. Obviously, since we’ve released the beta bits, I can talk about any feature of it publicly now. So come with questions, and an eye to see as much as I can fit into two hours.

On ColdFusion ORM and DBAs

Two things come up when I talk about the upcoming ORM features in Centaur:

  • DBAs are going to hate it
  • It’s going to put DBAs out of work, which will make them hate it.

Let me just say, 1 may be inevitable, but 2 is quite the opposite.

To start with, there are two ways of working with ColdFusion ORM, your application, and your database:

  • Start with the database and build your objects from it.
  • Start with the objects and have your database built based on them.

When you start from the database and go up, if you have a bad database, there is nothing Hibernate (the underlying ORM technology in Centaur) can do to make it any better. If it is poorly indexed, or improperly normalized, the resulting objects will perform poorly, or be unnecessarily complex.

On the other hand, if you have CF go ahead and create the tables for you, you will only get the basic indices and keys needed to generate relationships: primary keys and foreign keys. You can specify indices and unique constraints, but only if you know where to put them.

In both cases you will need the skills of a DBA (either your own, or a dedicated DBAs) to help you make decisions.

What’s different then? Much like other uses of ColdFusion, it takes the knucklehead rote stuff and makes it easy.

  • No building table creation scripts.
  • No writing rote CRUD scripts
  • DBA time can now be spent doing cool complex SQL and analysis where they really pack on the value.

How do you convince your DBAs of this? I have a few arguments:

  • ColdFusion ORM uses parameterized and prepared SQL much like cfqueryparam.
  • ColdFusion ORM can be configured to output generated SQL
  • ColdFusion ORM is based on Hibernate, which was built keeping most database best practices in mind.

Is this going to convince every DBA? Probably not. But hopefully enough have an open mind to at least give it a shot.