Pulling Content out of Word with ColdFusion 9

I had a 1 + 1 = 2 moment the other day. I was fooling around with the ColdFusion’s ability to turn Word docs into PDFs. At first glance it’s pretty simple and straightforward:

Word to PDF is nice to have, but as features go, it’s a pretty small bullet point. Don’t get me wrong, you get fidelity to the original, including fonts, layouts, and images. But it’s still just converting a Word document to a PDF.

That is until you remember that you can pull content out of PDFs now in ColdFusion 9. So now you can do this:

This will yield you the content of the original Word document. Now that’s cool.

Upcoming ColdFusion Events

There are a number of community events coming up that are either dedicated to ColdFusion or have ColdFusion tracks. What impresses me about this list is how many of them are either new this year or expanded from previous events. It’s great to see how committed this community is, not just to the technology, but to lending a hand to other members of the community.

 

Date not publicly set yet:

If I missed your event, I apologize. I thought I had all of them. Drop me a line and I’ll add it to the list.

Gartner Report on ColdFusion 8

As Ben and Claude have noted, the Gartner Report on ColdFusion 8 is now publicly available.

It’s an analyst’s report and not a ColdFusion marketing piece. It’s honest, it’s independent, and it’s not there to fluff up ColdFusion. Therefore it has both pros and cons in it. However, I think it’s much more positive than negative.

It also makes me wonder, if Gartner thinks this after seeing ColdFusion 8, what are they going to say after we release ColdFusion 9?

Adobe Max 2009 Sessions

Just wanted to drop you all a quick line about the sessions I’m doing at Adobe Max 2009 this year. They’re both around new features in ColdFusion 9:

Leveraging Exposed Services in ColdFusion Centaur
October 5 at 11:30AM
This session is about the new exposed services or CFaaS we have included in ColdFusion 9. I’ll be talking about how to leverage them in Flex and other languages, and even how to enhance previous versions of ColdFusion with them.

ColdFusion with Microsoft Office, SharePoint, and Exchange
October 5 at 05:00PM
I’ll be talking about how nice ColdFusion plays with Microsoft technologies. While Exchange integration has been around since ColdFusion 8, with 9 we’ve added the ability to interact with SharePoint and Office documents.

Go and register for them now!

CFUnited 2009

I’ve been remiss in blogging about CFUnited mostly because I’ve been busy reviewing topics, writing sessions and brainstorming about CFUnited.

It’s looking to be pretty awesome from the content I’ve been seeing. Today we finally revealed the Adobe sessions:

http://cfunited.com/blog/index.cfm/2009/7/20/Adobe-ColdFusion-9-and-ColdFusion-Builder

What really excites me about the list is the participation of some Adobe people I haven’t seen participating in the ColdFusion community in awhile:

And members of the team from India:

And of course the usual suspects:

It’s looking like it’s going to be a great show. I can’t wait.

 

ColdFusion 9 PDF Enhancements

Another of the less visible, but still cool features in ColdFusion 9 are the enhancements we’ve made to <cfpdf>. We’ve added the ability to:

  • Add headers/footers to existing PDFs
  • Create PDF packages
  • Selectively optimize PDF size
  • Extract text from PDFs
  • Extract images from PDFs
  • Create high quality thumbnails

Of these features, my personal favorites are optimization and extraction.

Optimization

PDFs can do a lot. Consequently, PDFs size can swell due to the presence of extra information, metadata, and embedded files. The optimize feature allows you to remove specific types of extras in order to selectively reduce the size of your PDF. But you can retain features that you need. When you take action=”optimize” the following options are open to you:

  • noattachments
  • nobookmarks
  • nocomments
  • nofonts
  • nojavascript
  • nolinks
  • nometadata
  • nothumbnails

Code looks like this:

As you can see, the code is pretty straightforward. I’ve seen reductions of 65-75% on PDF size when using all options.

Extraction

Yes, you can get at the text or embedded images of a PDF with ColdFusion 9.

Here’s the code to get at the text of a PDF:

That code will extract the text of a PDF to XML. The structure divides the content into pages, so you can quickly get at content on particular pages, etc.

You have a few options that I’m not showing though. You can get the content as just a string. You can selectively get page numbers. You can even get XY coordinates for all of the words in the document.

Getting images is similar; you plug in a PDF, and send the images to a directory:

You have options to prefix the images, and pick image formats

As you can see, the engineers added some cool functionality here.

Caching Enhancements in ColdFusion 9

One of the less mentioned aspects of ColdFusion 9 is the enhanced caching that was added by including ehcache under the covers.

This opens a number of possibilities including fragment caching:

The time was #DateFormat(Now(), "mmmm d, yyyy")#
#TimeFormat(Now(), "hh:mm:ss tt")#

The time is #DateFormat(Now(), "mmmm d, yyyy")#
#TimeFormat(Now(), "hh:mm:ss tt")#

In this code, the first <cfoutput> will always show the time from the first time it was called. The second <cfoutput> will show the time from the actual time the code is called. The 1 for timespan means that it will cache that value for a day.

But, we can also do dependent cached items, where the value of one of the contained variables has an impact on whether or not the cached item needs to be refreshed.

The minutes is #minutesVariable#

The time was #DateFormat(Now(), "mmmm d, yyyy")#
#TimeFormat(Now(), "hh:mm:ss tt")#

In the above code, assume that it is called at 12:03. For the next minute minutesVariable is going to equal 3. For each call where minutesVariable equals 3 the cache is used. However when the time rolls over to 12:04, minutesVariable will equal 4. This will trigger a refreshing of the cache with the new content being cached for the next minute.

In addition to fragments, I can also cache objects (but for easy of understanding, variables) See this code:

SELECT *
FROM Artists

I try and retrieve the query that I’ve given an id of “testQuery” from cache, if it’s not there, I call the query and cache it.

So those are three fairly straight forward examples of the new caching, but you can do much more with it, including invalidating objects, analyzing cache usage, and more.

 

 

 

TR ORM Generator Code for ColdFusion Builder

I’ve been demoing an extension I made for generating ORM code CFCs, views, controllers and services. It stopped working with the public beta of ColdFusion. I’ve updated it to work with the public beta. I’ve tweaked it in a few other places, but it’s by no means completely tested. But if you like it and have ideas, I’m open to allowing other authors in. (We’ll just have to change the name of the tool. )

Please note, this is not an official Adobe product. It’s just code that I’ve been using that I’m willing to share. There is an ORM generator included with ColdFusion Builder; this is just another alternative.

Terry Ryan ORM Jumpstart at RIAForge

Word Wrap in ColdFusion Builder

A number of people have commented on the lack of word or line wrapping in ColdFusion. I whined to Adam Lehman about it, and he set me straight.

You can find it in the preferences:

Preferences -> HTML -> Editors

Under Editors there is a toggle button for General/Advanced. Click “Advanced”

Option reads “Enable Word Wrap” It will require a restart of ColdFusion Builder.

 

We’re talking about perhaps seeing if we can’t make this more obvious.