Federal Information Management

Addressing critical issues faced by the U.S. Federal Government in managing its information resources: information architecture, information assurance and security, sharing, search, and others.

Sunday, July 23, 2006

Microsoft, Government, and OpenDocument Format

On July 5, Microsoft announced creation of a plug-in (officially titled "Open XML Translator") enabling MS Office users to save documents in the XML-based OpenDocument Format (ODF). An article published in GCN on July 17 discusses potential impact of this move on how government and industry organization archive and exchange information.

Below are a few interesting quotes from that article that clearly highlight the rationale behind the governments requiring the ODF functionality in office applications:

-- "Government records should be free of any proprietary software dependencies. We cannot defer to commercial vendors the prerogative for determining the formatting and structure of records that are inherently governmental in nature" /Owen Ambur, chairman of the Federal CIO Council's XML Community of Practice/

-- A document saved in a standards-based format ensures that, even if the original program is no longer available, another program can be written to view the documents, using the specs

-- "Governments should not require citizens to use a particular commercial product to view the data they are interested in seeing, and ODF can help increase the number of possible applications available" /Simon Phipps, chief open source officer for Sun Microsystems/

Technorati tags: , Government, , , , XML,

Monday, July 17, 2006

One query - one answer - Part 2

A great example of a search tool returning exact answers has been provided to me by Sylvain Falardeau, the President and CEO of Delphes Technologies International - a Montreal-based developer of advanced, linguistic-based information retrieval and extraction solutions.

As Mr. Falardeau writes, the Delphes technology
"can return an exact answer but only if that is the meaning of the query. For example, a query "John’s building" and a sentence in a document such as "John's building a house" [would be] two different configurations although it is the same suite of characters, i.e., the exact match."

The Government of Quebec has been extensively using Delphes' Intelligent Knowledge Online Self-Service. One of the key components of that application is "Intelligent Advisor" - among other things, it returns the so-called "Strategic Links" pertaining to a search query.

As shown in the screenshot below, when a user enters “address change”, the corresponding “strategic links” include "How to change address" and "A checklist to help you with your move". In addition, the search output contains a summary of relevant results and a "Personal Project Manager".


(Click here to view this image in full size.)

All this appears to be an excellent primer on a natural-language processing search technology helping to build what can truly be considered as a citizen-centric government.

Many thanks to Sylvain Falardeau for the information. Also, special thanks to Patrick Cormier for the reference.

Technorati tags: , Government, , , ,

Friday, July 07, 2006

One query - one answer

On June 22, Teragram Corporation (a Cambridge, MA-based "leading provider of multilingual natural language processing technologies"), announced a release Direct Answers for the Enterprise - an enterprise technology designed to extract simple answers to user queries from multiple types of documents.

The Direct Answers technology is based on a perception that most users are not looking for documents - instead, they want precise answers to their questions. The output of standard search engines is a large list of documents that a user needs to sort through to find exactly what he or she is looking for.

Teragram shortcuts this process by grasping the essence of a query and extracting the most relevant information from documents. As a result, the user receives (1) a direct answer to the query and (2) a list supporting documents to gather additional information.

How does this technology work? Teragram distinguishes two types of queries:

1. "Information-seeking" queries, i.e., those looking for a specific date, number, city name, time, etc.;

2. Queries looking for links to specific documents or websites (as one may notice, standard search engines mostly consider this type of queries).

Using Teragram's advanced linguistics technologies, Direct Answers determines whether a query is indeed "information-seeking". If it is, then Direct Answers processes it to provide an answer.

Here are the links to the company's press release on Direct Answers for the Enterprise and a white paper on this technology (a PDF document).

Interestingly, in 2001 Teragram submitted a research-project proposal to the Defense Advanced Research Project Agency. The proposed project called for development of "intelligent agent software capable of detecting (intentionally) misleading information from potential threat groups in open sources".

Technorati tags: , Federal Government, , , ,

Monday, July 03, 2006

Happy Independence Day!

All the best to everyone!


Stan Vornovitsky
Federal Information Management