]Latent semantic indexing (LSI) is a promising information ]retrieval technique for searching and organizing large data ]collections. LSI finds patterns in unstructured data (documents ]without descriptors such as keywords or special semantic ]tagging), and can return relevant results for a query even when ]there is no keyword match. ]Data collections don't have to be in English, or even in any ]human language at all. We have had good success in searching ]protein databases with the technique, as well as chemical mass ]spectra. I'm going to have to look this over in some detail later.. Projects: Latent Semantic Indexing |