The U.S. intelligence community defines DOMEX as "the processing, translation, analysis, and dissemination of collected hard-copy documents and electronic media, which are under the U.S. government's physical control and are not publicly available." That definition goes on to exclude "the handling of documents and media during the collection, initial review, and inventory process." DOMEX is not about being a digital librarian; it's about being a digital detective.
...
This article introduces electronic document and media exploitation from an academic perspective. It presents a model for performing this kind of exploitation and discusses some of the relevant academic research. Properly done, DOMEX goes far beyond recovering documents from hard drives and storing them in searchable archives. Understanding this engineering problem gives insight that will be useful for designing any system that works with large amounts of unstructured, heterogeneous data.