The following line restricts non-admins from editing this page. They can only comment on it.
The Cancer Text Information Extraction System (caTIES) focuses on two important challenges of bioinformatics; namely, information extraction from free text and access to tissue.
Specifically, caTIES has three primary goals:
- Extract coded information from free text Surgical Pathology Reports (SPRs), using controlled terminologies to populate caBIG-compliant data structures.
- Provide researchers with the ability to query, browse and create orders for annotated tissue data and physical material across a network of federated sources. With caTIES the SPR acts as a locator to tissue resources.
- Pioneer research for distributed text information extraction within the context of caBIG.
- Semantically codes text with concepts from the NCI Metathesaurus.
- Classifies coded concepts as Diagnosis, Procedure or Organ type.
- Uses a built-in negation engine to identify explicit negation in text.
- Supports concept based searches. Besides returning results that contain the search terms, concept based searches also return results that contain synonyms of the search terms.
- Supports searching for negated concepts. For e.g. you can search for 'Negated adenocarcinoma', which would return results that had an explicit negation for adenocarcinoma. i.e. statements such as "no evidence of adenocarcinoma" or "adenocarcinoma not found" must be present in the report.
- Supports temporal queries. For e.g. you can query for patients who had no evidence of thyrotroph adenoma less than 3 years prior to being diagnoses with Brain Hemangioma
- Has a unique graphical query modelling tool to easily formulate complex queries.
- Uses ontology information to improve search results. For. e.g. searching for 'vasculitis' will return reports with all sub types of vasculitis, even if the word 'vasculitis' is not present in the report.
- Limit searches to within certain sections of the report.
- NEW Supports tagging of reports and then searching for reports with specific tags.
- caTIES is open source.
- Can import data from HL7 files.
- Deidentifies text to comply with HIPAA regulations using third party de-identifier programs.
- It has a rich thick client that is accessible online through Java Webstart.
- It uses secure encryption protocols to protect all communications. Ability to use strong passwords and log all search activity for auditing purposes.
- Supports for Tissue Banking and Honest Broker operations.
- Supports multiple caTIES nodes across organizations. Fully integrated support for collaborative research.
- At the core of caTIES lies the coding engine, which has its origins in the SPIN project. Please see the History page for more information.
- MMTx Tools are also used extensively in the coding engine.
- A variety of open source software has been used in caTIES. For a list see here
- David Carell, from Group Health Co-op., Seattle, has been an avid user of caTIES and has helped testing and in finding many bugs. He has also contributed to the caTIES website content. Thanks David!
- Sean O'Hollaren, John Blackwood and Mark Goodman from OHSU have created excellent installation documentation for caTIES which is available on the Installation page.