About the project

Specific Aims:

  • Dictionary Creation: Formalize the unstructured process of creating a dataset dictionary using LLMs.
  • Data Harmonization: Develop a semiautomatic process for harmonizing dataset values.
  • Quality Evaluation: Define metrics to evaluate the quality of harmonization and linking processes.
Impact: Our findings will significantly enhance the UH capacity through a mutually beneficial collaboration with CTSA. Our research provides a generalizable framework for tailored implementation of training to include:
  • Harmonization Accuracy: Evaluate the conceptual similarity between original dataset values and their harmonized counterparts.
  • Mapping Completeness: Measure the proportion of values that are successfully mapped.
  • Human Intervention Rate: Track the frequency and extent of human involvement during the harmonization process.