PicnicHealth utilizes the OMOP Common Data Model to map clinical concepts to standard ontologies. All clinical tables in our data can be joined to the concept table to access standard codes and vocabularies, which can then be used to identify clinical elements of interest in your analysis.

Lists of conditions found on problem lists, assessment/visit diagnosis lists, and discharge diagnosis lists are captured in the visit_condition_occurrence table. Depending on the date of your dataset, conditions will be coded in one of two sets of ontologies:

  • datasets dated on or after July 1, 2024: SNOMED-CT
  • datasets dated prior to July 1, 2024: ICD-9 and ICD-10

Sometimes, you may want to convert between ICD and SNOMED oontologies in order to accomplish a particular task. For example:

  • You have a SNOMED-coded table, but need to use a list of ICD codes prepared by your research team to find comorbidities to include or exclude patients from an analysis.
  • You have an ICD-coded table, but want to use the SNOMED hierarchy to find descendents of a concept of interest, such as all the conditions subsumed by “diabetes mellitus”.

The vignette below shows you how to translate between the two ontologies using the vocabulary tables provided in your PicnicHealth dataset.

Important note

Conditions which are specifically relevant to the primary clinical population in your dataset are found in the cohort_ tables. Depending on your dataset’s data model, these may include the primary diagnosis, select comorbidities, and/or symptoms. In all datasets, these are coded in SNOMED.

Translating SNOMED to ICD

Working from an existing ICD code list

If you have already identified ICD codes to use in your analysis, you can use the tables provided in your dataset to identify the corresponding SNOMED codes.

[1] Load the PicnicHealth package and the data provided.


data_set = load_data_set(data_path)
list2env(data_set, envir = .GlobalEnv)

[2] Join the visit_condition_occurrence table to the concept table to access the SNOMED code for each condition.

joined_condition_concepts = visit_condition_occurrence %>%
    select(visit_condition_occurrence_id, person_id,
         condition_concept_id, condition_concept_name, section) %>%
    concept %>% 
      select(concept_id, domain_concept_code, vocabulary, vocabulary_version),
    by = c("condition_concept_id" = "concept_id"))

[3] Join the resulting table to the snomed_icd_mappings table. The domain_concept_code field will map on to the snomed_concept_code field. Note: This table is only present in datasets dated on or after July 1, 2024. If it is not present in your dataset, it may be provided to you as a .csv file in the PicnicHealth study portal - reach out to your project contact to receive access.

mapped_condition_concepts = joined_condition_concepts %>%
            by = c("domain_concept_code" = "snomed_concept_code"))

[4] Filter the resulting table by the ICD codes of interest. In this example, we’re looking for instances of fatigue.

fatigue_icd_codes = c('R53.1', 'R53.81', 'R53.82', 'R53.83', '780.7', '780.79')

fatigue_conditions = mapped_condition_concepts %>%
  filter(icd_concept_code %in% fatigue_icd_codes)

head(fatigue_conditions) %>%
         snomed_concept_code = domain_concept_code, snomed_concept_name, 
         icd_concept_code, icd_concept_name) %>%
visit_condition_occurrence_id snomed_concept_code snomed_concept_name icd_concept_code icd_concept_name
6ad80e13-ed21-4512-90dc-627829f92c85 84229001 Fatigue R53.83 Other fatigue
ec00d0a6-11a7-4835-9891-39e0067668e5 13791008 Asthenia R53.1 Weakness
c4eb52bf-8fe1-4d4c-8f2c-d28c2dc590f7 52702003 Chronic fatigue syndrome R53.82 Chronic fatigue, unspecified
298299ef-8ae5-4034-a393-678816d75f28 367391008 Malaise R53.81 Other malaise
c803a3e4-1a5b-49dd-bd5b-e35d4c67cb08 26544005 Muscle weakness M62.81 Muscle weakness (generalized)

Developing novel code lists in SNOMED

If you are developing a novel code list, a tool such as OHDSI’s Athena can help you identify concepts of interest.

SNOMED provides ancestor-descendent relationships at many levels of granularity, enabling researchers to identify conditions of interest without needing to curate exhaustive code lists by hand.

For example, an analaysis interested in diabetes mellitus (DM) can start with SNOMED code 73211009 and find all descendants of this concept, in order to identify patients who have DM as a comorbidity.

To do so in PicnicHealth data:

[1] Filter the concept table to rows where domain_concept_code = 73211009.

dm_concept = concept %>%
  filter(domain_concept_code == "73211009" &
           vocabulary == "Systematic Nomenclature of Medicine - Clinical Terms (IHTSDO)")

[2] Inner join this filtered table to the concept_ancestor table (fields to join: concept_id ←→ ancestor_concept_id).

dm_descendants = dm_concept %>%
             by = c("concept_id" = "ancestor_concept_id"))

[3] Join this filtered table to visit_condition_occurrence (fields to join: descendant_concept_id ←→ condition_concept_id) to find all instances where a descendant concept of DM was present in a patient’s medical record.

dm_conditions = dm_descendants %>%
             by = c("descendant_concept_id" = "condition_concept_id")) %>%
  select(visit_condition_occurrence_id, condition_concept_name, 
         section, visit_id)
visit_condition_occurrence_id condition_concept_name section visit_id
3fbe116e-5b00-4781-994f-ab3dec09e2eb Mononeuropathy due to type 2 diabetes mellitus Problem list - Reported 7a705a6d-c374-4e7a-a73c-914e8eb934fc
fb6129b7-424c-4aaf-865a-10834094feaf Type 2 diabetes mellitus Evaluation + Plan note 741bfaa0-9608-415c-8f3f-80ae5726ed73
f6165e73-0034-49d2-9bcb-e5c81a9a5797 Mild nonproliferative retinopathy due to diabetes mellitus Evaluation + Plan note cd1efad8-c2eb-4f73-9899-c4ca2eff4d3d
6e296041-0a81-4c60-a98d-142ae304bce2 Type 2 diabetes mellitus without complication Evaluation + Plan note f3d26a4c-0d37-437c-b478-a79fbf7ed892
7f41830e-7d76-4aec-a942-ff7883a2a0ca Mild nonproliferative retinopathy due to type 2 diabetes mellitus Problem list - Reported c1ad964a-b3cd-43ed-94ec-421594a6381b
dc4e510b-eaf2-442b-b241-836d75fbe15b Type 1 diabetes mellitus without complication Evaluation + Plan note 8a5d4259-879e-40f4-83fb-7fdbc0ef1f10

Translating ICD to SNOMED

In datasets dated prior to July 1, 2024, conditions in the visit_condition_occurrence table are coded in ICD-9 or ICD-10, the codes commonly documented in the structured portions of U.S. patients’ medical records for billing purposes. This vignette walks through how to use the concept and concept_relationship tables to map from ICD to SNOMED codes.

Important notes:

  • It is difficult to develop a direct translation from ICD to SNOMED. Follow the caveats in the “Maps to value” discussion below, and manually review output to ensure that only conditions of interest are included in your final code list.
  • Only the full OMOP concept relationship table that is provided independently from clinical data sets contains relationships to identify synonyms. Converting ICD to SNOMED codes as described below will require that the full OMOP concept relationship table is downloaded and loaded into R.

[1] Start with a list of ICD-9 or ICD-10 codes of interest. For this example, we will work with the ICD-10 codes for Type 2 diabetes mellitus: “E11” and all corresponding sub-codes (e.g., “E11.1”, “E11.2”). (Note: while we demo here only with ICD-10, it is advised to include ICD-9 codes as well in your analysis.)

[2] Load up the PicnicHealth package, and the data provided.


data_set = load_data_set(data_path)
list2env(data_set, envir = .GlobalEnv)

[3] Use the concept table to obtain the concept_id values corresponding to ICD-10 codes of interest. Here, we search the pattern “^E11” to match all codes that begin with this root.

DM_icd_concept_ids = concept %>% 
  filter(vocabulary == paste("International Classification of Diseases,",
                             "Tenth Revision, Clinical Modification (NCHS)") &
         grepl("^E11", domain_concept_code)) %>%
  select(concept_id) %>% 

[4] Identify all records in the concept_relationship table that map the concept_id values identified in step 3 to other concepts. This can be done using the concept_id_1 column in the concept_relationship table. When converting from ICD concepts to SNOMED concepts, we are only interested in relationship values of “Maps to” and “Maps to value”, so we will restrict to those. We will also rename the concept_id_1 and concept_id_2 columns to icd_concept_id and snomed_concept_id, respectively, to keep things organized.

As noted above, only the full OMOP concept relationship table that is provided independently from clinical data sets contains the “Maps to” and “Maps to value” relationships that are necessary for identifying synonyms.

DM_concept_relationships = concept_relationship %>% 
  filter(concept_id_1 %in% DM_icd_concept_ids &
         relationship %in% c("Maps to", "Maps to value")) %>% 
  select(icd_concept_id = concept_id_1, relationship, 
         snomed_concept_id = concept_id_2)
icd_concept_id relationship snomed_concept_id
0acfbd0c-4942-5822-b99d-ec1dd711a07a Maps to 0877f62f-346f-5684-a3a4-773691622b3f
45982c28-ab08-56ba-a267-a2b0c034b48f Maps to 939834b2-65a0-52d9-bc9f-a934a4e0aa62
5a96ed46-83f1-5ff2-9632-75d59467fd85 Maps to de305955-7485-5ca3-a33f-280af6a99cf1
2f0122d0-f493-59d4-8d8c-459f7d3f3b50 Maps to 262239b0-5396-59b9-8acc-46514feafd37
5a05995a-d531-59bf-9e2b-cc95bf6b1f1c Maps to 8dc97a07-7307-5d2e-bb75-586c1b26f989
ed0cb105-234a-5e38-92b0-dc5ab8d87c2a Maps to 172c52eb-63c1-5583-9073-190f61ee9dc2

[5] Now that we have our ICD and SNOMED concept_id values, we can join with the concept table twice (once for icd_concept_id and once for snomed_concept_id) to retrieve the corresponding codes.

DM_icd_snomed_map = DM_concept_relationships %>% 
  left_join(concept %>% 
              select(concept_id, icd_code = domain_concept_code),
            by = c("icd_concept_id" = "concept_id")) %>% 
  left_join(concept %>% 
              select(concept_id, snomed_code = domain_concept_code),
            by = c("snomed_concept_id" = "concept_id")) %>% 
  select(icd_concept_id, icd_code, relationship, 
         snomed_concept_id, snomed_code)
icd_concept_id icd_code relationship snomed_concept_id snomed_code
1395c6fb-c40a-5695-895b-adfa46b41c9a E11.641 Maps to 747bee0c-e815-55b0-91e2-8607a6ad9094 719216001
281c58bb-4580-5cd2-ae09-b9d639221023 E11.52 Maps to 4ecbde5a-97e7-53c3-9ba3-57e08977f083 421631007
79cbd97e-5f90-53c0-9db3-106eefb3b4d6 E11.4 Maps to 643f4b78-1fdd-5ef4-84c6-27063428f19b 421326000
4b00f6cd-ee82-59b8-9697-5cfcad429ee5 E11.40 Maps to 643f4b78-1fdd-5ef4-84c6-27063428f19b 421326000
c519515a-a88c-5f2a-b6ff-da26cafa2663 E11.49 Maps to 643f4b78-1fdd-5ef4-84c6-27063428f19b 421326000
7acb913f-f445-50ca-9bc5-2d45674034c8 E11.3523 Maps to f0a4502c-3835-59be-bf07-923b3f9771f6 232010004

[6] Now that we have our map from our original ICD-10 codes to their corresponding SNOMED codes, we can extract the unique SNOMED codes as a vector and take a look at the first few results.

DM_snomed_codes = unique(DM_icd_snomed_map$snomed_code)
#> [1] "312912001" "422014003" "421631007"

A note regarding the “Maps to value” relationship

Because ICD and SNOMED are distinct vocabularies, there does not exist a perfect one-to-one relationship for all concepts across both vocabularies. When translating from ICD to SNOMED, some caution is required.

The “Maps to” relationship corresponds to either a full equivalence between concepts or, if an equivalent mapping does not exist, a mapping from a more specific ICD code to a more general SNOMED code (e.g., the ICD-10 code for refractory anemia with multi-lineage dysplasia maps to the SNOMED code for refratory anemia). Furthermore, a single ICD code can map to one or more SNOMED codes.

The “Maps to value” relationship is a bit different. It is always used in conjunction with a single “Maps to” relationship, and is necessary to preserve the full meaning of certain types of ICD codes. One example of where “Maps to value” relationships arise in the context of ICD-to-SNOMED mapping is for ICD codes corresponding to abnormal levels of tests.

For example, let’s take a look at ICD-10 code R77.0, which represents abnormality of albumin, and obtain its corresponding SNOMED codes.

ab_alb_concept_id = concept %>%
  filter(domain_concept_code == "R77.0") %>%
  select(icd_concept_id = concept_id, 
         icd_code = domain_concept_code, 
         icd_concept_name = concept_name) %>%
            by = c("icd_concept_id" = "concept_id_1")) %>%
  filter(relationship %in% c("Maps to", "Maps to value"))%>%
  left_join(concept %>% select(concept_id,
                               snomed_code = domain_concept_code,
                               snomed_concept_name = concept_name),
            by = c("concept_id_2" = "concept_id")) %>%
  select(icd_concept_id, icd_code, icd_concept_name, relationship,
         snomed_concept_id = concept_id_2, snomed_code, snomed_concept_name)
icd_concept_id icd_code icd_concept_name relationship snomed_concept_id snomed_code snomed_concept_name
fe594980-4f0d-56d9-aec6-e266519c6834 R77.0 Abnormality of albumin Maps to value 771f019d-a4cb-5ffd-af79-c262249a9096 263654008 Abnormal
fe594980-4f0d-56d9-aec6-e266519c6834 R77.0 Abnormality of albumin Maps to e78bfb96-7a40-57b9-b707-cc0f94be8cfe 26758005 Albumin measurement

In the mapping above, we see that ICD-10 code R77.0 for abnormality of albumin has two relationships with SNOMED codes:

-A “Maps to” relationship with SNOMED code 26758005 for albumin measurement; and - A “Maps to value” relationship with SNOMED code 263654008 for abnormal.

The presence of a “Maps to value” relationship implies that the concurrent presence of two SNOMED codes (26758005 & 263654008) is necessary to capture the same meaning as a single ICD-10 code (R77.0).

For more information, please see the official OMOP documentation.