ICD and SNOMED codes for clinical conditions
condition_codes_icd_snomed.Rmd
Background
PicnicHealth utilizes the OMOP Common Data
Model to map clinical concepts to standard ontologies. All clinical
tables in our data can be joined to the concept
table to
access standard codes and vocabularies, which can then be used to
identify clinical elements of interest in your analysis.
Lists of conditions found on problem lists, assessment/visit
diagnosis lists, and discharge diagnosis lists are captured in the
visit_condition_occurrence
table. Depending on the date of
your dataset, conditions will be coded in one of two sets of
ontologies:
- datasets dated on or after July 1, 2024: SNOMED-CT
- datasets dated prior to July 1, 2024: ICD-9 and ICD-10
Sometimes, you may want to convert between ICD and SNOMED oontologies in order to accomplish a particular task. For example:
- You have a SNOMED-coded table, but need to use a list of ICD codes prepared by your research team to find comorbidities to include or exclude patients from an analysis.
- You have an ICD-coded table, but want to use the SNOMED hierarchy to find descendents of a concept of interest, such as all the conditions subsumed by “diabetes mellitus”.
The vignette below shows you how to translate between the two ontologies using the vocabulary tables provided in your PicnicHealth dataset.
Important note
Conditions which are specifically relevant to the primary clinical
population in your dataset are found in the cohort_
tables.
Depending on your dataset’s data model, these may include the primary
diagnosis, select comorbidities, and/or symptoms. In all
datasets, these are coded in SNOMED.
Translating SNOMED to ICD
Working from an existing ICD code list
If you have already identified ICD codes to use in your analysis, you can use the tables provided in your dataset to identify the corresponding SNOMED codes.
[1] Load the PicnicHealth
package and the data
provided.
library(PicnicHealth)
library(dplyr)
data_set = load_data_set(data_path)
list2env(data_set, envir = .GlobalEnv)
[2] Join the visit_condition_occurrence
table to the
concept
table to access the SNOMED code for each
condition.
joined_condition_concepts = visit_condition_occurrence %>%
select(visit_condition_occurrence_id, person_id,
condition_concept_id, condition_concept_name, section) %>%
left_join(
concept %>%
select(concept_id, domain_concept_code, vocabulary, vocabulary_version),
by = c("condition_concept_id" = "concept_id"))
[3] Join the resulting table to the snomed_icd_mappings
table. The domain_concept_code field will map on to the
snomed_concept_code field. Note: This table is
only present in datasets dated on or after July 1, 2024. If it is not
present in your dataset, it may be provided to you as a .csv file in the
PicnicHealth study portal - reach out to your project contact to receive
access.
mapped_condition_concepts = joined_condition_concepts %>%
left_join(snomed_icd_mappings,
by = c("domain_concept_code" = "snomed_concept_code"))
[4] Filter the resulting table by the ICD codes of interest. In this example, we’re looking for instances of fatigue.
fatigue_icd_codes = c('R53.1', 'R53.81', 'R53.82', 'R53.83', '780.7', '780.79')
fatigue_conditions = mapped_condition_concepts %>%
filter(icd_concept_code %in% fatigue_icd_codes)
head(fatigue_conditions) %>%
select(visit_condition_occurrence_id,
snomed_concept_code = domain_concept_code, snomed_concept_name,
icd_concept_code, icd_concept_name) %>%
kable()
visit_condition_occurrence_id | snomed_concept_code | snomed_concept_name | icd_concept_code | icd_concept_name |
---|---|---|---|---|
6ad80e13-ed21-4512-90dc-627829f92c85 | 84229001 | Fatigue | R53.83 | Other fatigue |
ec00d0a6-11a7-4835-9891-39e0067668e5 | 13791008 | Asthenia | R53.1 | Weakness |
c4eb52bf-8fe1-4d4c-8f2c-d28c2dc590f7 | 52702003 | Chronic fatigue syndrome | R53.82 | Chronic fatigue, unspecified |
298299ef-8ae5-4034-a393-678816d75f28 | 367391008 | Malaise | R53.81 | Other malaise |
c803a3e4-1a5b-49dd-bd5b-e35d4c67cb08 | 26544005 | Muscle weakness | M62.81 | Muscle weakness (generalized) |
Developing novel code lists in SNOMED
If you are developing a novel code list, a tool such as OHDSI’s Athena can help you identify concepts of interest.
SNOMED provides ancestor-descendent relationships at many levels of granularity, enabling researchers to identify conditions of interest without needing to curate exhaustive code lists by hand.
For example, an analaysis interested in diabetes mellitus (DM) can start with SNOMED code 73211009 and find all descendants of this concept, in order to identify patients who have DM as a comorbidity.
To do so in PicnicHealth data:
[1] Filter the concept
table to rows where
domain_concept_code = 73211009.
dm_concept = concept %>%
filter(domain_concept_code == "73211009" &
vocabulary == "Systematic Nomenclature of Medicine - Clinical Terms (IHTSDO)")
[2] Inner join this filtered table to the
concept_ancestor
table (fields to join: concept_id
←→ ancestor_concept_id).
dm_descendants = dm_concept %>%
inner_join(concept_ancestor,
by = c("concept_id" = "ancestor_concept_id"))
[3] Join this filtered table to
visit_condition_occurrence
(fields to join:
descendant_concept_id ←→ condition_concept_id) to find
all instances where a descendant concept of DM was present in a
patient’s medical record.
dm_conditions = dm_descendants %>%
inner_join(visit_condition_occurrence,
by = c("descendant_concept_id" = "condition_concept_id")) %>%
select(visit_condition_occurrence_id, condition_concept_name,
section, visit_id)
visit_condition_occurrence_id | condition_concept_name | section | visit_id |
---|---|---|---|
3fbe116e-5b00-4781-994f-ab3dec09e2eb | Mononeuropathy due to type 2 diabetes mellitus | Problem list - Reported | 7a705a6d-c374-4e7a-a73c-914e8eb934fc |
fb6129b7-424c-4aaf-865a-10834094feaf | Type 2 diabetes mellitus | Evaluation + Plan note | 741bfaa0-9608-415c-8f3f-80ae5726ed73 |
f6165e73-0034-49d2-9bcb-e5c81a9a5797 | Mild nonproliferative retinopathy due to diabetes mellitus | Evaluation + Plan note | cd1efad8-c2eb-4f73-9899-c4ca2eff4d3d |
6e296041-0a81-4c60-a98d-142ae304bce2 | Type 2 diabetes mellitus without complication | Evaluation + Plan note | f3d26a4c-0d37-437c-b478-a79fbf7ed892 |
7f41830e-7d76-4aec-a942-ff7883a2a0ca | Mild nonproliferative retinopathy due to type 2 diabetes mellitus | Problem list - Reported | c1ad964a-b3cd-43ed-94ec-421594a6381b |
dc4e510b-eaf2-442b-b241-836d75fbe15b | Type 1 diabetes mellitus without complication | Evaluation + Plan note | 8a5d4259-879e-40f4-83fb-7fdbc0ef1f10 |
Translating ICD to SNOMED
In datasets dated prior to July 1, 2024, conditions in the
visit_condition_occurrence
table are coded in
ICD-9 or ICD-10, the codes commonly
documented in the structured portions of U.S. patients’ medical records
for billing purposes. This vignette walks through how to use the
concept
and concept_relationship
tables to map
from ICD to SNOMED codes.
Important notes:
- It is difficult to develop a direct translation from ICD to SNOMED. Follow the caveats in the “Maps to value” discussion below, and manually review output to ensure that only conditions of interest are included in your final code list.
- Only the full OMOP concept relationship table that is provided independently from clinical data sets contains relationships to identify synonyms. Converting ICD to SNOMED codes as described below will require that the full OMOP concept relationship table is downloaded and loaded into R.
[1] Start with a list of ICD-9 or ICD-10 codes of interest. For this example, we will work with the ICD-10 codes for Type 2 diabetes mellitus: “E11” and all corresponding sub-codes (e.g., “E11.1”, “E11.2”). (Note: while we demo here only with ICD-10, it is advised to include ICD-9 codes as well in your analysis.)
[2] Load up the PicnicHealth
package, and the data
provided.
library(PicnicHealth)
library(dplyr)
data_set = load_data_set(data_path)
list2env(data_set, envir = .GlobalEnv)
[3] Use the concept
table to obtain the
concept_id values corresponding to ICD-10 codes of interest.
Here, we search the pattern “^E11” to match all codes that begin with
this root.
DM_icd_concept_ids = concept %>%
filter(vocabulary == paste("International Classification of Diseases,",
"Tenth Revision, Clinical Modification (NCHS)") &
grepl("^E11", domain_concept_code)) %>%
select(concept_id) %>%
pull()
[4] Identify all records in the concept_relationship
table that map the concept_id values identified in step 3 to
other concepts. This can be done using the concept_id_1 column
in the concept_relationship
table. When converting from ICD
concepts to SNOMED concepts, we are only interested in
relationship values of “Maps to” and “Maps to value”, so we
will restrict to those. We will also rename the concept_id_1
and concept_id_2 columns to icd_concept_id and
snomed_concept_id, respectively, to keep things organized.
As noted above, only the full OMOP concept relationship table that is provided independently from clinical data sets contains the “Maps to” and “Maps to value” relationships that are necessary for identifying synonyms.
DM_concept_relationships = concept_relationship %>%
filter(concept_id_1 %in% DM_icd_concept_ids &
relationship %in% c("Maps to", "Maps to value")) %>%
select(icd_concept_id = concept_id_1, relationship,
snomed_concept_id = concept_id_2)
icd_concept_id | relationship | snomed_concept_id |
---|---|---|
0acfbd0c-4942-5822-b99d-ec1dd711a07a | Maps to | 0877f62f-346f-5684-a3a4-773691622b3f |
45982c28-ab08-56ba-a267-a2b0c034b48f | Maps to | 939834b2-65a0-52d9-bc9f-a934a4e0aa62 |
5a96ed46-83f1-5ff2-9632-75d59467fd85 | Maps to | de305955-7485-5ca3-a33f-280af6a99cf1 |
2f0122d0-f493-59d4-8d8c-459f7d3f3b50 | Maps to | 262239b0-5396-59b9-8acc-46514feafd37 |
5a05995a-d531-59bf-9e2b-cc95bf6b1f1c | Maps to | 8dc97a07-7307-5d2e-bb75-586c1b26f989 |
ed0cb105-234a-5e38-92b0-dc5ab8d87c2a | Maps to | 172c52eb-63c1-5583-9073-190f61ee9dc2 |
[5] Now that we have our ICD and SNOMED concept_id
values, we can join with the concept
table twice (once for
icd_concept_id
and once for snomed_concept_id
)
to retrieve the corresponding codes.
DM_icd_snomed_map = DM_concept_relationships %>%
left_join(concept %>%
select(concept_id, icd_code = domain_concept_code),
by = c("icd_concept_id" = "concept_id")) %>%
left_join(concept %>%
select(concept_id, snomed_code = domain_concept_code),
by = c("snomed_concept_id" = "concept_id")) %>%
select(icd_concept_id, icd_code, relationship,
snomed_concept_id, snomed_code)
icd_concept_id | icd_code | relationship | snomed_concept_id | snomed_code |
---|---|---|---|---|
1395c6fb-c40a-5695-895b-adfa46b41c9a | E11.641 | Maps to | 747bee0c-e815-55b0-91e2-8607a6ad9094 | 719216001 |
281c58bb-4580-5cd2-ae09-b9d639221023 | E11.52 | Maps to | 4ecbde5a-97e7-53c3-9ba3-57e08977f083 | 421631007 |
79cbd97e-5f90-53c0-9db3-106eefb3b4d6 | E11.4 | Maps to | 643f4b78-1fdd-5ef4-84c6-27063428f19b | 421326000 |
4b00f6cd-ee82-59b8-9697-5cfcad429ee5 | E11.40 | Maps to | 643f4b78-1fdd-5ef4-84c6-27063428f19b | 421326000 |
c519515a-a88c-5f2a-b6ff-da26cafa2663 | E11.49 | Maps to | 643f4b78-1fdd-5ef4-84c6-27063428f19b | 421326000 |
7acb913f-f445-50ca-9bc5-2d45674034c8 | E11.3523 | Maps to | f0a4502c-3835-59be-bf07-923b3f9771f6 | 232010004 |
[6] Now that we have our map from our original ICD-10 codes to their corresponding SNOMED codes, we can extract the unique SNOMED codes as a vector and take a look at the first few results.
DM_snomed_codes = unique(DM_icd_snomed_map$snomed_code)
DM_snomed_codes[1:3]
#> [1] "312912001" "422014003" "421631007"
A note regarding the “Maps to value” relationship
Because ICD and SNOMED are distinct vocabularies, there does not exist a perfect one-to-one relationship for all concepts across both vocabularies. When translating from ICD to SNOMED, some caution is required.
The “Maps to” relationship corresponds to either a full equivalence between concepts or, if an equivalent mapping does not exist, a mapping from a more specific ICD code to a more general SNOMED code (e.g., the ICD-10 code for refractory anemia with multi-lineage dysplasia maps to the SNOMED code for refratory anemia). Furthermore, a single ICD code can map to one or more SNOMED codes.
The “Maps to value” relationship is a bit different. It is always used in conjunction with a single “Maps to” relationship, and is necessary to preserve the full meaning of certain types of ICD codes. One example of where “Maps to value” relationships arise in the context of ICD-to-SNOMED mapping is for ICD codes corresponding to abnormal levels of tests.
For example, let’s take a look at ICD-10 code R77.0, which represents abnormality of albumin, and obtain its corresponding SNOMED codes.
ab_alb_concept_id = concept %>%
filter(domain_concept_code == "R77.0") %>%
select(icd_concept_id = concept_id,
icd_code = domain_concept_code,
icd_concept_name = concept_name) %>%
left_join(concept_relationship,
by = c("icd_concept_id" = "concept_id_1")) %>%
filter(relationship %in% c("Maps to", "Maps to value"))%>%
left_join(concept %>% select(concept_id,
snomed_code = domain_concept_code,
snomed_concept_name = concept_name),
by = c("concept_id_2" = "concept_id")) %>%
select(icd_concept_id, icd_code, icd_concept_name, relationship,
snomed_concept_id = concept_id_2, snomed_code, snomed_concept_name)
icd_concept_id | icd_code | icd_concept_name | relationship | snomed_concept_id | snomed_code | snomed_concept_name |
---|---|---|---|---|---|---|
fe594980-4f0d-56d9-aec6-e266519c6834 | R77.0 | Abnormality of albumin | Maps to value | 771f019d-a4cb-5ffd-af79-c262249a9096 | 263654008 | Abnormal |
fe594980-4f0d-56d9-aec6-e266519c6834 | R77.0 | Abnormality of albumin | Maps to | e78bfb96-7a40-57b9-b707-cc0f94be8cfe | 26758005 | Albumin measurement |
In the mapping above, we see that ICD-10 code R77.0 for abnormality of albumin has two relationships with SNOMED codes:
-A “Maps to” relationship with SNOMED code 26758005 for albumin measurement; and - A “Maps to value” relationship with SNOMED code 263654008 for abnormal.
The presence of a “Maps to value” relationship implies that the concurrent presence of two SNOMED codes (26758005 & 263654008) is necessary to capture the same meaning as a single ICD-10 code (R77.0).
For more information, please see the official OMOP documentation.