Mapping ensembl_gene_id to hgnc_id via old bionty API

This mapping can be obtained via:

  1. hgnc

  2. ensembl

We should always use the reference table from hgnc website when converting hgnc_id to other ids, as it’s carefully curated to ensure unique mapping.

If you access the id mapping through ensembl, in a lot of cases, multiple ensembl ids map to the same hgnc_id.

However, the REST API provided by HGNC is not working, despite the database being up to date

import bionty as bt
gn = bt.Gene(species="human")
# table from ensembl

ens = gn.reference
# table from HGNC

hgnc = gn.hgnc()
diff_set = set(ens["ensembl_gene_id"].values).difference(df.ensembl_gene_id)

ref_diff = ens[ens["ensembl_gene_id"].isin(diff_set)]

Here you already see both two ensembl ids (index 40 and 42) map to HGNC:6338

ref_diff.head(10)
ensembl_gene_id entrezgene_id hgnc_id hgnc_symbol
37 ENSG00000278704 NaN NaN NaN
38 ENSG00000262826 65123 HGNC:26153 INTS3
39 ENSG00000275151 NaN NaN NaN
40 ENSG00000275717 3811 HGNC:6338 KIR3DL1
41 ENSG00000274714 3809 HGNC:6336 KIR2DS4
42 ENSG00000276379 3811 HGNC:6338 KIR3DL1
43 ENSG00000280538 NaN NaN NaN
44 ENSG00000274324 3809 HGNC:6336 KIR2DS4
45 ENSG00000271254 102724250 NaN NaN
46 ENSG00000275047 3810 HGNC:6337 KIR2DS5

If you search for the hgnc_id HGNC:6338 in hgnc, you got another ensembl id ENSG00000167633

hgnc[hgnc.hgnc_id == "HGNC:6338"]["ensembl_gene_id"]
13835    ENSG00000167633
Name: ensembl_gene_id, dtype: object

Check whether ENSG00000167633 is mapped to HGNC:6338 in ensembl, yes at least it is

ens[ens.ensembl_gene_id == "ENSG00000167633"]
ensembl_gene_id entrezgene_id hgnc_id hgnc_symbol
62030 ENSG00000167633 3811 HGNC:6338 KIR3DL1

HGNC REST server is not accessible

from bionty._rest import fetch_endpoint
fetch_endpoint(
    "http://rest.genenames.org/", "search/ensembl_gene_id/ENSG00000157764", "text/xml"
)
---------------------------------------------------------------------------
HTTPError                                 Traceback (most recent call last)
/Users/sunnysun/Documents/repos/bionty/docs/tasks/2022-05-29-ensembl-gene-ids.ipynb Cell 14' in <cell line: 1>()
----> <a href='vscode-notebook-cell:/Users/sunnysun/Documents/repos/bionty/docs/tasks/2022-05-29-ensembl-gene-ids.ipynb#ch0000013?line=0'>1</a> fetch_endpoint("http://rest.genenames.org/", "search/ensembl_gene_id/ENSG00000157764", "text/xml")

File /opt/miniconda3/envs/py39/lib/python3.9/site-packages/bionty/_rest.py:15, in fetch_endpoint(server, request, content_type)
     <a href='file:///opt/miniconda3/envs/py39/lib/python3.9/site-packages/bionty/_rest.py?line=11'>12</a> r = requests.get(server + request, headers={"Accept": content_type})
     <a href='file:///opt/miniconda3/envs/py39/lib/python3.9/site-packages/bionty/_rest.py?line=13'>14</a> if not r.ok:
---> <a href='file:///opt/miniconda3/envs/py39/lib/python3.9/site-packages/bionty/_rest.py?line=14'>15</a>     r.raise_for_status()
     <a href='file:///opt/miniconda3/envs/py39/lib/python3.9/site-packages/bionty/_rest.py?line=15'>16</a>     sys.exit()
     <a href='file:///opt/miniconda3/envs/py39/lib/python3.9/site-packages/bionty/_rest.py?line=17'>18</a> if content_type == "application/json":

File /opt/miniconda3/envs/py39/lib/python3.9/site-packages/requests/models.py:960, in Response.raise_for_status(self)
    <a href='file:///opt/miniconda3/envs/py39/lib/python3.9/site-packages/requests/models.py?line=956'>957</a>     http_error_msg = u'%s Server Error: %s for url: %s' % (self.status_code, reason, self.url)
    <a href='file:///opt/miniconda3/envs/py39/lib/python3.9/site-packages/requests/models.py?line=958'>959</a> if http_error_msg:
--> <a href='file:///opt/miniconda3/envs/py39/lib/python3.9/site-packages/requests/models.py?line=959'>960</a>     raise HTTPError(http_error_msg, response=self)

HTTPError: 500 Server Error: Internal Server Error for url: http://rest.genenames.org/search/ensembl_gene_id/ENSG00000157764