Commit 494a1654 authored by Ryoko's avatar Ryoko
Browse files

Added summary of original article

Added a summary of the article by Scala et al. to understanding_image_1c. Also added the summary of a previous research by Tasic et al. that Scala referred to. Started writing the summary of another reference article by Yao et al. but not finished yet.
parent 5bdc4ba9
......@@ -10,13 +10,14 @@
},
{
"cell_type": "code",
"execution_count": 5,
"execution_count": 9,
"metadata": {},
"outputs": [],
"source": [
"import pickle\n",
"import rnaseqTools\n",
"import matplotlib.pyplot as plt"
"import matplotlib.pyplot as plt\n",
"import pandas as pd"
]
},
{
......@@ -72,7 +73,7 @@
},
{
"cell_type": "code",
"execution_count": 11,
"execution_count": 2,
"metadata": {},
"outputs": [
{
......@@ -97,16 +98,16 @@
},
{
"cell_type": "code",
"execution_count": 10,
"execution_count": 3,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"<matplotlib.collections.PathCollection at 0x7f882bfed0d0>"
"<matplotlib.collections.PathCollection at 0x7f89ce4098e0>"
]
},
"execution_count": 10,
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
},
......@@ -128,6 +129,317 @@
"plt.scatter(Z[:,0],Z[:,1], s=0.5)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## First, what was the original article by Scala et al. doing?\n",
"\n",
"According to the article by Scala et al., there are two ways to classify neuron types: transciptomic and morpho-electric. The transcriptomic method uses the RNA sequence of neurons to determine their families. The morpho-electric (morphology and electrphysiology) method first observes actual neuron shapes by coloring them with a substance called byoctin. Next it runs electrical currents through the neurons (this article in particular uses a method called patch-clamp) and records the reactions of the neurons. Finally, it combines the neuron shapes to their reactions under electricity to classify the cells into families.\n",
"\n",
"Scala's article combined both transciptomic and morpho-electric descriptions to mouse neurons from the primary motor cortex (MOp) to see what it could find. The experiment showed that the morpho-electric classification agreed with the transcriptomic classification for broad cell families (Pvalb, Vip, L5 ET, and other code-like names found in the article). However, morpho-electric classification did not match transcriptomic classification for smaller neuron subclasses. The article concludes that the neuron families may not be able to be finely labeled as the transcriptomic classification suggests, but can only be labeled by the broad families with slight within-group variations. \n",
"\n",
"Scala et al. use transcriptomic clusters defined in an article by Yao et al. (https://www.biorxiv.org/content/10.1101/2020.02.29.970558v2.full.pdf). Yao also classifies neurons in the primary motor cortex.\n",
"\n",
"There was another previous research by Tasic et al., which both Scala and Yao refer to (https://www.nature.com/articles/s41586-020-2907-3#Abs1). Tasic et al. classified 23,822 neurons from mice by single-cell RNA sequencing. These neurons were from two areas of mouse brains: the primary visual cortex and the anterior lateral motor cortex. The article this notebook uses only analyzes the motor cortex. It seems that a parellel study of the visual cortex was done, though it's irrelevant to this notebook."
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [],
"source": [
"# tttypes is a dictionary with numpy arrays as values\n",
"# let's explore what the arrays mean\n",
"ttypes = pickle.load(open('../data/processed/rnaseq/ttypes.pickle', 'rb'))"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"(1329,)"
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"ttypes[ 'type'].shape"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [],
"source": [
"clusterAnn = pd.read_csv('../data/raw/allen/yao2020/cluster.annotation.csv')"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"['Lamp5 Pax6',\n",
" 'Lamp5 Egln3_1',\n",
" 'Lamp5 Egln3_2',\n",
" 'Lamp5 Egln3_3',\n",
" 'Lamp5 Pdlim5_1',\n",
" 'Lamp5 Pdlim5_2',\n",
" 'Lamp5 Slc35d3',\n",
" 'Lamp5 Lhx6',\n",
" 'Sncg Col14a1',\n",
" 'Sncg Slc17a8',\n",
" 'Sncg Calb1_1',\n",
" 'Sncg Calb1_2',\n",
" 'Sncg Npy2r',\n",
" 'Vip Sncg',\n",
" 'Vip Serpinf1_1',\n",
" 'Vip Serpinf1_2',\n",
" 'Vip Serpinf1_3',\n",
" 'Vip Htr1f',\n",
" 'Vip Gpc3',\n",
" 'Vip C1ql1',\n",
" 'Vip Mybpc1_2',\n",
" 'Vip Mybpc1_1',\n",
" 'Vip Chat_1',\n",
" 'Vip Mybpc1_3',\n",
" 'Vip Chat_2',\n",
" 'Vip Igfbp6_1',\n",
" 'Vip Igfbp6_2',\n",
" 'Sst Chodl',\n",
" 'Sst Penk',\n",
" 'Sst Myh8_1',\n",
" 'Sst Myh8_2',\n",
" 'Sst Myh8_3',\n",
" 'Sst Htr1a',\n",
" 'Sst Etv1',\n",
" 'Sst Pvalb Etv1',\n",
" 'Sst Crhr2_1',\n",
" 'Sst Crhr2_2',\n",
" 'Sst Hpse',\n",
" 'Sst Calb2',\n",
" 'Sst Pappa',\n",
" 'Sst Pvalb Calb2',\n",
" 'Sst C1ql3_1',\n",
" 'Sst C1ql3_2',\n",
" 'Sst Tac2',\n",
" 'Sst Th_1',\n",
" 'Sst Th_2',\n",
" 'Sst Th_3',\n",
" 'Pvalb Gabrg1',\n",
" 'Pvalb Egfem1',\n",
" 'Pvalb Gpr149',\n",
" 'Pvalb Kank4',\n",
" 'Pvalb Calb1_1',\n",
" 'Pvalb Calb1_2',\n",
" 'Pvalb Reln',\n",
" 'Pvalb Il1rapl2',\n",
" 'Pvalb Vipr2_1',\n",
" 'Pvalb Vipr2_2',\n",
" 'L2/3 IT_1',\n",
" 'L2/3 IT_2',\n",
" 'L2/3 IT_3',\n",
" 'L4/5 IT_1',\n",
" 'L4/5 IT_2',\n",
" 'L5 IT_1',\n",
" 'L5 IT_2',\n",
" 'L5 IT_3',\n",
" 'L5 IT_4',\n",
" 'L6 IT_1',\n",
" 'L6 IT_2',\n",
" 'L6 IT Car3',\n",
" 'L5 PT_1',\n",
" 'L5 PT_2',\n",
" 'L5 PT_3',\n",
" 'L5 PT_4',\n",
" 'L5/6 NP CT',\n",
" 'L6 CT Gpr139',\n",
" 'L6 CT Cpa6',\n",
" 'L6 CT Grp',\n",
" 'L6 CT Pou3f2',\n",
" 'L6 CT Kit_1',\n",
" 'L6 CT Kit_2',\n",
" 'L6b Col6a1',\n",
" 'L6b Shisa6_1',\n",
" 'L6b Shisa6_2',\n",
" 'L6b Ror1',\n",
" 'L6b Kcnip1',\n",
" 'L5/6 NP_1',\n",
" 'L5/6 NP_2',\n",
" 'L5/6 NP_3',\n",
" 'Meis2',\n",
" 'Meis2_Top2a',\n",
" 'OPC Pdgfra',\n",
" 'Astro_Top2a',\n",
" 'Astro Aqp4_Gfap',\n",
" 'Astro Aqp4_Slc7a10',\n",
" 'Oligo Enpp6_1',\n",
" 'Oligo Enpp6_2',\n",
" 'Oligo Enpp6_3',\n",
" 'Oligo Enpp6_4',\n",
" 'Oligo Opalin_1',\n",
" 'Oligo Opalin_2',\n",
" 'Oligo Opalin_3',\n",
" 'Oligo Opalin_4',\n",
" 'Endo',\n",
" 'VLMC_1',\n",
" 'VLMC_2',\n",
" 'VLMC_3',\n",
" 'VLMC_4',\n",
" 'VLMC_5',\n",
" 'VLMC_6',\n",
" 'VLMC_7',\n",
" 'SMC',\n",
" 'Peri',\n",
" 'Micro',\n",
" 'PVM_1',\n",
" 'PVM_2',\n",
" 'PVM_3']"
]
},
"execution_count": 17,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"list(clusterAnn[\"cluster_label\"])"
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {},
"outputs": [],
"source": [
"# a lot of the ttypes dictionary seem to be related to the data from a previous study\n",
"# It is stored in a pickle file tasic2018\n",
"tasic2018 = pickle.load(open('../data/processed/reduced-allen-data/tasic2018.pickle', 'rb'))"
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'counts': <23822x3000 sparse matrix of type '<class 'numpy.float64'>'\n",
" \twith 17616622 stored elements in Compressed Sparse Column format>,\n",
" 'genes': array(['0610040J01Rik', '1110008P14Rik', '1190002N15Rik', ..., 'Zmat4',\n",
" 'Zp2', 'Zwint'], dtype='<U28'),\n",
" 'areas': array([0, 0, 0, ..., 1, 1, 1]),\n",
" 'clusters': array([93, 72, 1, ..., 88, 34, 56]),\n",
" 'clusterColors': array(['#DDACC9', '#FF88AD', '#FFB8CE', '#DD6091', '#FF7290', '#FFA388',\n",
" '#C77963', '#9440F3', '#9900B3', '#C266D1', '#6C00BF', '#A700FF',\n",
" '#CA66FF', '#7779BF', '#8194CC', '#533691', '#9189FF', '#B09FFF',\n",
" '#756FB3', '#9FAAFF', '#FF00FF', '#AF00E6', '#FF00B3', '#B3128A',\n",
" '#FF4DC1', '#BD3D9A', '#882E81', '#AD589A', '#AC3491', '#FFFF00',\n",
" '#FFBB33', '#804811', '#B06411', '#BF480D', '#CC6D3D', '#FFDF11',\n",
" '#D6C300', '#FF8011', '#FF9F2C', '#FFB307', '#D9C566', '#BF9F00',\n",
" '#806B19', '#B95541', '#C77767', '#C11331', '#BF8219', '#994C00',\n",
" '#802600', '#A81111', '#ED4C50', '#FF2F7E', '#FF4343', '#BC2B11',\n",
" '#C94545', '#E62A5D', '#D6221D', '#E67A77', '#AF3F64', '#FF197F',\n",
" '#D9F077', '#A6E6A9', '#7AE6AB', '#82AD7D', '#B8FFCA', '#ADE6A6',\n",
" '#00979D', '#00DDC5', '#26FFF2', '#00A809', '#00FF00', '#26BF64',\n",
" '#33A9CE', '#0094C2', '#00A79D', '#2F8C4D', '#00879D', '#669D6A',\n",
" '#005C07', '#008F1F', '#53879D', '#53A39D', '#174F61', '#A19922',\n",
" '#7F9922', '#C2E32C', '#96E32C', '#5100FF', '#0000FF', '#22737F',\n",
" '#247740', '#00A863', '#29E043', '#53D385', '#1E806D', '#4A8044',\n",
" '#008F39', '#3D9946', '#73CA95', '#47867A', '#00B8C3', '#006091',\n",
" '#5C89CC', '#74A0FF', '#74CAFF', '#69A8E6', '#578EBF', '#1F6666',\n",
" '#2B7880', '#388899', '#266180', '#494566', '#336D99', '#254566',\n",
" '#335280', '#FF0000', '#00FF66', '#665C47', '#476655', '#475D4B',\n",
" '#6B998D', '#6C8581', '#476662', '#664747', '#89997A', '#6B8059',\n",
" '#74996B', '#665547', '#807059', '#997F7A', '#805F59', '#4C6647',\n",
" '#598069'], dtype='<U7'),\n",
" 'clusterNames': array(['Lamp5 Krt73', 'Lamp5 Fam19a1 Pax6', 'Lamp5 Fam19a1 Tmem182',\n",
" 'Lamp5 Ntn1 Npy2r', 'Lamp5 Plch2 Dock5', 'Lamp5 Lsp1',\n",
" 'Lamp5 Lhx6', 'Sncg Slc17a8', 'Sncg Vip Nptx2', 'Sncg Gpr50',\n",
" 'Sncg Vip Itih5', 'Serpinf1 Clrn1', 'Serpinf1 Aqp5 Vip',\n",
" 'Vip Igfbp6 Car10', 'Vip Igfbp6 Pltp', 'Vip Igfbp4 Mab21l1',\n",
" 'Vip Arhgap36 Hmcn1', 'Vip Gpc3 Slc18a3', 'Vip Lmo1 Fam159b',\n",
" 'Vip Lmo1 Myl1', 'Vip Ptprt Pkp2', 'Vip Rspo4 Rxfp1 Chat',\n",
" 'Vip Lect1 Oxtr', 'Vip Rspo1 Itga4', 'Vip Chat Htr1f',\n",
" 'Vip Pygm C1ql1', 'Vip Crispld2 Htr2c', 'Vip Crispld2 Kcne4',\n",
" 'Vip Col15a1 Pde1a', 'Sst Chodl', 'Sst Mme Fam114a1',\n",
" 'Sst Tac1 Htr1d', 'Sst Tac1 Tacr3', 'Sst Calb2 Necab1',\n",
" 'Sst Calb2 Pdlim5', 'Sst Nr2f2 Necab1', 'Sst Myh8 Etv1 ',\n",
" 'Sst Myh8 Fibin', 'Sst Chrna2 Glra3', 'Sst Chrna2 Ptgdr',\n",
" 'Sst Tac2 Myh4', 'Sst Hpse Sema3c', 'Sst Hpse Cbln4',\n",
" 'Sst Crhr2 Efemp1', 'Sst Crh 4930553C11Rik ', 'Sst Esm1',\n",
" 'Sst Tac2 Tacstd2', 'Sst Rxfp1 Eya1', 'Sst Rxfp1 Prdm8', 'Sst Nts',\n",
" 'Pvalb Gabrg1', 'Pvalb Th Sst', 'Pvalb Akr1c18 Ntf3',\n",
" 'Pvalb Calb1 Sst', 'Pvalb Sema3e Kank4', 'Pvalb Gpr149 Islr',\n",
" 'Pvalb Reln Tac1', 'Pvalb Reln Itm2a', 'Pvalb Tpbg', 'Pvalb Vipr2',\n",
" 'L2/3 IT VISp Rrad', 'L2/3 IT VISp Adamts2', 'L2/3 IT VISp Agmat',\n",
" 'L2/3 IT ALM Sla', 'L2/3 IT ALM Ptrf', 'L2/3 IT ALM Macc1 Lrg1',\n",
" 'L4 IT VISp Rspo1', 'L5 IT VISp Hsd11b1 Endou',\n",
" 'L5 IT VISp Whrn Tox2', 'L5 IT VISp Batf3',\n",
" 'L5 IT VISp Col6a1 Fezf2', 'L5 IT VISp Col27a1', 'L5 IT ALM Npw',\n",
" 'L5 IT ALM Pld5', 'L5 IT ALM Cbln4 Fezf2', 'L5 IT ALM Lypd1 Gpr88',\n",
" 'L5 IT ALM Tnc', 'L5 IT ALM Tmem163 Dmrtb1',\n",
" 'L5 IT ALM Tmem163 Arhgap25', 'L5 IT ALM Cpa6 Gpr88',\n",
" 'L5 IT ALM Gkn1 Pcdh19', 'L6 IT ALM Tgfb1', 'L6 IT ALM Oprk1',\n",
" 'L6 IT VISp Penk Col27a1', 'L6 IT VISp Penk Fst',\n",
" 'L6 IT VISp Col23a1 Adamts2', 'L6 IT VISp Col18a1',\n",
" 'L6 IT VISp Car3', 'L5 PT VISp Chrna6', 'L5 PT VISp Lgr5',\n",
" 'L5 PT VISp C1ql2 Ptgfr', 'L5 PT VISp C1ql2 Cdh13',\n",
" 'L5 PT VISp Krt80', 'L5 PT ALM Slco2a1', 'L5 PT ALM Npsr1',\n",
" 'L5 PT ALM Hpgd', 'L5 NP VISp Trhr Cpne7', 'L5 NP ALM Trhr Nefl',\n",
" 'L5 NP VISp Trhr Met', 'L6 NP ALM Trh', 'L6 CT ALM Cpa6',\n",
" 'L6 CT ALM Nxph2 Sla', 'L6 CT VISp Nxph2 Wls', 'L6 CT VISp Gpr139',\n",
" 'L6 CT VISp Ctxn3 Brinp3', 'L6 CT VISp Ctxn3 Sla',\n",
" 'L6 CT VISp Krt80 Sla', 'L6b VISp Col8a1 Rprm', 'L6b VISp Mup5',\n",
" 'L6b VISp Col8a1 Rxfp1', 'L6b ALM Olfr111 Spon1',\n",
" 'L6b ALM Olfr111 Nxph1', 'L6b P2ry12', 'L6b VISp Crh',\n",
" 'L6b Hsd17b2', 'Meis2 Adamts19', 'CR Lhx5', 'Astro Aqp4',\n",
" 'OPC Pdgfra Grm5', 'OPC Pdgfra Ccnb1', 'Oligo Rassf10',\n",
" 'Oligo Serpinb1a', 'Oligo Synpr', 'VLMC Osr1 Cd74',\n",
" 'VLMC Osr1 Mc5r', 'VLMC Spp1 Hs3st6', 'VLMC Spp1 Col15a1',\n",
" 'Peri Kcnj8', 'SMC Acta2', 'Endo Ctla2a', 'Endo Cytl1', 'PVM Mrc1',\n",
" 'Microglia Siglech'], dtype='<U26'),\n",
" 'seqDepths': array([[1769029.],\n",
" [ 811166.],\n",
" [1203840.],\n",
" ...,\n",
" [1002767.],\n",
" [1025818.],\n",
" [ 882435.]]),\n",
" 'intronCounts': <23822x3000 sparse matrix of type '<class 'numpy.float64'>'\n",
" \twith 12834637 stored elements in Compressed Sparse Column format>,\n",
" 'intronSeqDepths': array([[242757.],\n",
" [ 87358.],\n",
" [130679.],\n",
" ...,\n",
" [144836.],\n",
" [100540.],\n",
" [116728.]])}"
]
},
"execution_count": 22,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"tasic2018"
]
},
{
"cell_type": "code",
"execution_count": null,
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment