Home Search Reports Help

Gene Model: Htt

NomenclatureGenomic Location
SymbolHttChromosome1
NameHuntingtinLinkage mapunknown
SpeciesDracomimus familiarisGenome CoordinatesChr1: 86 Mbp

Molecular Function

May play a role in microtubule-mediated transport or vesicle function.

Molecular Function Terms:

binding
   drug binding
      diazepam binding
   protein binding
      cytoskeletal protein binding
         dynactin binding
         tubulin binding
      dynein binding
      transcription factor binding

Human Disease Association

Defects in HTT are the cause of Huntington disease (HD) [MIM:143100]. HD is an autosomal dominant neurodegenerative disorder characterized by involuntary movements (chorea), general motor impairment, psychiatric disorders and dementia. Onset of the disease occurs usually in the third or fourth decade of life and symptoms progressively worsen leading to death in 10 to 20 years. Onset and clinical course depend on the degree of poly-Gln repeat expansion, longer expansions resulting in earlier onset and more severe clinical manifestations. HD affects 1 in 10,000 individuals of European origin. Neuropathology of Huntington disease displays a distinctive pattern with loss of neurons, especially in the caudate and putamen (striatum).

Polymorphism: The poly-Gln region of HTT is highly polymorphic (10 to 35 repeats) in the normal population and is expanded to about 36-120 repeats in Huntington disease patients. The repeat length usually increases in successive generations, but contracts also on occasion. The adjacent poly-Pro region is also polymorphic and varies between 7-12 residues. Polyglutamine expansion leads to elevated susceptibility to apopain cleavage and likely result in accelerated neuronal apoptosis.

Predicted Transcript
     1 ATGGCTACCATGGAGAAGCTGATGAAAGCCTTCGAGTCCCTCCGCTCCTTCCAGCAGCAG
    61 CAGAGCCGGGCGCCTCAGCCGCTTCCTCGGCCGGAGCGGCCGCTGCCTCCGCCGCCGAAG
   121 AGCCCCCTCAGAGACCGATCGGCCCCTTCCACACAGCCCTGTATCCCAGAATATCAACCT
   181 TGGCATAAAAGTATCATTCCCCCCCCCCCCCCAAAAAAAATCCAGAAACCCTCTTATTTT
   241 AGAAAGAAAGAGCTTTCCACAACTAAGAAAGATCGAGTCAATCATTGTCTAACGATTTGT
   301 GAAAATATAGTCGCTCAGTCCTTGAGGAACTCTCCAGAATTTCAAAAGTTGCTGGGAATT
   361 GCTATGGAACTTTTCCTTCTCTGCAGTGATGATGTGGAGTCAGATGTAAGAATGGTTGCT
   421 GATGAATGTCTCAACAAAGTAATAAAAGCTTTGATGGACTCCAATCTTCCTAGGTTACAA
   481 TTAGAACTTTATAAAGAGATTAAAAAGAATGGTGCTTCTCGAAGTTTGCGTGCTGCTCTT
   541 TGGAGGTTTGCTGAACTTGCACACCTTGTTCGACCACAGAAATGCCGACCTTATTTTGTA
   601 AACCTTTTGCCCTGTCTAACACGAGTGAGCAACCGATCTGAGGAGTCGGTCCAAGAGACC
   661 TTGGCTGCAGCAATACCAAAAATCATGGCAGCTTTTGGAAATTTTGCAAATGATAATGAA
   721 ATTAAGATTTTATTGAAGGCATTTGTAGCTAACCTTAAGTCCACCTCTCCTACTATACGT
   781 CGAACGGCAGCTGCATCAGTAGTGAGCATTTGCCAGCATTCACGAAGAACACAATATTTT
   841 TATACTTGGCTGCTGAATGTACTATTAGGTTTATTATTTCCCATGGAAGATGAATATCCC
   901 AGTGTCTTAATTCTTGGGGTATTGCTCACATTAAGATATCTGATACCTTTGCTACAGCAG
   961 CAGGTGAAAGATACAAGTCTTAAAGGTAGCTTTGGTGTGACACAGAAAGAAACAGAGATT
  1021 TCTCCTTCTTTAAATCAACTTGTACAGGTTTATGAATTGAGTCTACATTATACACAGCAC
  1081 CGAGACCACAATGTTGTAACTGGATCATTAGAACTGTTGCAACAGCTTTTGAGAACGCCA
  1141 CCTCCTAATCTTCTGCTTGCCTTGACTACTGCTGGTGGCATTACACAAGCCTGTGTATCC
  1201 AAAGATGTGGTTGCCAGCAGAAACCGCAGTGATAGCATAGTAGAACTTATAGCTGGAGGG
  1261 GGGTCTTCATGCAGCCCTGTTTTTTCCAGAAAGCAAAAAGCTAAAATATTTCTTGGAGGA
  1321 GAAGAAGAAGGTTTAGAAGATGACTCTGAAACAAGATCTGAAGTCAGCAACACAAGCTTT
  1381 GCAGCATCCATGAAGAGTGAGTTAAATAGTGAATTAGCTTCCTCAGACATACCAACAGCG
  1441 ACAAATTCAGTAGCAGATTCGTCAGGACATGACATCATAACTGAGCAACCACGTTCTCAA
  1501 CATACACTACAGTCAGAATCAGACTTGACAAATTGTGAGTTAACAAGTTCAGCTAACAAA
  1561 GGAGATACTGATGATGATGTTCTTAGCAGAAGCTCCAGCCAAATCAGCACTGTCCAGTCA
  1621 GACCCTACAGTAGACCTGAACAGTGGCACTCGGGCATCCTCTCCCATTAGTGACAGCTCC
  1681 CAGACGACTACAGAAGGTCCGGATTCAGCTGTGACTCCTTCAGATTGTTCTGAAATGATT
  1741 ATAGAAGGTGCTGAAAACCAATACTCTGGAATGCAGATTGGACAGTTGCAAGATGAAGAA
  1801 GATGAAACAACCAATGTTCTCCAGGATGAGATGATCATTGATACAAGAAGTTCTTGTCTT
  1861 GAAGCCCTTCAGCAGTCCCATTTATTAAAAAGCATGGGTCACAGCAGACAGCCTTCTACC
  1921 AGCAGCATGGATAAATTTTCAACAAAAGAAGAAACAGAGCCTGGTGACCATGAAAATAAA
  1981 CTTTCCAGAATTAAAGGAGATATAGGGTGTTATACTGATGAAAATGCTGCTCCACTCGTT
  2041 CACTGTGTACGTCTTCTGTCAGCCTCATTTCTGCTTACAGAAGTGAAAGGTGCATTGGTA
  2101 CCAGATAAAGATGTGAGAGTCAGCGTGAAAGCTTTAGCGATAAGTTGTGTGAGTGCAGCA
  2161 GTTGCTCTCCATCCTGAAGCTTTCTTCAGCAAACTATATAAAATGCCTTCTGAAGCCAAC
  2221 ACAGGTGAAGAGGAGTATGTAAAGGATATTATGAATTATATTGATCATGGAGATCCCCAG
  2281 ATTAGAGGAGCAACAGCCATTTTATGTGGAACAATAGTTTATTCCATCCTCACCAAATCT
  2341 CGCTTTGATGTGGAAAACTGGCTAACAAATATAAGAGTTTCAACAGGAAATACATTTTCA
  2401 CTGGTAGATTGTATACTTTTGTTACAAAGAACACTGAAAGATGAATCTTCAGTTACCTGT
  2461 AAACTGGCTTGTACAGCTGTCAGGCATTGTATCATGGCTTTGTGCAGTAGCAGCTATAGT
  2521 GCCTTGGGTGTAAAGTTAATGGTTGATCTCCTTACATTGAGGAATAGTTCTTATTGGCTG
  2581 GTTAGAACTGAGCTGTTAGAAACTCTTGCTGATATTGATTTTCGGCTAATCAGTTTTTTG
  2641 GAAAGAAGGGCCAACAATTTACACAGAGGTACTCATCATTATACTGGACTGTTAAAACTT
  2701 CAAGATCGTGTGCTCAATGATGTAGTAATTTCTCTGCTTGGCGATGAAGACCCCAGAGTG
  2761 AGGCATGTTGCTGCTGCCACTTTAGTAAGGCTTGTTCCAAAGTTCTTTTATAATTGCGAT
  2821 CAAGGACAAGCTGACCCTGTAGTGGCAGTAGCAAGGGACCAAAGTAGTGTCTACCTAAAG
  2881 CTGCTAATGCACGAAACCCAACCTGCTTCTCAGTTTGCAGTAAGCACTATAACCAGAACG
  2941 TACAGAGGATATAACATACTGCAAAGTCCAACAGATGTGACAATGGAAAATAATCTTTCT
  3001 AGAGTTATTTCAGCAGTTTCCCATGCCTTGACAACATCGTCAACACGATCTCTAACATTT
  3061 GGTTGCTTTGAAGCTTTATGCCTTCTGTCCACATCATTTCCAGTTTGTACCTGGAATGTA
  3121 GGATGGCACTGTGGTTTTTACCTGCTGGGTTCTACAGAAGAGTCTCAGAAGAATAGCACC
  3181 ATTGGAATGGCAGGGTTGGTTCTATCACTTCTTTCATCAGCTTGGCTCCCACTAGACCTG
  3241 TCAGCTCATCAAGATGCTTTGATACTGGCTGGAAACTTGCTTGCAGCTAGTGCTTCAAAA
  3301 TCTTTGAAAACCCCTTGGACTAACGAGGATGACACTAATGTTGGTGCTACTAAGCAGGAA
  3361 GAGCCATGGCCAGCTCTAGCAGACAGAACTATAATTGTTTTGATAGAGCAGATGTTCTCC
  3421 CATTTGCTGAAAATCATTAATATTTGTGCTCATGTGTTGGACGATGTCATTCCTGGCCCA
  3481 ACAATAAAGGCAACATTACCATCTCTGACAAATCCACCTTCTTTGAGTCCATTGAAAGGA
  3541 CGCAAAGGAAAAGAGAAAGAGATTGCTGATCAAACCTCTGTGCCAATGAGCCCCAAAAAG
  3601 ACAAGTGAAAACAGCCCAGCCCCTAGGCAAACTGATGCTTCAGGTCCTGCTCCAGCAAGT
  3661 AAATCATCATCACTGGGCAACTTCTATCACCTTCCTTCATATCTGAAGCTATATGATGTT
  3721 TTGAAAGCTACCCATGCCAACTATAAGGTTACACTGGATCTCCAGAACAGCAGTGAAAAG
  3781 TTTGGTGCATTCCTGCGATCTGCTTTGGATGTTCTGTCTCAAATATTAGAACTTGCAACT
  3841 CTTCAGGACATTGGAAAATATGTTGAAGAAATTTTGGGTTATCTGAAATCCTGTTTCAGC
  3901 AGAGAACCAATGATGGCCACTTTGTGTGTACAGCAGTTGCTGAAGACATTGTTTGGCACA
  3961 AACCTGGCTTCACAGTATGATTGCTTGTCTTCAAACCCAAGTAGATCTCAAGGCAAAGCC
  4021 CAACGCTTGGGTTCGTCAAACTTGAGGCCTGGGCTTTATCATTATTGCTTCATGGCTCCT
  4081 TATACACATTTCACTCAGGCACTTGCAGATGCCAGTTTAAGGAATATGATGCAAGCTGAA
  4141 CAGGAACATGATACTTCTGGCTGGTTTGATATATTGCAGAAGGTTTCTTCCCAGCTTAAA
  4201 ACTGGCATGACAAGTGCAGTGAAGCATCGTGCTGACAAGAATATTATTCATAATCACATT
  4261 CGTTTGTTTGAACCCCTGGTCATCAAAGCTTTAAAGCAATATACTACAACAACATCAGTA
  4321 CAACTGCAGAGGCAAGTTTTAGATCTTCTTGCCCAACTCGTTCAGCTGCGGGTTAACTAT
  4381 TGTCTTCTAGACTCTGATCAGGTCTTCATTGGATTTGTTCTAAAGCAGTTTGAATACATT
  4441 GAAGTTGGACAATTCAGGGAATCAGAAGCAATTATTCCTAGTGTCTTTTTCTTCTTGGTG
  4501 CTACTGTCTTATGAACGCTACCACTCCAAGCAAATAATTGGAATCCCTAAAATTATTCAG
  4561 CTGTGTGATGGGATAATGGCTAGTGGAAGAAAAGCTGTTACACATGCAATACCTGCTCTA
  4621 CAGCCAATTGTTCATGATCTCTTTATATTGAGAGGAGCAAACAAAGCAGATGCTGGCAAA
  4681 GAACTGGAGACACAAAAAGAAGTAGTGGTTTCAATGTTGTTGCGACTTATCCAATACCAT
  4741 CAGGTTCTAGAGATGTTCATTCTAGTATTGCAGCAGTGTCACAAAGAGAATGAAGACAAG
  4801 TGGAAAAGATTGTCTCGACAGATAGCAGACATAATTCTCCCAATGCTAGCAAAACAACAG
  4861 AAACAGATGCAGATAGATTCCCATGAAGCTCTGGGAGTATTGAACACTTTATTTGAAATA
  4921 TTGGCACCATCTGCCCTTCGTCCAGTGGACATGCTTTTAAGGAGCATGTTTGTAACTCCA
  4981 AGTACAATGGCTTCAGTGAGCACAGTCCAGTTATGGATATCTGGTATTCTTGCTATTCTC
  5041 CGTGTTTTGATTTCTCAATCAACAGAAGACATTGTGCTTTCTCGTATTCAAGAGCTCTCC
  5101 TTCTCACCATATTTAATCTCACGGCAAACAATTAACAGATTACGCAATGATGAAAATATA
  5161 ACTTCTACAGATCAGGAACAAACTCTAGAAGAGCAAAATAAATATCTGCCTGAAGAGACA
  5221 TTTTCCAGGTTCTTATTACAATTAGTGGGAATTCTTCTTGAAGATATTACTAGCAAACAT
  5281 TTGAATGTGGATATGAATGAGCAGCAACATACATTCTACTGCCAAGAGTTGGGTACACTG
  5341 CTTATGTGCTTAATTCATATTTTCAAATCAGGAATGTTCAGAAGAATCACAGCTGCAGCC
  5401 AGCAAGCTTTTTATAGGAAATGGATTTGATCATCATTTCTATACCCTTGAAACTTTAAAT
  5461 GATCTTATTCAGTCCATGATACCCACCCATCCTTCCTTAGTGCTCCTGTGGTGCCAGATC
  5521 TTACTACTTGTTAATTATACAAACAGCACCTGGTGGTCAGCAGTGCATCAGACGCCAAAA
  5581 AGGCACAGCCTTTCCAGTAGTAAATTGCTTGGTTCTCAAACTTGCGAAGATAATGATGAA
  5641 GTAGATTCGGAATTCAAACTCAGTATGTGCAATCGAGAGATAGTCCGAAGAGGTGCACTC
  5701 ATCCTTTTTTGTGACTATGTGTGCCAAAATCTGCACGATTCTGAACACTTAACCTGGTTG
  5761 ACAGTGAACCATGTTCAAGAACTTATTAATCTGTCTCACGAGCCCCCAGTACAAGACTTA
  5821 ATTAGTGCCATTCACAGGAATTCTGCAGCAAGTGGTCTTTTCATCCAGGCTATTCAATCA
  5881 CGGTGTGAGAACTTTGCAGCTCCAACTTCTGTAAAGAAGGCATTACAATGCCTAGAGGGG
  5941 ATTCACCTGAGTCAATCTGGTGCTGTATTAATACTGTATGTGGATAAGCTTTTGTGTACA
  6001 CCATTCCGTGTGCTGGCTCGCATGGTTGACACTCTCGCTTGTCGCCGGGTGGAAATGCTT
  6061 TTAGCTGAAACTATGCAGAACAGCACTGCTCAGTTACCAGTTGAAGAATTGGATAGAATT
  6121 CAGGAATATCTTCAAAATAGTGGATTAGCCCACAGACACCAAAGACTCTACTCACTCTTA
  6181 GACAGGTTTCGCCTCATTGTCGCTCCAGAAACAATTAGCCCATCACCTGTTGTTACTGCA
  6241 CACCCACTGGATGGGGATAAGCAGCCTGCTCTAGAAACTGTGATTCCAGACAAAGATTGG
  6301 TATATTTCGCTAGTGAAATCACAGTGCTATGTGAAATCATCCTTGGCTCTGCTAGAAGGT
  6361 GCAGAGTTGGTGAATCGACTTCCTCAGCATGAACTGAATTCTTTTTTGATGCATAAGGAC
  6421 TTCAATATAAGTCTTCTCGCTCCATGTTTAAGTCTTGGCATGCATGAAATCTCCAAAGAC
  6481 CAGAAGAGGCTTTTTGAAACTGCCCGTAGGGTAACTCTCAATCACGTGACTGCTGTTGTA
  6541 AAAAAACTTCCCACCAATCACCACGTTTTTCAGCCATTGCAACCATTTGAAACATCAATG
  6601 TATTGGAACAAGCTGAGTGATATCTTTGGTGACACCACGGTGTATCAATCTACGATGGCC
  6661 TTATCTCATGCTTTAAGTCAGTATCTCGTGTTGCTTTCTAAACTGCCAAGCTCTCTGTGC
  6721 ATCCCACGTGAGAAAGAGAGTGACATTCTGAAGTTTGTTGTAATGTTGCTAGAGATGCTT
  6781 TCATGGCATTTAATACACAAATATATTCCACTAAGTATAGATCTCCAGGCTGTGTTGGAT
  6841 TGCTGTTGTTTGGCTTTACAACAGTCTAATTTATGGAATTTGCTGGCTTCAGCAGAATAT
  6901 ATGACCCATGCCTGCTCTGTCATCAACAGTATTCGATTCATTATAGAGGCAGTGGCAGTT
  6961 GAGCCTGGAAATCATCTTCTTGGTCCAGAAAAGAAGAAGAGTCATGTAAAAACTGCTACT
  7021 GAAGATGAAGTAGATTTCCAAGCACATAAATCTGAGTTTATTACAGCAACCTGTGAAACA
  7081 GTGGCAGAGTTGGTGGAATGTTTGCAATCTGTCTTATCCTTGGGACACAAGAAAAACAGC
  7141 AATATCCCAGCATTCCTGACCTCTGTTCTTAAGAACATCATTATCAGTTTATCCAGGCTG
  7201 CCTTTGGTAAACAGCTACACAAGAGTTCCTCCATTGGTCTGGAAGCTGGGCTGGTCTCCA
  7261 AAGCCTACAGGTGATTTTGGCACAACTTTCCCAGAAATCCCTGTAGAGTTTCTTCAAGAG
  7321 AAAGAAGTATTCAAGGAGTTTATTTATCGAATTAACACTTTAGGATGGACAAACCGCATG
  7381 CAGTTTGAAGAAACTTGGGCTACACTTCTCGGCGTTCTGGTAACCCAGCCAATTGTAATG
  7441 GACCAAGAGGAAAACCAGCAAGAGGAAGACACAGAAAGAACACAAATAAATGTTCTAGCT
  7501 GTTCAAGCCATAACATCCTTGGTGTTAAGTGCAATGACAATACCATTTGCTGGCAATCCG
  7561 TCAATTAGCTGCTTGGAACAGCAACCACGAAATAAAGCTTTAAAGGCTCTAGATACCAGG
  7621 TTTGGGAGAAAATTAAGTGTCATCAGAGGAATTGTAGAACAAGAAATCCAAGAAATGGCA
  7681 TCCAAAAGAGACAACATTGCTACGCATCATTTGTATCAAGCATGGGATCCGGTTCCATCT
  7741 CTGTCCCCTTCTTCTACAGGTGCCCTTATCAGCCATGAAAAACTCCTATTGCAGATAAAT
  7801 ACAGAGCGTGAAATAGGAAATATGGGATATAAACTAGGACAGGTGTCTATCCACTCTGTG
  7861 TGGCTTGGAAATAATATTACTCCATTACGAGAAGAAGAGTGGGATGAAGATGAAGACGAC
  7921 GAGGGTGATGTGCCTGTCCCTTCCTCACCACCATCCTCTCCCATCAATTCCAGAAAACAT
  7981 CGTGCTGGTGTTGACATACATTCCTGTTCTCAATTCTTACTGGAGTTGTACAGCCAGTGG
  8041 ATTTTGCCATCAAATCCAAACAAGAGAACTCCAATTGTTCTGATCAGTGAAGTTGTCCGA
  8101 TCTTTATTTATTGTGTCAGAA

Predicted Protein Product
MATMEKLMKAFESLRSFQQQQSRAPQPLPRPERPLPPPPKSPLRDRSAPSTQPCIPEYQP
WHKSIIPPPPPKKIQKPSYFRKKELSTTKKDRVNHCLTICENIVAQSLRNSPEFQKLLGI
AMELFLLCSDDVESDVRMVADECLNKVIKALMDSNLPRLQLELYKEIKKNGASRSLRAAL
WRFAELAHLVRPQKCRPYFVNLLPCLTRVSNRSEESVQETLAAAIPKIMAAFGNFANDNE
IKILLKAFVANLKSTSPTIRRTAAASVVSICQHSRRTQYFYTWLLNVLLGLLFPMEDEYP
SVLILGVLLTLRYLIPLLQQQVKDTSLKGSFGVTQKETEISPSLNQLVQVYELSLHYTQH
RDHNVVTGSLELLQQLLRTPPPNLLLALTTAGGITQACVSKDVVASRNRSDSIVELIAGG
GSSCSPVFSRKQKAKIFLGGEEEGLEDDSETRSEVSNTSFAASMKSELNSELASSDIPTA
TNSVADSSGHDIITEQPRSQHTLQSESDLTNCELTSSANKGDTDDDVLSRSSSQISTVQS
DPTVDLNSGTRASSPISDSSQTTTEGPDSAVTPSDCSEMIIEGAENQYSGMQIGQLQDEE
DETTNVLQDEMIIDTRSSCLEALQQSHLLKSMGHSRQPSTSSMDKFSTKEETEPGDHENK
LSRIKGDIGCYTDENAAPLVHCVRLLSASFLLTEVKGALVPDKDVRVSVKALAISCVSAA
VALHPEAFFSKLYKMPSEANTGEEEYVKDIMNYIDHGDPQIRGATAILCGTIVYSILTKS
RFDVENWLTNIRVSTGNTFSLVDCILLLQRTLKDESSVTCKLACTAVRHCIMALCSSSYS
ALGVKLMVDLLTLRNSSYWLVRTELLETLADIDFRLISFLERRANNLHRGTHHYTGLLKL
QDRVLNDVVISLLGDEDPRVRHVAAATLVRLVPKFFYNCDQGQADPVVAVARDQSSVYLK
LLMHETQPASQFAVSTITRTYRGYNILQSPTDVTMENNLSRVISAVSHALTTSSTRSLTF
GCFEALCLLSTSFPVCTWNVGWHCGFYLLGSTEESQKNSTIGMAGLVLSLLSSAWLPLDL
SAHQDALILAGNLLAASASKSLKTPWTNEDDTNVGATKQEEPWPALADRTIIVLIEQMFS
HLLKIINICAHVLDDVIPGPTIKATLPSLTNPPSLSPLKGRKGKEKEIADQTSVPMSPKK
TSENSPAPRQTDASGPAPASKSSSLGNFYHLPSYLKLYDVLKATHANYKVTLDLQNSSEK
FGAFLRSALDVLSQILELATLQDIGKYVEEILGYLKSCFSREPMMATLCVQQLLKTLFGT
NLASQYDCLSSNPSRSQGKAQRLGSSNLRPGLYHYCFMAPYTHFTQALADASLRNMMQAE
QEHDTSGWFDILQKVSSQLKTGMTSAVKHRADKNIIHNHIRLFEPLVIKALKQYTTTTSV
QLQRQVLDLLAQLVQLRVNYCLLDSDQVFIGFVLKQFEYIEVGQFRESEAIIPSVFFFLV
LLSYERYHSKQIIGIPKIIQLCDGIMASGRKAVTHAIPALQPIVHDLFILRGANKADAGK
ELETQKEVVVSMLLRLIQYHQVLEMFILVLQQCHKENEDKWKRLSRQIADIILPMLAKQQ
KQMQIDSHEALGVLNTLFEILAPSALRPVDMLLRSMFVTPSTMASVSTVQLWISGILAIL
RVLISQSTEDIVLSRIQELSFSPYLISRQTINRLRNDENITSTDQEQTLEEQNKYLPEET
FSRFLLQLVGILLEDITSKHLNVDMNEQQHTFYCQELGTLLMCLIHIFKSGMFRRITAAA
SKLFIGNGFDHHFYTLETLNDLIQSMIPTHPSLVLLWCQILLLVNYTNSTWWSAVHQTPK
RHSLSSSKLLGSQTCEDNDEVDSEFKLSMCNREIVRRGALILFCDYVCQNLHDSEHLTWL
TVNHVQELINLSHEPPVQDLISAIHRNSAASGLFIQAIQSRCENFAAPTSVKKALQCLEG
IHLSQSGAVLILYVDKLLCTPFRVLARMVDTLACRRVEMLLAETMQNSTAQLPVEELDRI
QEYLQNSGLAHRHQRLYSLLDRFRLIVAPETISPSPVVTAHPLDGDKQPALETVIPDKDW
YISLVKSQCYVKSSLALLEGAELVNRLPQHELNSFLMHKDFNISLLAPCLSLGMHEISKD
QKRLFETARRVTLNHVTAVVKKLPTNHHVFQPLQPFETSMYWNKLSDIFGDTTVYQSTMA
LSHALSQYLVLLSKLPSSLCIPREKESDILKFVVMLLEMLSWHLIHKYIPLSIDLQAVLD
CCCLALQQSNLWNLLASAEYMTHACSVINSIRFIIEAVAVEPGNHLLGPEKKKSHVKTAT
EDEVDFQAHKSEFITATCETVAELVECLQSVLSLGHKKNSNIPAFLTSVLKNIIISLSRL
PLVNSYTRVPPLVWKLGWSPKPTGDFGTTFPEIPVEFLQEKEVFKEFIYRINTLGWTNRM
QFEETWATLLGVLVTQPIVMDQEENQQEEDTERTQINVLAVQAITSLVLSAMTIPFAGNP
SISCLEQQPRNKALKALDTRFGRKLSVIRGIVEQEIQEMASKRDNIATHHLYQAWDPVPS
LSPSSTGALISHEKLLLQINTEREIGNMGYKLGQVSIHSVWLGNNITPLREEEWDEDEDD
EGDVPVPSSPPSSPINSRKHRAGVDIHSCSQFLLELYSQWILPSNPNKRTPIVLISEVVR
SLFIVSE
Protein Alignment to Mouse
sp|P42859|HD_MOUSE Huntingtin OS=Mus musculus GN=Htt PE=1 SV=2
      MGI:96067 Htt huntingtin (Chr 5)
        Length = 3119

 Score = 9780 (3447.8 bits), Expect = 0., Sum P(3) = 0.
 Identities = 1916/2632 (72%), Positives = 2181/2632 (82%)

Query:    82 KKELSTTKKDRVNHCLTICENIVAQSLRNSPEFQKLLGIAMELFLLCSDDVESDVRMVAD 141
             KKELS TKKDRVNHCLTICENIVAQSLRNSPEFQKLLGIAMELFLLCS+D ESDVRMVAD
Sbjct:    68 KKELSATKKDRVNHCLTICENIVAQSLRNSPEFQKLLGIAMELFLLCSNDAESDVRMVAD 127

Query:   142 ECLNKVIKALMDSNLPRLQLELYKEIKKNGASRSLRAALWRFAELAHLVRPQKCRPYFVN 201
             ECLNKVIKALMDSNLPRLQLELYKEIKKNGA RSLRAALWRFAELAHLVRPQKCRPY VN
Sbjct:   128 ECLNKVIKALMDSNLPRLQLELYKEIKKNGAPRSLRAALWRFAELAHLVRPQKCRPYLVN 187

Query:   202 LLPCLTRVSNRSEESVQETLAAAIPKIMAAFGNFANDNEIKILLKAFVANLKSTSPTIRR 261
             LLPCLTR S R EESVQETLAAA+PKIMA+FGNFANDNEIK+LLKAF+ANLKS+SPT+RR
Sbjct:   188 LLPCLTRTSKRPEESVQETLAAAVPKIMASFGNFANDNEIKVLLKAFIANLKSSSPTVRR 247

Query:   262 TAAASVVSICQHSRRTQYFYTWLLNVLLGLLFPMEDEYPSVLILGVLLTLRYLIPLLQQQ 321
             TAA S VSICQHSRRTQYFY WLLNVLLGLL PME+E+ ++LILGVLLTLR L+PLLQQQ
Sbjct:   248 TAAGSAVSICQHSRRTQYFYNWLLNVLLGLLVPMEEEHSTLLILGVLLTLRCLVPLLQQQ 307

Query:   322 VKDTSLKGSFGVTQKETEISPSLNQLVQVYELSLHYTQHRDHNVVXXXXXXXXXXXXXXX 381
             VKDTSLKGSFGVT+KE E+SPS  QLVQVYEL+LH+TQH+DHNVV               
Sbjct:   308 VKDTSLKGSFGVTRKEMEVSPSTEQLVQVYELTLHHTQHQDHNVVTGALELLQQLFRTPP 367

Query:   382 XXXXXXXXXXXXXXXXCVSKDVVASRNRSDSIVELIAGGGSSCSPVFSRKQKAKIFXXXX 441
                              + ++    R RS SIVEL+AGGGSSCSPV SRKQK K+     
Sbjct:   368 PELLQALTTPGGLGQLTLVQEEARGRGRSGSIVELLAGGGSSCSPVLSRKQKGKVLLGEE 427

Query:   442 XXXXXDDSETRSEVSNTSFAASMKSELNSELASSDIPTATNSVADSSGHDIITEQPRSQH 501
                  DDSE+RS+VS+++FAAS+KSE+  ELA+S   +   SV    GHDIITEQPRSQH
Sbjct:   428 EALE-DDSESRSDVSSSAFAASVKSEIGGELAASSGVSTPGSV----GHDIITEQPRSQH 482

Query:   502 TLQSES-DLTNCELTSSANKGDTDDDVLXXXXXXXXXXXXDPTVDLNSGTRASSPISDSS 560
             TLQ++S DL+ C+LTS+A  GD ++D+L            DP +DLN GT+ASSPISDSS
Sbjct:   483 TLQADSVDLSGCDLTSAATDGD-EEDILSHSSSQFSAVPSDPAMDLNDGTQASSPISDSS 541

Query:   561 QTTTEGPDSAVTPSDCSEMIIEGAENQYSGMQIGQLQDEEDE-TTNVLQDEMIIDTRSSC 619
             QTTTEGPDSAVTPSD SE++++GA++QY GMQIGQ Q++++E    VL  E+    R+S 
Sbjct:   542 QTTTEGPDSAVTPSDSSEIVLDGADSQYLGMQIGQPQEDDEEGAAGVLSGEVSDVFRNSS 601

Query:   620 LEALQQSHLLKSMGHSRQPSTSSMDKFSTKEET-EPGDHENKLSRIKGDIGCYTDENAAP 678
             L ALQQ+HLL+ MGHSRQPS SS+DK+ T++E  E  D E+K  RIKGDIG   D+++AP
Sbjct:   602 L-ALQQAHLLERMGHSRQPSDSSIDKYVTRDEVAEASDPESKPCRIKGDIGQPNDDDSAP 660

Query:   679 LVHCVRLLSASFLLTEVKGALVPDKDVRVSVKALAISCVSAAVALHPEAFFSKLYKMP-S 737
             LVHCVRLLSASFLLT  K ALVPD+DVRVSVKALA+SC+ AAVALHPE+FFS+LYK+P +
Sbjct:   661 LVHCVRLLSASFLLTGEKKALVPDRDVRVSVKALALSCIGAAVALHPESFFSRLYKVPLN 720

Query:   738 EANTGEEEYVKDIMNYIDHGDPQIRGATAILCGTIVYSILTKSRFDVENWLTNIRVSTGN 797
                + EE+YV DI+NYIDHGDPQ+RGATAILCGT+VYSIL++SR  V +WL NIR  TGN
Sbjct:   721 TTESTEEQYVSDILNYIDHGDPQVRGATAILCGTLVYSILSRSRLRVGDWLGNIRTLTGN 780

Query:   798 TFSLVDCILLLQRTLKDESSVTCKLACTAVRHCIMALCSSSYSALGVKLMVDLLTLRNSS 857
             TFSLVDCI LLQ+TLKDESSVTCKLACTAVRHC+++LCSSSYS LG++L++D+L L+NSS
Sbjct:   781 TFSLVDCIPLLQKTLKDESSVTCKLACTAVRHCVLSLCSSSYSDLGLQLLIDMLPLKNSS 840

Query:   858 YWLVRTELLETLADIDFRLISFLERRANNLHRGTHHYTGLLKLQDRVLNDVVISLLGDED 917
             YWLVRTELL+TLA+IDFRL+SFLE +A +LHRG HHYTG LKLQ+RVLN+VVI LLGDED
Sbjct:   841 YWLVRTELLDTLAEIDFRLVSFLEAKAESLHRGAHHYTGFLKLQERVLNNVVIYLLGDED 900

Query:   918 PRVRHVAAATLVRLVPKFFYNCDQGQADPVVAVARDQSSVYLKLLMHETQPASQFAVSTI 977
             PRVRHVAA +L RLVPK FY CDQGQADPVVAVARDQSSVYLKLLMHETQP S F+VSTI
Sbjct:   901 PRVRHVAATSLTRLVPKLFYKCDQGQADPVVAVARDQSSVYLKLLMHETQPPSHFSVSTI 960

Query:   978 TRTYRGYNILQSPTDVTMENNLSRVISAVSHALTTSSTRSLTFGCFEALCLLSTSFPVCT 1037
             TR YRGY++L S TDVTMENNLSRV++AVSH L TS+TR+LTFGC EALCLLS +FPVCT
Sbjct:   961 TRIYRGYSLLPSITDVTMENNLSRVVAAVSHELITSTTRALTFGCCEALCLLSAAFPVCT 1020

Query:  1038 WNVGWHCGFYLLGSTEESQKNSTIGMAGLVLSLLSSAWLPLDLSAHQDALILAGNLLAAS 1097
             W++GWHCG   L +++ES+K+ T+GMA ++L+LLSSAW PLDLSAHQDALILAGNLLAAS
Sbjct:  1021 WSLGWHCGVPPLSASDESRKSCTVGMASMILTLLSSAWFPLDLSAHQDALILAGNLLAAS 1080

Query:  1098 ASKSLKTPWTNEDDTNVGATKQEEPWPALADRTIIVLIEQMFSHLLKIINICAHVLDDVI 1157
             A KSL++ WT+E++ N  AT+QEE WPAL DRT++ L+EQ+FSHLLK+INICAHVLDDV 
Sbjct:  1081 APKSLRSSWTSEEEANSAATRQEEIWPALGDRTLVPLVEQLFSHLLKVINICAHVLDDVT 1140

Query:  1158 PGPTIKAXXXXXXXXXXXXXXKGRKGKEKEIADQTSVPMSPKKTSENSPAPRQTDXXXXX 1217
             PGP IKA              + RKGKEKE  +Q S PMSPKK  E S A RQ+D     
Sbjct:  1141 PGPAIKAALPSLTNPPSLSPIR-RKGKEKEPGEQASTPMSPKKVGEASAASRQSDTSGPV 1199

Query:  1218 XXXXXXXLGNFYHLPSYLKLYDVLKATHANYKVTLDLQNSSEKFGAFLRSALDVLSQILE 1277
                    LG+FYHLPSYLKL+DVLKATHANYKVTLDLQNS+EKFG FLRSALDVLSQILE
Sbjct:  1200 TASKSSSLGSFYHLPSYLKLHDVLKATHANYKVTLDLQNSTEKFGGFLRSALDVLSQILE 1259

Query:  1278 LATLQDIGKYVEEILGYLKSCFSREPMMATLCVQQLLKTLFGTNLASQYDCLSSNPSRSQ 1337
             LATLQDIGK VEE+LGYLKSCFSREPMMAT+CVQQLLKTLFGTNLASQ+D LSSNPS+SQ
Sbjct:  1260 LATLQDIGKCVEEVLGYLKSCFSREPMMATVCVQQLLKTLFGTNLASQFDGLSSNPSKSQ 1319

Query:  1338 GKAQRLGSSNLRPGLYHYCFMAPYTHFTQALADASLRNMMQAEQEHDTSGWFDILQKVSS 1397
              +AQRLGSS++RPGLYHYCFMAPYTHFTQALADASLRNM+QAEQE D SGWFD+LQKVS+
Sbjct:  1320 CRAQRLGSSSVRPGLYHYCFMAPYTHFTQALADASLRNMVQAEQERDASGWFDVLQKVSA 1379

Query:  1398 QLKTGMTSAVKHRADKNIIHNHIRLFEPLVIKALKQYTTTTSXXXXXXXXXXXXXXXXXX 1457
             QLKT +TS  K+RADKN IHNHIRLFEPLVIKALKQYTTTTS                  
Sbjct:  1380 QLKTNLTSVTKNRADKNAIHNHIRLFEPLVIKALKQYTTTTSVQLQKQVLDLLAQLVQLR 1439

Query:  1458 XNYCLLDSDQVFIGFVLKQFEYIEVGQFRESEAIIPSVFFFLVLLSYERYHSKQIIGIPK 1517
              NYCLLDSDQVFIGFVLKQFEYIEVGQFRESEAIIP++FFFLVLLSYERYHSKQIIGIPK
Sbjct:  1440 VNYCLLDSDQVFIGFVLKQFEYIEVGQFRESEAIIPNIFFFLVLLSYERYHSKQIIGIPK 1499

Query:  1518 IIQLCDGIMASGRKAVTHAIPALQPIVHDLFILRGANKADAGKELETQKEVVVSMLLRLI 1577
             IIQLCDGIMASGRKAVTHAIPALQPIVHDLF+LRG NKADAGKELETQKEVVVSMLLRLI
Sbjct:  1500 IIQLCDGIMASGRKAVTHAIPALQPIVHDLFVLRGTNKADAGKELETQKEVVVSMLLRLI 1559

Query:  1578 QYHQVLEMFILVLQQCHKENEDKWKRLSRQIADIILPMLAKQQKQMQIDSHEALGVLNTL 1637
             QYHQVLEMFILVLQQCHKENEDKWKRLSRQ+ADIILPMLAKQQ  M IDSHEALGVLNTL
Sbjct:  1560 QYHQVLEMFILVLQQCHKENEDKWKRLSRQVADIILPMLAKQQ--MHIDSHEALGVLNTL 1617

Query:  1638 FEILAPSALRPVDMLLRSMFVTPSTMASVSTVQLWISGILAILRVLISQSTEDIVLSRIQ 1697
             FEILAPS+LRPVDMLLRSMF+TPSTMASVSTVQLWISGILAILRVLISQSTEDIVL RIQ
Sbjct:  1618 FEILAPSSLRPVDMLLRSMFITPSTMASVSTVQLWISGILAILRVLISQSTEDIVLCRIQ 1677

Query:  1698 ELSFSPYLISRQTINRLRNDE-NITSTDQEQTLEEQNKYLPEETFSRFLLQLVGILLEDI 1756
             ELSFSP+L+S   INRLR    N+T  +     E + K LPE+TFSRFLLQLVGILLEDI
Sbjct:  1678 ELSFSPHLLSCPVINRLRGGGGNVTLGECS---EGKQKSLPEDTFSRFLLQLVGILLEDI 1734

Query:  1757 TSKHLNVDMNEQQHTFYCQELGTLLMCLIHIFKSGMFRRITAAASKLFIGNGFDHHFYTL 1816
              +K L VDM+EQQHTFYCQELGTLLMCLIHIFKSGMFRRITAAA++LF  +G +  FYTL
Sbjct:  1735 VTKQLKVDMSEQQHTFYCQELGTLLMCLIHIFKSGMFRRITAAATRLFTSDGCEGSFYTL 1794

Query:  1817 ETLNDLIQSMIPTHPSXXXXXXXXXXXVNYTNSTWWSAVHQTPKRHSLSSSKLLGSQTCE 1876
             E+LN  ++SM+PTHP+           +N+T+  WW+ V QTPKRHSLS +K L  Q   
Sbjct:  1795 ESLNARVRSMVPTHPALVLLWCQILLLINHTDHRWWAEVQQTPKRHSLSCTKSLNPQKSG 1854

Query:  1877 DNDEVDSEFKLSMCNREIVRRGALILFCDYVCQNLHDSEHLTWLTVNHVQELINLSHEPP 1936
             + ++  S  +L MCNREIVRRGALILFCDYVCQNLHDSEHLTWL VNH+Q+LI+LSHEPP
Sbjct:  1855 EEEDSGSAAQLGMCNREIVRRGALILFCDYVCQNLHDSEHLTWLIVNHIQDLISLSHEPP 1914

Query:  1937 VQDLISAIHRNSAASGLFIQAIQSRCENFAAPTSVKKALQCLEGIHLSQSGAVLILYVDK 1996
             VQD ISAIHRNSAASGLFIQAIQSRCEN + PT++KK LQCLEGIHLSQSGAVL LYVD+
Sbjct:  1915 VQDFISAIHRNSAASGLFIQAIQSRCENLSTPTTLKKTLQCLEGIHLSQSGAVLTLYVDR 1974

Query:  1997 LLCTPFRVLARMVDTLACRRVEMLLAETMQNSTAQLPVEELDRIQEYLQNSGLAHRHQRL 2056
             LL TPFR LARMVDTLACRRVEMLLA  +Q+S AQLP EEL+RIQE+LQNSGLA RHQRL
Sbjct:  1975 LLGTPFRALARMVDTLACRRVEMLLAANLQSSMAQLPEEELNRIQEHLQNSGLAQRHQRL 2034

Query:  2057 YSLLDRFRLIVAPETISPSPVVTAHPLDGDKQPALETVIPDKDWYISLVKSQCYVKSSLA 2116
             YSLLDRFRL    +++SP P VT+HPLDGD   +LETV PDKDWY+ LV+SQC+ +S  A
Sbjct:  2035 YSLLDRFRLSTVQDSLSPLPPVTSHPLDGDGHTSLETVSPDKDWYLQLVRSQCWTRSDSA 2094

Query:  2117 LLEGAELVNRLPQHELNSFLMHKDFNISLLAPCLSLGMHEISKDQKR-LFETARRVTLNH 2175
             LLEGAELVNR+P  ++N F+M  +FN+SLLAPCLSLGM EI+  QK  LFE AR V LN 
Sbjct:  2095 LLEGAELVNRIPAEDMNDFMMSSEFNLSLLAPCLSLGMSEIANGQKSPLFEAARGVILNR 2154

Query:  2176 VTAVVKKLPTNHHVFQPLQPFETSMYWNKLSDIFGDTTVYQSTMALSHALSQYLVLLSKL 2235
             VT+VV++LP  H VFQP  P E + YWNKL+D+ GDTT YQS   L+ AL+QYLV+LSK+
Sbjct:  2155 VTSVVQQLPAVHQVFQPFLPIEPTAYWNKLNDLLGDTTSYQSLTILARALAQYLVVLSKV 2214

Query:  2236 PSSLCIPREKESDILKFVVMLLEMLSWHLIHKYIPLSIDLQAVLDCCCLALQQSNLWNLL 2295
             P+ L +P EKE D +KFVVM +E LSWHLIH+ IPLS+DLQA LDCCCLALQ   LW +L
Sbjct:  2215 PAHLHLPPEKEGDTVKFVVMTVEALSWHLIHEQIPLSLDLQAGLDCCCLALQVPGLWGVL 2274

Query:  2296 ASAEYMTHACSVINSIRFIIEAVAVEPGNHLLGPEKKKSHVKTATEDEVDFQAHKSEFIT 2355
             +S EY+THACS+I+ +RFI+EA+AV+PG+ LLGPE +    +   ++EVD        +T
Sbjct:  2275 SSPEYVTHACSLIHCVRFILEAIAVQPGDQLLGPESRSHTPRAVRKEEVDSDIQNLSHVT 2334

Query:  2356 ATCETVAELVECLQSVLSLGHKKNSNIPAFLTSVLKNIIISLSRLPLVNSYTRVPPLVWK 2415
             + CE VA++VE LQSVL+LGHK+NS +P+FLT+VLKNI+ISL+RLPLVNSYTRVPPLVWK
Sbjct:  2335 SACEMVADMVESLQSVLALGHKRNSTLPSFLTAVLKNIVISLARLPLVNSYTRVPPLVWK 2394

Query:  2416 LGWSPKPTGDFGTTFPEIPVEFLQEKEVFKEFIYRINTLGWTNRMQFEETWATLLGVLVT 2475
             LGWSPKP GDFGT FPEIPVEFLQEKE+ KEFIYRINTLGWTNR QFEETWATLLGVLVT
Sbjct:  2395 LGWSPKPGGDFGTVFPEIPVEFLQEKEILKEFIYRINTLGWTNRTQFEETWATLLGVLVT 2454

Query:  2476 QPIVMXXXXXXXXXXXXXXXINVLAVQAITSLVLSAMTIPFAGNPSISCLEQQPRNKALK 2535
             QP+VM               I+VLAVQAITSLVLSAMT+P AGNP++SCLEQQPRNK LK
Sbjct:  2455 QPLVMEQEESPPEEDTERTQIHVLAVQAITSLVLSAMTVPVAGNPAVSCLEQQPRNKPLK 2514

Query:  2536 ALDTRFGRKLSVIRGIVEQEIQEMASKRDNIATHHLYQAWDPVPSLSPSSTGALISHEKL 2595
             ALDTRFGRKLS+IRGIVEQEIQEM S+R+N ATHH +QAWDPVPSL P++TGALISH+KL
Sbjct:  2515 ALDTRFGRKLSMIRGIVEQEIQEMVSQRENTATHHSHQAWDPVPSLLPATTGALISHDKL 2574

Query:  2596 LLQINTEREIGNMGYKLGQVSIHSVWLGNNITPLRXXXXXXXXXXXXXXXXXXXXXXXXI 2655
             LLQIN ERE GNM YKLGQVSIHSVWLGNNITPLR                        +
Sbjct:  2575 LLQINPEREPGNMSYKLGQVSIHSVWLGNNITPLREEEWDEEEEEESDVPAPTSPPVSPV 2634

Query:  2656 NSRKHRAGVDIHSCSQFLLELYSQWILPSNPNKRTPIVLISEVVRSLFIVSE 2707
             NSRKHRAGVDIHSCSQFLLELYS+WILPS+  +RTP++LISEVVRSL +VS+
Sbjct:  2635 NSRKHRAGVDIHSCSQFLLELYSRWILPSSAARRTPVILISEVVRSLLVVSD 2686

 Score = 95 (38.5 bits), Expect = 0., Sum P(3) = 0.
 Identities = 19/21 (90%), Positives = 21/21 (100%)

Query:     1 MATMEKLMKAFESLRSFQQQQ 21
             MAT+EKLMKAFESL+SFQQQQ
Sbjct:     1 MATLEKLMKAFESLKSFQQQQ 21

 Score = 39 (18.8 bits), Expect = 0., Sum P(3) = 0.
 Identities = 21/78 (26%), Positives = 37/78 (47%)

Query:  2525 LEQQPRNKALKALDTRFGRKLSVIRGIVEQEIQEMASKRDNIATHHLYQ-------AWDP 2577
             LE+   ++ L  LDT    KLSV R  V+   + MA+    +   +  +       A DP
Sbjct:  2859 LERLLLSEQLSRLDTESLVKLSVDRVNVQSPHRAMAALGLMLTCMYTGKEKASPGRASDP 2918

Query:  2578 VPSLSPSSTGALISHEKL 2595
              P+ +P S   +++ E++
Sbjct:  2919 SPA-TPDSESVIVAMERV 2935