Barmak Modrek, Alissa Resch, Catherine Grasso and Christopher Lee*
Department of Chemistry and Biochemistry, University of California, 611 Charles E. Young Drive East, Los Angeles, CA 90095-1570, USA
* To whom correspondence should be addressed. Tel: +1 310 825
7374; Fax: +1 310 267 0248;
Email: leec@mbi.ucla.edu
We have identified 6201 alternative splice relationships in human
genes, through a genome-wide analysis of expressed sequence tags (ESTs).
Starting with 2.1 million human mRNA and EST sequences, we mapped expressed
sequences onto the draft human genome sequence and only accepted splices
that obeyed the standard splice site consensus. A large fraction (47%)
of these were observed multiple times, indicating that they comprise a
substantial fraction of the mRNA species. The vast majority of the detected
alternative forms appear to be novel, and produce highly specific, biologically
meaningful control of function in both known and novel human genes, e.g.
specific removal of the lysosomal targeting signal from HLA-DM ß
chain, replacement of the C-terminal transmembrane domain and cytoplasmic
tail in an FC receptor ß chain homolog with a different transmembrane
domain and cytoplasmic tail, likely modulating its signal transduction
activity. Our data indicate that a large proportion of human genes, probably
42% or more, are alternatively spliced, but that this appears to be observed
mainly in certain types of molecules (e.g. cell surface receptors) and
systemic functions, particularly the immune system and nervous system.
These results provide a comprehensive dataset for understanding the role
of alternative splicing in the human genome, accessible at:
http://www.bioinformatics.ucla.edu/HASDB
1. International Human Genome Sequencing Consortium, "Initial Sequencing and Analysis of the Human Genome", Nature vol. 409, no. 6822, pp. 860-921 (Feb. 15, 2001).