PUTATIVE DRUG TARGET IDENTIFICATION FOR CHLAMYDIA TRACHOMATIS: AN INSILICO PROTEOME ANALYSIS A.Praveena*, R.Sindhuja, V.Anuradha,

The whole genome sequences of pathogenic bacteria and the host genome such as human has provided a Subtractive genomic approach, which can be used to identify potent vaccine and drug targets. In the present study subtractive genomic approach has been used to identify therapeutic target in Chlamydia trachomatis. C.trachomatis infection is now the most common sexually transmitted disease worldwide. The BlastP search against Homo sapiens revealed 551 non-homologous protein sequences out of 874 in C.trachomati. Further analysis of these non human homologous proteins predicted that 142 essential proteins were involved in unique metabolic pathways of C.trachomatis. The prediction of sub-cellular localization of the essential proteins was used to identify the membrane proteins which can be used as vaccine targets. There are 63 unique essential non-human homologous therapeutic targets found in the current study, which plays a vital role in the Peptidoglycan biosynthesis, Phosphotransferase system, Fatty acid biosynthesis and Bacterial secretion system of C.trachomatis.


INTRODUCTION
The completion of human genome project and pathogenic bacteria have increased the chances of identifying potent drug targets against life threatening human pathogens. A number of bioinformatics tools and public databases have been developed to facilitate in silico analysis of the gene sequence information 1 . Drug resistance among important microorganisms is a major challenge in modern medicine. C.trachomatis, an obligate intracellular human gramnegative pathogen, is one of three bacterial species in the genus Chlamydia 2 . C. trachomatis was the first chlamydial agent discovered in humans in the year of 1907 3 . There are numerous factors that contribute to the pathogenicity of C.trachomatis.
Colonization of Chlamydia begins with attachment to sialic acid receptors on the eye, throat, or genitalia. In humans, infection in many individuals is asymptomatic and so treating only those with clinical symptoms is not the best way of controlling the spread of infection 4 . Empirical attempts were made in the 1960s and early 1970s to prevent IJBR 2[2] [2011]151-160 www.ijbr.ssjournals.com trachoma caused by Chlamydial infections using vaccination. Renewed attempts to protect against C. trachomatis infection were made in the mid 1980s following the discovery that the major outer membrane protein (MOMP) of C. trachomatis 5,6 . As yet there is no consensus as to what constitutes a protective immune response against genital Chlamydia infection. One of the important strategies to identify the novel drug target is finding the bacterial genes that are non-homologous to human genes and important for the survival of bacteria. A subtractive genomics approach and bioinformatics provide opportunities for finding the drug targets against pathogens 7 . In the present study, the genome of C.trachomatis is compared with the host genome Homo sapiens to identify the potential unique targets for the development of effective drugs and vaccines.

MATERIAL AND METHODS Sequence Collection
The Genome sequence of C.trachomatis (Accession number: NC_010287.1) was collected from the NCBI (National Center for Biotechnology Information) database. The total 874 protein sequences of C.trachomatis were downloaded in FASTA format from NCBI. The protein sequences with less than 100 amino acids were screened out for further study because coding sequences having less than 100 amino acids were less likely to represent essential genes from protein table.

Identification of Paralogs
The paralogous sequences were identified at similarity threshold of 60% using the CD-HIT suit. The prologs were excluded and the remaining set of proteins was used for further analysis.

Similarity Search
The non-paralogous proteins were subjected to BlastP 8 search against Homo sapiens using threshold expectation value of 10 -3 as parameter to find out the nonhuman homologous proteins of C.trachomatis. The human homologous were excluded and the list of nonhomologs was compiled.

Essential Protein Search
The selected non-human homologous proteins were then subjected to similarity search using BlastP in Database of Essential Genes (DEG) (http://tubic.tju.edu.cu/deg1). A random expectation value (E-value) cut-off of 0.001 and sequence identity of 30% and above was used to screen out proteins that appeared to represent essential proteins.

Functional classification of hypothetical proteins
Functional family prediction of the putative uncharacterized essential proteins was done by using the SVMProt web server (http://jing.cz3.nus.edu.sg/cgibin/svmprot.cgi.) 9 .

Subcellular Localisation
Prediction of protein localization is important to identify the surface membrane proteins which could be feasible vaccine target. Sub-cellular localization analysis of the essential protein sequences has been done by PsortBTb v3.0 server (http://www.psort.org/psortb/ PSORTb).

Therapeutic target identification
The essential proteins (enzymes) of C.trachomatis found in the DEG database were searched against the therapeutic The essential proteins (enzymes) matched with the TTD were searched in the KEGG database to identify the unique pathway of the C.trachomatis when compared against human metabolic pathways.

RESULTS
In the present study, a strategy for comparative metabolic pathway analysis was used to find out some potential targets against C.trachomatis. Only those enzymes which show unique properties than the host were selected as the target ( Table 1). The genome of C.trachomatis consist of 1,038,842 nucleotides, 934 genes, 874 coding proteins and the GC content of 41%. A list of 874 proteins from the genome of the C.trachomatis has been extensively compared with the proteins present in the genome of H.sapiens. Fifty five protein sequences were excluded from the total list, since coding sequences having less than 100 amino acids were less likely to represent essential genes. The CD-HIT suit results showed that there are no duplicate proteins with the threshold identity value of 60% and above. Thus, the 819 sequences were analyzed for tracing the non-Human homologous sequences using the BlastP program. The BlastP search resulted in 551 non-human homologous sequences which were short listed based on the E-Value of 0.005. These sequences were further analysed for the identification of essential genes using DEG database server and considered cutoff score was >100, to enhance the specificity of enzyme in C.trachomatis. A total of 142 proteins were found to be essential for C.trachomatis life cycle.
The functional family of 7 hypothetical proteins predicted to be essential in DEG database was identified using the SVMprot tool. The SVMprot results indicated that the hypothetical proteins are mainly from the protein families of Zincbinding, DNA-binding, Iron-binding, lipid-binding, Metal-binding and Transmembrane proteins with high Pvalue, which is the expected classification accuracy in terms of percentage ( Table 2). As it was suggested that membrane associated protein could be the better target for developing vaccines. The subcellular localization results explored that the 98 proteins were found to be located in the cytoplasm, 1 as extra cellular, 26 as membrane bound and 16 without any positive prediction (Graph1). Out of 142 unique essential proteins, 63 matched significantly and screened as the therapeutic targets. The selected 63 putative therapeutic targets were matched with the KEGG database to identify the unique pathways. The results explored 10 unique enzymes out of 63, mainly involved in virulence pathways of bacteria such as Peptidoglycan biosynthesis, Phosphotransferase system, Fatty acid biosynthesis and Bacterial secretion system (Table 3).

DISCUSSION
The computational approach has been used to investigate novel drug targets in pathogenic organisms such as Pseudomonas aeruginosa 10, 11 and Helicobacter pylori 12 . Anti-bacterials are essentially inhibitors of certain bacterial enzymes, all enzymes specific to bacteria can be considered as potential drug targets 13 . Bacterial envelope and secretion systems are of particular interest in antimicrobial compound discovery. The bacterial envelope is known to actively extrude drugs via the action of an efflux pump. A unique multi-drug efflux system (BpeAB-OprB) accords B. pseudomallei resistance to aminoglycosides and macrolide antibiotics 14 . Thus, active compounds that can destabilize bacterial cell membranes and disrupt lipopolysaccharides are good drug candidates. In addition, the bacterial secretion machinery is also important for survival. Despite some level of similarity between prokaryotic and eukaryotic secretion systems, differences in the bacterial secretion process are sufficient to infer that these systems might be useful as drug targets without the risk of disrupting host-cell functions 15 . All the bacterial species almost share a common feature of the cell wall which is helping them to maintain their structure as well as helping the bacterium to withstand tremendous internal pressure (up to 350 lbs/cm2) to keep the bacterial cell from exploding. The cell wall is composed of peptidoglycan, teichoic acids, and proteins 16 . Chemical analysis of the cell wall indicates that more than 70% of the weight of the cell wall is peptidoglycan and that the teichoic acid is covalently bound to the peptidoglycan through a phosphodiester bond 17 . Thus the peptidoglycan biosynthesis plays a vital role in the enlargement of cell wall. The unique enzymes found in Peptidoglycan biosynthesis of C.trachomatis are UDP-N-acetyl muramoyl-L-alanyl-D-glutamate synthetase, N-acetyl glucosaminyl transferase, UDP-N-acetyl muramate-alanine ligase and Penicillin binding protein.
The phosphoenolpyruvate (PEP)dependent phosphotransferase system (PTS) is a major mechanism used by bacteria for uptake of sources of energy such as carbohydrates, particularly hexoses, hexitol, and disaccharides. The unique non human homologous enzyme found in the phosphotransferase system of C.trachomatis is PTS family membrane transport protein IIA subunit. Genomic analysis has revealed many novel potential targets for antimicrobial drugs and many of them are in essential and conserved metabolic pathways or cell-cell communication systems 18,19,20 . Such a potential mechanism which is necessary for the virulence of microbes is the bacterial secretion system. Pathogenic bacteria need virulence factors in order to infect their hosts and to survive the immune response 21, 22 . The secretion systems used for this purpose are in many cases very important or essential for bacterial virulence, and they are grouped into five classes according to their protein composition, amino acid similarities and mechanism. The unique enzymes found in the bacterial secretion system of C.trachomatis are preprotein translocase subunit SecA and general secretion pathway protein E. Fatty acids are one of the most important building blocks of cellular materials. In bacterial cells, fatty acids occur mainly in the cell membranes as the acyl constituents of phospholipids. The unique enzyme of C.trachomatis involved in the biosynthesis of fatty acid is (3R)-hydroxy myristoyl-ACP dehydratase.
Subtractive genomics studies between the host and pathogen genome thus provides the details about the proteins likely to be essential to the pathogen but absent in the host. By applying this approach the proteome of