Analysis of  Bt  S3076-1 Toxin Protein at Genome-wide Level

Zhongqi Wu<sup>1,2</sup>

Research Article

Analysis of Bt S3076-1 Toxin Protein at Genome-wide Level

Zhongqi Wu^1,2

1 Institute of Life Science, Jiyang College of Zhejiang A&F University, Zhuji, 311800, Zhejiang, China
2 Cuixi Academy of Biotechnology, Zhuji, 311800, Zhejiang, China

Author

Correspondence author
Molecular Microbiology Research, 2014, Vol. 4, No. 1 doi: 10.5376/mmr.2014.04.0001
Received: 09 Oct., 2014 Accepted: 12 Nov., 2014 Published: 23 Dec., 2014

This is an open access article published under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Preferred citation for this article:

Wu Z.Q., 2014, Analysis of Bt S3076-1 toxin protein at genome-wide level, Molecular Microbiology Research, 4(1): 1-10 (doi: 10.5376/mmr.2014.04.0001)

Abstract

In order to systematically analyze the quantity, species and distribution of Bt toxin protein carried by Bt S3076-1 strain, gene prediction, local database blast and protein structure analysis of Bt S3076-1 strain were carried out at the genome-wide level in this research. A total of 45 candidate Bt toxin protein genes were identified in the genome of Bt S3076-1 strain. To further confirm the reliability of these Bt toxin protein genes, we analyzed the primary, secondary and tertiary structures of these candidate Bt toxin proteins. The results of physicochemical properties analysis showed that the length of these candidate Bt toxin proteins sequence ranged from 50 to 1 247 aa, the molecular weight of the proteins ranged from 6.02 to 141.55 kD, and the isoelectric point of the proteins ranged from 4.3 to 10.88. The pH of 21 proteins of these candidate Bt toxin proteins were greater than 7. The results of protein tertiary domain analysis and functional domain analysis of PFAM indicated that there were seven Cry toxin proteins containing typical 3-Domain endotoxin domains (Endotoxin_N, Endotoxin_M and delta_endotoxin_C) in Bt S3076-1 genome, four Cry toxin proteins containing atypical conserved domains (RICIN or Toxin_10, etc.), and one Vip toxin protein. Therefore, it was predicted that eleven genes encoding Cry toxin protein and one gene encoding Vip toxin protein might exist in Bt S3076-1 strain.

Keywords

Bacillus thuringiensis; Bt S3076-1; Bt toxin protein; Cry toxin protein; Vip toxin protein

Bacillus thuringiensi (Bt) is a kind of gram-positive bacteria, which broadly exists in natural environment such as soil, water and so on (Ibrahim et al., 2010). Since Bt was named by Ernst Berlier in 1915, researches and applications of Bt have been over a hundred years. Bt has been widely used in biological control of agricultural and forestry pests and insect vectors, and has become the most important part in biological pesticide field (Crickmore et al., 1998).

For the identification of Bt toxin protein, the research methods and means are in constant progress from the preliminary level of physiological and biochemical characteristics to the subsequent level of molecular biology. With the popularity of whole genome sequencing technology, the identification speed and efficiency of Bt toxin protein has entered a new level. In particular, with the wide application of the second-generation high-throughput sequencing platform, the cost of whole-genome sequencing of microorganisms has been greatly reduced, making it possible to predict and identify Bt toxic proteins at the genome-wide level. From 2009, the number of Bt genome sequencing projects registered in NCBI database has increased by dozens every year. Whole genome sequencing is applied by more and more research groups for deep and comprehensive study on Bt strains at the genome level, attempting to discover more Bt insecticidal toxin proteins.

Bacillus thuringiensi S3076-1 strain was isolated from the soil of Hainan Diaoluoshan National Nature Reserves by Hainan Institute of Tropical Agricultural Resources in 2007 (Isolated Strain No. 20070720S3076). Scanning electron microscopy results indicated that the strain could produce a large number of spherical and square spore crystals. Further SDS-PAGE electrophoretic determination result of strain total protein showed that in late stage of spore formation, the molecular weight of protein produced by the strain were mainly 140 kDa, 90 kDa, 70 kDa, 50 kDa, 45 kDa and 25 kDa (Wu et al., 2017).

In this study, the genomic data of Bt S3076-1 sequenced by De Novo at the early stage (Wu et al., 2016) were used to predict and analyze the possible Bt toxin proteins of Bt S3076-1 strain at the genome level by means of bioinformatics, which would lay basis for further identification of novel toxin proteins.

1 Results and Analysis

1.1 Prediction and recognition of toxin protein coding gene

By blast in the local Bt toxin protein database, the genes in the sequence alignment results were listed as the toxin protein coding genes possibly carried by Bt S3076-1 strain. Finally, a total of 45 genes were screened out which may encode toxin proteins. Seven of these genes were predicted to be coding genes for Vip toxin-like protein, and 38 genes were predicted to be coding genes for Cry/Cyt toxin-like protein (Table 1).

Table 1 Prediction and analysis of toxin protein coding genes of Bt S3076-1 strain

1.2 Physicochemical properties analysis of toxin proteins

ProtParam on-line analysis tool was used to analyze the proteins physicochemical properties of 45 amino acid sequences probably encoding Cry/Cyt/Vip Bt toxin proteins, such as isoelectric point, molecular weight and amino acid composition. The molecular weight and isoelectric point of 45 Cry/Cyt/Vip Bt toxin proteins obtained by predictive analysis and there were quite different. Usually the molecular weight of Bt toxin protein was larger (40~140 kDa) and the isoelectric point was about 5.5 (Tapp et al., 1994). So, in the next structural domain analysis, we can focus on predictive toxin proteins with larger molecular weight and isoelectric point around 5.5 (Table 2).

Table 2 Physicochemical properties of toxin protein of Bt S3076-1 strain

1.3 Prediction and analysis of toxin protein structure

The secondary structures of 45 predictive toxin proteins were analyzed by SOPMA. The positions and peak map of α-helix, β-turn, β-sheet and randon coil in each protein were obtained respectively (Figure 1A; Figure 1B). SWISS-MODEL was then used to predict the tertiary structure of 45 predictive toxin proteins. Through analysis, fourteen predicted protein-coding genes were found to match the tertiary structure model of the currently known Bt insecticidal crystal protein (Figure 1C), while the other 31 predictive proteins matched no tertiary structure model or the matched tertiary models were not Bt insecticidal crystal proteins (Table 3).

Figure 1 Example of P01G-139 encoded protein analysis results

Note: Figure A and Figure B, SOPMA secondary structure analysis and prediction of P01G-139 encoded protein; Figure C, SWISS-MODEL was used to predict the tertiary structure of P01G-139 encoded protein, and the three-dimensional structure model of the two proteins was constructed

Table 3 Tertiary structure information of toxin protein predicted by Bt S3076-1 strain

1.4 Conserved domain analysis of predicted toxin proteins

DELTA-BLAST protein sequence alignment algorithm was used to identify and analyze the conserved domain of the predicted Bt toxin protein. Combined with SWISS-MODEL tertiary structure analysis, it was found that seven of the predicted proteins had the tertiary structural characteristics of Bt toxin protein, and contained or part contained the Endotoxin domain specific to Bt toxin protein as well. The genes P01G-35, P01G-44, P01G-73, P01G-139, P02G-41, P02G-44, and P02G-521 located on Bt S3076-1 plasmid could be identified as the toxin protein coding gene of this strain from the perspective of prediction analysis. However, among registered Cry proteins, not all Cry proteins had three typical domains of Endotoxin_N, Endotoxin_M and delta_endotoxin_C, and some Cry proteins had RICIN or Toxin_10 domains. In the course of Blast, it was found that protein CG-5295 and CG-5303 which located on chromosome and the P01G-34 and P01G-43 on plasmids had such structural domains. Therefore, these four predicted genes were also preliminarily identified as Cry toxin protein genes. In addition, it was also found that the predicted gene P02G-219 coding proteins contained typical domains of Vip1-type genes, PA14 family and Binary_toxB, thus, this gene was preliminarily identified as Bt toxin protein gene as well. However, P01G-141 and P01G-180, which were deemed to Bt toxin protein coding genes because of the SWISS-MODEL predicted result, did not find typical domains in the process of DELTA-BLAST, so they were identified as non-Bt toxin protein genes. Therefore, the final predicted Bt toxin protein genes were 12 in total (Table 4).

Table 4 Analysis of the conserved domain of Bt S3076-1 predicted toxin protein

Note: “√” indicates that the protein has the domain; “×” indicates the protein does not have this domain

Combined with the results of local BLAST database alignment and NCBI online DELTA-BLAST protein conserved domain analysis, homologous protein analysis was performed for 12 predicted Bt toxin proteins. The results showed that Bt S3076-1 might carry multiple Cry-like proteins, such as Cry1A, Cry1B, Cry35A, Cry39A, Cry54A and so on. At the same time, Vip1A-like protein was found in plasmid S3076-1P02, and the sequence homology was lower than 45%, which was extremely likely to be a new Vip-like protein (Table 5).

Table 5 Homology comparison of 12 predicted Bt toxin proteins in Bt S3076-1

2 Discussion

Until December 31, 2018, the Bt toxin protein named by Bacillus thuringiensis Toxin Nomenclature Committee had been obtained, and a total of 1,000 pieces of its protein or nucleotide sequences could be inquired in the public database, including 833 amino acid sequences and 167 nucleotide sequences. These protein and nucleic acid sequences were obtained from NCBI database. These protein and nucleotide sequences were obtained from NCBI database, BLAST localization sequence alignment database was constructed through ncbi-blast-2.2.27 sequence alignment platform, and Blastp and Blastn alignment programs were used to screen the genes that might encode toxin protein in predicted genes.

A total of 45 possible toxin proteins were obtained from the blast results of local Bt toxin protein database. The physicochemical properties of the proteins were analyzed by ProtParam, and the preliminary judgment was made according to the information of isoelectric point and molecular weight. SOPMA and SWISS-MODEL were used to analyze the secondary and tertiary structures of the predicted proteins, respectively. The protein structure models with high homology in PDB and SWISS-PROT databases were selected as templates to construct the three-dimensional structure model of predicted proteins. It was preliminarily analyzed that 14 predicted proteins might be Bt toxin proteins. The conserved domains of 45 predicted proteins were further predicted and analyzed by NCBI online analysis program DELTA-BLAST. Cry-like toxin proteins usually have three typical conserved domains, Endotoxin_N,Endotoxin_M and delta_endotoxin_C, but some Cry-like proteins do not have typical conserved domains. For example, the conserved domain of Cry35-like protein is RICIN, while Cry46-like proteins do not have conserved domains. Similarly, Vip-like proteins have their own unique conserved domains. DELTA-BLAST conserved domain analysis further reduced the range of toxin proteins contained in Bt S3076-1 to 12. The homologous proteins included 9 Cry proteins like Cry1A, Cry1B, Cry35A, Cry39A, and Cry54A et al, and 1 Vip1 protein, respectively. The amino acid sequence similarity of P01G-34, P01G-35, P01G-43 and P01G-44 with the target protein is above 75%, which belongs to the third category of the classification system of Bt toxin protein, and the results are reliable. The sequence similarity of the other eight predicted proteins were between 18% and 66%, which belongs to the first or second category of the classification system of Bt toxin protein. The similarity is relatively low, and the possibility of becoming a new Bt protein is higher.

An interesting issue was also found in this study: a Vip1A-like protein was found on the plasmid S3076-1P02. As we all know that Vip1A/Vip2A was a binary toxin, and the expression products of Vip1A and Vip2A co-produced toxic effects. Generally, the vip2A gene is located upstream of the vip1A gene and the ORF of the two genes are separated by 4 bases (Warren, 1997). However, in this study, no ORF of vip2A gene was found in the upstream of the predicted vip1A gene. Yu et al. (2012) investigated and evaluated the vip gene resources of Bacillus thuringiensis in Sichuan and found that some Bt strains only contained vip1A or vip2A gene, but the specific strain number and more detailed information were not provided in the literature. Is the independent vip1A gene found in this study to be the same? It is still necessary to clone the pre-and post-vip1A gene sequences separately in the next step and re-sequence to analyze the structural composition of the gene in this region.

More and more research teams are using whole genome sequencing to study Bt strains comprehensively and deeply from the genome level to dig up more insecticidal gene resources, to analyze insecticidal mechanism and regulation mechanism of insecticidal gene expression, and to further understand the evolutionary relationship between Bt and Ba, Bt and Bc. Furthermore, these studies would provide new ideas for solving the problem of insect resistance to Bt toxin protein, so as to make better use of Bt resources in actual production activities.

3 Materials and Methods

3.1 Materials

The chromosome and plasmid genome sequences of the spliced Bt S3076-1 were used as the bases for prediction and analysis (Wu et al., 2016; 2017).

3.2 Prediction and recognition of toxin protein encoding gene

607 amino acid sequences and 122 nucleotide sequences of Bt toxin protein which were named by International Bacillus thuringiensis Toxin Nomenclature Committee were collected from public database (http://www.lifesci.sussex.ac.uk/home/Neil_Crickmore/Bt/index.html), and these sequences was used to established BLAST localization sequence alignment database. The genes that might encode toxin protein in the prediction genes were screened through blastp and blastn alignment program. In comparison, the significant horizontal threshold was set as e-value < 0.0001.

3.3 Analysis of physicochemical properties of toxin proteins

ProtParam (http://web.expasy.org/protparam/) is an analytical software that can predict the primary structure of proteins. It can calculate the physical and chemical parameters of the protein sequences in SWISS-PROT and TrEMBL databases or artificially input protein sequences to analyze the molecular weight, pI, amino acid composition, and other information of the target protein sequences (Wilkins et al., 1999).

3.4 Structure prediction and analysis of Bt S3076-1 toxin protein

Secondary structure of protein refers to the conformation of the main chain in the peptide chain which is coiled and folded regularly by hydrogen bond and has periodic structure along one dimensional direction. The common secondary structures mainly include α-helix, β-turn, β-sheet, and randon coil. SOPMA (http://npsa-pbil.ibcp.fr/cgi-bin/npsa_automat.pl?page=npsa_sopma.html) is an online tool for prediction and analysis of secondary structure of protein. In this study, SOPMA was used to analyze the secondary structure of possible toxin protein in the previously predicted and obtained Bt S3076-1 genome (Geourjon et al., 1995), and the default value Window width was selected for threshold setting: 17, similarity threshold: 8, number of states: 4.

Protein tertiary structure refers to a polypeptide chain that coils and folds further on the basis of secondary structure or super-secondary structure or even domain, and forms a specific spatial structure by maintaining and fixing secondary bond to turn into a protein tertiary structure. SWISS-MODEL (http://swissmodel.expasy.org/) is a highly automated tool for analyzing and predicting the tertiary structure of protein sequences based on homology models (Schwede et al., 2003). SWISS-MODEL was used to predict the tertiary structure of 45 toxin protein sequences of S3076-1 strain.

3.5 Prediction and analysis of the Bt S3076-1 toxin protein structure

Most of the Cry, Cyt and Vip proteins in Bt have their own conserved domains. For example, the typical conserved domains of Cry protein are Endotoxin_N、Endotoxin_M and delta_endotoxin_C, and the typical domains of Vip3 protein are CBM_4_9 and Vip3A_N. In this study, the conservative domain of the predicted Bt toxin proteins were identified and analyzed by using the protein sequence alignment algorithm of the NCBI online analysis tool DELTA-BLAST (http://blast.ncbi.nlm.nih.gov/Blast.cgi). If predictive analysis of proteins containing typical 3-Domain, they could be preliminarily judged as Bt toxin protein. If those proteins containing atypical domains, such as CBM_4_9, Vip3A_N and RICIN, we could also preliminarily classify them into Bt toxin proteins.

All intellectual property and rights belong to Hainan Institute of Tropical Agricultural Resources. The authors thank QTY for her review of this paper in English.

Authors’ Contributions

WZQ was the chairman and conceiver of the research project, and carried out Bt genome data analysis, writing and revising and finalizing of the draft. The author read and agrees to the final text.

Acknowledgments

This study was funded by China National Bt Strains Resource Initiative (BtSRI) of Hainan Institute of Tropical Agricultural Resources. All intellectual property and rights belong to Hainan Institute of Tropical Agricultural Resources. The authors thank QTY for her review of this paper in English.

Reference

Crickmore N., Zeigler D.R., Feitelson J., Schnepf E., Van Rie J., Lereclus D., Baum J., and Dean D.H., 1998, Revision of the nomenclature for the Bacillus thuringiensis pesticidal crystal proteins, Microbiology and Molecular Biology Reviews, 62(3): 807-13

Geourjon C., and Deléage G., 1995, SOPMA: significant improvements in protein secondary structure prediction by consensus prediction from multiple alignments, Bioinformatics, 11(6): 681-684

https://doi.org/10.1093/bioinformatics/11.6.681

Ibrahim M.A., Griko N., Junker M., and Bulla L.A., 2010, Bacillus thuringiensis: Agenomics and proteomics perspective, Bioengineered Bugs, 1(1): 31-50

https://doi.org/10.4161/bbug.1.1.10519

PMid:21327125 PMCid:PMC3035146

Schwede T., Kopp J., Guex N., and Peitsch M.C., 2003, SWISS-MODEL: An automated protein homology-modeling server, Nucleic Acids Research, 31(13): 3381-3385

https://doi.org/10.1093/nar/gkg520

PMid:12824332 PMCid:PMC168927

Tapp H., Calamai L., and Stotzky G., 1994, Adsorption and binding of the insecticidal proteins from Bacillus thuringiensis subsp. kurstaki and subsp. Tenebrionis on clay minerals, Soil Biology & Biochemistry, 26(6): 663-679

https://doi.org/10.1016/0038-0717(94)90258-5

Warren G.W., 1997, Vegetative Insecticidal Proteins: Novel Proteins for Control of Corn Pests. Advances in insecet control: The role of transgenic plants. London: Taylor & Francis, 1997: 109-122

https://doi.org/10.4324/9780203211731_chapter_7

Wilkins M.R., Gasteiger E., Bairoch A., Sanchez J.C., Williams K.L., Appel R.D., and Hochstrasser D.F., 1999, Protein identification and analysis tools in the ExPASy server, Methods in Molecular Biology, 112(112): 531-552

https://doi.org/10.1385/1-59259-584-7:531

Wu et al., 2016, Draft genome sequence of Bacillus thuringiensis strain S3076-1, Bt Research, Vol.7, No.1, 1-7

Wu et al., 2017, Bt S3076-1, A Novel Strain with Larvacidal Toxicity Against Lepidopteran Insects, Bt Research, Vol.8, No.1 1-5

Yu X.M., 2012, Characterization of beneficial and agricultural Bacteria sources and the clone and function test ofsome useful genes from Bacteria, Dissertation for Ph.D., Sichuan Agriculture University, Supervisor: Li P., pp.36-37

Molecular Microbiology Research

• Volume 4