U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • BMC Genomics

Logo of bmcgeno

The porcine translational research database: a manually curated, genomics and proteomics-based research resource

Harry d. dawson.

1 United States Department of Agriculture, Agricultural Research Service, Beltsville Human Nutrition Research Center, Diet, Genomics and Immunology Laboratory, Beltsville, MD USA

Celine Chen

Brady gaynor.

2 United States Department of Agriculture, Agricultural Research Service, Beltsville Agricultural Research Center, Molecular Plant Pathology Lab, Beltsville, MD 20705 USA

Jonathan Shao

Joseph f. urban, jr, associated data.

The use of swine in biomedical research has increased dramatically in the last decade. Diverse genomic- and proteomic databases have been developed to facilitate research using human and rodent models. Current porcine gene databases, however, lack the robust annotation to study pig models that are relevant to human studies and for comparative evaluation with rodent models. Furthermore, they contain a significant number of errors due to their primary reliance on machine-based annotation. To address these deficiencies, a comprehensive literature-based survey was conducted to identify certain selected genes that have demonstrated function in humans, mice or pigs.

The process identified 13,054 candidate human, bovine, mouse or rat genes/proteins used to select potential porcine homologs by searching multiple online sources of porcine gene information. The data in the Porcine Translational Research Database (( http://www.ars.usda.gov/Services/docs.htm?docid=6065 ) is supported by >5800 references, and contains 65 data fields for each entry, including >9700 full length (5′ and 3′) unambiguous pig sequences, >2400 real time PCR assays and reactivity information on >1700 antibodies. It also contains gene and/or protein expression data for >2200 genes and identifies and corrects 8187 errors (gene duplications artifacts, mis-assemblies, mis-annotations, and incorrect species assignments) for 5337 porcine genes.

Conclusions

This database is the largest manually curated database for any single veterinary species and is unique among porcine gene databases in regard to linking gene expression to gene function, identifying related gene pathways, and connecting data with other porcine gene databases. This database provides the first comprehensive description of three major Super-families or functionally related groups of proteins (Cluster of Differentiation (CD) Marker genes, Solute Carrier Superfamily, ATP binding Cassette Superfamily), and a comparative description of porcine microRNAs.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-017-4009-7) contains supplementary material, which is available to authorized users.

Swine are an important models for human anatomy, nutrition, metabolism and immunology [ 1 – 3 ]. Their organs are anatomically and histologically similar to humans as are their sensory innervation and blood supply [ 4 ]. Pigs are naturally susceptible to infection with organisms that are closely related or identical to those species infecting humans including helminths ( Ascaris, Taenia, Trichuris, Trichinella, Shistosoma, Strongyloides ), bacteria ( Campylobacter, Chlamydia, Eschericia coli, Helicobacter, Neisseria, Mycoplasma, Salmonella), protozoans ( Toxoplasma ) and viri ( Coronavirus , Hepatitis E, Influenza, Nipah, Reovirus, Rotavirus) [ 2 , 5 , 6 ] . The last 10 years has seen a boon in the development of genetically modified pig as models for human cardiovascular and lung disease, neurodegenerative and musculoskeletal disorders [ 7 , 8 ] and cancer [ 9 ]. There is also a robust effort to develop pigs as sources for organs and tissues for human xenotransplantation [ 10 ].

Despite these potential strengths as a model, the lack of an annotated database for porcine gene and protein expression data is a limiting factor for translating findings in one species to another. Multiple online databases exist for the storage and retrieval of diverse bovine, rodent or human biomedical data [ 11 – 19 ]. Other databases exist for Zebrafish (ZFIN, [ 20 ]), C. elegans (WormBase, [ 21 ]), and Drosophila melanogaster (Flybase, [ 22 ]). Databases that encompass multispecies analysis such as Homologene and/or that rely on manual annotation such as InnateDb [ 23 ] include bovine but not porcine genes. Several porcine genome companion databases exist; however they lack robust manual annotation and are somewhat limited in scope or are infrequently updated [ 16 – 19 ]. Agbase, a large, multispecies functional analysis database allows the user to search 51,489 porcine genes based on 12 criteria including gene and protein names (UniProt) and Gene Ontology (GO) annotations. Furthermore, databases can contain a significant number of errors due to their primary reliance on machine-based annotation [ 24 ]. For example, the SUS-BAR database [ 19 ] is designed to identify protein orthologs based upon data that includes annotations from the machine-annotated NCBI genome. NCBI has recently begun to include GO annotations into curated entries for non-human and rodent species but most of these are indirect and often based on observations made in other species. As swine are an important model for comparative human studies, there is a critical need to have a centralized, manually-curated source of information for biomedical research. To address these needs, we created the Porcine Translational Research Database.

Construction and Content

To generate content of immunological relevance, broad-based literature searches were conducted using the following terms: apoptosis, B cell development or activation, CD markers, chemokines, chemokine receptors, cytokines, cytokine receptors, dendritic cells, type 1 IFN induced genes, inflammation, nuclear factor kappa-light-chain-enhancer of activated B cells (NFκ-B) signaling pathway, toll receptor signaling pathway, T cell development or activation, Th1 cell development and Th2 cell development. In addition, immunologically related genes associated with the susceptibility to or pathology of allergy, asthma, arthritis, atherosclerosis and inflammation were included. In addition, The Gene Ontology consortium’s community annotation wikis for immunology, cardiovascular disease and muscle biology were searched ( http://wiki.geneontology.org/index.php/Main_Page ). The Jackson Laboratory database of knockout mouse phenotypes was searched for genes leading to defects in immune or metabolic phenotypes when over or under expressed. These genes include the vast majority of genes that are related to immunity and inflammation [ 2 , 3 , 25 , 26 ]. For additional metabolically related genes, genes involved in the transport or metabolism of macronutrients, trace vitamins and minerals were searched. Other genes, associated with the susceptibility to or pathology of atherosclerosis, diabetes, and obesity, were identified. This process identified 13,054 candidate human, bovine, mouse or rat genes/proteins of interest used to select potential porcine orthologs by searching various online sources of porcine gene information. One to one orthology of protein coding genes were determined by protein structure similarity (best reciprocal BLAST hits) and the presence of a corresponding gene in the syntenic region of the human and or mouse genome. No 1:1 orthology could be established for members of some gene families including the Leukocyte Immunoglobulin-like Receptor (LILR) Killer Cell immunoglobulin-like Receptor (KIR), Carcinoembryonic antigen-related cell adhesion molecule (CEACAM) and Cytochrome P450 superfamilies. One to one porcine orthologs of human genes utilize the approved HGNC Name according to the International Society for Animal Genetics (ISAG) publishing guidelines. We defined pseudogenes by the criteria used by Ensembl and ENSCODE; namely the presence of one or more stop codons in the open reading frame that disrupt the protein structure, and (usually) a lack of intron structure at the genome level [ 27 ]. Pseudogenes are further classified into Processed, Duplicated, Unitary or Polymorphic categories [ 27 ].

Sequence generation

Genbank (non-redundant, expressed sequences tag, high throughput genomic sequence, trace archive databases and whole genome shotgun contigs databases) was searched by discontiguous Megablast using default settings (word size = 11), using reference sequence accession numbers to human, bovine, mouse or rat genes/proteins of interest. A similar search was conducted in the following databases using the human or bovine reference sequence; NIH Intramural Sequencing Center (NISC) Comparative Vertebrate Sequencing Project [ 28 ]; National Center for Biotechnology Information (NCBI), Sus scrofa Genome Assembly releases 102 to 105 and Ensembl v10.2 releases 83 to 89. For genes that were determined to be missing from build 10.2 (Additional file 1 ) (and for the mis-assembled or duplicated gene artifacts (Additional file 1 ), we also constructed templates from de novo assemblies derived from Illumina 80 bp reads of the pig alveolar macrophage transcriptome (Dawson, unpublished results) using the de novo assembly algorithm of CLC Genomics Workbench using word size of 20 and a bubble size of 50. When necessary, predicted templates (from bovine or human sequences) were supplemented with porcine expressed sequence tag (EST) assemblies, single ESTs and portions of the published Tibetan (Bioproject # PRJNA291130), Wuzhishan (Bioproject # PRJNA144099), Goettingen (Bioproject # PRJNA291011) [ 29 ], Jinhua, Meishan, Bamei, Large White, Berkshire, Hampshire, Pietrain, Landrace, Rongshang and Duroc (Bioproject # PRJNA309108) porcine genomes [ 30 ]. ESTs were assembled using CAP3 ( http://doua.prabi.fr/software/cap3 ). RNASeq reads were then mapped to these predicted templates in order to derive the full-length consensus sequence (unambiguous 6X coverage) using CLC Genome Workbench 7.0 (QIAGEN Bioinformatics, Redwood City CA). The following settings were used. Mismatch cost =2, Insertion cost = 3, Deletion cost = 3, similarity fraction = 0.95, length fraction = 0.95. Nucleotide sequences were translated using the ExPASy translate tool ( http://web.expasy.org/translate/ ). A total of 1279 of these sequences have been deposited to the transcriptome shotgun assembly sequence database under Bioproject PRJNA80971 and the short read archive under project SRP013743). In silico-derived full-length RNA sequences are provided for an additional 3391 genes. This process/pipeline is summarized in Fig. ​ Fig.1. 1 . A summary of these sequences is provided in Table ​ Table1 1 .

An external file that holds a picture, illustration, etc.
Object name is 12864_2017_4009_Fig1_HTML.jpg

Porcine Translation Research Database (PTR) Construction Flowchart

Current Database Statistics (07/12/2017)

Sequence analysis

We randomly chose 268 of these mRNA for comparison of the 5′, 3′ and ORF length comparison to the corresponding human mRNA. Data are presented in Additional file 4 . For the 1041 protein-coding genes missing from the genome, we entered the gene symbols into the DAVID version 6.8 ( https://david.ncifcrf.gov ) to assess overrepresentation of groups of gene with related function. The functional data were limited to human. Nine hundred and fifty six genes out of 1041 genes were recognized and 955 had functional annotations, of the unrecognized gene 41 are pig or artiodactyl specific genes. Data on functional enrichment of genes with a multiple comparison adjustment (Benjamini) value of >0.05 are presented in Table ​ Table3. 3 . We chose the 60 largest proteins of extreme size (>3000 amino acids) to compare the status (number of loci and completeness) in the NCBI and Ensembl build 10.2 genome. Because exon preservation is usually well conserved and there is fragmentation of certain areas of the porcine genome, the number of exons for the corresponding human gene was used for comparison. Lastly, we determined the chromosomal location of 1307 duplicated gene artifacts (2889 loci, Additional file 2 ) to identify problematic regions. Data are expressed as duplication per megabase (number of bases derived from the NCBI genome build ( http://www.ncbi.nlm.nih.gov/genome?term=sus%20scrofa ) and are presented in Fig. ​ Fig.2 2 .

An external file that holds a picture, illustration, etc.
Object name is 12864_2017_4009_Fig2_HTML.jpg

Chromosomal Locations of 1307 Duplicated Gene Artifacts (2889 Loci)

Functional Annotations for 1041 Protein-Coding Genes that are Missing from Ensembl build 10.2

Database implementation

The currently described database was constructed in the Filemaker Po Advanced v14.0 program (Filemaker Inc., Santa Clara, CA). The layout is illustrated in the sample database entry for the cytokine IL10 (Fig. ​ (Fig.3 3 panels A–D). It was deployed using the Filemaker Server Advanced v14.0 program (Filemaker Inc., Santa Clara, CA). External access to the database has been successfully tested using Chrome, Internet Explorer and Safari browsers. Other areas of the database were populated from existing published or our own unpublished data. Each publication is manually reviewed and data (antibodies, real-time PCR assays, RNA or protein expression data, functional data) is abstracted and entered into the database, along with the Pubmed ID, in the appropriate field. We have developed Taqman real-time PCR assays for 1867 of these genes making them cross reactive for as many species as possible (1067 are partially or fully human gene cross reactive). This is to ensure that comparable areas of the gene are being analyzed as well as for economic reasons. We also conducted a literature survey to determine the sequence of porcine SYBR green PCR assays. Tissue-specific gene expression summaries, using these assays, are provided for these and other studies (i.e., those using microarray and RNASeq), and a comprehensive search of catalog and published literature to identify antibodies to the corresponding proteins. Last, the “Notes Field” in the database was populated with information such as types of errors discovered, degree of 5′ and 3′ UTR conservation, degree of positive selection pressure in various species, and intron status. When the gene (sequence) is present in the genome but not annotated as a gene, we annotate the gene in the Notes field as “Not an identified gene in Ensembl build 10.2.” or “not an identified gene in NCBI build 10.2”.

An external file that holds a picture, illustration, etc.
Object name is 12864_2017_4009_Fig3_HTML.jpg

a – d Sample Database Entry

To date, we have generated 9720 full-length transcripts representing 9165 genes (Table ​ (Table1). 1 ). They include 1354 genes missing from Ensembl build 10.2 (Table ​ (Table2 2 and Additional file 1 ) and 1400 genes that have been sequenced at least two times (gene duplicated artifacts shown in Table ​ Table2 2 and Additional file 2 that were annotated as separate genes in either Ensembl or NCBI builds. Functional enrichment analysis of 1041 protein-coding genes that are missing from the genome reveals that genes that are annotated as cytokines (24, p  = 0.0053) and transcription factors (68) (particularly Homeodomain-like transcription factors (34, p  = 0.032) and CENP-B/Helix-turn-helix (HTH) domains (6, p  = 0.035) are significantly overrepresented (Table ​ (Table3). 3 ). Of note, the great majority of the Interleukin 1 Superfamily (IL1F10, IL1RN, IL36A, IL36B, IL36G, IL36RN, IL37) members are significantly ( p  = 0.0073) overrepresented. Data analysis that do not account for these genes risk missing assessment of important genes involved in inflammation and development.

Number and Types of Errors Located in Publically-available Porcine Databases

Based upon gene number estimates from other closely related species such as human and cow, we estimated that our database has a coverage rate of approximately 42% of the porcine genome. These represent sequences found in 10,232 Unigene entries (1.45 per gene), 9967 NCBI loci (5756 are single loci that are not duplicated gene artifacts or split into multiple loci, and 1793 genes have multiple (4211) loci. A total of 2109 and 1616 of the genes have no assigned Unigene number or NCBI loci, respectively. In addition to GO and Kyoto Encyclopedia of Genes and Genomes (KEGG) annotations, literature-based functional annotations (derived from more than 5500 references) are provided for these sequences. We have also discovered a relatively large number (178) of porcine or artiodactyl-specific paralogs (Additional file 3 ) for 104 protein or non-protein coding porcine genes. For genes with multiple paralogs, genes are named in the order of phylogenetic distance of the parent human or bovine gene. Some of these genes are expressed pseudogenes. Some of these genes have been previously discussed (i.e., CD36, IL1B [ 25 , 31 ]) or will be discussed in the following sections.

The transcripts we have generated for protein-coding genes include, on average, 70.5% of the corresponding 5′ and 3′ ends (each) of the human sequence (Additional file 4 ). The ORF is 99.4% conserved on a nucleotide count basis. These percentages indicate the fidelity of our procedure. We discovered extensive gene truncation (incomplete ORF) and gene duplicated artifacts (genes sequenced more than once) among the machine annotated versions of these genes. These problems are common among 1st drafts of other genomes [ 32 , 33 ]. Gene duplicated artifacts appear most frequently for chromosomes 12, 2 and 3, and less frequently for chromosomes X, 11, 13, and 1 (Fig. ​ (Fig.2). 2 ). The most frequent areas should be targeted for re-sequencing or reassembly. Analysis of the 60 largest porcine proteins in the database shows that gene fragmentation and truncation roughly correlate with protein size and number of exons (Table ​ (Table4). 4 ). Figure ​ Figure4 4 shows BLAST search results from two extremely large proteins, hemicentin (HMCN1, panel a) and titin (TTN, panel b) that have 9 loci assignments each, in the current NCBI build. Surprisingly, these proteins are not represented in Ensembl build 10.2 as annotated genes. Overall, of the 60 largest porcine proteins, only 6 and 10 are represented as single full-length sequences of the correct size in Ensembl and Genbank, respectively. We have deposited 12 de novo assemblies in the TSA archive and have provided in silico predicted RNA and protein sequences for 37 of these genes.

An external file that holds a picture, illustration, etc.
Object name is 12864_2017_4009_Fig4_HTML.jpg

Hemicentin ( a ) and Titin ( b ) Assembly Blasts

Extensive Gene Fragmentation/Truncation Frequently Occurs Among Proteins of Extreme Size

In previous studies, we extensively compared porcine, human and mouse genes related to immunity and inflammation [ 2 , 3 , 25 , 26 ]. In the following section, we will summarize our findings for three major Superfamilies or functionally related groups of proteins (CD marker genes, Solute carrier superfamily, ATP binding cassette superfamily) or non-coding RNA (microRNA) that have complete or nearly complete representation. CD markers (accessible as a group by entering CD markers in the Annotations field) encode a heterogeneous group of cell surface proteins. The Human Leucocyte Differentiation Antigen (HLDA) workshop has designated 408 molecules (some of which are grouped within a CD) as CD markers [ 34 ]. Based upon our assembly and analysis, we could establish 1:1 orthology for 357 porcine genes to those that compose HLDA version 10. Forty-three genes are not present in the porcine genome or could not be designated as 1:1 orthologs. Of these, nine genes (CLEC4C, CLEC4M, SIGLEC7, BTN3A1, LILRA1, LAIR2, PSG1, SIRPG, TNFRSF10C, are primate-specific [ 35 – 37 ]. KLRC2 (CD159c) is found in humans and rodents but not pigs. FCGR2C is a human-specific gene/pseudogene that belongs to a family of three low-affinity immunoglobulin gamma Fc receptors (CD32) [ 38 ]. We have determined that pigs have two member of this family that roughly corresponds to FCGR2A and FCGR2B. TNFRSF14 (CD270) is a marker for B cells, dendritic cells, monocytes, and Treg cells [ 39 ] found in humans and rodents, but not cows. Although, canine, feline, equine and ursine homologs have been identified, this gene may be a pseudogene in pigs as the putative ORF is interrupted by an endogenous retroviral sequence (H. Dawson, unpublished). FCRL2 (CD307b) is a marker for B cells in humans. Although sequences corresponding to FCRL2 have been identified in other mammals including dog and horse, no mouse ortholog has been identified [ 40 ]. This gene shows evidence of positive selection in humans [ 41 ] and is most likely a pseudogene in pigs.

Due to rapid evolution and post-speciation gene duplication, no 1:1 orthology could be established for most mouse and pig LILR or KIR family members, including LILRA4 (CD85G) and LILRB4 (CD85K) [ 42 ]. Similarly, other than CEACAM1 (CD66) and CEACAM6 (CD66C), no 1:1 orthology could be established for most pig and mouse CEACAM family members (CEACAM3 (CD66D) CEACAM5 (CD66E). CEACAM8 (CD67) may be a pseudogene as ESTs in Unigene Ssc.60435 predict a 243 amino acid protein interrupted by several stop codons. CEACAM8 and CEACAM6 were previously determined to have no direct murine orthologs [ 35 ]. Several other shared human-pig CD marker orthologs (ADGRE2 (CD312), ADGRE3 (CD313r), CD1A, CD1E, CR1 (CD35), CD58, FCGR2A (CD32), FCAR (CD89), FCRL3 (CD307c), FCRL4 (CD307d), ICAM3 (CD50), NCR2 (CD336), NCR3 (CD337) and TLR10 (CD290r) have no rodent orthologs [ 2 , 40 , 43 ].

A significant number of errors were discovered in genes encoding porcine orthologs of human CD markers; 25 are not present in Ensembl build 10.2, 88 of the proteins are truncated and 52 are duplicated gene artifacts. Sixty-seven full-length mRNA sequences encoding proteins, assembled from macrophage RNA-Seq reads, have been deposited in Genbank. An additional 79 in silico constructs are provided. Antibody data, gathered from publications, manufacturers or generated in house, is provided for 186 proteins including 395 monoclonal and 285 polyclonal antibodies. Additional cross reactivity for 29 proteins is expected because they are >95% similar to human proteins. Several of the CD Marker family are members of other gene families including the Solute Carrier and ATP-binding Cassette Super Family.

The Human Genome Organization’s gene nomenclature committee (HGNC) has assigned 395 genes to the Solute Carrier Superfamily, 21 are pseudogenes and three hundred seventy four encode proteins (accessible as a group by entering Solute Carrier Superfamily in the Annotations field). These are organized into 52 subfamilies; about 25% are dedicated to nutrient transport. The porcine Solute Carrier Super family contains 398 protein-coding members and all human subfamilies are represented. Forty-two of these genes are present in other porcine genomes but missing from Ensembl build 10.2, 113 are truncated and 58 of these are duplicated gene artifacts. Sixty three full-length mRNA sequences, assembled from macrophage RNA-Seq reads, have been deposited in Genbank and an additional 159 in silico constructs are provided. Forty-two of these genes are missing from all porcine genomes or are present as pseudogenes. Among these genes are UCP1 (thermogenein), a protein involved in non-shivering thermogenesis and a pseudogene in pigs [ 44 ] and SLC52A2, a primate specific riboflavin transporter [ 45 ]. Other species-specific genes include eight primate-specific (SLC2A14, SLC22A24, SLC35E2, SLC35G3, SLC35G4, SLC35G5, SLCO1B1, SLCO1B7), one human specific (SLC22A25) gene and 14 mouse or rodent-specific genes (Slc6a20b, Slc7a12, Slc21a4, Slc22a19, Slc22a21, Slc22a22, Slc22a26, Slc22a27, Slc22a28, Slc22a29, Slc22a30, Slco1a1, Slco1b2, and Slco6b1). SLC25A18 is present in human and rodent genomes but is missing from bovine and porcine genomes. SLC25A52 is present in primate and rat genomes but not mouse. SLC9C2 is a pseudogene in mouse [ 46 ]. SLC22A31 is an expressed pseudogene in pigs and is missing in rodents. SLC22A11 is an expressed pseudogene in pigs and a non-expressed pseudogene in mouse. Lastly, SLC23A4, an intestinal nucleobase transporter [ 47 ], is a pseudogene in humans but is present in pig, cow and rodent genomes. Several porcine or artiodactyl-specific gene expansions are found in subfamilies (Additional file 3 ) including SLC7A3 (14 members), SLC7A13 (3 members) SLC22A6 (2 members), SLC22A10 (4 members) and SLC47A1 (2 members). The biological functions of these paralogs remain to be determined; however the parent genes are involved in amino acid (SLC7A3, SLC7A13) or dipeptide transport (SLC22A6) [ 48 , 49 ].

The HGNC has assigned 51 genes to the ATP binding Cassette Superfamily, three are pseudogenes and 48 encode proteins (accessible as a group by entering ATP binding Cassette Superfamily in the Annotations field). These are organized into five subfamilies (A-G), about 20% are dedicated to nutrient (i.e., carotenoid, cholesterol and vitamin A) transport. The porcine ATP binding Cassette Family contains 57 members and all human subfamilies are represented. These include five that are missing from Ensembl build 10.2 and 18 that are duplicated gene artifacts. Five of these genes are present in other porcine genomes, but missing from Ensembl build 10.2, 21 are truncated, and 18 of these genes are duplicated gene artifacts, Eleven full-length mRNA sequences, assembled from macrophage RNA-Seq reads, have been deposited in Genbank and an additional 24 in silico constructs are provided.

An analysis of this superfamily revealed that ABCC11 has no murine ortholog [ 50 ] and ABCA8 has no direct rodent ortholog as the gene has diverged into two paralogs, Abca8a and Abca8b [ 51 ]. The ABCC4, a prostaglandin E2 transporter [ 52 ], has diverged from the parent gene into five paralogs (ABCC4L1, ABCC4L2, ABCC4L3, ABCC4L4 and ABCC4L5 (Additional file 3 ). ABCA10, involved in human macrophage cholesterol transport [ 53 ], is a pseudogene in rodents. It may be an expressed pseudogene in pigs as the predicted protein is half (787 amino acids) the size of human ABCA10 (1543 amino acids) and weak expression (by RNASeq) was detected in macrophages and moderated expression in intestine (H. Dawson, unpublished). ABCA17 is an expressed pseudogene is humans and pigs. Like the Solute Carrier Superfamily, most of the genes in the ATP binding Cassette Super family have not been characterized at the functional level. Nevertheless, the similarities and differences in the ATP binding Cassette and ATP binding Cassette Super families impact the suitability of rodents and pigs as models for human drug and nutrient transport and metabolism.

The exact number of microRNAs in the porcine genome is unknown. There are 4272 annotated microRNAs in the human genome (build 30). Although there are several papers describing the measurement of porcine microRNAs in various tissues or estimating the number in the porcine genome [ 54 – 57 ] and three partially overlapping sources of porcine microRNA sequences, the exact number of porcine microRNAs is currently unknown. There are only 382, 385 and 816 (non-redundant) annotated pig miRNA sequences in Mirbase, NCBI gene build, and Ensembl build 10.2, respectively. These three sources of information have a significant amount of overlap (Fig. ​ (Fig.5a). 5a ). We have consolidated this information and provide sequence data for our own predicted sequences based on conserved sequence identity to 1900 human, mouse or bovine sequences, to provide 1033 non-redundant porcine microRNA sequences (accessible as a group by entering MicroRNA in the Annotations field). Of note, all of the sequences found in Mirbase were found in the NCBI gene build, 59 of the microRNA sequences in Ensembl were found to be duplicated artifacts, and 214 of the 1033 sequences are not present in the current Ensembl gene build (10.2). This includes 81 that we have predicted based upon their presence in other species and other unfinished porcine genomes. We discovered the following species- or genera-specific microRNA; pigs (454), humans (199), primates (111) bovine (179), mouse (76) and rodents (20). Many of the porcine-specific microRNA have arisen from biological duplication/expansion (Additional file 3 ). A comparison of microRNA that are present in pigs and shared among at least one of the three other species (human, cows, and mice) revealed that 318 microRNA are shared among the four species, 107 are shared between pigs, humans and cows but not mice, and 34 are shared between pigs, mice and cows but not humans (Fig. ​ (Fig.5b). 5b ). Thus, the frequency of non-conserved microRNA preservation between human and pig is nearly three times that of mouse to pig.

An external file that holds a picture, illustration, etc.
Object name is 12864_2017_4009_Fig5_HTML.jpg

Analysis of MicroRNA Sequence Origin and Species Similarity. These 3 sources of information for our 1047 MicroRNA sequences have a significant amount of overlap ( a ) and include 81 that we have predicted based upon their presence in other species and other unfinished porcine genomes. Of these sequences, 454 are unique to pigs, 318 are shared among the four species ( b ), 55 are shared between humans and pigs but not mice and cows and 25 are shared between mice and pigs but not humans and cows

The Porcine Translational Research Database is named because of its unique utility to translate findings made in rodents to pigs and from those in pigs to humans. A comprehensive literature-based survey was conducted to identify genes that have demonstrated function in humans, mice or pigs. The resulting data in the database is documented by >6000 references. The database currently contains 65 data fields for each entry. Our efforts to improve the genome and its annotation are similar to other efforts, for example the sequencing of 12,000 genes to supplement annotation of the pig genome [ 32 , 33 , 58 ] and de novo assembly of multiple pig genomes to reveal 1737 protein coding genes that are missing from Ensembl build 10.2 [ 30 ]. The online Supplemental data from the latter manuscript was unavailable at the time of the preparation of this manuscript so no comparison could be made. The manual assembly of >9700 RNA sequences has direct practical implications for genomics-based analysis. The state of the current genome build (mis-annotations, duplication artifacts, and missing sequences) effectively prohibits its use for aligning RNAseq reads. We have used these sequences to compare gene expression separately from Ensembl 10.2 and have also compared the number of reads obtained from the corresponding templates in Ensembl 10.2. For the great majority of transcripts compared, as expected, our full-length sequences provided a higher level of sensitivity than the corresponding Ensembl sequences (H. Dawson unpublished).

The full 5′ and 3′ representation of each gene will also allow for characterization of regulatory regions and miRNA target sites. In our estimation, >40% of transcripts in Ensembl or NCBI genomes do not represent the full-length gene. Our efforts will also allow for further consolidation of porcine Unigene numbers. Currently, each gene is represented by from 0 to >10 Unigene assignments, and >10% of genes have more than one.

It is significant that we discovered a large number of errors (about 30% of entries) in the publicly available sequence databases (these can be accessed by searching the “Notes Field” using the word “error” (Fig. ​ (Fig.3)). 3 )). In addition to the duplication artifacts, mis-annotations and missing genes, we also encountered a number of RNA sequences in publically available archives belonging to other species. For, example, human (AHR, {"type":"entrez-nucleotide","attrs":{"text":"AF233432.1","term_id":"7243740","term_text":"AF233432.1"}} AF233432.1 ), panda (IL2, {"type":"entrez-nucleotide","attrs":{"text":"NM_001199892.1","term_id":"315468521","term_text":"NM_001199892.1"}} NM_001199892.1 ) and rat (NUDT14, ESTs in Unigene Ssc.85635) RNA sequences are annotated as porcine derived. We also found sources of contaminating DNA from completely unrelated species. For example, about 1/5 of porcine chromosome 4 clone {"type":"entrez-nucleotide","attrs":{"text":"CU076066.6","term_id":"115515912","term_text":"CU076066.6"}} CU076066.6 is from Zebrafish. These sequences represent 6 Zebrafish genes (LOC100003615, LOC447815, LOC108179932, LOC108183883, LOC108183971, and LOC103910681) and are annotated as porcine genes by Ensembl build 10.2 (ENSSSCG00000006223) and NCBI genomes (LOC100739857). Similarly, several NCBI loci (ASNA1L*, LOC100737282, LOC100737202, LOC100620149, LOC100737282) and one Ensembl locus (ENSSSCG00000026988) are derived from contaminating Babesia bigemina genomic DNA.

We have discovered several sources of systematic errors in the Ensmbl and NCBI gene/protein prediction or annotation pipelines. For example all selenoproteins in Ensembl are truncated because the codon (UGA) for selenocysteine is mistranslated or translated as a stop codon. We and others have identified a systematic error in the identification of another gene family, the Taste receptor, type 2 (TAS2R) Superfamily. Despite being intronless and mostly devoid of 5′ and 3′ UTR regions, Ensembl consistently fails to recognize them as genes [ 3 ]. These data illustrate the critical importance of the manual-curation process to reduce errors.

We believe that this is the largest manually curated database for any veterinary species and that the infomantics are unique among those targeting a veterinary species in regard to linking gene expression to gene function, identification of related gene pathways, and connectivity with other porcine gene databases, as well as for reagents that measure gene and protein expression. In addition, it is the largest source of centralized antibody information for the pig. Any database must be updated frequently in order to be useful. Currently the database is updated monthly and we anticipate expanding the content to include all porcine genes. There are several Super families of genes that will be the next targets of our efforts. One is the GPCR super family, the exact size of the GPCR super family is still unknown, but nearly 800 different human genes (or ~4% of the entire protein-coding genome) have been predicted to code for them. We will also continue to develop and annotate new assays. We intend to include our own prediction analysis for the promoter and 3′ UTR region of RNA for transcription factor and microRNA binding sites. Lastly, we intend to synchronize our database with the porcine “Snowball” array and porcine gene expression atlas [ 59 ].

Additional files

Porcine genes missing in Ensembl build 10.2 of the porcine genome. Gene names and evidence/source for RNA sequence of genes that are missing from Ensembl build 10.2. (XLSX 112 kb)

Artifactually duplicated genes in Ensembl build 10.2. Gene names, Ensembl and NCBI loci numbers and NCBI genome build 10.2 coordinates of artifactually duplicated genes (XLSX 282 kb)

Porcine or artiodactyl-specific paralogs. Gene names, Ensembl and NCBI loci numbers and Build 10.2 NCBI gene coordinates of porcine or artiodactyl-specific paralogs (XLSX 58 kb)

5′, ORF and 3′ end comparison of porcine and human mRNAs. 5′, ORF and 3′ end comparison of porcine and human mRNAs (XLSX 66 kb)

Supported by USDA/ARS Project Plan 1235-51,000-055-00D.

Availability and requirements

The dataset(s) supporting the conclusions of this article are included within the article, its additional file (Additional files 1 , 2 , 3 and 4 ) and within the online database ( http://www.ars.usda.gov/Services/docs.htm?docid=6065 ).

Authors’ contributions

HDD, CC, BG and JS contributed to the content of the database. HDD and JFU wrote the manuscript. All authors read and approved the final manuscript.

Ethics approval and consent to participate

Not applicable.

Consent for publication

Competing interests.

The authors declare that they have no competing interests

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Contributor Information

Harry D. Dawson, Email: [email protected] .

Celine Chen, Email: [email protected] .

Brady Gaynor, Email: [email protected] .

Jonathan Shao, Email: [email protected] .

Joseph F. Urban, Jr, Email: [email protected] .

Ag Data Commons

File(s) stored somewhere else

Please note: Linked content is NOT stored on Ag Data Commons and we can ' t guarantee its availability, quality, security or accept any liability.

The Porcine Translational Research Database

The data in the Porcine Translational Research Database is supported by >5800 references, and contains 65 data fields for each entry, including >9700 full length (5′ and 3′) unambiguous pig sequences, >2400 real time PCR assays and reactivity information on >1700 antibodies. It also contains gene and/or protein expression data for >2200 genes and identifies and corrects errors (gene duplications artifacts, mis-assemblies, mis-annotations, and incorrect species assignments) for >2,000 porcine genes. This database is the largest manually curated database for any single veterinary species and is unique among porcine gene databases in regard to linking gene expression to gene function, identifying related gene pathways, and connecting data with other porcine gene database.

Resource Title: The Porcine Translational Research Database.

File Name: Web Page, url: https://www.ars.usda.gov/northeast-area/beltsville-md/beltsville-human-nutrition-research-center/diet-genomics-and-immunology-laboratory/docs/dgil-porcine-translational-research-database/

USDA-ARS: 1235-51000-055-00D

Data contact name, data contact email, intended use, use limitations, temporal extent start date.

  • Not specified

Geographic location - description

Iso topic category, national agricultural library thesaurus terms, omb bureau code.

  • 005:18 - Agricultural Research Service

OMB Program Code

  • 005:040 - National Research

ARS National Program Number

Pending citation, public access level, preferred dataset citation, usage metrics.

  • Animal production

Table 4 Extensive Gene Fragmentation/Truncation Frequently Occurs Among Proteins of Extreme Size

From: The porcine translational research database: a manually curated, genomics and proteomics-based research resource

BMC Genomics

ISSN: 1471-2164

the porcine translational research database

Academia.edu no longer supports Internet Explorer.

To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to  upgrade your browser .

Enter the email address you signed up with and we'll email you a reset link.

  • We're Hiring!
  • Help Center

paper cover thumbnail

The porcine translational research database: a manually curated, genomics and proteomics-based research resource

Profile image of Harry Dawson

2017, BMC Genomics

RELATED PAPERS

Greg J Bamber

cédric devigne

sapfo.aegean.gr

John Hatzopoulos

Jan Andrzej Choroszy

Academy of Management Proceedings

Stephen Bear

Agronomy Journal

Virginia Moreno

European Journal of Cancer

Pieter E Postmus

Retrovirology

Florence Margottin-goguet

George C Efthimiou

The Review of Laser Engineering

Masayoshi Tonouchi

The journals of gerontology. Series A, Biological sciences and medical sciences

Natasja van Schoor

Indian Journal of Medical and Paediatric Oncology

Sarjana Dutt

Pablo Jorge Marcos-Pardo

2008 16th IEEE International Requirements Engineering Conference

Michel Fortuna

Management Intercultural

Liliana Mata

Journal of Perinatal Medicine

João Bernardes

Private Standards and Global Governance

Tim De Meyer

Shlomo Slonim

Endocrinology

ÀNGEL JANER CAMPOS

Abdul Salam

Afyon Kocatepe University Journal of Sciences and Engineering

Irfan Engin

Mba I . Michael

Italian Journal of Zoology

Andrea Dematteis

RELATED TOPICS

  •   We're Hiring!
  •   Help Center
  • Find new research papers in:
  • Health Sciences
  • Earth Sciences
  • Cognitive Science
  • Mathematics
  • Computer Science
  • Academia ©2024

The porcine translational research database: a manually curated, genomics and proteomics-based research resource

Affiliations.

  • 1 United States Department of Agriculture, Agricultural Research Service, Beltsville Human Nutrition Research Center, Diet, Genomics and Immunology Laboratory, Beltsville, MD, USA. [email protected].
  • 2 United States Department of Agriculture, Agricultural Research Service, Beltsville Human Nutrition Research Center, Diet, Genomics and Immunology Laboratory, Beltsville, MD, USA.
  • 3 United States Department of Agriculture, Agricultural Research Service, Beltsville Agricultural Research Center, Molecular Plant Pathology Lab, Beltsville, MD, 20705, USA.
  • PMID: 28830355
  • PMCID: PMC5568366
  • DOI: 10.1186/s12864-017-4009-7

Background: The use of swine in biomedical research has increased dramatically in the last decade. Diverse genomic- and proteomic databases have been developed to facilitate research using human and rodent models. Current porcine gene databases, however, lack the robust annotation to study pig models that are relevant to human studies and for comparative evaluation with rodent models. Furthermore, they contain a significant number of errors due to their primary reliance on machine-based annotation. To address these deficiencies, a comprehensive literature-based survey was conducted to identify certain selected genes that have demonstrated function in humans, mice or pigs.

Results: The process identified 13,054 candidate human, bovine, mouse or rat genes/proteins used to select potential porcine homologs by searching multiple online sources of porcine gene information. The data in the Porcine Translational Research Database (( http://www.ars.usda.gov/Services/docs.htm?docid=6065 ) is supported by >5800 references, and contains 65 data fields for each entry, including >9700 full length (5' and 3') unambiguous pig sequences, >2400 real time PCR assays and reactivity information on >1700 antibodies. It also contains gene and/or protein expression data for >2200 genes and identifies and corrects 8187 errors (gene duplications artifacts, mis-assemblies, mis-annotations, and incorrect species assignments) for 5337 porcine genes.

Conclusions: This database is the largest manually curated database for any single veterinary species and is unique among porcine gene databases in regard to linking gene expression to gene function, identifying related gene pathways, and connecting data with other porcine gene databases. This database provides the first comprehensive description of three major Super-families or functionally related groups of proteins (Cluster of Differentiation (CD) Marker genes, Solute Carrier Superfamily, ATP binding Cassette Superfamily), and a comparative description of porcine microRNAs.

Keywords: Comparative genomics; Database; Porcine.

  • Databases, Genetic*
  • Proteomics / methods*
  • Translational Research, Biomedical*

U.S. flag

An official website of the United States government

Here’s how you know

Official websites use .gov A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS A lock ( Lock A locked padlock ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

Data Catalog

Default Publisher Icon

  • Department of Agriculture
  • Agricultural...

The Porcine Translational Research Database

The data in the Porcine Translational Research Database is supported by >5800 references, and contains 65 data fields for each entry, including >9700 full length (5′ and 3′) unambiguous pig sequences, >2400 real time PCR assays and reactivity information on >1700 antibodies. It also contains gene and/or protein expression data for >2200 genes and identifies and corrects errors (gene duplications artifacts, mis-assemblies, mis-annotations, and incorrect species assignments) for >2,000 porcine genes. This database is the largest manually curated database for any single veterinary species and is unique among porcine gene databases in regard to linking gene expression to gene function, identifying related gene pathways, and connecting data with other porcine gene database. Resources in this dataset:Resource Title: The Porcine Translational Research Database. File Name: Web Page, url: https://www.ars.usda.gov/northeast-area/beltsville-md/beltsville-human-nutrition-research-center/diet-genomics-and-immunology-laboratory/docs/dgil-porcine-translational-research-database/

Access & Use Information

Downloads & resources, metadata source.

Download Metadata

Harvested from USDA JSON

Other Data Resources

View this on Geoplatform

  • online-database

Additional Metadata

Didn't find what you're looking for? Suggest a dataset here .

Header Logo

The porcine translational research database: a manually curated, genomics and proteomics-based research resource.

  • Databases, Genetic

United States Department of Agriculture

AgResearch Magazine

AR Research Magazine

Pigs Useful in Immune and Obesity Research

Nutritionist Harry Dawson and microbiologist Gloria Solano-Aguilar, both scientists at the Agricultural Research Service’s Beltsville [Maryland] Human Nutrition Research Center (BHNRC), have teamed with scientists from ARS and other organizations to use the pig as an animal model to promote both human and animal health. This research focuses on assessing the effect of nutrition on immune and inflammatory responses.

Dawson helped develop and continues to curate the publicly available Porcine Translational Research Database of genes and proteins for comparison with those prominently studied in rodents and humans. “This database contains functional information on more than 5,800 genes commonly studied in humans, pigs, and mice, including about 2,240 that have been sequenced at BHNRC.” The database can be found at tinyurl.com/porcinedata .

The database contains “manually annotated” genes, meaning that all genes and protein sequences included in the database, as well as information about their functions, were manually entered. Annotated genes can also be entered by computer software programs that predict the structure and identity of genes and proteins based on algorithms.

“These computer programs, while fast, are prone to error that can be corrected only by manual annotation,” says Dawson.

Immune System Similarities

In addition, Dawson conducted a comparative analysis and assessment of specific portions of the swine, mouse, and human genomes. He found that humans share far more immune-system-related genes and proteins with pigs than they do with mice. He reported that when a functional part of a protein is missing among one of the three species, the chance that it is preserved only in pigs and humans is nearly two times greater than the chance it is preserved only in mice and humans. Dawson’s book chapter, “A Comparative Assessment of the Pig, Mouse, and Human Genomes,” was published by CRC Press in 2011 in The Minipig in Biomedical Research.

The first complete pig genome sequence, version Sscrofa 10.2, was released by the Swine Genome Sequencing Consortium in 2012. As part of that effort, a subgroup called the “Immune Response Annotation Group” annotated more than 1,400 swine genes involved in the animal’s immune response. This group included ARS’s Dawson, molecular biologist Celine Chen, chemist Joan Lunney, and others.

The group discovered that the immunity genes of pigs and humans are very similar and evolve at a similar rate in both species. These findings were reported in the journal Nature in 2012.

Later, the group published a study that further characterized the structure and function of the porcine immunome. An immunome is a collection, or reference set, of immune-system-related genes and proteins of a given species. The group provided new immune-response annotations for more than 500 porcine genes and 3,472 protein-coding transcripts.

“The porcine genome is not yet complete, and additional genes may be discovered,” says Dawson. “But these comprehensive and integrated analyses provide important tools for measuring the porcine immune response.” The findings were published in BMC Genomics in 2013.

These comparative studies provide compelling evidence for using swine in research on both human and animal health, says Dawson. “These studies indicate that pigs are a good species to further test concepts and principles that have been discovered by first using mice as a model, particularly for immune-response research.”

Obesity Research Goes to the Hogs

Also at the BHNRC, Solano-Aguilar has worked on a series of studies showing that the pig is instrumental as a model for human obesity-related research. She worked with Kati Hanhineva, of the University of Eastern Finland in Kuopio, to study metabolic changes that occur in pig tissues and biofluids after the pigs consumed a high-fat diet.

The researchers studied the Ossabaw pig because it has a greater tendency to deposit excess fat and develop obesity-related diseases when fed a high-calorie diet, compared to other pig breeds. The emphasis was on using juvenile pigs as a model for obesity in children. “This is an important area because it is generally difficult to evaluate obesity-related metabolic disturbances in children,” says Solano-Aguilar.

The authors wanted to study diet-induced metabolic changes taking place in the tissues they collected from the pigs—liver, pancreas, brain, and intestine. And they wanted to compare whether the changes they found in the tissues were also present in the pig’s urine and plasma—biofluids that are typically collected during human clinical studies.

The study pigs were fed either a maintenance diet or a high-fat diet. The researchers found changes in lipid metabolites in all analyzed host tissue samples from the pigs fed the high-fat diet. Some tissue-dependent changes were not reflected in the biofluids.

Using swine as a biomedical research model was useful for studying metabolic effects induced by a high-fat diet, says Joseph Urban, a coauthor at the BHNRC laboratory who initiated a multi-institute cooperative agreement with the Finnish scientists.

“Biofluids give us part of the picture,” Urban says, “but being able to look at organ tissue helped us target changes that are indicative of both disease and poor response to diet.”

The study was published in the Journal of Proteome Research in 2013.—By Rosalie Marion Bliss, Agricultural Research Service Information Staff.

This research is part of Human Nutrition (#107) and Animal Health (#103), two ARS national programs described at www.nps.ars.usda.gov .

Harry Dawson and Gloria Solano-Aguilar are with the USDA-ARS Diet, Genomics, and Immunology Laboratory , 10300 Baltimore Ave., Beltsville, MD 20705-2350; (301) 504-9412, ext. 278 [Dawson], (301) 504-8068, ext. 295 [Solano-Aguilar].

" Pigs Useful in Immune and Obesity Research " was published in the May/June 2014 issue of Agricultural Research magazine.

Proteomics Overview Proteomics involves the systematic study of proteins in order to provide a comprehensive view of the structure, function and regulation of biological systems. Advances in instrumentation and methodologies have fueled an expansion of the scope of biological studies from simple biochemical analysis of single proteins to measurements of complex protein mixtures. Proteomics is rapidly becoming an essential component of biological research. Coupled with advances in bioinformatics, this approach to comprehensively describing biological systems will undoubtedly have a major impact on our understanding of the phenotypes of both normal and diseased cells.

Initially, proteomics focused on the generation of protein maps using two-dimensional polyacrylamide gel electrophoresis. The field has since expanded to include not only protein expression profiling, but the analysis of post-translational modifications and protein-protein interactions. Protein expression, or the quantitative measurement of the global levels of proteins, may still be done with two-dimensional gels, however, mass spectrometry has been incorporated to increase sensitivity, specificity and to provide results in a high-throughput format. A variety of platforms are available to conduct protein expression studies and this site provides links to these resources.

The study of protein-protein interactions has been revolutionized by the development of protein microarrays. Analagous to DNA microarrays, these biochips are printed with antibodies or proteins and probed with a complex protein mixture. The intenisty or indentity of the resulting protein-protein interactions may be detected by fluorescence imaging or mass spectrometry. Other protein capture methods may be used in place of arrays, including the yeast two-hybrid system or the isolation of proteins/protein complexes by affinity chromatography or other separation techniques.

the porcine translational research database

Official websites use .gov A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS A lock ( ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

Ag Data Commons User Guide

What is ag data commons.

The Ag Data Commons is a research data catalog and repository for public access to data produced during research funded or co-funded by the United States Department of Agriculture.

In accordance with the USDA Public Access DR, all USDA-funded researchers must ensure that a catalog record indicating the point of public access to their data is created in the Ag Data Commons. Supplementary repository service is optional and is open to data without a community- or subject-specific home.

Through the Ag Data Commons, the USDA National Agricultural Library (NAL) provides services to make USDA-funded research data systems and data products Findable, Accessible, Interoperable, and Reusable (FAIR).  Each submitted dataset goes through a review by Ag Data Commons data curators -- NAL metadata librarians who ensure completeness and accuracy of the submission before approving it for publication.  

Eligibility for submissions to the Ag Data Commons

If you are thinking about sharing your research output through the Ag Data Commons, please make sure that your submission meets the following conditions:

1. USDA-funding . A submission must satisfy at least one of the two USDA-funding criteria:

   a. One of the co-authors of the data product is affiliated with USDA, or

   b. It was produced as a result of a research project funded or co-funded by USDA, for example through NIFA grant or through an agency-approved research project.

2. Content type . The item(s) considered for submission must fall into one of the following types:

   a. Data (e.g. tabular data, genomic sequences, multimedia materials)

   b. Data product (e.g., database)*

   c. Non-executable software created to help users process or model data*.

3. Level of access - public. Your submission must contain only data and/or data products intended for public access. The Ag Data Commons does not accept datasets with Personally Identifiable Information (PII).

*See Ag Data Commons Collection Policy  for details about these content types.  

Account creation

After verifying that your submission is eligible to deposit in the Ag Data Commons, you can create an account. Note that a user account is only required to submit data resources to the Ag Data Commons. Those wishing to view or download data do not require accounts.

To create an account, select “ Log In ” at the upper right corner of the Ag Data Commons home page. You will be prompted to select the user type for your account. Options include:

  • Customer:  Use this option for non-federal users (e.g., university partners, grant recipients, etc.). You can use Login.gov or eAuth to log into your account.
  • USDA Employee/Contractor:  Use this option if you work directly for the USDA. You can use your PIV card, USDA MobileLinc, or USDA Work Account through Microsoft to log into your account.
  • Other Federal Employee/Contractor:  Use this option if you work for a federal department outside the USDA. You can use your PIC/CAC PIN or Login.gov to log into your account.

Your account will be automatically created the first time you log in.

New users will start with zero storage capacity and need to request storage space from the curation team to upload dataset files. The Data Submission Manual provides instructions on this process.  A curator will review your request and may follow up with you for additional information before granting the request.

More on Ag Data Commons

Submission manual, collection policies, ag data commons portal, page content curated by.

Europe PMC requires Javascript to function effectively.

Either your web browser doesn't support Javascript or it is currently turned off. In the latter case, please turn on Javascript support in your web browser and reload this page.

Search life-sciences literature (44,061,457 articles, preprints and more)

  • Free full text
  • Citations & impact
  • Similar Articles

The porcine translational research database: a manually curated, genomics and proteomics-based research resource.

Author information, affiliations.

  • Dawson HD 1
  • Urban JF Jr 1

ORCIDs linked to this article

  • Urban JF Jr | 0000-0002-1590-8869
  • Gaynor B | 0000-0002-4142-0613
  • Dawson HD | 0000-0002-7648-9952

BMC Genomics , 22 Aug 2017 , 18(1): 643 https://doi.org/10.1186/s12864-017-4009-7   PMID: 28830355  PMCID: PMC5568366

Abstract 

Conclusions, free full text .

Logo of bmcgeno

The porcine translational research database: a manually curated, genomics and proteomics-based research resource

Harry d. dawson.

1 United States Department of Agriculture, Agricultural Research Service, Beltsville Human Nutrition Research Center, Diet, Genomics and Immunology Laboratory, Beltsville, MD USA

Celine Chen

Brady gaynor.

2 United States Department of Agriculture, Agricultural Research Service, Beltsville Agricultural Research Center, Molecular Plant Pathology Lab, Beltsville, MD 20705 USA

Jonathan Shao

Joseph f. urban, jr, associated data.

The use of swine in biomedical research has increased dramatically in the last decade. Diverse genomic- and proteomic databases have been developed to facilitate research using human and rodent models. Current porcine gene databases, however, lack the robust annotation to study pig models that are relevant to human studies and for comparative evaluation with rodent models. Furthermore, they contain a significant number of errors due to their primary reliance on machine-based annotation. To address these deficiencies, a comprehensive literature-based survey was conducted to identify certain selected genes that have demonstrated function in humans, mice or pigs.

The process identified 13,054 candidate human, bovine, mouse or rat genes/proteins used to select potential porcine homologs by searching multiple online sources of porcine gene information. The data in the Porcine Translational Research Database (( http://www.ars.usda.gov/Services/docs.htm?docid=6065 ) is supported by >5800 references, and contains 65 data fields for each entry, including >9700 full length (5′ and 3′) unambiguous pig sequences, >2400 real time PCR assays and reactivity information on >1700 antibodies. It also contains gene and/or protein expression data for >2200 genes and identifies and corrects 8187 errors (gene duplications artifacts, mis-assemblies, mis-annotations, and incorrect species assignments) for 5337 porcine genes.

This database is the largest manually curated database for any single veterinary species and is unique among porcine gene databases in regard to linking gene expression to gene function, identifying related gene pathways, and connecting data with other porcine gene databases. This database provides the first comprehensive description of three major Super-families or functionally related groups of proteins (Cluster of Differentiation (CD) Marker genes, Solute Carrier Superfamily, ATP binding Cassette Superfamily), and a comparative description of porcine microRNAs.

Electronic supplementary material

The online version of this article (10.1186/s12864-017-4009-7) contains supplementary material, which is available to authorized users.

Swine are an important models for human anatomy, nutrition, metabolism and immunology [ 1 – 3 ]. Their organs are anatomically and histologically similar to humans as are their sensory innervation and blood supply [ 4 ]. Pigs are naturally susceptible to infection with organisms that are closely related or identical to those species infecting humans including helminths ( Ascaris, Taenia, Trichuris, Trichinella, Shistosoma, Strongyloides ), bacteria ( Campylobacter, Chlamydia, Eschericia coli, Helicobacter, Neisseria, Mycoplasma, Salmonella), protozoans ( Toxoplasma ) and viri ( Coronavirus , Hepatitis E, Influenza, Nipah, Reovirus, Rotavirus) [ 2 , 5 , 6 ] . The last 10 years has seen a boon in the development of genetically modified pig as models for human cardiovascular and lung disease, neurodegenerative and musculoskeletal disorders [ 7 , 8 ] and cancer [ 9 ]. There is also a robust effort to develop pigs as sources for organs and tissues for human xenotransplantation [ 10 ].

Despite these potential strengths as a model, the lack of an annotated database for porcine gene and protein expression data is a limiting factor for translating findings in one species to another. Multiple online databases exist for the storage and retrieval of diverse bovine, rodent or human biomedical data [ 11 – 19 ]. Other databases exist for Zebrafish (ZFIN, [ 20 ]), C. elegans (WormBase, [ 21 ]), and Drosophila melanogaster (Flybase, [ 22 ]). Databases that encompass multispecies analysis such as Homologene and/or that rely on manual annotation such as InnateDb [ 23 ] include bovine but not porcine genes. Several porcine genome companion databases exist; however they lack robust manual annotation and are somewhat limited in scope or are infrequently updated [ 16 – 19 ]. Agbase, a large, multispecies functional analysis database allows the user to search 51,489 porcine genes based on 12 criteria including gene and protein names (UniProt) and Gene Ontology (GO) annotations. Furthermore, databases can contain a significant number of errors due to their primary reliance on machine-based annotation [ 24 ]. For example, the SUS-BAR database [ 19 ] is designed to identify protein orthologs based upon data that includes annotations from the machine-annotated NCBI genome. NCBI has recently begun to include GO annotations into curated entries for non-human and rodent species but most of these are indirect and often based on observations made in other species. As swine are an important model for comparative human studies, there is a critical need to have a centralized, manually-curated source of information for biomedical research. To address these needs, we created the Porcine Translational Research Database.

Construction and Content

To generate content of immunological relevance, broad-based literature searches were conducted using the following terms: apoptosis, B cell development or activation, CD markers, chemokines, chemokine receptors, cytokines, cytokine receptors, dendritic cells, type 1 IFN induced genes, inflammation, nuclear factor kappa-light-chain-enhancer of activated B cells (NFκ-B) signaling pathway, toll receptor signaling pathway, T cell development or activation, Th1 cell development and Th2 cell development. In addition, immunologically related genes associated with the susceptibility to or pathology of allergy, asthma, arthritis, atherosclerosis and inflammation were included. In addition, The Gene Ontology consortium’s community annotation wikis for immunology, cardiovascular disease and muscle biology were searched ( http://wiki.geneontology.org/index.php/Main_Page ). The Jackson Laboratory database of knockout mouse phenotypes was searched for genes leading to defects in immune or metabolic phenotypes when over or under expressed. These genes include the vast majority of genes that are related to immunity and inflammation [ 2 , 3 , 25 , 26 ]. For additional metabolically related genes, genes involved in the transport or metabolism of macronutrients, trace vitamins and minerals were searched. Other genes, associated with the susceptibility to or pathology of atherosclerosis, diabetes, and obesity, were identified. This process identified 13,054 candidate human, bovine, mouse or rat genes/proteins of interest used to select potential porcine orthologs by searching various online sources of porcine gene information. One to one orthology of protein coding genes were determined by protein structure similarity (best reciprocal BLAST hits) and the presence of a corresponding gene in the syntenic region of the human and or mouse genome. No 1:1 orthology could be established for members of some gene families including the Leukocyte Immunoglobulin-like Receptor (LILR) Killer Cell immunoglobulin-like Receptor (KIR), Carcinoembryonic antigen-related cell adhesion molecule (CEACAM) and Cytochrome P450 superfamilies. One to one porcine orthologs of human genes utilize the approved HGNC Name according to the International Society for Animal Genetics (ISAG) publishing guidelines. We defined pseudogenes by the criteria used by Ensembl and ENSCODE; namely the presence of one or more stop codons in the open reading frame that disrupt the protein structure, and (usually) a lack of intron structure at the genome level [ 27 ]. Pseudogenes are further classified into Processed, Duplicated, Unitary or Polymorphic categories [ 27 ].

Sequence generation

Genbank (non-redundant, expressed sequences tag, high throughput genomic sequence, trace archive databases and whole genome shotgun contigs databases) was searched by discontiguous Megablast using default settings (word size = 11), using reference sequence accession numbers to human, bovine, mouse or rat genes/proteins of interest. A similar search was conducted in the following databases using the human or bovine reference sequence; NIH Intramural Sequencing Center (NISC) Comparative Vertebrate Sequencing Project [ 28 ]; National Center for Biotechnology Information (NCBI), Sus scrofa Genome Assembly releases 102 to 105 and Ensembl v10.2 releases 83 to 89. For genes that were determined to be missing from build 10.2 (Additional file 1 ) (and for the mis-assembled or duplicated gene artifacts (Additional file 1 ), we also constructed templates from de novo assemblies derived from Illumina 80 bp reads of the pig alveolar macrophage transcriptome (Dawson, unpublished results) using the de novo assembly algorithm of CLC Genomics Workbench using word size of 20 and a bubble size of 50. When necessary, predicted templates (from bovine or human sequences) were supplemented with porcine expressed sequence tag (EST) assemblies, single ESTs and portions of the published Tibetan (Bioproject # PRJNA291130), Wuzhishan (Bioproject # PRJNA144099), Goettingen (Bioproject # PRJNA291011) [ 29 ], Jinhua, Meishan, Bamei, Large White, Berkshire, Hampshire, Pietrain, Landrace, Rongshang and Duroc (Bioproject # PRJNA309108) porcine genomes [ 30 ]. ESTs were assembled using CAP3 ( http://doua.prabi.fr/software/cap3 ). RNASeq reads were then mapped to these predicted templates in order to derive the full-length consensus sequence (unambiguous 6X coverage) using CLC Genome Workbench 7.0 (QIAGEN Bioinformatics, Redwood City CA). The following settings were used. Mismatch cost =2, Insertion cost = 3, Deletion cost = 3, similarity fraction = 0.95, length fraction = 0.95. Nucleotide sequences were translated using the ExPASy translate tool ( http://web.expasy.org/translate/ ). A total of 1279 of these sequences have been deposited to the transcriptome shotgun assembly sequence database under Bioproject PRJNA80971 and the short read archive under project SRP013743). In silico-derived full-length RNA sequences are provided for an additional 3391 genes. This process/pipeline is summarized in Fig. ​ Fig.1. 1 . A summary of these sequences is provided in Table ​ Table1 1 .

the porcine translational research database

Porcine Translation Research Database (PTR) Construction Flowchart

Current Database Statistics (07/12/2017)

Sequence analysis

We randomly chose 268 of these mRNA for comparison of the 5′, 3′ and ORF length comparison to the corresponding human mRNA. Data are presented in Additional file 4 . For the 1041 protein-coding genes missing from the genome, we entered the gene symbols into the DAVID version 6.8 ( https://david.ncifcrf.gov ) to assess overrepresentation of groups of gene with related function. The functional data were limited to human. Nine hundred and fifty six genes out of 1041 genes were recognized and 955 had functional annotations, of the unrecognized gene 41 are pig or artiodactyl specific genes. Data on functional enrichment of genes with a multiple comparison adjustment (Benjamini) value of >0.05 are presented in Table ​ Table3. 3 . We chose the 60 largest proteins of extreme size (>3000 amino acids) to compare the status (number of loci and completeness) in the NCBI and Ensembl build 10.2 genome. Because exon preservation is usually well conserved and there is fragmentation of certain areas of the porcine genome, the number of exons for the corresponding human gene was used for comparison. Lastly, we determined the chromosomal location of 1307 duplicated gene artifacts (2889 loci, Additional file 2 ) to identify problematic regions. Data are expressed as duplication per megabase (number of bases derived from the NCBI genome build ( http://www.ncbi.nlm.nih.gov/genome?term=sus%20scrofa ) and are presented in Fig. ​ Fig.2 2 .

the porcine translational research database

Chromosomal Locations of 1307 Duplicated Gene Artifacts (2889 Loci)

Functional Annotations for 1041 Protein-Coding Genes that are Missing from Ensembl build 10.2

Database implementation

The currently described database was constructed in the Filemaker Po Advanced v14.0 program (Filemaker Inc., Santa Clara, CA). The layout is illustrated in the sample database entry for the cytokine IL10 (Fig. ​ (Fig.3 3 panels A–D). It was deployed using the Filemaker Server Advanced v14.0 program (Filemaker Inc., Santa Clara, CA). External access to the database has been successfully tested using Chrome, Internet Explorer and Safari browsers. Other areas of the database were populated from existing published or our own unpublished data. Each publication is manually reviewed and data (antibodies, real-time PCR assays, RNA or protein expression data, functional data) is abstracted and entered into the database, along with the Pubmed ID, in the appropriate field. We have developed Taqman real-time PCR assays for 1867 of these genes making them cross reactive for as many species as possible (1067 are partially or fully human gene cross reactive). This is to ensure that comparable areas of the gene are being analyzed as well as for economic reasons. We also conducted a literature survey to determine the sequence of porcine SYBR green PCR assays. Tissue-specific gene expression summaries, using these assays, are provided for these and other studies (i.e., those using microarray and RNASeq), and a comprehensive search of catalog and published literature to identify antibodies to the corresponding proteins. Last, the “Notes Field” in the database was populated with information such as types of errors discovered, degree of 5′ and 3′ UTR conservation, degree of positive selection pressure in various species, and intron status. When the gene (sequence) is present in the genome but not annotated as a gene, we annotate the gene in the Notes field as “Not an identified gene in Ensembl build 10.2.” or “not an identified gene in NCBI build 10.2”.

the porcine translational research database

a – d Sample Database Entry

To date, we have generated 9720 full-length transcripts representing 9165 genes (Table ​ (Table1). 1 ). They include 1354 genes missing from Ensembl build 10.2 (Table ​ (Table2 2 and Additional file 1 ) and 1400 genes that have been sequenced at least two times (gene duplicated artifacts shown in Table ​ Table2 2 and Additional file 2 that were annotated as separate genes in either Ensembl or NCBI builds. Functional enrichment analysis of 1041 protein-coding genes that are missing from the genome reveals that genes that are annotated as cytokines (24, p  = 0.0053) and transcription factors (68) (particularly Homeodomain-like transcription factors (34, p  = 0.032) and CENP-B/Helix-turn-helix (HTH) domains (6, p  = 0.035) are significantly overrepresented (Table ​ (Table3). 3 ). Of note, the great majority of the Interleukin 1 Superfamily (IL1F10, IL1RN, IL36A, IL36B, IL36G, IL36RN, IL37) members are significantly ( p  = 0.0073) overrepresented. Data analysis that do not account for these genes risk missing assessment of important genes involved in inflammation and development.

Number and Types of Errors Located in Publically-available Porcine Databases

Based upon gene number estimates from other closely related species such as human and cow, we estimated that our database has a coverage rate of approximately 42% of the porcine genome. These represent sequences found in 10,232 Unigene entries (1.45 per gene), 9967 NCBI loci (5756 are single loci that are not duplicated gene artifacts or split into multiple loci, and 1793 genes have multiple (4211) loci. A total of 2109 and 1616 of the genes have no assigned Unigene number or NCBI loci, respectively. In addition to GO and Kyoto Encyclopedia of Genes and Genomes (KEGG) annotations, literature-based functional annotations (derived from more than 5500 references) are provided for these sequences. We have also discovered a relatively large number (178) of porcine or artiodactyl-specific paralogs (Additional file 3 ) for 104 protein or non-protein coding porcine genes. For genes with multiple paralogs, genes are named in the order of phylogenetic distance of the parent human or bovine gene. Some of these genes are expressed pseudogenes. Some of these genes have been previously discussed (i.e., CD36, IL1B [ 25 , 31 ]) or will be discussed in the following sections.

The transcripts we have generated for protein-coding genes include, on average, 70.5% of the corresponding 5′ and 3′ ends (each) of the human sequence (Additional file 4 ). The ORF is 99.4% conserved on a nucleotide count basis. These percentages indicate the fidelity of our procedure. We discovered extensive gene truncation (incomplete ORF) and gene duplicated artifacts (genes sequenced more than once) among the machine annotated versions of these genes. These problems are common among 1st drafts of other genomes [ 32 , 33 ]. Gene duplicated artifacts appear most frequently for chromosomes 12, 2 and 3, and less frequently for chromosomes X, 11, 13, and 1 (Fig. ​ (Fig.2). 2 ). The most frequent areas should be targeted for re-sequencing or reassembly. Analysis of the 60 largest porcine proteins in the database shows that gene fragmentation and truncation roughly correlate with protein size and number of exons (Table ​ (Table4). 4 ). Figure ​ Figure4 4 shows BLAST search results from two extremely large proteins, hemicentin (HMCN1, panel a) and titin (TTN, panel b) that have 9 loci assignments each, in the current NCBI build. Surprisingly, these proteins are not represented in Ensembl build 10.2 as annotated genes. Overall, of the 60 largest porcine proteins, only 6 and 10 are represented as single full-length sequences of the correct size in Ensembl and Genbank, respectively. We have deposited 12 de novo assemblies in the TSA archive and have provided in silico predicted RNA and protein sequences for 37 of these genes.

the porcine translational research database

Hemicentin ( a ) and Titin ( b ) Assembly Blasts

Extensive Gene Fragmentation/Truncation Frequently Occurs Among Proteins of Extreme Size

In previous studies, we extensively compared porcine, human and mouse genes related to immunity and inflammation [ 2 , 3 , 25 , 26 ]. In the following section, we will summarize our findings for three major Superfamilies or functionally related groups of proteins (CD marker genes, Solute carrier superfamily, ATP binding cassette superfamily) or non-coding RNA (microRNA) that have complete or nearly complete representation. CD markers (accessible as a group by entering CD markers in the Annotations field) encode a heterogeneous group of cell surface proteins. The Human Leucocyte Differentiation Antigen (HLDA) workshop has designated 408 molecules (some of which are grouped within a CD) as CD markers [ 34 ]. Based upon our assembly and analysis, we could establish 1:1 orthology for 357 porcine genes to those that compose HLDA version 10. Forty-three genes are not present in the porcine genome or could not be designated as 1:1 orthologs. Of these, nine genes (CLEC4C, CLEC4M, SIGLEC7, BTN3A1, LILRA1, LAIR2, PSG1, SIRPG, TNFRSF10C, are primate-specific [ 35 – 37 ]. KLRC2 (CD159c) is found in humans and rodents but not pigs. FCGR2C is a human-specific gene/pseudogene that belongs to a family of three low-affinity immunoglobulin gamma Fc receptors (CD32) [ 38 ]. We have determined that pigs have two member of this family that roughly corresponds to FCGR2A and FCGR2B. TNFRSF14 (CD270) is a marker for B cells, dendritic cells, monocytes, and Treg cells [ 39 ] found in humans and rodents, but not cows. Although, canine, feline, equine and ursine homologs have been identified, this gene may be a pseudogene in pigs as the putative ORF is interrupted by an endogenous retroviral sequence (H. Dawson, unpublished). FCRL2 (CD307b) is a marker for B cells in humans. Although sequences corresponding to FCRL2 have been identified in other mammals including dog and horse, no mouse ortholog has been identified [ 40 ]. This gene shows evidence of positive selection in humans [ 41 ] and is most likely a pseudogene in pigs.

Due to rapid evolution and post-speciation gene duplication, no 1:1 orthology could be established for most mouse and pig LILR or KIR family members, including LILRA4 (CD85G) and LILRB4 (CD85K) [ 42 ]. Similarly, other than CEACAM1 (CD66) and CEACAM6 (CD66C), no 1:1 orthology could be established for most pig and mouse CEACAM family members (CEACAM3 (CD66D) CEACAM5 (CD66E). CEACAM8 (CD67) may be a pseudogene as ESTs in Unigene Ssc.60435 predict a 243 amino acid protein interrupted by several stop codons. CEACAM8 and CEACAM6 were previously determined to have no direct murine orthologs [ 35 ]. Several other shared human-pig CD marker orthologs (ADGRE2 (CD312), ADGRE3 (CD313r), CD1A, CD1E, CR1 (CD35), CD58, FCGR2A (CD32), FCAR (CD89), FCRL3 (CD307c), FCRL4 (CD307d), ICAM3 (CD50), NCR2 (CD336), NCR3 (CD337) and TLR10 (CD290r) have no rodent orthologs [ 2 , 40 , 43 ].

A significant number of errors were discovered in genes encoding porcine orthologs of human CD markers; 25 are not present in Ensembl build 10.2, 88 of the proteins are truncated and 52 are duplicated gene artifacts. Sixty-seven full-length mRNA sequences encoding proteins, assembled from macrophage RNA-Seq reads, have been deposited in Genbank. An additional 79 in silico constructs are provided. Antibody data, gathered from publications, manufacturers or generated in house, is provided for 186 proteins including 395 monoclonal and 285 polyclonal antibodies. Additional cross reactivity for 29 proteins is expected because they are >95% similar to human proteins. Several of the CD Marker family are members of other gene families including the Solute Carrier and ATP-binding Cassette Super Family.

The Human Genome Organization’s gene nomenclature committee (HGNC) has assigned 395 genes to the Solute Carrier Superfamily, 21 are pseudogenes and three hundred seventy four encode proteins (accessible as a group by entering Solute Carrier Superfamily in the Annotations field). These are organized into 52 subfamilies; about 25% are dedicated to nutrient transport. The porcine Solute Carrier Super family contains 398 protein-coding members and all human subfamilies are represented. Forty-two of these genes are present in other porcine genomes but missing from Ensembl build 10.2, 113 are truncated and 58 of these are duplicated gene artifacts. Sixty three full-length mRNA sequences, assembled from macrophage RNA-Seq reads, have been deposited in Genbank and an additional 159 in silico constructs are provided. Forty-two of these genes are missing from all porcine genomes or are present as pseudogenes. Among these genes are UCP1 (thermogenein), a protein involved in non-shivering thermogenesis and a pseudogene in pigs [ 44 ] and SLC52A2, a primate specific riboflavin transporter [ 45 ]. Other species-specific genes include eight primate-specific (SLC2A14, SLC22A24, SLC35E2, SLC35G3, SLC35G4, SLC35G5, SLCO1B1, SLCO1B7), one human specific (SLC22A25) gene and 14 mouse or rodent-specific genes (Slc6a20b, Slc7a12, Slc21a4, Slc22a19, Slc22a21, Slc22a22, Slc22a26, Slc22a27, Slc22a28, Slc22a29, Slc22a30, Slco1a1, Slco1b2, and Slco6b1). SLC25A18 is present in human and rodent genomes but is missing from bovine and porcine genomes. SLC25A52 is present in primate and rat genomes but not mouse. SLC9C2 is a pseudogene in mouse [ 46 ]. SLC22A31 is an expressed pseudogene in pigs and is missing in rodents. SLC22A11 is an expressed pseudogene in pigs and a non-expressed pseudogene in mouse. Lastly, SLC23A4, an intestinal nucleobase transporter [ 47 ], is a pseudogene in humans but is present in pig, cow and rodent genomes. Several porcine or artiodactyl-specific gene expansions are found in subfamilies (Additional file 3 ) including SLC7A3 (14 members), SLC7A13 (3 members) SLC22A6 (2 members), SLC22A10 (4 members) and SLC47A1 (2 members). The biological functions of these paralogs remain to be determined; however the parent genes are involved in amino acid (SLC7A3, SLC7A13) or dipeptide transport (SLC22A6) [ 48 , 49 ].

The HGNC has assigned 51 genes to the ATP binding Cassette Superfamily, three are pseudogenes and 48 encode proteins (accessible as a group by entering ATP binding Cassette Superfamily in the Annotations field). These are organized into five subfamilies (A-G), about 20% are dedicated to nutrient (i.e., carotenoid, cholesterol and vitamin A) transport. The porcine ATP binding Cassette Family contains 57 members and all human subfamilies are represented. These include five that are missing from Ensembl build 10.2 and 18 that are duplicated gene artifacts. Five of these genes are present in other porcine genomes, but missing from Ensembl build 10.2, 21 are truncated, and 18 of these genes are duplicated gene artifacts, Eleven full-length mRNA sequences, assembled from macrophage RNA-Seq reads, have been deposited in Genbank and an additional 24 in silico constructs are provided.

An analysis of this superfamily revealed that ABCC11 has no murine ortholog [ 50 ] and ABCA8 has no direct rodent ortholog as the gene has diverged into two paralogs, Abca8a and Abca8b [ 51 ]. The ABCC4, a prostaglandin E2 transporter [ 52 ], has diverged from the parent gene into five paralogs (ABCC4L1, ABCC4L2, ABCC4L3, ABCC4L4 and ABCC4L5 (Additional file 3 ). ABCA10, involved in human macrophage cholesterol transport [ 53 ], is a pseudogene in rodents. It may be an expressed pseudogene in pigs as the predicted protein is half (787 amino acids) the size of human ABCA10 (1543 amino acids) and weak expression (by RNASeq) was detected in macrophages and moderated expression in intestine (H. Dawson, unpublished). ABCA17 is an expressed pseudogene is humans and pigs. Like the Solute Carrier Superfamily, most of the genes in the ATP binding Cassette Super family have not been characterized at the functional level. Nevertheless, the similarities and differences in the ATP binding Cassette and ATP binding Cassette Super families impact the suitability of rodents and pigs as models for human drug and nutrient transport and metabolism.

The exact number of microRNAs in the porcine genome is unknown. There are 4272 annotated microRNAs in the human genome (build 30). Although there are several papers describing the measurement of porcine microRNAs in various tissues or estimating the number in the porcine genome [ 54 – 57 ] and three partially overlapping sources of porcine microRNA sequences, the exact number of porcine microRNAs is currently unknown. There are only 382, 385 and 816 (non-redundant) annotated pig miRNA sequences in Mirbase, NCBI gene build, and Ensembl build 10.2, respectively. These three sources of information have a significant amount of overlap (Fig. ​ (Fig.5a). 5a ). We have consolidated this information and provide sequence data for our own predicted sequences based on conserved sequence identity to 1900 human, mouse or bovine sequences, to provide 1033 non-redundant porcine microRNA sequences (accessible as a group by entering MicroRNA in the Annotations field). Of note, all of the sequences found in Mirbase were found in the NCBI gene build, 59 of the microRNA sequences in Ensembl were found to be duplicated artifacts, and 214 of the 1033 sequences are not present in the current Ensembl gene build (10.2). This includes 81 that we have predicted based upon their presence in other species and other unfinished porcine genomes. We discovered the following species- or genera-specific microRNA; pigs (454), humans (199), primates (111) bovine (179), mouse (76) and rodents (20). Many of the porcine-specific microRNA have arisen from biological duplication/expansion (Additional file 3 ). A comparison of microRNA that are present in pigs and shared among at least one of the three other species (human, cows, and mice) revealed that 318 microRNA are shared among the four species, 107 are shared between pigs, humans and cows but not mice, and 34 are shared between pigs, mice and cows but not humans (Fig. ​ (Fig.5b). 5b ). Thus, the frequency of non-conserved microRNA preservation between human and pig is nearly three times that of mouse to pig.

the porcine translational research database

Analysis of MicroRNA Sequence Origin and Species Similarity. These 3 sources of information for our 1047 MicroRNA sequences have a significant amount of overlap ( a ) and include 81 that we have predicted based upon their presence in other species and other unfinished porcine genomes. Of these sequences, 454 are unique to pigs, 318 are shared among the four species ( b ), 55 are shared between humans and pigs but not mice and cows and 25 are shared between mice and pigs but not humans and cows

The Porcine Translational Research Database is named because of its unique utility to translate findings made in rodents to pigs and from those in pigs to humans. A comprehensive literature-based survey was conducted to identify genes that have demonstrated function in humans, mice or pigs. The resulting data in the database is documented by >6000 references. The database currently contains 65 data fields for each entry. Our efforts to improve the genome and its annotation are similar to other efforts, for example the sequencing of 12,000 genes to supplement annotation of the pig genome [ 32 , 33 , 58 ] and de novo assembly of multiple pig genomes to reveal 1737 protein coding genes that are missing from Ensembl build 10.2 [ 30 ]. The online Supplemental data from the latter manuscript was unavailable at the time of the preparation of this manuscript so no comparison could be made. The manual assembly of >9700 RNA sequences has direct practical implications for genomics-based analysis. The state of the current genome build (mis-annotations, duplication artifacts, and missing sequences) effectively prohibits its use for aligning RNAseq reads. We have used these sequences to compare gene expression separately from Ensembl 10.2 and have also compared the number of reads obtained from the corresponding templates in Ensembl 10.2. For the great majority of transcripts compared, as expected, our full-length sequences provided a higher level of sensitivity than the corresponding Ensembl sequences (H. Dawson unpublished).

The full 5′ and 3′ representation of each gene will also allow for characterization of regulatory regions and miRNA target sites. In our estimation, >40% of transcripts in Ensembl or NCBI genomes do not represent the full-length gene. Our efforts will also allow for further consolidation of porcine Unigene numbers. Currently, each gene is represented by from 0 to >10 Unigene assignments, and >10% of genes have more than one.

It is significant that we discovered a large number of errors (about 30% of entries) in the publicly available sequence databases (these can be accessed by searching the “Notes Field” using the word “error” (Fig. ​ (Fig.3)). 3 )). In addition to the duplication artifacts, mis-annotations and missing genes, we also encountered a number of RNA sequences in publically available archives belonging to other species. For, example, human (AHR, AF233432.1), panda (IL2, NM_001199892.1) and rat (NUDT14, ESTs in Unigene Ssc.85635) RNA sequences are annotated as porcine derived. We also found sources of contaminating DNA from completely unrelated species. For example, about 1/5 of porcine chromosome 4 clone CU076066.6 is from Zebrafish. These sequences represent 6 Zebrafish genes (LOC100003615, LOC447815, LOC108179932, LOC108183883, LOC108183971, and LOC103910681) and are annotated as porcine genes by Ensembl build 10.2 (ENSSSCG00000006223) and NCBI genomes (LOC100739857). Similarly, several NCBI loci (ASNA1L*, LOC100737282, LOC100737202, LOC100620149, LOC100737282) and one Ensembl locus (ENSSSCG00000026988) are derived from contaminating Babesia bigemina genomic DNA.

We have discovered several sources of systematic errors in the Ensmbl and NCBI gene/protein prediction or annotation pipelines. For example all selenoproteins in Ensembl are truncated because the codon (UGA) for selenocysteine is mistranslated or translated as a stop codon. We and others have identified a systematic error in the identification of another gene family, the Taste receptor, type 2 (TAS2R) Superfamily. Despite being intronless and mostly devoid of 5′ and 3′ UTR regions, Ensembl consistently fails to recognize them as genes [ 3 ]. These data illustrate the critical importance of the manual-curation process to reduce errors.

We believe that this is the largest manually curated database for any veterinary species and that the infomantics are unique among those targeting a veterinary species in regard to linking gene expression to gene function, identification of related gene pathways, and connectivity with other porcine gene databases, as well as for reagents that measure gene and protein expression. In addition, it is the largest source of centralized antibody information for the pig. Any database must be updated frequently in order to be useful. Currently the database is updated monthly and we anticipate expanding the content to include all porcine genes. There are several Super families of genes that will be the next targets of our efforts. One is the GPCR super family, the exact size of the GPCR super family is still unknown, but nearly 800 different human genes (or ~4% of the entire protein-coding genome) have been predicted to code for them. We will also continue to develop and annotate new assays. We intend to include our own prediction analysis for the promoter and 3′ UTR region of RNA for transcription factor and microRNA binding sites. Lastly, we intend to synchronize our database with the porcine “Snowball” array and porcine gene expression atlas [ 59 ].

Additional files

Porcine genes missing in Ensembl build 10.2 of the porcine genome. Gene names and evidence/source for RNA sequence of genes that are missing from Ensembl build 10.2. (XLSX 112 kb)

Artifactually duplicated genes in Ensembl build 10.2. Gene names, Ensembl and NCBI loci numbers and NCBI genome build 10.2 coordinates of artifactually duplicated genes (XLSX 282 kb)

Porcine or artiodactyl-specific paralogs. Gene names, Ensembl and NCBI loci numbers and Build 10.2 NCBI gene coordinates of porcine or artiodactyl-specific paralogs (XLSX 58 kb)

5′, ORF and 3′ end comparison of porcine and human mRNAs. 5′, ORF and 3′ end comparison of porcine and human mRNAs (XLSX 66 kb)

Supported by USDA/ARS Project Plan 1235-51,000-055-00D.

Availability and requirements

The dataset(s) supporting the conclusions of this article are included within the article, its additional file (Additional files 1 , 2 , 3 and 4 ) and within the online database ( http://www.ars.usda.gov/Services/docs.htm?docid=6065 ).

Authors’ contributions

HDD, CC, BG and JS contributed to the content of the database. HDD and JFU wrote the manuscript. All authors read and approved the final manuscript.

Ethics approval and consent to participate

Not applicable.

Consent for publication

Competing interests.

The authors declare that they have no competing interests

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Contributor Information

Harry D. Dawson, Email: [email protected] .

Celine Chen, Email: [email protected] .

Brady Gaynor, Email: [email protected] .

Jonathan Shao, Email: [email protected] .

Joseph F. Urban, Jr, Email: [email protected] .

Full text links 

Read article at publisher's site: https://doi.org/10.1186/s12864-017-4009-7

Citations & impact 

Impact metrics, citations of article over time, alternative metrics.

Altmetric item for https://www.altmetric.com/details/24258398

Article citations

Molecular insights into the proteomic composition of porcine treated dentin matrix..

Zhang X , Zhou S , Zhan Y , Mei Z , Qian A , Yuan Y , Zhang X , Fu T , Ma S , Li J

Mater Today Bio , 25:100990, 05 Feb 2024

Cited by: 0 articles | PMID: 38371466 | PMCID: PMC10873736

Editorial: Vetinformatics: an insight for decoding livestock systems through in silico biology.

Kim JM , Pathak RK

Front Vet Sci , 10:1292733, 30 Oct 2023

Cited by: 1 article | PMID: 38026650 | PMCID: PMC10643129

Raw potato starch alters the microbiome, colon and cecal gene expression, and resistance to Citrobacter rodentium infection in mice fed a Western diet.

Smith AD , Chen C , Cheung L , Dawson HD

Front Nutr , 9:1057318, 10 Jan 2023

Cited by: 1 article | PMID: 36704785 | PMCID: PMC9871501

How is depth of anaesthesia assessed in experimental pigs? A scoping review.

Mirra A , Gamez Maidanskaia E , Carmo LP , Levionnois O , Spadavecchia C

PLoS One , 18(3):e0283511, 23 Mar 2023

Cited by: 3 articles | PMID: 36952576 | PMCID: PMC10035875

Identification of Important Factors Causing Developmental Arrest in Cloned Pig Embryos by Embryo Biopsy Combined with Microproteomics.

Zhang Y , Yang L , Zhang Y , Liang Y , Zhao H , Li Y , Cai G , Wu Z , Li Z

Int J Mol Sci , 23(24):15975, 15 Dec 2022

Cited by: 0 articles | PMID: 36555617 | PMCID: PMC9783476

Data behind the article

This data has been text mined from the article, or deposited into data resources.

BioStudies: supplemental material and supporting data

  • http://www.ebi.ac.uk/biostudies/studies/S-EPMC5568366?xr=true

BioProject (5)

  • (1 citation) BioProject - PRJNA291011
  • (1 citation) BioProject - PRJNA80971
  • (1 citation) BioProject - PRJNA144099
  • (1 citation) BioProject - PRJNA309108
  • (1 citation) BioProject - PRJNA291130

Ensembl Genome Browser (2)

  • (1 citation) Ensembl - ENSSSCG00000006223
  • (1 citation) Ensembl - ENSSSCG00000026988

Nucleotide Sequences (Showing 13 of 13)

  • (1 citation) ENA - SRP013743
  • (1 citation) ENA - JAG69421
  • (1 citation) ENA - JAA53703
  • (1 citation) ENA - JAA53804
  • (1 citation) ENA - JAA53694
  • (1 citation) ENA - JAA53695
  • (1 citation) ENA - JAA53665
  • (1 citation) ENA - JAA53700
  • (1 citation) ENA - JAA53656
  • (1 citation) ENA - JAG69485
  • (1 citation) ENA - JAG69054
  • (1 citation) ENA - JAG69152
  • (1 citation) ENA - JAG69140

Similar Articles 

To arrive at the top five similar articles we use a word-weighted algorithm to compare words from the Title and Abstract of each citation.

Structural and functional annotation of the porcine immunome.

Dawson HD , Loveland JE , Pascal G , Gilbert JG , Uenishi H , Mann KM , Sang Y , Zhang J , Carvalho-Silva D , Hunt T , Hardy M , Hu Z , Zhao SH , Anselmo A , Shinkai H , Chen C , Badaoui B , Berman D , Amid C , [...] Tuggle CK

BMC Genomics , 14:332, 15 May 2013

Cited by: 111 articles | PMID: 23676093 | PMCID: PMC3658956

MILANO--custom annotation of microarray results using automatic literature searches.

Rubinstein R , Simon I

BMC Bioinformatics , 6:12, 20 Jan 2005

Cited by: 33 articles | PMID: 15661078 | PMCID: PMC547913

The mouse genome database (MGD): new features facilitating a model system.

Eppig JT , Blake JA , Bult CJ , Kadin JA , Richardson JE , Mouse Genome Database Group

Nucleic Acids Res , 35(database issue):D630-7, 29 Nov 2006

Cited by: 71 articles | PMID: 17135206 | PMCID: PMC1751527

Anatomy and bronchoscopy of the porcine lung. A model for translational respiratory medicine.

Judge EP , Hughes JM , Egan JJ , Maguire M , Molloy EL , O'Dea S

Am J Respir Cell Mol Biol , 51(3):334-343, 01 Sep 2014

Cited by: 107 articles | PMID: 24828366

Exploring human disease using the Rat Genome Database.

Shimoyama M , Laulederkind SJ , De Pons J , Nigam R , Smith JR , Tutaj M , Petri V , Hayman GT , Wang SJ , Ghiasvand O , Thota J , Dwinell MR

Dis Model Mech , 9(10):1089-1095, 01 Oct 2016

Cited by: 17 articles | PMID: 27736745 | PMCID: PMC5087824

Funding 

Funders who supported this work.

Agricultural Research Service (1)

Grant ID: 1235-51000-055-00D

1 publication

Europe PMC is part of the ELIXIR infrastructure

ORIGINAL RESEARCH article

Development of a triplex quantitative reverse transcription-polymerase chain reaction for the detection of porcine epidemic diarrhea virus, porcine transmissible gastroenteritis virus, and porcine rotavirus a.

Tingyu Luo

  • State Key Laboratory for Animal Disease Control and Prevention, Heilongjiang Provincial Key Laboratory of Laboratory Animal and Comparative Medicine, National Poultry Laboratory Animal Resource Center, Harbin Veterinary Research Institute, Chinese Academy of Agricultural Sciences, Harbin, China

Porcine viral diarrhea is caused by many pathogens and can result in watery diarrhea, dehydration and death. Various detection methods, such as polymerase chain reaction (PCR) and real-time quantitative PCR (qPCR), have been widely used for molecular diagnosis. We developed a triplex real-time quantitative reverse transcription PCR (qRT-PCR) for the simultaneous detection of three RNA viruses potentially associated with porcine viral diarrhea: porcine epidemic diarrhea virus (PEDV), porcine transmissible gastroenteritis virus (TGEV), and porcine rotavirus A (PoRVA). The triplex qRT-PCR had R 2 values of 0.999 for the standard curves of PEDV, TGEV and PoRVA. Importantly, the limits of detection for PEDV, TGEV and PoRVA were 10 copies/μL. The specificity test showed that the triplex qRT-PCR detected these three pathogens specifically, without cross-reaction with other pathogens. In addition, the approach had good repeatability and reproducibility, with intra-and inter-assay coefficients of variation <1%. Finally, this approach was evaluated for its practicality in the field using 256 anal swab samples. The positive rates of PEDV, TGEV and PoRVA were 2.73% (7/256), 3.91% (10/256) and 19.14% (49/256), respectively. The co-infection rate of two or more pathogens was 2.73% (7/256). The new triplex qRT-PCR was compared with the triplex RT-PCR recommended by the Chinese national standard (GB/T 36871-2018) and showed 100% agreement for PEDV and TGEV and 95.70% for PoRVA. Therefore, the triplex qRT-PCR provided an accurate and sensitive method for identifying three potential RNA viruses for porcine viral diarrhea that could be applied to diagnosis, surveillance and epidemiological investigation.

1 Introduction

The main pathogens causing diarrhea in piglets are porcine epidemic diarrhea virus (PEDV), porcine transmissible gastroenteritis virus (TGEV) and porcine rotavirus A (PoRVA) ( Zhang et al., 2013 ; Monteagudo et al., 2022 ). Co-infection by these viruses is common in swine and poses a serious challenge for diarrhea control in swine farms ( Zhang et al., 2019 ; Shi et al., 2021 ). PEDV is an enveloped, single-stranded, positive-sense RNA virus belonging to the genus Coronavirus in the family Coronaviridae . In 1978, researchers first isolated PEDV from the intestinal contents of pigs in the UK ( Pensaert and de Bouck, 1978 ). PEDV has spread globally since then, causing watery diarrhea, vomiting, dehydration and death in pigs, resulting in severe economic losses for the swine industry ( Wang et al., 2016 ). The PEDV genome is approximately 28 kb long and comprises seven open reading frames (ORFs). The M gene (ORF5) encodes the membrane protein M and has a relatively conserved sequence, which makes it a suitable molecular detection target for PEDV diagnosis ( Kocherhans et al., 2001 ; Yang et al., 2014 ; Rasmussen et al., 2018 ). TGEV is also an RNA virus that belongs to the Coronavirus genus and Coronaviridae family. It was the first coronavirus identified in pigs and is responsible for porcine viral diarrhea. The genome of TGEV is approximately 28.6 kb in length and comprises nine major ORFs. The N gene (ORF6) encodes the capsid protein N and is relatively conserved in the TGEV genome ( Eleouet et al., 1995 ; Yount et al., 2000 ). Another porcine enteric virus, PoRVA, is a non-enveloped double-stranded RNA virus belonging to the genus Rotavirus and family Reoviridae . It is one of the major pathogens responsible for severe diarrhea in piglets worldwide ( Vlasova et al., 2017 ; Luo et al., 2023 ). The PoRVA genome is approximately 18.5 kb and has 11 double-stranded RNA segments. NSP3 is a relatively conserved gene that plays a key role in viral replication and transcription, and is a common target gene for detecting PoRVA infection ( Ghosh and Kobayashi, 2011 ).

Porcine viral diarrhea caused by these three enteric viruses poses serious health and economic threats to pig farming in China. It is also a challenge to the microbiological quality control of specific pathogen-free pigs for scientific research. To cope with the challenges of PEDV, TGEV and PoRVA, 34 standards have been publicly released in China so far, including 3 national standards, 6 agricultural industry and entry-exit inspection and quarantine industry standards, and 25 provincial local standards. These standards specify the detection methods for the three pathogens, such as reverse transcription polymerase chain reaction (RT-PCR), nested RT-PCR, single real-time quantitative RT-PCR (qRT-PCR) and duplex qRT-PCR. Among the 34 standards, a triplex RT-PCR technique was established only in the Chinese national standard ( GB/T 36871-2018, 2018 ) for simultaneous detection and diagnosis of PEDV, TGEV and PoRVA. In addition, the duplex RT-PCR and duplex qRT-PCR techniques for differential diagnosis of dual infections by porcine viral diarrhea viruses were developed in some Chinese provincial local standards, e.g., Zhejiang provincial local standard ( DB33/T 2254-2020, 2020 ) and Anhui provincial local standard ( DB34/T 2795-2016, 2016 ). Porcine viral diarrhea is also severe worldwide, which has led to development of various pathogen detection techniques, such as triplex RT-PCR, nested RT-PCR and qRT-PCR ( Huang et al., 2019 ; Chen et al., 2023 ). The accurate and rapid molecular diagnosis is essential for the prevention and control of the diseases caused by PEDV, TGEV and PoRVA. Therefore, it is necessary to establish a detection method with high specificity, sensitivity and efficiency.

Real-time qPCR is an accurate, sensitive, and rapid method for detecting and quantifying target genomes. Compared to conventional single qPCR, multiplex qPCR can simultaneously detect multiple target genes in a single reaction, showing many advantages such as high efficiency, throughput, and cost effective ( Mackay, 2004 ; Yang et al., 2022 ). Advances in molecular biology techniques have led to widespread use of multiplex qPCR in clinical detection ( Wang et al., 2020 ). In this study, we designed primers and probes based on the conserved fragments of PEDV M gene, TGEV N gene and PoRVA NSP3 gene, and successfully developed a triplex qRT-PCR based on TaqMan probes. This method was highly sensitive and specific and did not cross-react with the genomes of other swine pathogens. It could be used for diagnosis, epidemiological investigation and microbiological quality control of specific pathogen-free pigs.

2 Materials and methods

2.1 viral nucleic acids and clinical samples.

The genomes (DNA or RNA) of PEDV, TGEV, PoRVA, pseudorabies virus (PRV), porcine circovirus type 2 (PCV2), porcine parvovirus (PPV), porcine deltacoronavirus (PDCoV), Seneca virus A (SVA), Toxoplasma gondii , Leptospira interrogans , Mycoplasma hyopneumoniae , Mycoplasma hyorhinis , Haemophilus parasuis , Streptococcus suis , Pasteurella multocida and Actinobacillus pleuropneumoniae were preserved by the State Key Laboratory for Animal Disease Control and Prevention of China or Heilongjiang Provincial Key Laboratory of Laboratory Animal and Comparative Medicine. In addition, 256 anal swab samples from pigs with or without clinical diarrhea were obtained from Animal Health Testing Center of Harbin Veterinary Research Institute, Chinese Academy of Agricultural Sciences. The Institutional Review Board of Harbin Veterinary Research Institute or the Regulations on the Administration of Laboratory Animals of China did not require the study to be reviewed or approved by an ethics committee because the samples were collected from animals that were already dead or euthanized for other purposes, and no additional harm or intervention was imposed on the animals.

2.2 Primers and TaqMan probes

To ensure the detection performance of the primers and probes used in the triplex qRT-PCR, M gene of PEDV genome, N gene of TGEV genome and NSP3 gene of PoRVA genome were selected as the detection targets, based on their relative conservation. We obtained 86 PEDV M gene sequences, 46 TGEV N gene sequences and 20 PoRVA NSP3 gene sequences from GenBank database. We used MegAlign to align them and determine the most conserved regions of each target gene. Using Primer Express 3.0.1, we designed primers and probes for the three viruses with the following conditions: primer length 18–30 bp, primer melting temperature ( T m ) 58–62°C, primer GC content 40–60%, product T m 70–90°C and product size 70–150 bp. We ensured that the probe T m value was higher than that of the primers. The specificity of the primers and probes was verified using the BLAST tool provided by the National Center for Biotechnology Information. For triplex detection, the probes for the three viral genes were labeled with different 5′-reporting dyes: Victoria Blue (VIC), Cyanine 5 (Cy5) and Fluorescein (FAM) and corresponding 3′-quenchers: Black Hole Quencher 1 (BHQ1), Black Hole Quencher 2 (BHQ2) and Minor Groove Binder (MGB). The triplex RT-PCR recommended by the Chinese national standard ( GB/T 36871-2018, 2018 ) was used to verify the accuracy of the results for clinical samples. The details of the primers and probes are provided in Table 1 . Figure 1 shows the locations of the triplex qRT-PCR primers and probes for PEDV, TGEV and PoRVA in different reference strains.

www.frontiersin.org

Table 1 . Primers and probes used in this study.

www.frontiersin.org

Figure 1 . Alignment of sequences of reference viral strains collected from GenBank database. The locations of the primers/probes specific for PEDV M gene, TGEV N gene and PoRVA NSP3 gene are shown. The positions of the partial nucleotide fragments are indicated by numbers.

2.3 Preparation of standard plasmids

A synthetic gene fragment (PEDV- M -TGEV- N -PoRVA- NSP3 ) containing partial sequences of PEDV M gene, TGEV N gene and PoRVA NSP3 gene was constructed in Sangon Biotech Co., Ltd. (Shanghai, China). This fragment was inserted into the pUC57 cloning vector, forming a standard plasmid (pUC57-PEDV M & TGEV N & PoRVA NSP3 ) for subsequent detection ( Figure 2 ). The sequence of the synthetic gene fragment is shown in Supplementary Table S1 . The plasmids were quantified by ultraviolet absorbance at 260 and 280 nm wavelengths using a NanoDrop spectrophotometer (Thermo Fisher Scientific, Waltham, MA, United States) and their copy number was calculated based on the size of the standard plasmid template using the following formula: copies/μL = (A260 (ng/μL) × 10 −9  × 6.02 × 10 23 )/(DNA length × 650). The standard plasmids were serially diluted 10-fold to a concentration gradient of 10 8 –10 0 copies/μL with EASY Dilution (TaKaRa, China, Dalian).

www.frontiersin.org

Figure 2 . Standard plasmid containing three conserved gene fragments: PEDV M (693 bp), TGEV N (649 bp) and PoRVA NSP3 (114 bp). Each fragment had specific restriction enzyme cutting sites at both ends: Nde I and Sac I for PEDV M ; Kpn I and Bam HI for TGEV N ; and Apa I and Xho I for PoRVA NSP3 .

2.4 Optimization of the triplex qRT-PCR

The unoptimized triplex qRT-PCR consisted of 10 μL 2× One Step U + Mix (Vazyme, China, Nanjing), 1 μL One Step U + Enzyme Mix (Vazyme), 0.4 μL 50× ROX Reference Dye II, 0.4 μL each primer (final concentration of 200 nM), 0.2 μL each probe (final concentration of 100 nM), 4 μL template and 1.6 μL RNase-free water in a total volume of 20 μL. The amplification was performed on an ABI QuantStudio5 real-time PCR system (Thermo Fisher Scientific) with the following program: 55°C for 15 min, 95°C for 30 s; 40 cycles of 95°C for 10 s and 60°C for 30 s. The fluorescence signal was automatically collected at the end of each cycle. To optimize the reaction system, we explored different primer volumes (10 μM) and probe volumes (10 μM). A range of primer volumes (0.3–1.2 μL) was assessed to achieve final concentrations spanning 150–600 nM. Additionally, probe volumes were varied from 0.1 to 0.6 μL, covering a concentration range of 50–300 nM. Recombinant plasmids (10 7 copies/μL) served as the detection template for optimization. Finally, the fluorescence intensity and cycle threshold (Ct) values of each primer and probe concentration were compared to determine the optimal volumes. The annealing temperature was also optimized by setting six gradients from 56 to 61°C and comparing the fluorescence intensity and Ct values of each gradient.

2.5 Establishment of standard curves for the triplex qRT-PCR

On the basis of the optimized reaction and protocol, three replicates of plasmid samples with serial dilutions from 10 8 to 10 copies/μL were detected using the triplex qRT-PCR and subjected to linear regression between Ct values and the logarithm of plasmid copy numbers. Eight-point standard curves were established for PEDV, TGEV and PoRVA, including negative controls.

2.6 Specificity of the triplex qRT-PCR

To evaluate the specificity of the primer and probe sets, genomes (DNA or RNA) of PEDV, TGEV, PoRVA, PRV, PCV2, PPV, PDCoV, SVA, T. gondii , L. interrogans , M. hyopneumoniae , M. hyorhinis , H. parasuis , S. suis , P. multocida and A. pleuropneumoniae were tested using the triplex qRT-PCR. All nucleic acid samples were stored previously in our laboratory.

2.7 Sensitivity of the triplex qRT-PCR

For sensitivity assessment, standard plasmids were serially diluted 10-fold to final concentrations ranging from 10 8 to 1 copies/μL. These dilutions were used as templates to determine the limit of detection (LoD) for triplex qRT-PCR and each reaction was repeated three times in a single test. We tested the standard plasmids at 100, 10 and 1 copies/μL 20 times to ensure the LoD accuracy. We set the LoD as the lowest concentration of standard plasmids that gave positive results in 85% of the replicates and marked it on the amplification curves. The threshold was set in the middle of the exponential amplification phase in the logarithmic view. A positive test result was defined as an exponential fluorescence curve that crossed the threshold within 35 cycles [(Ct) <35]. According to this definition, we calculated the positive rates at 100, 10 and 1 copies/μL of standard plasmids.

2.8 Repeatability and reproducibility of the triplex qRT-PCR

The repeatability (intra-assay precision) and reproducibility (inter-assay precision) of the developed triplex qRT-PCR were determined using standard plasmids at three different concentrations (10 6 , 10 4 and 100 copies/μL). We analyzed each dilution in triplicate on the same day for intra-assay variability and in six independent experiments by two different operators on different days for inter-assay variability. The coefficient of variation (CV) of the Ct values of the samples at different concentrations was calculated in both intra-assay and inter-assay tests to estimate the repeatability and reproducibility.

2.9 Detection of clinical samples by the triplex qRT-PCR

Viral RNA was extracted from 256 anal swab samples using AxyPrep Body Fluid Viral DNA/RNA Miniprep Kit (Corning Life Sciences, China, Wujiang). The RNA samples were tested in triplicate by the optimized triplex qRT-PCR. Subsequently, the sample RNA was reverse transcribed into cDNA using PrimeScript ™ RT Master Mix (Perfect Real Time) (TaKaRa) and detected by the triplex RT-PCR recommended by the Chinese national standard ( GB/T 36871-2018, 2018 ), to validate the clinical performance of the developed triplex qRT-PCR assay. For the triplex RT-PCR, the reaction mixture (25 μL) contained 12.5 μL 2× Taq PCR Star Mix (Genstar, China, Beijing), 0.2 μL PEDV primers (final concentration of 80 nM), 0.4 μL TGEV primers (final concentration of 160 nM), 1 μL PoRVA primers (final concentration of 400 nM), 4 μL cDNA template and 5.3 μL RNase-free water. We performed the triplex RT-PCR with the following parameters: pre-denaturation at 94°C for 2 min, followed by 35 cycles of denaturation at 94°C for 30 s, annealing at 55°C for 30 s, extension at 72°C for 1 min, and final extension at 72°C for 10 min.

3.1 Construction of the standard plasmid

A plasmid with three conserved gene fragments of PEDV M (693 bp), TGEV N (649 bp) and PoRVA NSP3 (114 bp) was constructed, all containing their respective qRT-PCR amplicons. The plasmid was used as the standard for subsequent detection.

3.2 Optimization of the triplex qRT-PCR

Using the standard plasmid pUC57-PEDV M & TGEV N & PoRVA NSP3 with the target fragments as the template, we optimized the reaction conditions of the triplex qRT-PCR. Orthogonal experiments determined the optimal annealing temperature and concentrations of primers and probes. For TGEV and PoRVA, the optimal volumes of probes and primers were 0.4 and 0.7 μL, respectively. For PEDV, both were 0.3 μL ( Figure 3 ). The confirmed reaction system was listed in Table 2 . The optimal annealing temperature was 60°C, which yielded the highest amplification efficiency ( Figure 4 ).

www.frontiersin.org

Figure 3 . Optimization of the triplex qRT-PCR assay. (A) Optimization of primer volumes and the final concentrations in the reaction. The optimal volumes of forward and reverse primers were 0.7 μL for TGEV (final concentration: 350 nM) and PoRVA (final concentration: 350 nM), and 0.3 μL for PEDV (final concentration: 150 nM). (B) Optimization of probe volumes and the final concentrations in the reaction. The optimal volumes of probes were 0.4 μL for TGEV (final concentration: 200 nM) and PoRVA (final concentration: 200 nM), and 0.3 μL for PEDV (final concentration: 150 nM).

www.frontiersin.org

Table 2 . The reaction system of triplex qRT-PCR.

www.frontiersin.org

Figure 4 . Triplex qRT-PCR amplification curves of different annealing temperatures. The optimal annealing temperature for triplex qRT-PCR was determined by measuring the amplification efficiency of the reaction at different T m values. The highest amplification efficiency was achieved at 60°C.

3.3 Establishment of the standard curve

The standard plasmid was diluted in a 10-fold series and eight standard samples (10 8 –10 copies/μL) were selected as templates to establish the standard curve of the triplex qRT-PCR. Figure 5 shows the correlation coefficients ( R 2 ), equation slopes and amplification efficiencies ( E ) for each virus: PEDV, 0.999, −3.272 and 102.118%; TGEV, 0.999, −3.294 and 101.179%; and PoRVA, 0.999, −3.22 and 104.441%. The initial template and Ct value had a good linear relationship, as indicated by R 2 and E .

www.frontiersin.org

Figure 5 . Standard curve for the triplex qRT-PCR assay. (A) Standard curve for PEDV M gene. (B) Standard curve for TGEV N gene. (C) Standard curve for PoRVA NSP3 gene.

3.4 Specificity of the triplex qRT-PCR

Genomic DNA or RNA of 16 porcine pathogens (PEDV, TGEV, PoRVA, PRV, PCV2, PPV, PDCoV, SVA, T. gondii , L. interrogans , M. hyopneumoniae , M. hyorhinis , H. parasuis , S. suis , P. multocida and A. pleuropneumoniae ) was used as a template for the triplex qRT-PCR. Amplification curves were obtained for PEDV, TGEV and PoRVA but not for the other porcine pathogens ( Figure 6 ). Therefore, the triplex qRT-PCR assay was specific for detection of PEDV, TGEV and PoRVA, and had no cross-reaction with other porcine pathogens.

www.frontiersin.org

Figure 6 . Specific amplification curves for the triplex qRT-PCR assay. Three fluorescent signals were monitored by triplex qRT-PCR. RNA of PEDV, TGEV and PoRVA was used as a positive control. No fluorescent signal was observed when genomes of other porcine pathogens were used as templates. The graph type was set in linear phase to simultaneously display the three different fluorescent signals (VIC, Cy5 and FAM) with distinct signal intensities. Other pathogens included PRV, PCV2, PPV, PDCoV, SVA, T. gondii , L. interrogans , M. hyopneumoniae , M. hyorhinis , H. parasuis , S. suis , P. multocida and A. pleuropneumoniae.

3.5 Sensitivity of the triplex qRT-PCR

Different concentrations of standard plasmids were used as templates for the triplex qRT-PCR. Table 3 shows that a plasmid concentration of 100 copies/μL resulted in 100% positive detection rates for PEDV, TGEV and PoRVA. At 10 copies/μL, the positive detection rates were 100, 90 and 95% for PEDV, TGEV and PoRVA, respectively. At 1 copy/μL, the positive detection rates were 75, 5% and 0 for PEDV, TGEV and PoRVA, respectively. The LoD was defined as the lowest standard plasmid concentration with positive results in 85% of 20 replicates. Therefore, the triplex qRT-PCR showed high sensitivity, with a LoD of 10 copies/μL for PEDV, TGEV and PoRVA ( Figure 7 ).

www.frontiersin.org

Table 3 . Positive detection rate of 100 copies, 10 copies and 1 copy standard plasmids for 20 times.

www.frontiersin.org

Figure 7 . The sensitivity of the triplex qRT-PCR assay. (A) Sensitivity for PEDV M gene. (B) Sensitivity for TGEV N gene. (C) Sensitivity for PoRVA NSP3 gene.

3.6 Repeatability and reproducibility of the triplex qRT-PCR

Three concentrations of standard plasmids, 10 6 , 10 4 and 100 copies/μL, were used to assess the repeatability and reproducibility of the triplex qRT-PCR. The intra-and inter-assay CVs were 0.08–0.79% and 0.37–0.83%, respectively ( Table 4 ), which indicated good repeatability and reproducibility.

www.frontiersin.org

Table 4 . Repeatability and reproducibility evaluation of the triplex qRT-PCR assay.

3.7 Detection of clinical samples

A total of 256 porcine anal swab samples were tested using the triplex qRT-PCR. The positive rates for PEDV, TGEV and PoRVA were 2.73% (7/256), 3.91% (10/256) and 19.14% (49/256), respectively. To verify the accuracy of the method, the clinical samples were also tested by the triplex RT-PCR recommended by the Chinese national standard ( GB/T 36871-2018, 2018 ). The triplex RT-PCR results showed that the positive rates for PEDV, TGEV and PoRVA were 2.73% (7/256), 3.91% (10/256) and 14.84% (38/256), respectively. Both methods detected seven samples co-infected with PEDV and PoRVA. The new triplex qRT-PCR had 100% (PEDV), 100% (TGEV) and 95.70% (PoRVA) agreement with the triplex RT-PCR, indicating that the new approach was accurate, reliable and more sensitive ( Table 5 ).

www.frontiersin.org

Table 5 . Detection of clinical samples by the triplex qRT-PCR and RT-PCR methods.

4 Discussion

PEDV, TGEV and PoRVA are porcine enteric RNA viruses that cause porcine viral diarrhea ( Chen et al., 2010 ; Zhao et al., 2016 ; Zhang et al., 2017 ). Co-infections with various combinations and all three viruses are common in swine herds worldwide ( Song et al., 2006 ; Liu et al., 2019 ; El-Tholoth et al., 2021 ). These co-infections severely compromise the herd immunity and result in an increased risk of secondary infections, higher piglet mortality and significant economic losses, and they are a major concern for the swine industry ( Jung et al., 2008 ; Mesonero-Escuredo et al., 2018 ; Song et al., 2022 ).

Currently, the commonly used molecular detection techniques for PEDV, TGEV and PoRVA include RT-PCR, nested RT-PCR, qRT-PCR, reverse transcription loop-mediated isothermal amplification, reverse transcription recombinase-aided amplification, and CRISPR-Cas nucleic acid detection ( Marthaler et al., 2014 ; Areekit et al., 2022 ; Wu et al., 2022 ; Lazov et al., 2023 ; Xia et al., 2024 ). RT-PCR and nested RT-PCR are not capable of quantitative analysis, and their operation is cumbersome, time-consuming and less sensitive. Isothermal amplification techniques, including reverse transcription loop-mediated isothermal amplification and reverse transcription recombinase-aided amplification, are prone to false-positive results. CRISPR-Cas nucleic acid detection is expensive and not suitable for large-scale detection, and its multiplex technology is not yet matured. qRT-PCR is a highly specific and sensitive method for quantifying trace amounts of RNA in samples and is the most practical technique for the detection of viral RNA. It displays the results as fluorescent signals, which are easy to interpret. In particular, the multiplex qRT-PCR technique can detect multiple target genes of various pathogens in a single-tube reaction, using specific primers and probes with different fluorescent labels. Researchers have established some multiplex qRT-PCR detection methods for pathogens related to porcine viral diarrhea ( Han et al., 2019 ; Huang et al., 2019 ; Jia et al., 2019 ). The detection method using SYBR Green fluorescent dye requires validation of product specificity through melting curve analysis. However, in practical applications, issues such as false-positive signals, dye redistribution, and low sensitivity can affect the reliability of detection results. However, the one-step TaqMan probe-based multiplex qRT-PCR method allows simultaneous detection of various RNA viruses without prior reverse transcription. In practical applications, it is easier to operate and has strong practicality for daily monitoring of pig diseases.

In this study, we designed three pairs of virus-gene-specific primers and corresponding probes for one-step triplex qRT-PCR, which can simultaneously detect PEDV, TGEV and PoRVA in one tube. We inserted three gene fragments into the same vector to generate a standard plasmid containing three gene targets for the triplex qRT-PCR, rather than using a mixture of three standard plasmids. This approach reduced the preparation cost of standard plasmids and minimized the systematic errors from adding three different plasmids. The sensitivity test revealed a LoD of 10 copies/μL for each pathogen. A strong linear correlation between Ct values and standard copy numbers was demonstrated by the standard curve plots. The primer and probe sequences used in the detection method were highly specific, and the fluorescent dyes VIC, Cy5 and FAM did not interfere with each other. Thus, PEDV, TGEV and PoRVA were accurately detected without cross-reaction with other swine pathogens (PRV, PCV2, PPV, PDCoV, SVA, T. gondii , L. interrogans , M. hyopneumoniae , M. hyorhinis , H. parasuis , S. suis , P. multocida and A. pleuropneumoniae ). Furthermore, we tested 256 porcine anal swab samples with the developed triplex qRT-PCR method to verify its practicality and usefulness in clinical samples. The results indicated that PEDV, TGEV and PoRVA were detected in 7 (2.73%), 10 (3.91%) and 49 (19.14%) samples, respectively. This suggested that PEDV, TGEV and PoRVA persisted in pig herds. Co-infection with two or more of PEDV, TGEV and PoRVA was also common, which can worsen immunosuppression and inflammation, increase the risk of secondary infection by other pathogens, and exacerbate these diseases. This was supported by the detection of seven samples that were co-infected with PEDV and PoRVA in 256 clinical samples. We also tested 256 samples with the triplex RT-PCR detection method recommended by the Chinese national standard ( GB/T 36871-2018, 2018 ). The results showed that consistency rates of 100% (PEDV), 100% (TGEV) and 95.70% (PoRVA) between the two methods. The sensitivity of the triplex qRT-PCR was significantly higher than that of triplex RT-PCR.

5 Conclusion

We developed a triplex qRT-PCR for simultaneous and differential detection of PEDV, TGEV and PoRVA. This new method is cost-effective, efficient and user-friendly. It can obtain results within an hour regardless of the number of samples to be tested or diagnosed, whether for routine screening or temporary diagnosis of these three pathogens in pig herds. It provides a reliable detection technique for accurate diagnosis, epidemiological investigation and microbiological quality control of laboratory pigs.

Data availability statement

The sequence presented in the study is showed in Supplementary Table S1 , further inquiries can be directed to the corresponding authors.

Ethics statement

The manuscript presents research on animals that do not require ethical approval for their study.

Author contributions

TL: Investigation, Methodology, Writing – original draft. KL: Methodology, Validation, Writing – original draft. CL: Investigation, Project administration, Writing – original draft. CX: Project administration, Supervision, Writing – original draft. CG: Funding acquisition, Project administration, Supervision, Writing – review & editing.

The author(s) declare that financial support was received for the research, authorship, and/or publication of this article. This study was supported by the National Key Research and Development Program (2021YFF0703000), Pilot Technology Project of National Pig Technology Innovation Center (NCTIP-XD1C09), Special Funds for Basic Scientific Research Operations of Central Public Welfare Scientific Research Institutions (1610302022018).

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Supplementary material

The Supplementary material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fmicb.2024.1390328/full#supplementary-material

Anhui provincial local standard DB34/T 2795-2016 . (2016). Duplex RT-PCR assay for detection of porcine transmissible gastroenteritis virus and porcine epidemic diarrhea virus. Available at:. (in Chinese) https://std.samr.gov.cn/search/std?q=DB34%2FT%202795-2016

Google Scholar

Areekit, S., Tangjitrungrot, P., Khuchareontaworn, S., Rattanathanawan, K., Jaratsing, P., Yasawong, M., et al. (2022). Development of duplex LAMP technique for detection of porcine epidemic diarrhea virus (PEDV) and porcine circovirus type 2 (PCV 2). Curr. Issues Mol. Biol. 44, 5427–5439. doi: 10.3390/cimb44110368

PubMed Abstract | Crossref Full Text | Google Scholar

Chen, Q., Li, J., Fang, X., and Xiong, W. (2010). Detection of swine transmissible gastroenteritis coronavirus using loop-mediated isothermal amplification. Virol. J. 7:206. doi: 10.1186/1743-422X-7-206

Chen, J., Liu, R., Liu, H., Chen, J., Li, X., Zhang, J., et al. (2023). Development of a multiplex quantitative PCR for detecting porcine epidemic diarrhea virus, transmissible gastroenteritis virus, and porcine deltacoronavirus simultaneously in China. Vet. Sci. 10:402. doi: 10.3390/vetsci10060402

Chinese national standard GB/T 36871-2018 . (2018). Multiplex RT-PCR to detect porcine transmissible gastroenteritis virus, porcine epidemic diarrhea virus and porcine rotavirus. Available at:. (in Chinese) https://std.samr.gov.cn/search/std?q=GB%2FT%2036871-2018

Eleouet, J. F., Rasschaert, D., Lambert, P., Levy, L., Vende, P., and Laude, H. (1995). Complete sequence (20 kilobases) of the polyprotein-encoding gene 1 of transmissible gastroenteritis virus. Virology 206, 817–822. doi: 10.1006/viro.1995.1004

El-Tholoth, M., Bai, H., Mauk, M. G., Saif, L., and Bau, H. H. (2021). A portable, 3D printed, microfluidic device for multiplexed, real time, molecular detection of the porcine epidemic diarrhea virus, transmissible gastroenteritis virus, and porcine deltacoronavirus at the point of need. Lab Chip 21, 1118–1130. doi: 10.1039/d0lc01229g

Ghosh, S., and Kobayashi, N. (2011). Whole-genomic analysis of rotavirus strains: current status and future prospects. Future Microbiol. 6, 1049–1065. doi: 10.2217/fmb.11.90

Han, H., Zheng, H., Zhao, Y., Tian, R., Xu, P., Hou, H., et al. (2019). Development of a SYBR green I-based duplex real-time fluorescence quantitative PCR assay for the simultaneous detection of porcine epidemic diarrhea virus and porcine circovirus 3. Mol. Cell. Probes 44, 44–50. doi: 10.1016/j.mcp.2019.02.002

Huang, X., Chen, J., Yao, G., Guo, Q., Wang, J., and Liu, G. (2019). A TaqMan-probe-based multiplex real-time RT-qPCR for simultaneous detection of porcine enteric coronaviruses. Appl. Microbiol. Biotechnol. 103, 4943–4952. doi: 10.1007/s00253-019-09835-7

Jia, S., Feng, B., Wang, Z., Ma, Y., Gao, X., Jiang, Y., et al. (2019). Dual priming oligonucleotide (DPO)-based real-time RT-PCR assay for accurate differentiation of four major viruses causing porcine viral diarrhea. Mol. Cell. Probes 47:101435. doi: 10.1016/j.mcp.2019.101435

Jung, K., Kang, B. K., Lee, C. S., and Song, D. S. (2008). Impact of porcine group A rotavirus co-infection on porcine epidemic diarrhea virus pathogenicity in piglets. Res. Vet. Sci. 84, 502–506. doi: 10.1016/j.rvsc.2007.07.004

Kocherhans, R., Bridgen, A., Ackermann, M., and Tobler, K. (2001). Completion of the porcine epidemic diarrhoea coronavirus (PEDV) genome sequence. Virus Genes 23, 137–144. doi: 10.1023/A:1011831902219

Lazov, C. M., Papetti, A., Belsham, G. J., Bøtner, A., Rasmussen, T. B., and Boniotti, M. B. (2023). Multiplex real-time RT-PCR assays for detection and differentiation of porcine enteric coronaviruses. Pathogens 12:1040. doi: 10.3390/pathogens12081040

Liu, G., Jiang, Y., Opriessnig, T., Gu, K., Zhang, H., and Yang, Z. (2019). Detection and differentiation of five diarrhea related pig viruses utilizing a multiplex PCR assay. J. Virol. Methods 263, 32–37. doi: 10.1016/j.jviromet.2018.10.009

Luo, S., Chen, X., Yan, G., Chen, S., Pan, J., Zeng, M., et al. (2023). Emergence of human-porcine reassortment G9P [19] porcine rotavirus A strain in Guangdong province, China. Front. Vet. Sci. 9:1111919. doi: 10.3389/fvets.2022.1111919

Mackay, I. M. (2004). Real-time PCR in the microbiology laboratory. Clin. Microbiol. Infect. 10, 190–212. doi: 10.1111/j.1198-743x.2004.00722.x

Marthaler, D., Homwong, N., Rossow, K., Culhane, M., Goyal, S., Collins, J., et al. (2014). Rapid detection and high occurrence of porcine rotavirus A, B, and C by RT-qPCR in diagnostic samples. J. Virol. Methods 209, 30–34. doi: 10.1016/j.jviromet.2014.08.018

Mesonero-Escuredo, S., Strutzberg-Minder, K., Casanovas, C., and Segalés, J. (2018). Viral and bacterial investigations on the aetiology of recurrent pig neonatal diarrhoea cases in Spain. Porcine Health Manag. 4:5. doi: 10.1186/s40813-018-0083-8

Monteagudo, L. V., Benito, A. A., Lázaro-Gaspar, S., Arnal, J. L., Martin-Jurado, D., Menjon, R., et al. (2022). Occurrence of rotavirus A genotypes and other enteric pathogens in diarrheic suckling piglets from Spanish swine farms. Animals 12:251. doi: 10.3390/ani12030251

Pensaert, M. B., and de Bouck, P. (1978). A new coronavirus-like particle associated with diarrhea in swine. Arch. Virol. 58, 243–247. doi: 10.1007/BF01317606

Rasmussen, T. B., Boniotti, M. B., Papetti, A., Grasland, B., Frossard, J., Dastjerdi, A., et al. (2018). Full-length genome sequences of porcine epidemic diarrhoea virus strain CV777; use of NGS to analyse genomic and sub-genomic RNAs. PLoS One 13:e0193682. doi: 10.1371/journal.pone.0193682

Shi, Y., Li, B., Tao, J., Cheng, J., and Liu, H. (2021). The complex co-infections of multiple porcine diarrhea viruses in local area based on the luminex xTAG multiplex detection method. Front. Vet. Sci. 8:602866. doi: 10.3389/fvets.2021.602866

Song, L., Chen, J., Hao, P., Jiang, Y., Xu, W., Li, L., et al. (2022). Differential transcriptomics analysis of IPEC-J2 cells single or coinfected with porcine epidemic diarrhea virus and transmissible gastroenteritis virus. Front. Immunol. 13:844657. doi: 10.3389/fimmu.2022.844657

Song, D. S., Kang, B. K., Oh, J. S., Ha, G. W., Yang, J. S., Moon, H. J., et al. (2006). Multiplex reverse transcription-PCR for rapid differential detection of porcine epidemic diarrhea virus, transmissible gastroenteritis virus, and porcine group A rotavirus. J. Vet. Diagn. Invest. 18, 278–281. doi: 10.1177/104063870601800309

Vlasova, A. N., Amimo, J. O., and Saif, L. J. (2017). Porcine rotaviruses: epidemiology, immune responses and control strategies. Viruses 9:48. doi: 10.3390/v9030048

Wang, Y., Das, A., Zheng, W., Porter, E., Xu, L., Noll, L., et al. (2020). Development and evaluation of multiplex real-time RT-PCR assays for the detection and differentiation of foot-and-mouth disease virus and Seneca Valley virus 1. Transbound. Emerg. Dis. 67, 604–616. doi: 10.1111/tbed.13373

Wang, D., Fang, L., and Xiao, S. (2016). Porcine epidemic diarrhea in China. Virus Res. 226, 7–13. doi: 10.1016/j.virusres.2016.05.026

Wu, X., Liu, Y., Gao, L., Yan, Z., Zhao, Q., Chen, F., et al. (2022). Development and application of a reverse-transcription recombinase-aided amplification assay for porcine epidemic diarrhea virus. Viruses 14:591. doi: 10.3390/v14030591

Xia, Y., Li, Y., He, Y., Wang, X., Qiu, W., Diao, X., et al. (2024). Development of a CRISPR-Cas12a based assay for the detection of swine enteric coronaviruses in pig herds in China. Adv. Biotechnol. 2:7. doi: 10.1007/s44307-024-00015-x

Crossref Full Text | Google Scholar

Yang, D., Ge, F., Ju, H., Wang, J., Liu, J., Ning, K., et al. (2014). Whole-genome analysis of porcine epidemic diarrhea virus (PEDV) from eastern China. Arch. Virol. 159, 2777–2785. doi: 10.1007/s00705-014-2102-7

Yang, J., Li, D., Wang, J., Zhang, R., and Li, J. (2022). Design, optimization, and application of multiplex rRT-PCR in the detection of respiratory viruses. Crit. Rev. Clin. Lab. Sci. 59, 555–572. doi: 10.1080/10408363.2022.2072467

Yount, B., Curtis, K. M., and Baric, R. S. (2000). Strategy for systematic assembly of large RNA and DNA genomes: transmissible gastroenteritis virus model. J. Virol. 74, 10600–10611. doi: 10.1128/jvi.74.22.10600-10611.2000

Zhang, Q., Hu, R., Tang, X., Wu, C., He, Q., Zhao, Z., et al. (2013). Occurrence and investigation of enteric viral infections in pigs with diarrhea in China. Arch. Virol. 158, 1631–1636. doi: 10.1007/s00705-013-1659-x

Zhang, Q., Liu, X., Fang, Y., Zhou, P., Wang, Y., and Zhang, Y. (2017). Detection and phylogenetic analyses of spike genes in porcine epidemic diarrhea virus strains circulating in China in 2016–2017. Virol. J. 14:194. doi: 10.1186/s12985-017-0860-z

Zhang, F., Luo, S., Gu, J., Li, Z., Li, K., Yuan, W., et al. (2019). Prevalence and phylogenetic analysis of porcine diarrhea associated viruses in southern China from 2012 to 2018. BMC Vet. Res. 15:470. doi: 10.1186/s12917-019-2212-2

Zhao, Z., Yang, Z., Lin, W., Wang, W., Yang, J., Jin, W., et al. (2016). The rate of co-infection for piglet diarrhea viruses in China and the genetic characterization of porcine epidemic diarrhea virus and porcine kobuvirus. Acta Virol. 60, 55–61. doi: 10.4149/av_2016_01_55

Zhejiang provincial local standard DB33/T 2254-2020 . (2020). Method of duplex fluorescence quantitative RT-PCR for the detection of porcine epidemic diarrhea virus and transmissible gastroenteritis virus. Available at:. (in Chinese) https://std.samr.gov.cn/search/std?q=DB33%2FT%202254-2020

Keywords: porcine epidemic diarrhea virus, porcine transmissible gastroenteritis virus, porcine rotavirus A, porcine enteric viruses, triplex real-time qRT-PCR

Citation: Luo T, Li K, Li C, Xia C and Gao C (2024) Development of a triplex quantitative reverse transcription-polymerase chain reaction for the detection of porcine epidemic diarrhea virus, porcine transmissible gastroenteritis virus, and porcine rotavirus A. Front. Microbiol . 15:1390328. doi: 10.3389/fmicb.2024.1390328

Received: 23 February 2024; Accepted: 17 April 2024; Published: 10 May 2024.

Reviewed by:

Copyright © 2024 Luo, Li, Li, Xia and Gao. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY) . The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Changyou Xia, [email protected] ; Caixia Gao, [email protected]

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

IMAGES

  1. Porcine Translation Research Database (PTR) Construction Flowchart

    the porcine translational research database

  2. (PDF) The pig as a model for translational research: Overview of

    the porcine translational research database

  3. (PDF) The porcine translational research database: A manually curated

    the porcine translational research database

  4. Porcine Translation Research Database (PTR) Construction Flowchart

    the porcine translational research database

  5. Porcine Translational Research

    the porcine translational research database

  6. The pig as a model for translational research: overview of porcine

    the porcine translational research database

VIDEO

  1. HCEMM-USZ Translational Pancreatology Research Group-Balázs Németh

  2. Porc

  3. Solaris Research Database Startup + Meltdown

  4. ALD Database ... Translational Metabolism in Newborn Screening

  5. Workshop: Porcine Research and Human Health Applications

  6. Progress Report I

COMMENTS

  1. The porcine translational research database: a manually curated

    The use of swine in biomedical research has increased dramatically in the last decade. Diverse genomic- and proteomic databases have been developed to facilitate research using human and rodent models. Current porcine gene databases, however, lack the robust annotation to study pig models that are relevant to human studies and for comparative evaluation with rodent models.

  2. DGIL Porcine Translational Research Database : USDA ARS

    DGIL Porcine Translational Research Database. Click SEARCH or logo to enter into the database. As of November 2023, the database has been online for 18 years. It is used by research scientists working with pigs worldwide as well as commercial and public developers of new reagents. This database is the largest manually curated database for any ...

  3. The porcine translational research database: a manually curated

    The Porcine Translational Research Database is named because of its unique utility to translate findings made in rodents to pigs and from those in pigs to humans. A comprehensive literature-based survey was conducted to identify genes that have demonstrated function in humans, mice or pigs. The resulting data in the database is documented by ...

  4. The Porcine Translational Research Database

    The Porcine Translational Research Database. The data in the Porcine Translational Research Database is supported by >5800 references, and contains 65 data fields for each entry, including >9700 full length (5′ and 3′) unambiguous pig sequences, >2400 real time PCR assays and reactivity information on >1700 antibodies.

  5. The Porcine Translational Research Database

    The data in the Porcine Translational Research Database is supported by >5800 references, and contains 65 data fields for each entry, including >9700 full length (5′ and 3′) unambiguous pig sequences, >2400 real time PCR assays and reactivity information on >1700 antibodies. It also contains gene and/or protein expression data for >2200 ...

  6. The porcine translational research database: A manually curated

    Porcine Translation Research Database (PTR) Construction Flowchart Current Database Statistics (07/12/2017) Chromosomal Locations of 1307 Duplicated Gene Artifacts (2889 Loci)

  7. The porcine translational research database: a manually curated

    This database is the largest manually curated database for any single veterinary species and is unique among porcine gene databases in regard to linking gene expression to gene function, identifying related gene pathways, and connecting data with other porcines gene databases. BackgroundThe use of swine in biomedical research has increased dramatically in the last decade.

  8. The porcine translational research database: a manually curated

    Background The use of swine in biomedical research has increased dramatically in the last decade. Diverse genomic- and proteomic databases have been developed to facilitate research using human and rodent models. Current porcine gene databases, however, lack the robust annotation to study pig models that are relevant to human studies and for comparative evaluation with rodent models ...

  9. The porcine translational research database: a manually curated

    Academia.edu is a platform for academics to share research papers. The porcine translational research database: a manually curated, genomics and proteomics-based research resource (PDF) The porcine translational research database: a manually curated, genomics and proteomics-based research resource | Harry Dawson - Academia.edu

  10. Publication : USDA ARS

    The porcine translational research database: A manually curated, genomics and proteomics-based research resource. Biomed Central (BMC) Genomics. doi: 10.1186/s12864-017-4009-7. Interpretive Summary:

  11. The porcine translational research database: a manually curated

    The porcine translational research database: a manually curated, genomics and proteomics-based research resource. ILDGDB: a manually curated database of genomics, transcriptomics, proteomics and drug information for interstitial lung diseases; Detailed tail proteomic analysis of axolotl (Ambystoma mexicanum) using an mRNA-seq reference database.

  12. The porcine translational research database: a manually curated

    This database provides the first comprehensive description of three major Super-families or functionally related groups of proteins (Cluster of Differentiation Marker genes, Solute Carrier Superfamily, ATP binding Cassette Superfamily), and a comparative description of porcine microRNAs. The use of swine in biomedical research has increased dramatically in the last decade.

  13. The porcine translational research database: a manually curated

    Background: The use of swine in biomedical research has increased dramatically in the last decade. Diverse genomic- and proteomic databases have been developed to facilitate research using human and rodent models. Current porcine gene databases, however, lack the robust annotation to study pig models that are relevant to human studies and for comparative evaluation with rodent models.

  14. The Porcine Translational Research Database

    The data in the Porcine Translational Research Database is supported by >5800 references, and contains 65 data fields for each entry, including >9700 full length (5′ and 3′) unambiguous pig sequences, >2400 real time PCR assays and reactivity information on >1700 antibodies. It also contains gene and/or protein expression data for >2200 ...

  15. Porcine Translation Research Database (PTR) Construction Flowchart

    The sequences for these genes are found in the Porcine Translational Research Database maintained by the Beltsville Human Nutrition Research Center, Diet, Genomics, and Immunology Laboratory (http ...

  16. The porcine translational research database: a manually curated

    The porcine translational research database: a manually curated, genomics and proteomics-based research resource. Dawson HD, Chen C, Gaynor B, Shao J, Urban JF. BMC Genomics. 2017 Aug 22; 18(1):643.

  17. The porcine translational research database: a manually curated

    The porcine translational research database: a manually curated, genomics and proteomics-based research resource. Sign in | Create an account. https://orcid.org. Europe PMC. Menu. About. About Europe PMC ...

  18. USDA ARS Online Magazine Vol. 62, No. 5

    This research focuses on assessing the effect of nutrition on immune and inflammatory responses. Dawson helped develop and continues to curate the publicly available Porcine Translational Research Database of genes and proteins for comparison with those prominently studied in rodents and humans.

  19. Ag Data Commons

    The Porcine Translational Research Database; Ag Data Commons migration begins October 18, 2023. The Ag Data Commons is migrating to a new platform - an institutional portal on Figshare. Starting October 18 the current system will be available for search and download only.

  20. PDF The porcine translational research database: a manually curated

    The porcine translational research database: a manually curated, genomics and proteomics-based research resource Harry D. Dawson1*, Celine Chen1, Brady Gaynor2, Jonathan Shao2 and Joseph F. Urban Jr1

  21. Home

    A curated gluten protein sequence database to support development of proteomics methods for determination of gluten in gluten-free foods. The porcine translational research database: a manually curated, genomics and proteomics-based research resource. LeptoDB: an integrated database of genomics and proteomics resource of Leptospira.

  22. rodents

    The Porcine Translational Research Database This database contains functional information on genes commonly studied in humans, pigs, and rodents, including more than 2,300 sequenced at DGIL. Each entry has been manually-annotated and is linked to other porcine databases as well as Homologene, InnateDb and Gene Ontology.

  23. The porcine translational research database: a manually curated

    The porcine translational research database: a manually curated, genomics and proteomics-based research resource. Sign in | Create an account. https://orcid.org ...

  24. Frontiers

    This study was supported by the National Key Research and Development Program (2021YFF0703000), Pilot Technology Project of National Pig Technology Innovation Center (NCTIP-XD1C09), Special Funds for Basic Scientific Research Operations of Central Public Welfare Scientific Research Institutions (1610302022018). Conflict of interest