Microbial communities contribute to most organisms in our environment and have greater capacity to transform the world around. However, a large number of them remain uncultivable. Identification and functional validation of the uncultured microbial communities in environmental samples remains a challenging task. Metagenomics has become one of the indispensable tools for studying the diversity and function of these uncultured microbes. Our sample has been collected from a public toilet of an undeveloped area in North Coimbatore District, Tamil Nadu, INDIA. Waste effluents from this public toilet were used to generate electricity for the area using Microbial Fuel Cell (MFC) which involves bio-film formation which helps in electron transfer. Our study aims in identification of the diverse microbial community present in urine sample and to functionally annotate the gene responsible for biofilm formation aiding in electron transfer process. As metagenomic studies are becoming widespread, a large number of computational tools and a wide range of bioinformatics pipelines have been developed for metagenomic assembly, taxonomic and functional annotation. Choosing an appropriate tool is crucial and requires validation. In this present study, a range of bioinformatics tools were used quantify the sequence reads, taxonomic classification of the microbial population in the urine sample, assembly of the mixed sequence reads of multiple species and to attain functional annotation of the hypothetical protein.

1.
Sato
,
K.
, &
Sakakibara
,
Y.
(
2015
).
MetaVelvet-SL: an extension of the Velvet assembler to a de novo metagenomic assembler utilizing supervised learning
.
DNA research
,
22
(
1
),
69
77
.
2.
Alshalchi
,
S. A.
, &
Anderson
,
G. G.
(
2015
).
Expression of the lipopolysaccharide biosynthesis gene lpxD affects biofilm formation of Pseudomonas aeruginosa
.
Archives of microbiology
,
197
(
2
),
135
145
.
3.
Altschul
,
S. F.
,
Gish
,
W.
,
Miller
,
W.
,
Myers
,
E. W.
, &
Lipman
,
D. J.
(
1990
).
Basic local alignment search tool
.
Journal of molecular biology
,
215
(
3
),
403
410
.
4.
Wingett
,
S. W.
, &
Andrews
,
S.
(
2018
).
FastQ Screen: A tool for multi-genome mapping and quality control
.
F1000Research
,
7
.
5.
Andrews
,
S.
(
2010
).
FastQC: a quality control tool for high throughput sequence data.
6.
Bateman
,
A.
,
Birney
,
E.
,
Cerruti
,
L.
,
Durbin
,
R.
,
Etwiller
,
L.
,
Eddy
,
S. R.
, … &
Sonnhammer
,
E. L.
(
2002
).
The Pfam protein families database
.
Nucleic acids research
,
30
(
1
),
276
280
.
7.
Cai
,
C. Z.
,
Han
,
L. Y.
,
Ji
,
Z. L.
,
Chen
,
X.
, &
Chen
,
Y. Z.
(
2003
).
SVM-Prot: web-based support vector machine software for functional classification of a protein from its primary sequence
.
Nucleic acids research
,
31
(
13
),
3692
3697
.
8.
Caméléna
,
F.
,
Pilmis
,
B.
,
Mollo
,
B.
,
Hadj
,
A.
,
Le Monnier
,
A.
, &
Mizrahi
,
A.
(
2016
).
Infections caused by Tissierella praeacuta: a report of two cases and literature review
.
Anaerobe
,
40
,
15
17
.
9.
Cayol
,
J. L.
,
Ducerf
,
S.
,
Patel
,
B. K.
,
Garcia
,
J. L.
,
Thomas
,
P.
, &
Ollivier
,
B.
(
2000
).
Thermohalobacter berrensis gen. nov., sp. nov., a thermophilic, strictly halophilic bacterium from a solar saltern
.
International journal of systematic and evolutionary microbiology
,
50
(
2
),
559
564
.
10.
Charuvaka
,
A.
, &
Rangwala
,
H.
(
2011
, December).
Evaluation of short read metagenomic assembly
. In
BMC genomics
(Vol.
12
, No.
2
, pp.
1
13
). BioMed Central.
11.
National Research Council
. (
2007
).
The new science of metagenomics: revealing the secrets of our microbial planet
.
National Academies Press
.
12.
De Filippo
,
C.
,
Ramazzotti
,
M.
,
Fontana
,
P.
, &
Cavalieri
,
D.
(
2012
).
Bioinformatic approaches for functional annotation and pathway inference in metagenomics data
.
Briefings in bioinformatics
,
13
(
6
),
696
710
.
13.
Fakruddin
,
M.
, &
Mannan
,
K.
(
2013
).
Methods for analyzing diversity of microbial communities in natural environments
.
Ceylon Journal of Science (Biological Sciences)
,
42
(
1
).
14.
Finn
,
R. D.
,
Bateman
,
A.
,
Clements
,
J.
,
Coggill
,
P.
,
Eberhardt
,
R. Y.
,
Eddy
,
S. R.
, … &
Punta
,
M.
(
2014
).
Pfam: the protein families database
.
Nucleic acids research
,
42
(
D1
),
D222
D230
.
15.
Gasteiger
,
E.
,
Gattiker
,
A.
,
Hoogland
,
C.
,
Ivanyi
,
I.
,
Appel
,
R. D.
, &
Bairoch
,
A.
(
2003
).
ExPASy: the proteomics server for in-depth protein knowledge and analysis
.
Nucleic acids research
,
31
(
13
),
3784
3788
.
16.
Gasteiger
,
E.
,
Hoogland
,
C.
,
Gattiker
,
A.
,
Wilkins
,
M. R.
,
Appel
,
R. D.
, &
Bairoch
,
A.
(
2005
).
Protein identification and analysis tools on the ExPASy server
.
The proteomics protocols handbook
,
571
607
.
17.
Haas
,
B. J.
,
Gevers
,
D.
,
Earl
,
A. M.
,
Feldgarden
,
M.
,
Ward
,
D. V.
,
Giannoukos
,
G.
, … &
Human Microbiome Consortium
. (
2011
).
Chimeric 16S rRNA sequence formation and detection in Sanger and 454-pyrosequenced PCR amplicons
.
Genome research
,
21
(
3
),
494
504
.
18.
Handelsman
,
J.
(
2004
).
Metagenomics: application of genomics to uncultured microorganisms
.
Microbiology and molecular biology reviews
,
68
(
4
),
669
.
19.
Hasan
,
N. A.
,
Chowdhury
,
W. B.
,
Rahim
,
N.
,
Sultana
,
M.
,
Shabnam
,
S. A.
,
Mai
,
V.
, … &
Alam
,
M.
(
2010
).
Metagenomic 16S rDNA targeted PCR-DGGE in determining bacterial diversity in aquatic ecosystem
.
Bangladesh Journal of Microbiology
,
27
(
2
),
46
50
.
20.
Huelsenbeck
,
J. P.
, &
Ronquist
,
F.
(
2001
).
MRBAYES: Bayesian inference of phylogenetic trees
.
Bioinformatics
,
17
(
8
),
754
755
.
21.
Huerta-Cepas
,
J.
,
Szklarczyk
,
D.
,
Forslund
,
K.
,
Cook
,
H.
,
Heller
,
D.
,
Walter
,
M. C.
, … &
Bork
,
P.
(
2016
).
eggNOG 4.5: a hierarchical orthology framework with improved functional annotations for eukaryotic, prokaryotic and viral sequences
.
Nucleic acids research
,
44
(
D1
),
D286
D293
.
22.
Inbakandan
,
D.
,
Sriyutha Murthy
,
P.
,
Venkatesan
,
R.
, &
Ajmal Khan
,
S.
(
2010
).
16S rDNA sequence analysis of culturable marine biofilm forming bacteria from a ship’s hull
.
Biofouling
,
26
(
8
),
893
899
.
23.
Koonin
,
E. V.
(
2002
).
The Clusters of Orthologous Groups (COGs) Database: phylogenetic classification of proteins from complete genomes
.
The NCBI Handbook.
24.
Lai
,
B.
,
Ding
,
R.
,
Li
,
Y.
,
Duan
,
L.
, &
Zhu
,
H.
(
2012
).
A de novo metagenomic assembly program for shotgun DNA reads
.
Bioinformatics
,
28
(
11
),
1455
1462
.
25.
Lavigne
,
R.
,
Seto
,
D.
,
Mahadevan
,
P.
,
Ackermann
,
H. W.
, &
Kropinski
,
A. M.
(
2008
).
Unifying classical and molecular taxonomic classification: analysis of the Podoviridae using BLASTP-based tools
.
Research in microbiology
,
159
(
5
),
406
414
.
26.
Lee
,
K. J.
,
Lee
,
M. A.
,
Hwang
,
W.
,
Park
,
H.
, &
Lee
,
K. H.
(
2016
).
Deacylated lipopolysaccharides inhibit biofilm formation by Gram-negative bacteria
.
Biofouling
,
32
(
7
),
711
723
.
27.
Letunic
,
I.
,
Goodstadt
,
L.
,
Dickens
,
N. J.
,
Doerks
,
T.
,
Schultz
,
J.
,
Mott
,
R.
, … &
Bork
,
P.
(
2002
).
Recent improvements to the SMART domain-based sequence annotation resource
.
Nucleic acids research
,
30
(
1
),
242
244
.
28.
Li
,
Y. H.
,
Xu
,
J. Y.
,
Tao
,
L.
,
Li
,
X. F.
,
Li
,
S.
,
Zeng
,
X.
, … &
Chen
,
Y. Z.
(
2016
).
SVM-Prot 2016: a webserver for machine learning prediction of protein functional families from sequence irrespective of similarity
.
PloS one
,
11
(
8
),
e0155290
.
29.
Liu
,
H.
,
Ramnarayanan
,
R.
, &
Logan
,
B. E.
(
2004
).
Production of electricity during wastewater treatment using a single chamber microbial fuel cell
.
Environmental science & technology
,
38
(
7
),
2281
2285
.
30.
Marchler-Bauer
,
A.
,
Lu
,
S.
,
Anderson
,
J. B.
,
Chitsaz
,
F.
,
Derbyshire
,
M. K.
,
DeWeese-Scott
,
C.
, … &
Bryant
,
S. H.
(
2010
).
CDD: a Conserved Domain Database for the functional annotation of proteins
.
Nucleic acids research
,
39
(suppl_
1
),
D225
D229
.
31.
McGuffin
,
L. J.
,
Bryson
,
K.
, &
Jones
,
D. T.
(
2000
).
The PSIPRED protein structure prediction server
.
Bioinformatics
,
16
(
4
),
404
405
.
32.
Qiu
,
Y. Q.
,
Tian
,
X.
, &
Zhang
,
S.
(
2015
).
Infer metagenomic abundance and reveal homologous genomes based on the structure of taxonomy tree
.
IEEE/ACM transactions on computational biology and bioinformatics
,
12
(
5
),
1112
1122
.
33.
Murzin
,
A. G.
,
Brenner
,
S. E.
,
Hubbard
,
T.
, &
Chothia
,
C.
(
1995
).
SCOP: a structural classification of proteins database for the investigation of sequences and structures
.
Journal of molecular biology
,
247
(
4
),
536
540
.
34.
Muyzer
,
G.
,
De Waal
,
E. C.
, &
Uitterlinden
,
A. G.
(
1993
).
Profiling of complex microbial populations by denaturing gradient gel electrophoresis analysis of polymerase chain reaction-amplified genes coding for 16S rRNA
.
Applied and environmental microbiology
,
59
(
3
),
695
.
35.
Muyzer
,
G.
, &
Smalla
,
K.
(
1998
).
Application of denaturing gradient gel electrophoresis (DGGE) and temperature gradient gel electrophoresis (TGGE) in microbial ecology
.
Antonie van Leeuwenhoek
,
73
(
1
),
127
141
.
36.
Nakao
,
R.
,
Ramstedt
,
M.
,
Wai
,
S. N.
, &
Uhlin
,
B. E.
(
2012
).
Enhanced biofilm formation by Escherichia coli LPS mutants defective in Hep biosynthesis
.
PloS one
,
7
(
12
),
e51241
.
37.
Namiki
,
T.
,
Hachiya
,
T.
,
Tanaka
,
H.
, &
Sakakibara
,
Y.
(
2012
).
MetaVelvet: an extension of Velvet assembler to de novo metagenome assembly from short sequence reads
.
Nucleic acids research
,
40
(
20
),
e155
e155
.
38.
Narasingarao
,
P.
,
Podell
,
S.
,
Ugalde
,
J. A.
,
Brochier-Armanet
,
C.
,
Emerson
,
J. B.
,
Brocks
,
J. J.
, … &
Allen
,
E. E.
(
2012
).
De novo metagenomic assembly reveals abundant novel major lineage of Archaea in hyper-saline microbial communities
.
The ISME journal
,
6
(
1
),
81
93
.
39.
Oulas
,
A.
,
Pavloudi
,
C.
,
Polymenakou
,
P.
,
Pavlopoulos
,
G. A.
,
Papanikolaou
,
N.
,
Kotoulas
,
G.
, … &
Iliopoulos
,
L.
(
2015
).
Metagenomics: tools and insights for analyzing next-generation sequencing data derived from biodiversity studies
.
Bioinformatics and biology insights
,
9
,
BBI
S12462
.
40.
Read
,
S. T.
,
Dutta
,
P.
,
Bond
,
P. L.
,
Keller
,
J.
, &
Rabaey
,
K.
(
2010
).
Initial development and structure of biofilms on microbial fuel cell anodes
.
BMC microbiology
,
10
(
1
),
1
10
.
41.
Salamov
,
A. A.
, &
Solovyev
,
V. V.
(
2000
).
Ab initio gene finding in Drosophila genomic DNA
.
Genome research
,
10
(
4
),
516
522
.
42.
Sanli
,
K.
,
Bengtsson-Palme
,
J.
,
Nilsson
,
R. H.
,
Kristiansson
,
E.
,
Alm Rosenblad
,
M.
,
Blanck
,
H.
, &
Eriksson
,
K. M.
(
2015
).
Metagenomic sequencing of marine periphyton: taxonomic and functional insights into biofilm communities
.
Frontiers in microbiology
,
6
,
1192
.
43.
Sauer
,
K.
(
2003
).
The genomics and proteomics of biofilm formation
.
Genome biology
,
4
(
6
),
1
5
.
44.
Schmieder
,
R.
, &
Edwards
,
R.
(
2011
).
Quality control and preprocessing of metagenomic datasets
.
Bioinformatics
,
27
(
6
),
863
864
.
45.
Schultz
,
J.
,
Milpetz
,
F.
,
Bork
,
P.
, &
Ponting
,
C. P.
(
1998
).
SMART, a simple modular architecture research tool: identification of signaling domains
.
Proceedings of the National Academy of Sciences
,
95
(
11
),
5857
5864
.
46.
Söding
,
J.
,
Biegert
,
A.
, &
Lupas
,
A. N.
(
2005
).
The HHpred interactive server for protein homology detection and structure prediction
.
Nucleic acids research
,
33
(suppl_
2
),
W244
W248
.
47.
Sussman
,
J. L.
,
Lin
,
D.
,
Jiang
,
J.
,
Manning
,
N. O.
,
Prilusky
,
J.
,
Ritter
,
O.
, &
Abola
,
E. E.
(
1998
).
Protein Data Bank (PDB): database of three-dimensional structural information of biological macromolecules
.
Acta Crystallographica Section D: Biological Crystallography
,
54
(
6
),
1078
1084
.
48.
Tatusov
,
R. L.
,
Galperin
,
M. Y.
,
Natale
,
D. A.
, &
Koonin
,
E. V.
(
2000
).
The COG database: a tool for genome-scale analysis of protein functions and evolution
.
Nucleic acids research
,
28
(
1
),
33
36
.
49.
Tatusov
,
R. L.
,
Natale
,
D. A.
,
Garkavtsev
,
I. V.
,
Tatusova
,
T. A.
,
Shankavaram
,
U. T.
,
Rao
,
B. S.
, … &
Koonin
,
E. V.
(
2001
).
The COG database: new developments in phylogenetic classification of proteins from complete genomes
.
Nucleic acids research
,
29
(
1
),
22
28
.
50.
Tennant
,
R. K.
,
Sambles
,
C. M.
,
Diffey
,
G. E.
,
Moore
,
K. A.
, &
Love
,
J.
(
2017
).
Metagenomic analysis of silage
.
Journal of visualized experiments: JoVE
, (
119
).
51.
Dunlop
,
A. L.
,
Mulle
,
J. G.
,
Ferranti
,
E. P.
,
Edwards
,
S.
,
Dunn
,
A. B.
, &
Corwin
,
E. J.
(
2015
).
The maternal microbiome and pregnancy outcomes that impact infant health: a review
.
Advances in neonatal care: official journal of the National Association of Neonatal Nurses
,
15
(
6
),
377
.
52.
Wang
,
Q.
,
Garrity
,
G. M.
,
Tiedje
,
J. M.
, &
Cole
,
J. R.
(
2007
).
Naive Bayesian classifier for rapid assignment of rRNA sequences into the new bacterial taxonomy
.
Applied and environmental microbiology
,
73
(
16
),
5261
.
53.
Xu
,
J.
,
Li
,
M.
,
Kim
,
D.
, &
Xu
,
Y.
(
2003
).
RAPTOR: optimal protein threading by linear programming
.
Journal of bioinformatics and computational biology
,
1
(
01
),
95
117
.
54.
Yadav
,
P. K.
, &
Rana
,
J.
(
2011
).
Computer aided epitope prediction for glycoprotein-B in human cytomegalovirus
.
Elixir BioPhy
,
39
,
5021
5025
.
55.
Rho
,
M.
,
Tang
,
H.
, &
Ye
,
Y.
(
2010
).
FragGeneScan: predicting genes in short and error-prone reads
.
Nucleic acids research
,
38
(
20
),
e191
e191
.
56.
Zerbino
,
D. R.
, &
Birney
,
E.
(
2008
).
Velvet: algorithms for de novo short read assembly using de Bruijn graphs
.
Genome research
,
18
(
5
),
821
829
.
This content is only available via PDF.
You do not currently have access to this content.