This work is being carried out for development of an Auto-photo-creator- a system that would extend hands for physically challenged people. Initially, we have used “Attentional generative adversarial networks (Attn: GAN)” with the update of dense net architecture for text-to-image conversion, along with a Deep Attentional Multimodal Similarity Model (DAMSM) to calculate the matching loss and thereby produce high quality images. The updated dense net architecture was able to reduce the loss by 1.62% and could retrieve faster results by 768 seconds per iteration than the existed inception architecture of CNN. We have further extended our work by making a system that takes voice or text as input where necessary conversions were applied on the input to covert the voice to text and finally the images will be generated based on the text. The underlying process is carried with the support of Artificial Intelligence (AI), Machine Learning (ML) and Natural Language processing (NLP). The proposed work will prove to be helpful for people with hearing impairment and visual loss.

1.
Agnese
,
Herrera
,
Tao
 et al (
2019
) “
A survey and taxonomy of adversarial neural networks for text-to- image synthesis
”, In Wiley.
2.
Akshay
K.
,
Rishi
S.
,
Rushi
B.
 et al (
2020
) “
Understanding Inception Network Architecture for Image Classification
”, arXiv:1606.05908.
3.
Anirban
,
Ayush
, et al (
2019
) “
Android Application Development: A Brief Overview of Android Platforms and Evolution of Security Systems
”, In
Proceedings of the Third International Conference on I-SMAC
.
4.
Eghbal
Z. H.
,
Zellinger
W.
 et al (
2019
) “
Mixture density generative adversarial networks
”,
Proceedings of the Advanced Computer Vision and Pattern Recognition (CVPR
), pp.
5820
5829
.
5.
Gecer
B.
,
Ploumpis
S.
,
Kotsia
I.
 et al (
2019
) “
GANFIT: generative adversarial network fitting for high fidelity 3D face reconstruction
,”
Proceedings of the Advanced Computer Vision and Pattern Recognition (CVPR)
, pp.
1155
1164
.
6.
He
Z.
,
Vishwanath
S.
,
Vishal
M. P.
(
2019
) “
Image De-raining Using a Conditional Generative Adversarial Network
”, In IEEE conference.
7.
Jing
Yu
,
Jason
,
Lee
 et al (
2020
) “
Text-to-Image Generation Grounded by Fine-Grained User Attention
”,
Google Research.
8.
Kenan
E. A.
,
Joo
H. L.
 et al (
2019
) “
Semantically Consistent Hierarchical Text to Fashion Image Synthesis with an enhanced-Attentional Generative Adversarial Network
”, in
ICCV.
9.
Li
W.
,
Zhang
P.
,
Zhang
L.
 et al (
2019
) “
Object-driven text-to-image synthesis via adversarial training
”, In
IEEE conference on computer vision and pattern recognition (CVPR
), pp.
12174
12182
.
10.
Linyan
Li
,
Yu
Sun
,
Fuyuan
Hu
(
2020
) “
Text to Realistic Image Generation with Attentional Concatenation Generative Adversarial Networks
”,
Hindawi.
11.
Pranjal
J.
,
Tanmay
el al
. (
2020
) “
Generative Adversarial Training and Its Utilization for Text-To-Image Generation: A Survey and Analysis
”, ISSN- Vol.
7
, Issue
8
.
12.
Qingrong
C.
,
Xiaodong
G.
 et al (
2019
) “
Deep attentional fine-grained similarity network with adversarial learning for cross-modal retrieval
”, In Springer.
13.
Qiao
T.
,
Zhang
J.
,
Xu
D.
 et al (
2019
) “
Mirror GAN: Learning text-to-image generation by re-description
”, In
IEEE conference on computer vision and pattern recognition (CVPR
) (pp.
1505
1514
).
14.
Tingting
,
Jing
,
Duanqing
 et al (
2019
) “
Learn, Imagine and Create: Text-to-Image Generation from prior knowledge
”, In Neur IPS.
15.
Vinnarasu
A.
,
Deepa
V.
 et al (
2019
) “
Speech to text conversion and summarization for effective understanding and documentation
”, In IJECE.
This content is only available via PDF.
You do not currently have access to this content.