基于多文本描述的图像生成方法

admin · 发表于 2024-12-14 11:59

文档名：基于多文本描述的图像生成方法
摘要：针对单条文本描述生成的图像质量不高且存在结构错误的问题进行研究,采用多阶段生成对抗网络模型,并提出对不同文本序列进行插值操作,从多条文本描述中提取特征,以丰富给定的文本描述,使生成图像具有更多细节.为了生成与文本更为相关的图像,引入了多文本深度注意多模态相似度模型以得到注意力特征,并与上一层视觉特征联合作为下一层的输入,从而提升生成图像的真实程度和文本描述之间的语义一致性.为了能够让模型学会协调每个位置的细节,引入了自注意力机制,让生成器生成更加符合真实场景的图像.优化后的模型在CUB和MS-COCO数据集上进行验证,生成的图像不仅结构完整,语义一致性更强,视觉上的效果更加丰富多样.

Abstract：Aimingatthechallengesassociateswiththelowqualityandstructuralerrorsexistedintheimagesgener-atedbyasingletextdescription,amulti-stagegenerativeadversarialnetworkmodelwasusedtostudy,anditwaspro-posedtointerpolatedifferenttextsequencestoenrichthegiventextdescriptionsbyextractingfeaturesfrommultipletextdescriptionsandimpartinggreaterdetailtothegeneratedimages.Inordertoenhancethecorrelationbetweenthegeneratedimagesandthecorrespondingtext,amulti-captionsdeepattentionalmulti-modalsimilaritymodelthatcap-turedattentionfeatureswasintroduced.Thesefeaturesweresubsequentlyintegratedwithvisualfeaturesfromthepre-cedinglayer,servingasinputforthesubsequentlayer.Thisintegrationimprovedtherealismofthegeneratedimagesandenhancedtheirsemanticconsistencywiththetextdescriptions.Inaddition,aself-attentionmechanismtoenablethemodeltoeffectivelycoordinatethedetailsateachpositionwasincorporated,resultinginimagesthatweremorealignedwithreal-worldscenarios.TheoptimizedmodelwasverifiedontheCUBandMS-COCOdatasets,demon-stratingthegenerationofimageswithintactstructures,strongersemanticconsistency,andrichervisualdiversity.

作者：聂开琴  倪郑威Author：NIEKaiqin  NIZhengwei
作者单位：浙江工商大学信息与电子工程学院,浙江杭州310018
刊名：电信科学 ISTICPKU
Journal：TelecommunicationsScience
年，卷(期)：2024, 40(5)
分类号：TP183
关键词：文本生成图像  生成对抗网络  计算机视觉  语义一致性  自注意力
Keywords：text-to-image  generativeadversarialnetwork  computervision  semanticconsistency  self-attention
机标分类号：TP391.41TN911.73TP183
在线出版日期：2024年7月1日
基金项目：基于多文本描述的图像生成方法[
期刊论文]  电信科学--2024, 40(5)聂开琴  倪郑威针对单条文本描述生成的图像质量不高且存在结构错误的问题进行研究,采用多阶段生成对抗网络模型,并提出对不同文本序列进行插值操作,从多条文本描述中提取特征,以丰富给定的文本描述,使生成图像具有更多细节.为了生成与...参考文献和引证文献
参考文献
引证文献
本文读者也读过
相似文献
相关博文

基于多文本描述的图像生成方法  Image synthesis method based on multiple text description

基于多文本描述的图像生成方法.pdf

2024-12-14 11:59 上传

基于多文本描述的图像生成方法.pdf

文件大小:: 28.35 MB

下载次数:: 60

高速下载

基于多文本描述的图像生成方法

相关帖子

能源电力

化工

建筑工程

机械

电子信息

医药

科学