Abstract: Image-text matching is a vital task in multi-modal intelligence. Recently, researchers have moved beyond simply aligning fragments between image regions and text words at a low level. They ...
Required packages: follow the requirements.txt file. # from Google Drive # https://drive.google.com/file/d/1nscytAhDEPdDCn5K9nfT7d_RFTXoQZcZ/view?usp=drive_link pip ...