Multi-modal neural network with double-loop learning that fuses vision and text, integrates external APIs for computational knowledge, and is optimized for training on single-GPU or NPU-accelerated ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results