Large Multi Modal Models (LMMs) are essentially LLMs that also support other modes of data input such as image or audio. This class of models is growing quickly and the most widely used one is probably GPT-4. However, there are a number of other LMM offerings both closed and open source including LLaVa and Qwen-VL.

See more models tagged withLMM