Apple collaborates with NVIDIA to research faster LLM performance

0

In a blog post today, Apple engineers have shared new details on a collaboration with NVIDIA to implement faster text generation performance with large language models.

Apple published and open sourced its Recurrent Drafter (ReDrafter) technique earlier this year. It represents a new method for generating text with LLMs that is significantly faster and “achieves state of the art performance.” It combines two techniques: beam search (to explore multiple possibilities) and dynamic tree attention (to efficiently handle choices).

more…

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.