<p>Charting The Future</p>

<p>A Perspective for OEMs</p>

Charting the Future: A Perspective for OEMs

As technology evolves and greater computing power is made available in a more eﬃcient and accessible manner, we can identify several areas that present themselves as potential ground for adopting Gen AI and LLM-based approaches. However, there are two very important aspects to be considered while considering LLM’s.

Choosing the right LLM is a very serious activity and this requires a very good understanding of the proposed model as well as the areas where it will be used. One needs to be very detailed about this research. Our work in this area shows that there is a very clear diﬀerence in performance timings and accuracies at a subtask level when one compares diﬀerent LLMs. Table 1 illustrates some of our ﬁndings in this direction.

Table 1: Open Source LLM Performance on Various NLP Tasks*

”Subjective Score” is human evaluation score after comparing output from ChatGPT & listed LLMs on various generic NLP Tasks

Our teams created comparison charts for various NLP tasks with popular, open source LLMs that can be deployed on moderate GPU configurations. We found that:

Smaller model (Flan-T5) perform well with non-generative tasks such as "Information Extraction" , "Text Classification," and "Language Translation" at acceptable levels, while
LLAMA2-7B performed well on most of NLP tasks except Arithmetic reasoning and translation.

The second area of importance is around ensuring that LLMs are trained properly for improved accuracy. We need be remember here that LLMs are particularly eﬀective in capturing the complex relationships between words and phrases in natural language, and can achieve signiﬁcant results across a variety of NLP tasks. The training of LLMs is data-intensive, and it requires signiﬁcant computing resources and specialized algorithms to process large amounts of data.

Table 2 (below) illustrates the evaluation of LLM performance using foundation models. Our teams created a comparison chart for summarization and Q&A task with popular LLMs that can be deployed on moderate GPU conﬁgurations. The focus was on striking a balance between inference time and accuracy, and we found LLAMA2 8 bit as being more memory eﬃcient with acceptable accuracy.

Table 2: Foundation Models Quantized – Evaluating LLM performance*

Subjective Score” is human evaluation score after comparing output from ChatGPT & listed LLMs on various generic NLP Tasks

As indicated earlier, improving the accuracy of LLMs can have a significant impact on the performance of natural language processing applications, including helping improve the accuracy of machine translation systems. This can have a significant impact on cross-border communication and global businesses, besides helping improve the accuracy of search and text classification systems, and streamlining and revitalizing information retrieval and knowledge discovery paradigms.

As LLM’s continue to evolve rapidly, another area where we can eﬀectively leverage the capabilities unlocked by these models is in the development of vision computing for enabling Autonomous Driving/Autonomous Driving Assistance Systems (AD/ADAS). This includes using Segment Anything Models (SAM) for image segmentation, or leveraging Contrastive Language-Image Pre-training (CLIP) or Florence for image classiﬁcation, segmentation, and generation. The span of these models is clear from the size of the data sets used, ranging from about 100 million parameters for SAM to about 1 billion for Florence. Figure 2 illustrates the Foundation Models for Vision Computing.

Figure 2: Foundation Model for Vision Computing: AD/ADAS

All these models are significantly better than Classical Dense CNN/GAN models in most of the tasks needed for enabling the next level of AD/ADAS performance.