monkeyjump labs

2023-12-21

Llamas, Haystacks, and Hugging Faces: AI Names You May Want to Know

Breaking down the peculiar names in applied AI

Written by Emily Copeland

Artificial intelligence tools and capabilities are vast, and the area of Large Language Models (LLMs) within AI is also improving at a rapid pace. LLMs are constantly developing with the help of new players, and that list of players is growing, too.

Amid the complexities, understanding who is involved in the Artificial Intelligence Tools space and knowing what they offer can cause heads to spin when you hear bewildering names like "Llamas," "Haystacks," and "Hugging Faces." What do you need to know about these guys? How do these tools' capabilities impact what AI can do for you?

We're going to unpack information about these new(er) AI players so you can understand what they offer to companies looking to get into or level up within the AI space.

Quick note: we're not going to talk about the higher level players here, such as TensorFlow, Python, Torch, etc.

BIG PICTURE:

What categories of tools exist that we will be talking about?

  • Platform Communities and Toolsets
  • ML, *NN, and LLM models
  • Hosted "black box" services
  • Libraries and SDKs

BIG PLAYERS:

Who is hosting the major LLMs?

  • Open AI — provides LLM and other models
  • Microsoft — key partner of OpenAI; provides AI/ML platform capabilities
  • Meta — open source LLM in Llama2 and other AI models
  • Google — provides LLM in Bard, along with other AI models
  • Anthropic — develops and provides a human-focused LLM model

We want to provide information about these AI players that is helpful and easy to understand so you can make informed decisions as you open up your company to the world of AI. The information below will be brief, but loaded with value.

Let’s dive in.

The Peculiar Names of Applied AI

Llama2:

Open Source

Good for: Enhanced context-aware interactions

The letters in Llama contain “LLM”, and the word stands for “Large Language Models AI”. Llama was created by Meta AI as a family of LLMs, but not originally created as open source. It was trained on data to generate natural sounding outputs for chatbots and other interactive AI applications.

Llama 2 is Meta’s second LLM model creation. This model is open source and similar to Chat GPT 4, but smaller in scale. Llama 2 uses a data network to guess subsequent text, and makes responses seem very human-like based on the parameters it’s given.

Then you have the Llama Index: a library used to facilitate ETL/ELT (extract, transform, load– moving data from one place to another) processes in preparation for running and doing LLM activity. It is licensed under an open-source license.

Finally, there is the Llama Hub which is a community platform for implementations of ETL processes for Llama index. It is licensed under an open-source license.

Hugging Face:

Varied licenses

Good for: Community knowledge-sharing

Hugging Face is a newer player, partnered with Meta to make the model available to their customers. Hugging Face is aiming to be the GitHub of AI models as it is a community where people like researchers, data scientists, and machine learning engineers can share and engage.

For consumers and implementers of AI models, Hugging Face is a platform that will increasingly become important as they utilize training data sets, share and collaborate on models, and deploy public-facing AI models. Those 3 components - datasets, models, and PaaS/SaaS services form the core of Hugging Face.

LangChain:

Open source

Good for: Generative data summarization (chatbots)

LangChain is the biggest python library for extracted applied LLM usage. Essentially, you can create applications that adapt to data changes and utilize the latest advancements in natural language processing. Upstream, LangChain uses models (like those from Open AI + others) to work correctly. It combines those with various other capabilities to produce a full-featured experience — such as combining a large language model with a math tool + wikipedia lookup — in order to produce meaningful products.

In general, Langchain efficiently configures prompts for chatbots and anything that requires a generative data summarization, making them interactive and compatible.

Haystack:

Open source

Good for: Large information systems

Similar to LangChain, Haystack is a framework for building applications built on LLM’s and AI. While it has a different license, it is also a python library for applied LLM products. It has slightly different patterns for implementing products and use cases.

Haystack uses pipelines to retrieve, read, and generate information. To make life — and application building — easier, the company has ready-made pipelines available for download.

Semantic Kernel:

Open source

Good for: A more personalized user experience

Kernel (yes, like popcorn) is the middle of an operating system. Microsoft's stab at a Applied LLM framework builds on this concept, and aims to provide the ability to create applied LLM capabilities outside of Python, and with their own spin. Instead of only using Python, Semantic Kernel also allows people to utilize C#.. Like LangChain, it builds "chains" that can be glued together to make a full LLM-powered experience. Semantic Kernel uses machine learning to help improve online searches, suggesting content you might like, and making AI chatbots and assistants more human-like in their responses.

Ready to put AI to work for you?

Having trouble determining if/when/how AI could advance your business? Book a 30-minute call with us to discuss the revenue opportunities or cost savings in your digital space.

If we determine on the call that AI enhancement is viable, we will then schedule a 4-hour workshop to bring our teams together and get to work.