2 min read

What is LangChain?

What is LangChain?

With the recent surge in AI interest, LangChain has been coming up a lot in my work conversations. I wasn't familiar with the technology, and watched this short video that helped improve my understanding of the concept.

At its core, LangChain is an open-source framework designed to enable AI developers to integrate Large Language Models (LLMs) like GPT-4 with other computational sources and datasets. As the name suggests, LangChain facilities the chaining of language models with other functionalities.

LangChain consists of three key parts:

  1. Components: These include LLM wrappers, prompt templates, and indexes for retrieving relevant information.
  2. Chains: This allows developers to assemble components to tackle specific tasks.
  3. Agents: These enable LLMs to interact with their environment, such as making API requests with specified actions.

LangChain is useful because there can limitations with the data that LLMs are trained on. This data may be outdated, or lack access to proprietary information. LangChain provides developers with a means to incorporate their own data into the mix, leveraging entire databases.

Setting up a vector database

To create a database for use by LangChain, text data is converted into smaller chunks known as embeddings, via an embedding model. An embedding is a vector (list) of floating point numbers. Embeddings are then stored in a vector database. Many existing databases can serve as vector stores, such as Cloud SQL, AlloyDB, Spanner, BigTable, Redis, and Firestore.

Creating a pipeline

Screenshot from https://youtu.be/aywZrzNaKjs?si=f676mvll4YrPrIz3

To illustrate a typical pipeline with LangChain:

  1. A user asks a question
  2. The question is converted into a vector representation, and which is then used to conduct a similarity search and retrieve related information from a vector database.
  3. Both the question and the retrieved information are sent to the LLM, such as GPT-4.
  4. The LLM can proceed to answer the question or execute an action.

LangChain enables developers to chain all of these parts together and build applications that are not only data-aware but also capable of taking actions. This opens up the possibility to build applications such as:

  • Personal assistant apps for tasks such as booking flights, studying, or managing taxes
  • Line of business apps that leverage customer data to improve employee workflows
  • Productivity apps that connect to existing APIs, such as Google Workspace