In today’s tech-driven world, data plays a crucial role in decision-making, especially with advancements in artificial intelligence (AI) and machine learning (ML). Traditional databases are great for structured data, but they fall short when handling unstructured or complex data like images, text, and audio. This is where vector databases come in.

In this guide, we’ll explain what vector databases are, why they’re important, and how to use them in your projects. We’ll also walk you through performing basic CRUD (Create, Read, Update, Delete) operations using Node.js, Express, and Pinecone, a popular vector database.

What is a Vector Database?

A vector database stores data as high-dimensional vectors. These vectors represent unstructured data (like text or images) in a way that computers can process. Unlike traditional databases that rely on exact matches, vector databases excel at finding similar data based on mathematical proximity, making them perfect for AI-driven applications like recommendation systems, image searches, and natural language processing.

For example, imagine searching for articles similar to one you’re reading. A vector database compares the “vector” of your current article with others in the database to find the closest matches.

Why Use Vector Databases?

Here’s why vector databases are becoming essential:

What Are Vectors and Embeddings?

Building a CRUD Application with Pinecone and Node.js

We’ll now build a simple blog post application that interacts with a vector database. Here’s the step-by-step process.

Step 1: Set Up Your Project

  1. Create a new Node.js project:

bash

Copy code

mkdir vector-db-crud

cd vector-db-crud

npm init -y

  1. Install the required packages:

bash

Copy code

npm i express @pinecone-database/pinecone @huggingface/inference body-parser dotenv morgan cors nanoid

    • express: For building APIs.
    • @pinecone-database/pinecone: For working with Pinecone.
    • @huggingface/inference: To convert data into embeddings.

Step 2: Set Up Your Folder Structure

Organize your project like this:

bash

Copy code

vector-db-crud/

├── controllers/ # Logic for CRUD operations

├── routes/      # API routes

├── server.mjs   # Main server file

├── .env         # Environment variables

└── package.json

Step 3: Create the Server File

In server.mjs, set up the Express server:

javascript

Copy code

import “dotenv/config”;

import express from “express”;

import cors from “cors”;

import bodyParser from “body-parser”;

import morgan from “morgan”;

const app = express();

const PORT = process.env.PORT || 7000;

app.use(morgan(“dev”));

app.use(bodyParser.json());

app.use(cors());

// Import routes

import postRoutes from “./routes/postRoutes.mjs”;

app.use(“/api”, postRoutes);

app.listen(PORT, () => {

  console.log(`Server running on port ${PORT}`);

});

Step 4: Define Routes

In routes/postRoutes.mjs, create routes for the CRUD operations:

javascript

Copy code

import express from “express”;

const Router = express.Router();

import { createPost, getAllPosts, updatePostById, deleteAllPosts } from “../controllers/postController.mjs”;

Router.post(“/create-post”, createPost);

Router.get(“/get-all-posts”, getAllPosts);

Router.patch(“/update-post/:id”, updatePostById);

Router.delete(“/delete-all-posts”, deleteAllPosts);

export default Router;

Step 5: Implement Controllers

In controllers/postController.mjs, define the logic for CRUD operations:

  1. Create a Post (POST):

javascript

Copy code

export const createPost = async (req, res) => {

  try {

    const vector = await inference.featureExtraction({

      model: “sentence-transformers/distilbert-base-nli-mean-tokens”,

      inputs: req.body.title + ” ” + req.body.description,

    });

    const PineconeRecord = {

      id: nanoid(),

      values: vector,

      metadata: {

        title: req.body.title,

        description: req.body.description,

        createdAt: new Date().toISOString(),

      },

    };

    await pcIndex.upsert([PineconeRecord]);

    res.json({ message: “Post created successfully!” });

  } catch (error) {

    res.status(500).json({ message: error.message });

  }

};

  1. Get All Posts (GET):

javascript

Copy code

export const getAllPosts = async (req, res) => {

  try {

    const data = await pcIndex.query({

      topK: 100,

      includeMetadata: true,

    });

    res.json({ posts: data.matches });

  } catch (error) {

    res.status(500).json({ message: error.message });

  }

};

  1. Update a Post (PATCH):

javascript

Copy code

export const updatePostById = async (req, res) => {

  try {

    await pcIndex.update({

      id: req.params.id,

      metadata: { title: req.body.title },

    });

    res.json({ message: “Post updated!” });

  } catch (error) {

    res.status(500).json({ message: error.message });

  }

};

  1. Delete All Posts (DELETE):

javascript

Copy code

export const deleteAllPosts = async (req, res) => {

  try {

    await pcIndex.namespace().deleteAll();

    res.json({ message: “All posts deleted!” });

  } catch (error) {

    res.status(500).json({ message: error.message });

  }

};

Bonus: Adding Search

Vector databases enable powerful searches based on similarity. For example, searching “travel” might return related posts like “adventure” or “vacation” based on their embeddings.

Why Vector Databases Matter

Vector databases are revolutionizing how we handle data. Their ability to perform similarity searches, handle high-dimensional data, and integrate with AI makes them a must-have for modern applications.

Start integrating vector databases into your projects today and unlock the potential of smarter, AI-driven solutions.

Leave a Reply

Your email address will not be published. Required fields are marked *