Articles / How to search for images by text description or using a similar image

Building on some earlier articles, let's look at how we can use embeddings for not just text but images. In fact, let's take it a stage further and embrace a multi-modal model / tokenizer called "openai/clip-vit-base-patch32" (from the guys at OpenAI and available for download on Huggingface) for the embeddings.

Python

These steps are required for the Python-specific code.

mkdir embedding
cd embedding/

Managing the correct environment.

python3 -m venv .
source bin/activate

Installing the necessary packages.

pip install transformers
pip install torch
pip install pillow
pip install numpy
pip install fastapi
pip install uvicorn

Node / JS

These steps are required for the Javascript-specific code.

mkdir api
cd api/

Installing the necessary packages.

npm install axios
npm install commander
npm install compromise
npm install cookie-parser
npm install cors
npm install dotenv
npm install express
npm install express-es6-template-engine
npm install he
npm install joi
npm install moment
npm install multer
npm install nodemon
npm install pg
npm install pg-hstore
npm install pgvector
npm install sequelize
npm install sequelize-auto
npm install sequelize-pagination
npm install uuid
npm install prettier

Python

Let's create a throw-away Python script that shows us how to use the model / tokenizer to generate embeddings from an image.

test-image-embedding.py

from transformers import AutoProcessor, AutoModelForZeroShotImageClassification
import torch
from PIL import Image

processor = AutoProcessor.from_pretrained("openai/clip-vit-base-patch32")
model = AutoModelForZeroShotImageClassification.from_pretrained("openai/clip-vit-base-patch32")

image_path = "blonde-woman.png"

image = Image.open(image_path)
inputs = processor(images=image, return_tensors="pt")

with torch.no_grad():
    image_features = model.get_image_features(**inputs)

import numpy as np
image_features_np = image_features.numpy()

print(image_features_np[0])

Let's run the script:

python test-image-embedding.py
[-5.40132821e-02 -1.93054989e-01 -3.95455211e-03  1.91171229e-01
 -3.75162125e-01  9.22233611e-03 -6.77868351e-03 -1.44027546e-01
 -6.85096860e-01 -8.37914348e-02  1.77640423e-01  1.25645205e-01
 -3.46377909e-01 -1.06426030e-02  3.37736040e-01 -5.01924753e-01
 -2.18784124e-01  2.61504769e-01 -5.83848804e-02 -1.09245747e-01
  1.24091530e+00  3.88815776e-02 -6.49410486e-02 -5.26623964e-01
 -1.60188228e-01  1.56638145e-01  4.17744070e-01 -2.97322392e-01
  3.41349468e-02  2.08359420e-01 -5.40926814e-01  2.63510168e-01
 -4.04175133e-01  3.67467046e-01 -1.64146096e-01  2.66779751e-01
  1.82682618e-01  2.32544899e-01 -8.85631964e-02  3.41229677e-01
 -3.25820118e-01  3.25782239e-01 -1.00399166e-01 -3.37315708e-01
  7.60497153e-02  5.88586748e-01  3.53650153e-01  1.76775903e-01
 -3.95788670e-01  7.47422576e-02  1.70323253e-01 -4.80736852e-01
  3.90328407e-01 -1.02524199e-01 -4.68754172e-01  6.71297908e-02
  6.51074648e-02  1.48725420e-01 -2.87767947e-01 -4.08542484e-01
  3.17822427e-01  2.42569029e-01  2.53024817e-01 -6.30636588e-02
 -2.72080183e-01  1.74388930e-01  2.75049806e-02 -9.65860009e-01
  1.58574894e-01  6.06836230e-02 -3.32381606e-01  2.68457413e-01
 -5.07304609e-01  6.31776750e-02 -3.61066125e-02 -2.78394133e-01
  1.16228595e-01  4.81769830e-01  1.85684770e-01 -1.47018865e-01
  1.60435230e-01 -1.33053288e-01  1.97314218e-01  6.58724189e-01
 -5.75581118e-02  9.00070518e-02  1.54322326e+00 -4.79513288e-01
  3.89355779e-01  1.57677174e-01  2.44223967e-01 -1.73689425e-01
 -6.77042580e+00  6.37175322e-01  6.37708008e-02 -2.70652354e-01
 -7.96685517e-02 -2.32987404e-01 -7.92721748e-01 -1.04933548e+00
  2.44591787e-01 -6.62040293e-01 -2.64316320e-01 -4.00982589e-01
 -5.97513139e-01 -2.11769968e-01  1.03814578e+00 -1.32618055e-01
  1.26016200e-01  3.51098686e-01 -1.24301620e-01 -1.70093596e+00
  1.29336268e-01  1.94604993e-01 -1.68729663e-01  2.31825486e-01
  5.47995746e-01 -1.33333594e-01  8.10879767e-02 -3.01703572e-01
 -2.95724571e-01  3.98139864e-01 -9.58971530e-02 -1.08954504e-01
 -4.29436088e-01 -1.44610956e-01  3.82165127e-02 -2.05728635e-01
 -1.38953060e-01 -2.42466778e-02  7.01146245e-01 -5.64890504e-01
 -2.79967010e-01  9.18756962e-01 -2.06098072e-02  6.28092408e-01
  1.79266632e-02 -1.78675815e-01  9.51453745e-02 -1.21693417e-01
 -3.74287605e-01 -1.23257786e-02 -1.89352110e-01  2.10022867e-01
 -2.02477545e-01  1.34273767e-01  4.31957319e-02  3.86657000e-01
 -2.51282036e-01  3.98504376e-01 -2.08202928e-01  7.97248930e-02
  7.68799782e-01  3.64595577e-02 -3.01613957e-02 -5.65646648e-01
  3.44998807e-01  3.46816480e-02 -1.95582837e-01  2.49393523e-01
 -1.82594836e-01 -1.31168455e-01 -5.85692525e-02  1.20639294e-01
 -8.85587931e-02  5.55291846e-02  8.23581457e-01  2.85749793e-01
 -2.15871558e-01 -8.33057016e-02  1.37687400e-02  2.75512874e-01
 -4.22006100e-02 -2.50869513e-01  5.02870604e-02 -3.95914316e-02
  9.10528362e-01  4.83117253e-02  2.83215702e-01  2.40392461e-01
  4.55593139e-01 -1.46761909e-01  7.09548593e-01  1.37666523e-01
 -2.25647822e-01  1.03631921e-01  1.63447559e-02 -1.09305717e-01
  2.52433270e-01  2.16893464e-01  2.96454787e-01  4.57040787e-01
  2.77082980e-01  9.31173563e-04 -3.76722664e-02  1.28894463e-01
 -2.51551390e-01 -5.28474808e-01  4.10098463e-01 -5.47439873e-01
 -1.95222080e-01  1.54247344e-01  4.13270950e-01 -3.02311599e-01
  2.29907945e-01 -5.08035123e-01  2.22629935e-01 -1.00009888e-01
  9.34797227e-02  1.66726500e-01  1.11586547e+00  4.47894096e-01
 -2.09018052e-01  4.17754471e-01 -4.44357216e-01 -1.17062956e-01
 -6.20028377e-03  1.18191361e-01  3.16576779e-01 -2.21808419e-01
  1.66782290e-02  6.11949444e-01 -6.92677647e-02  1.11415312e-01
 -3.96258175e-01  1.32691696e-01  1.25753418e-01 -1.14196293e-01
 -2.36799404e-01 -9.57945138e-02  7.86440596e-02 -1.26604736e-02
  7.92662948e-02  1.90126315e-01  1.22421324e-01 -5.18528104e-01
 -8.53140280e-02 -4.20529962e-01  1.04288846e-01 -1.54540062e-01
  3.00026596e-01  1.11398101e-02 -1.07361630e-01 -5.09832144e-01
 -1.15582354e-01 -7.68002331e-01 -1.98700160e-01  2.93628752e-01
 -4.63021874e-01  2.97747366e-02 -8.91805887e-02  2.78878957e-02
  2.31536344e-01 -2.41299450e-01 -1.33296221e-01 -6.87943399e-03
 -2.38735601e-01 -9.13164616e-02 -1.05474532e+00 -3.74551237e-01
  2.15849027e-01  2.37684399e-02  8.90867934e-02 -9.56982136e-01
 -2.10128516e-01 -3.77484918e-01 -4.70648468e-01  1.59837127e-01
 -5.44727892e-02 -2.53751367e-01  6.87814504e-02  2.45265856e-01
  2.55720019e-01  1.84735000e-01 -1.17877781e-01 -3.40856612e-02
 -2.46014178e-01  2.67045856e-01 -6.36359155e-02 -4.24352705e-01
 -3.55958045e-02 -1.09272853e-01 -7.68991947e-01  4.40233618e-01
 -2.67008156e-01 -6.27204031e-02 -3.49275053e-01 -1.02338009e-01
 -3.71907681e-01 -5.87717295e-01 -2.67961472e-01 -5.24523854e-03
 -4.85436805e-02  3.26138794e-01 -2.64503777e-01 -2.90730029e-01
 -6.72322392e-01  1.05043486e-01  5.12615889e-02 -1.37341663e-01
 -1.25858381e-01  1.41833410e-01 -5.66676617e-01 -1.08450904e-01
  9.41905379e-02  8.42899621e-01 -8.25493187e-02  1.47096068e-01
 -8.69344175e-02  5.15298188e-01 -9.83569026e-02  2.25818157e-01
  9.16545212e-01 -1.34785473e-01 -5.43541983e-02  7.77317524e-01
  1.62835568e-01 -1.84425935e-01 -4.82330650e-01 -7.16550112e-01
  2.48820931e-01  2.36239731e-02 -9.44472551e-02 -1.70316190e-01
  4.80618775e-01  2.11296141e-01  2.08745807e-01  5.09166867e-02
  6.24350160e-02 -2.41702929e-01  3.77914429e-01 -3.48765016e-01
  1.94830775e-01  2.99026936e-01 -1.21971816e-01  1.81951791e-01
  1.12946880e+00  3.82078409e-01  4.49622244e-01  5.36007404e-01
  1.13294274e-01 -2.85540342e-01 -2.19394624e-01  1.78703219e-01
 -4.22878526e-02  1.78477407e-01 -2.08368167e-01  2.83520818e-01
 -1.74282804e-01  9.85179916e-02 -2.82502115e-01 -2.02017322e-01
  4.04816628e-01 -5.81855029e-02  9.41279531e-03  4.75115597e-01
  1.09255716e-01  1.01144004e+00  2.65411437e-01 -2.85278767e-01
 -1.51898786e-01 -1.09826326e-01  1.77694798e-01  4.34331954e-01
  7.72409022e-01 -1.56922638e-03 -3.24966818e-01 -1.11104822e+00
 -4.12855536e-01 -2.00007945e-01  9.52722132e-02 -2.04372436e-01
  3.60612571e-01  3.26962471e-01  8.62678140e-02 -5.95970213e-01
  1.61683905e+00  1.04747035e-01 -2.17336237e-01  2.63533220e-02
 -6.13715388e-02 -6.61238670e-01 -2.28930384e-01 -2.49526441e-01
 -1.41146243e-01  5.08370042e-01  5.84921598e-01 -1.98167592e-01
  1.17925346e-01  1.57207072e+00 -4.18866366e-01 -2.29065955e-01
 -1.24670461e-01 -2.19841763e-01 -3.83840203e-01 -1.36076108e-01
  4.36393261e-01 -1.76403493e-01  4.17627618e-02  1.16413474e-01
 -2.03465819e-02 -2.52857059e-02  2.84987867e-01 -3.58117223e-01
  1.81351304e-02 -1.90663040e-01 -1.02527514e-02  1.98255748e-01
  9.24774706e-02  6.57313168e-02 -1.51287496e-01 -5.58991507e-02
  6.23843819e-02  4.06558990e-01  1.49780214e-01  2.58329302e-01
  2.18302280e-01  2.36582294e-01  2.04655513e-01 -1.81472927e-01
 -2.86354125e-02 -4.89747524e-02  3.40251446e-01 -7.93629050e-01
  1.09296322e-01 -5.80129176e-02 -5.86714447e-02  2.34155416e-01
 -3.22000444e-01 -9.33362320e-02 -1.18350074e-01 -8.35705549e-04
  8.91422868e-01  1.07855037e-01 -2.66656220e-01  1.09330118e-01
 -9.29073095e-02 -6.46081567e-02  1.38101697e-01 -3.96992326e-01
 -2.96084285e-01  3.48464325e-02 -4.63098288e-01  4.12537932e-01
 -6.16104305e-02 -2.08837256e-01 -1.79680765e-01 -5.29724061e-02
 -3.15027714e-01  2.51084805e-01  2.40888387e-01  3.85966599e-01
 -2.13772476e-01  3.49766277e-02 -8.26992542e-02  3.52111906e-01
 -4.54889536e-02 -2.41713375e-02 -1.09562993e-01  7.44453222e-02
  1.56931877e-01 -3.50701958e-02  3.26353498e-02 -7.38986492e-01
  6.67737186e-01  6.12415373e-04  3.15411568e-01 -2.33650491e-01
 -1.91711351e-01 -2.39452198e-02  1.34741440e-01  6.45889193e-02
 -1.89972386e-01 -4.44191992e-01 -1.11413486e-01 -1.14576057e-01
 -2.61343271e-01  1.49857491e-01 -1.60966724e-01  6.66997731e-02
 -5.55049181e-01  1.51970565e-01 -3.45902681e-01  7.42579773e-02
  6.39410019e-02  6.14605665e-01  1.69301122e-01 -2.33500630e-01
 -2.39544705e-01 -2.97120929e-01  2.80148119e-01  7.52456039e-02
  5.05358279e-02 -6.12225473e-01  3.51461887e-01  6.72973037e-01
  3.55735064e-01  1.63332045e-01  3.24618012e-01 -2.27422804e-01
 -5.41151106e-01  1.24591038e-01 -1.64012462e-01 -1.93415880e-01
  1.27287912e+00 -2.10435316e-01 -5.51056504e-01 -2.67660290e-01
 -3.00241798e-01  9.22407210e-02  4.66299921e-01  3.70588720e-01]

Here, we're outputting the embedding. Looking good.

Okay. Let's create an embedding API using FastAPI so that we can call this Python code from our Express RESTful API.

from fastapi import FastAPI, HTTPException
from transformers import AutoProcessor, AutoModelForZeroShotImageClassification
import torch
from PIL import Image
from typing import List
from fastapi.responses import JSONResponse
import numpy as np
from pydantic import BaseModel
import base64
from io import BytesIO

app = FastAPI()

processor = AutoProcessor.from_pretrained("openai/clip-vit-base-patch32")
model = AutoModelForZeroShotImageClassification.from_pretrained("openai/clip-vit-base-patch32")

class SentenceRequest(BaseModel):
    sentence: str


@app.post("/api/generate-embedding")
async def generate_embedding(request: SentenceRequest):
    """
    Generates sentence embedding for a given sentence.

    Args:
        request: A SentenceRequest object containing the sentence to embed.

    Returns:
        A JSON response containing the sentence embedding.
    """
    try:
        sentence = request.sentence
        try:
            image = Image.open(BytesIO(base64.b64decode(sentence)))
            inputs = processor(images=image, return_tensors="pt")
            is_image = True
        except Exception as e:
            is_image = False
            inputs = processor(text=sentence, return_tensors="pt")
        with torch.no_grad():
            if is_image:
                features = model.get_image_features(**inputs)
            else:
                features = model.get_text_features(**inputs)
        features_np = features.numpy()
        return {"embedding": features_np[0].tolist()}
    except Exception as e:
        raise HTTPException(status_code=500, detail=str(e))

Node / JS

Okay. So what we've done here is create a separate, stand-alone API on our own internal network. And our "public" API will use it when it needs to generate embeddings.

Okay. Let's create a CLI tool we can use to generate the embeddings for the SQL files. This tool will take a description or base64 encoded string, send it to the embedding API and then output the embedding in the format we can use for PostgreSQL.

const { program } = require("commander");
const { post } = require("axios");

program.version("0.0.1").description("A command-line tool for generating embeddings");

program
    .command("generate ")
    .description("Generates embeddings given a sentence and that sentence might be text or base64 encoded string")
    .action(async (sentence) => {
        const response = await post(
            "http://image_vector_search_example_embeddings:7474/api/generate-embedding",
            {
                sentence: sentence,
            },
        );
        const embeddings = response.data.embedding;
        console.log(`ARRAY[${embeddings.join(", ")}]::vector(512)`);
    });

program.parse(process.argv);

You can use it as follows:

node console/embedding.js
Usage: embedding [options] [command]

A command-line tool for generating embeddings

Options:
  -V, --version        output the version number
  -h, --help           display help for command

Commands:
  generate   Generates embeddings given a sentence and that sentence might be text or base64 encoded string
  help [command]       display help for command

Here is an example using a text description:

node console/embedding.js generate "Blonde woman standing in front of a concrete wall"
ARRAY[0.18320761620998383, 0.03161871060729027, -0.24215754866600037, 0.002381939673796296, -0.3815241754055023, 0.11699826270341873, -0.06393322348594666, -0.32643017172813416, -0.06508059799671173, -0.016655657440423965, 0.12588918209075928, 0.08714832365512848, 0.1825539916753769, 0.3111313283443451, 0.1409342736005783, -0.2670823335647583, -0.05960197001695633, -0.057347264140844345, 0.0851236879825592, -0.018667513504624367, 0.40941762924194336, 0.06750501692295074, 0.010939118452370167, -0.21832513809204102, 0.007390803191810846, 0.17483380436897278, 0.013441544026136398, 0.21434886753559113, 0.17463350296020508, 0.3800116777420044, 0.43186306953430176, -0.6189807653427124, -0.09298565983772278, -0.009257301688194275, -0.063480906188488, 0.17932671308517456, -0.005677626468241215, 0.37541261315345764, -0.3637228012084961, 0.3403746485710144, -0.13458044826984406, 0.20003224909305573, 0.21956512331962585, -0.06931434571743011, -0.0733635425567627, 0.1582181453704834, 0.1460110992193222, -0.18139229714870453, -0.055013470351696014, -0.32213377952575684, 0.02394952066242695, -0.4399184286594391, 0.1836244761943817, -0.46314653754234314, -0.23396655917167664, 0.08557330071926117, 0.28264233469963074, -0.08427311480045319, -0.2938283383846283, -0.288257360458374, 0.3792520761489868, -0.18671190738677979, 0.38342440128326416, 0.13260459899902344, -0.26889917254447937, -0.1900402009487152, 0.12867343425750732, -0.34942781925201416, -0.14439786970615387, -0.13558758795261383, -0.11739916354417801, 0.07086839526891708, 0.15885205566883087, -0.023332800716161728, 0.19701221585273743, -0.2463913857936859, 0.31158074736595154, 0.19193068146705627, 0.12548136711120605, -0.21132497489452362, 0.24094906449317932, 0.18685618042945862, 0.27544763684272766, 0.18937289714813232, -0.1478014439344406, -0.008608809672296047, -0.28316086530685425, 0.19025227427482605, 0.07392159104347229, 0.056241899728775024, 0.19267235696315765, 0.40285584330558777, -0.734610378742218, 0.36875903606414795, 0.530949592590332, -0.2714320421218872, 0.192481130361557, 0.14686863124370575, -0.37205198407173157, -0.33547890186309814, 0.29407593607902527, -0.041811954230070114, 0.048648834228515625, -0.4006766378879547, 0.05889435112476349, -0.38627687096595764, -0.04212673753499985, 0.09793310612440109, 0.44871851801872253, -0.34463536739349365, -0.3357393443584442, -0.17218787968158722, 0.5231924057006836, -0.35198652744293213, -0.6286910176277161, 0.19060413539409637, 0.09887733310461044, -0.011740289628505707, 0.08027203381061554, -0.2311287373304367, 0.018516944721341133, -0.6229166388511658, 0.324867844581604, -0.42825278639793396, -0.07613208144903183, 0.198409765958786, -0.011468782089650631, -0.11272446066141129, 0.272502601146698, 0.4532114565372467, 0.1331571638584137, 0.3604752719402313, -0.2653084993362427, 1.8272792100906372, -0.5559777617454529, 0.5493554472923279, 0.012088047340512276, -0.007365081459283829, 0.052574507892131805, -0.006460706703364849, 0.020373916253447533, -0.098611019551754, -0.9145404696464539, 1.0089466571807861, 0.15015417337417603, -0.07220616191625595, 0.17096614837646484, 0.4602961838245392, -0.5027254223823547, 0.31969448924064636, -0.3716675341129303, -0.24277935922145844, 0.17500081658363342, -0.007330306340008974, -0.4239054024219513, -0.03606951981782913, 0.0037081113550812006, 0.039805926382541656, -0.050204697996377945, 0.4800964295864105, -0.46841755509376526, 0.2082263082265854, -0.03534712269902229, 0.029579732567071915, -0.03793754801154137, 0.36987927556037903, 0.2054104208946228, 0.12230900675058365, -0.4559875726699829, -0.13300426304340363, -0.43109604716300964, -0.2516193985939026, -0.04285894334316254, -0.02255505695939064, -0.14487141370773315, 0.20051859319210052, 0.11653833091259003, -0.20984508097171783, 0.2732856273651123, -0.2821314036846161, -0.23106496036052704, -0.2692067325115204, 0.23294344544410706, 0.23291316628456116, -0.09978078305721283, -0.42728888988494873, -0.3605228066444397, -0.06111738458275795, -0.15404172241687775, 0.4982629120349884, 0.04908117651939392, 0.44712162017822266, -0.22099830210208893, -0.4077518880367279, -0.474028617143631, 0.2252333164215088, 0.048096172511577606, -0.18651093542575836, -0.05525001883506775, -0.4308834671974182, -0.05295749381184578, 0.08328722417354584, 0.49740681052207947, -0.09472580254077911, -0.14828002452850342, -0.047292500734329224, 0.5883834958076477, 0.04992053285241127, -0.23273952305316925, 0.21077242493629456, 0.8096718192100525, 0.3225979804992676, 0.04389605298638344, 0.4176429808139801, 0.24365273118019104, -0.21760311722755432, 0.23110586404800415, -0.0115760937333107, 0.3578460216522217, -0.057855233550071716, 0.35614025592803955, 0.025827746838331223, -0.35924452543258667, -0.1638335883617401, -0.6468785405158997, 0.18430623412132263, 0.13102470338344574, -0.011148101650178432, -0.2766386866569519, -0.15531618893146515, -0.14854350686073303, 0.017327124252915382, -0.1750909686088562, 0.0494525209069252, 0.522391676902771, -0.3823597729206085, 0.625869870185852, -0.12237784266471863, -0.4376762807369232, -0.2474554032087326, 0.1449665129184723, 0.17815206944942474, -0.06530159711837769, 0.10113844275474548, -0.20453868806362152, 0.12214796245098114, -0.05643964558839798, 0.3488086462020874, -0.12369988858699799, -0.06253030896186829, 0.02321617119014263, -0.13727237284183502, -0.04255237430334091, 0.00031793946982361376, 0.11082957684993744, 0.13972096145153046, -0.16037924587726593, 0.04480183124542236, 0.17697395384311676, 0.1526934802532196, 0.11159809678792953, -0.1704074889421463, 0.521111011505127, -0.31910401582717896, 0.1905408650636673, -0.16451969742774963, 0.24540314078330994, -0.12473762780427933, 0.22389020025730133, -0.4061351716518402, -0.18735837936401367, 0.3350383937358856, 0.06288840621709824, -0.10963790118694305, 0.061673711985349655, -0.03069629706442356, -0.400185763835907, -0.11688192933797836, 0.024786440655589104, -0.21992020308971405, -0.3526710271835327, 0.09666597843170166, -0.21711817383766174, 0.015498803928494453, -0.04464738816022873, 0.08089850842952728, -0.4951034486293793, -0.003317497903481126, 0.14044064283370972, -0.4955803453922272, -0.2822767496109009, 0.09655940532684326, -0.15445704758167267, 0.2305513471364975, -0.0567755252122879, -0.4003731906414032, -0.1417245864868164, 0.023199480026960373, 0.17864802479743958, -0.6016622185707092, -0.21647945046424866, -0.012226293794810772, 0.12389914691448212, -0.15582191944122314, -0.30015435814857483, -0.7697263360023499, 0.0674356147646904, 0.057904962450265884, 0.3939399719238281, 0.5404149293899536, 0.19319896399974823, -0.2791910171508789, 1.8278433084487915, 0.30572158098220825, 0.4915773868560791, 0.48899975419044495, -0.2239573895931244, -0.5855585336685181, -0.15145756304264069, -0.32410115003585815, 0.13519792258739471, 0.33430036902427673, -0.07638221979141235, -0.04510635510087013, 0.045370277017354965, 0.2756231129169464, -0.21859993040561676, -0.021257346495985985, -0.15004757046699524, -0.5165911912918091, 0.025063667446374893, -0.463236540555954, -0.09404203295707703, -0.3654698133468628, -0.11659350991249084, 0.11813821643590927, 0.049309343099594116, 0.23712363839149475, 0.3683038353919983, 0.09738539904356003, -0.318829208612442, 0.14393991231918335, 0.08104397356510162, -0.259915828704834, 0.12142622470855713, -0.3120679557323456, 0.1939970850944519, 0.0712263286113739, 0.26806801557540894, -0.07302326709032059, 0.20989781618118286, -0.07551079243421555, 0.33883434534072876, 0.028321625664830208, -0.04849167540669441, -0.019721360877156258, -0.1519397348165512, 0.4897685647010803, 0.17571735382080078, -0.16773556172847748, 0.16563838720321655, -0.36857375502586365, 0.18253324925899506, 0.13409125804901123, -0.1410038024187088, 0.14604905247688293, -0.1321098506450653, -0.15063968300819397, 0.037616413086652756, 0.1624722182750702, -0.044324882328510284, -0.21483537554740906, 0.1227593645453453, -0.10093070566654205, -0.23645834624767303, -0.30633944272994995, 0.27131882309913635, 0.16170825064182281, -0.7078282833099365, 0.2671409547328949, 0.25450751185417175, 0.5182300209999084, -0.0952322781085968, -0.27931615710258484, 0.26408806443214417, 0.21787187457084656, -0.028779391199350357, -0.009930015541613102, -0.42213544249534607, 0.19434984028339386, -0.7372940182685852, -0.16505108773708344, -0.14029450714588165, 0.4269119203090668, 0.41406553983688354, 0.21454691886901855, 0.23170803487300873, 0.20499533414840698, 0.061538323760032654, 0.23449143767356873, -0.13078975677490234, -0.12735670804977417, 0.004640067461878061, 0.023091239854693413, 0.19143415987491608, -0.5198346972465515, 0.4803995192050934, -0.2935032248497009, 0.36202722787857056, -0.18045593798160553, -0.2110348343849182, -0.36658555269241333, -0.14479045569896698, 0.16400666534900665, 0.281674861907959, 0.27595600485801697, 0.30481716990470886, 0.10732381045818329, 0.0662091001868248, 0.057474128901958466, 0.1314658522605896, 0.3050023913383484, 0.21605689823627472, -0.356924831867218, 0.7546390891075134, 0.061432626098394394, 0.44972312450408936, 0.3294881284236908, -0.16517288982868195, 0.1717618703842163, 0.20931179821491241, 0.03940201550722122, 0.1584504246711731, 0.015166008844971657, -0.36681002378463745, 0.05253731831908226, -0.14946019649505615, -0.026552796363830566, -0.16984863579273224, -0.11084974557161331, -0.29363372921943665, -0.1455077975988388, -0.3935563564300537, -0.30397215485572815, 0.05908620357513428, 0.06961996853351593, -0.7056646347045898, -0.4824918508529663, -0.3544345498085022, 0.03629116714000702, 0.14571990072727203, 0.04891129955649376, -0.10198657959699631, 0.3325859010219574, 0.09042026102542877, 0.03576710447669029, -0.25503459572792053, 0.009950867854058743, -0.09176511317491531, -0.31916332244873047, 0.42978087067604065, -0.08646591752767563, 0.014205148443579674, -0.10038730502128601, 0.03813614696264267, 0.0994037315249443, -0.1212572529911995, -0.055020857602357864, -0.07485558837652206, -0.15094901621341705, 0.12740400433540344, 0.299965500831604, 0.31172463297843933, -0.057152025401592255, -0.3193354308605194, 0.2412383109331131, 0.1514105349779129, 0.02667844295501709, 0.8067172169685364, -0.35850760340690613, -0.23788148164749146, -0.196895033121109, 0.19713708758354187, -0.01743750460445881, -0.05868763476610184, -0.007379439659416676, 0.186065673828125, -0.07089875638484955, -0.04931574687361717, -0.01708357036113739, 0.5363094210624695, 0.3172820508480072, 0.0754103884100914, 0.6118435859680176, 0.18285530805587769, 0.38011351227760315, -0.057567957788705826, -0.18636561930179596, 0.12354051321744919, -0.14239268004894257, -0.01393868587911129, -0.1977919191122055, -0.2816285192966461, 0.13766126334667206, 0.3340049684047699, -0.05721529200673103, -0.04470556229352951, -0.47196903824806213, 0.06261805444955826, -0.06662864983081818, 0.022203583270311356, -0.17256979644298553]::vector(512)

Okay. We're now about half-way. Let's create a simple schema for persisting the images and their embeddings and we'll then build a simple front-end / back-end to demonstrate the functionality and wrap up things.

We're going to be a bit naughty here and store the image as a base64 string. The mimetype is to help us output it into html img element. But the key thing is the VECTOR(512) type in the database. The model / tokenizer we're using outputs 512 dimensions so we want to make sure we're using that too.

CREATE
EXTENSION vector;

CREATE TABLE images
(
    id SERIAL PRIMARY KEY,
    mimetype VARCHAR,
    image TEXT,
    embedding VECTOR(512)
);

Let's create some entries into the database.

I've stripped out the gubbins so these are just for illustrative purposes. DO NOT TRY INSERTING!

INSERT INTO images (mimetype, image, embedding) VALUES ('image/png', 'iVBORw0KGgoAAAANSUhEUgAAAOAAAADgCAIAAACVT/22AAAAwXpUWHRSYXcgcHJvZmlsZSB0eXBlIGV4aWYAAHjabVDbDcMgDPz3FB0BPyBmHNJQqRt0/BrsREnTkzg/ddiG/nm/3mHQpor1jnXLQqV1JqE5zDK+OI1KptbUGEethdTMmIaI9SCzSc5EMJVMjohy7xjD55EwU/h9nxpzuA80jxAAAAABJRU5ErkJggg==',ARRAY[0.23651084303855896, -0.06072381138801575, -0.09976153075695038, 0.07470647990703583, -0.15835444629192352, -0.21511252224445343, -0.3690471351146698, -1.135039210319519, -0.5108910202980042, -0.31614992022514343, -0.09622760117053986, -0.011659342795610428, -0.15897372364997864, 0.07606512308120728]::vector(512));
INSERT INTO images (mimetype, image, embedding) VALUES ('image/png', 'iVBORw0KGgoAAAANSUhEUgAAAOAAAADgCAIAAACVT/22AAAAwXpUWHRSYXcgcHJvZmlsZSB0eXBlIGV4aWYAAHjabVDbDcMgDPz3FB0BPyBmHNJQqRt0/BrsREnTkzg/ddiG/nm/3mHQpor1jnXLQqV1JqE5zDK+OI1KptbUGEethdTMmIaI9SCzSc5EMJVMjohy7xjD55EwU/h9nxpzuA80jxAAAAABJRU5ErkJggg==',ARRAY[0.23651084303855896, -0.06072381138801575, -0.09976153075695038, 0.07470647990703583, -0.15835444629192352, -0.21511252224445343, -0.3690471351146698, -1.135039210319519, -0.5108910202980042, -0.31614992022514343, -0.09622760117053986, -0.011659342795610428, -0.15897372364997864, 0.07606512308120728]::vector(512));
INSERT INTO images (mimetype, image, embedding) VALUES ('image/png', 'iVBORw0KGgoAAAANSUhEUgAAAOAAAADgCAIAAACVT/22AAAAwXpUWHRSYXcgcHJvZmlsZSB0eXBlIGV4aWYAAHjabVDbDcMgDPz3FB0BPyBmHNJQqRt0/BrsREnTkzg/ddiG/nm/3mHQpor1jnXLQqV1JqE5zDK+OI1KptbUGEethdTMmIaI9SCzSc5EMJVMjohy7xjD55EwU/h9nxpzuA80jxAAAAABJRU5ErkJggg==',ARRAY[0.23651084303855896, -0.06072381138801575, -0.09976153075695038, 0.07470647990703583, -0.15835444629192352, -0.21511252224445343, -0.3690471351146698, -1.135039210319519, -0.5108910202980042, -0.31614992022514343, -0.09622760117053986, -0.011659342795610428, -0.15897372364997864, 0.07606512308120728]::vector(512));

Okay. Let's flesh out the Express App.

app.post("/", upload.single('file'), async (req, res) => {
    let sentence = ";
    if (!req.file) {
        sentence = req.body.sentence
    } else {
        const file = req.file;
        sentence = file.buffer.toString('base64');
    }
    let matches = [];
    if (sentence.length > 0) {
        // Let's generate the appropriate embeddings...
        const response = await post(
            "http://image_vector_search_example_embeddings:7474/api/generate-embedding",
            {
                sentence: sentence,
            },
        );
        const embedding = response.data.embedding;
        const threshold = 0.1;
        const limit = 10;
        const results = await db.sequelize.query(
            `SELECT id,
                    mimetype,
                    image,
                    embedding,
                    1 - (embedding <=> ARRAY[${embedding.join(", ")}]::vector(512)) AS similarity
             FROM images
             WHERE (1 - (embedding <=> ARRAY[${embedding.join(", ")}]::vector(512))) > ${threshold}
             ORDER BY similarity DESC
                 LIMIT ${limit}`,
        );
        matches = results[0];
    }
    res.render("template", {
        locals: {
            sentence: req.file ? "" : sentence,
            matches
        },
        partials: {
            partial: "/index",
        },
    });
})

Okay. So this is really the brains of the App, pulling everything together. If it's an image that has been uploaded, it will be turned into a base64 encoded string and we'll get the embeddings for it, And if it's a text prompt, we'll ge the embeddings for it (but skipping the base64 encoding step. Either way, it's a string being sent to the Embeddings API which is multi-modal so it will work with both. And then the code searching for similar images based on the embeddings is practically a copy / paste from our previous tutorial.

And here's the markup inside the index view.

<main>
    <div class="column controls">
        <div class="column">
            <form method="post" action="/">
                <label for="sentence">Search Text</label>
                <input type="text" value="${sentence}" name="sentence" id="sentence" />
                <button>Search</button>
            </form>
        </div>
        <div class="column">
            <form enctype="multipart/form-data" method="post" action="/">
                <label for="sentence">Search Image</label>
                <input type="file" value="" name="file" id="file" />
                <button>Search</button>
            </form>
        </div>
    </div>
    <div class="row matches">
        ${matches.map((match) => (`
        <div class="column match">
            <img alt="image of person" src="data:${match.mimetype};base64,${match.image}" />
            <p>${match.similarity}</p>
        </div>
        `)).join('')}
    </div>
</main>
Search By Image or Text Vector Embedding

It isn't the prettiest UI / UX but you can search by text or by uploading a similar image. This could be a powerful feature for the right set of requirements.

We use cookies to give you the best possible browsing experience. By continuing to use this website, you agree to our use of cookies. You can view our Data Protection Policy, or by following the link at the bottom of any page on our site.