Skip to content

feat(oracle): Vector datatype support#6

Open
hjamil-24 wants to merge 25 commits intov6from
vector_support
Open

feat(oracle): Vector datatype support#6
hjamil-24 wants to merge 25 commits intov6from
vector_support

Conversation

@hjamil-24
Copy link
Copy Markdown
Owner

@hjamil-24 hjamil-24 commented Feb 3, 2025

Pull Request Checklist

  • Have you added new tests to prevent regressions?
  • If a documentation update is necessary, have you opened a PR to the documentation repository?
  • Did you update the typescript typings accordingly (if applicable)?
  • Does the description below contain a link to an existing issue (Closes #[issue]) or a description of the issue you are solving?
  • Does the name of your PR follow our conventions?

Description of Changes

Vector support for Sequelize

This MR is an attempt to add vector support to Sequelize-v6 for Oracle Database. To add support, following sequelize datatypes functions are introduced for similarity search:

  • sequelize.fn('COSINE_DISTANCE', <columnName>, <vectorValue>)
  • sequelize.fn('INNER_PRODUCT', <columnName>, <vectorValue>)
  • sequelize.fn('L1_DISTANCE', <columnName>, <vectorValue>)
  • sequelize.fn('L2_DISTANCE', <columnName>, <vectorValue>)
  • sequelize.fn('VECTOR_DISTANCE', <columnName>, <vectorValue>)

The above introduced functions take following arguments:

  • columnName (Name of column or field)
  • vector (Vector to compare distance with column)

Vector Datatype for other dialects:

To support VECTOR type in other dialects, Following should be done:

  • Extend abstract vector datatype class. (Add toSql() and validate() and optionally _stringify())
  • Override handleSequelizeMethod to implement Sequelize.fn to implement vector similarity search functions`

An example to test Vector datatype:

const Sequelize = require('sequelize');
const Op = Sequelize.Op;


async function run() {

  const sequelize = new Sequelize({username: 'demouser', password: 'password4demo', dialect: 'oracle', dialectOptions: {connectString: 'demoString'}});

  const Item = sequelize.define('item', {
    embedding: Sequelize.DataTypes.VECTOR(4)
  });

  
  try {
    let queryVector = [1,2,3,4];
    await sequelize.sync({force: true});
    await Item.create({embedding: new Float32Array([1,1,1,1])});
    await Item.create({embedding: new Float32Array([1,2,3,3])});
    let result = await Item.findAll({
      where: Sequelize.sql.where(sequelize.fn('VECTOR_DISTANCE','embedding', queryVector), Op.lt, 2)
    });
    let result1 = await Item.findAll({
      where: Sequelize.sql.where(sequelize.fn('INNER_PRODUCT','embedding', queryVector), Op.gt, 2)
    });
    let result2 = await Item.findAll({
      where: Sequelize.sql.where(sequelize.fn('L1_DISTANCE','embedding', queryVector), Op.gt, 2)
    });
    let result3 = await Item.findAll({
      where: Sequelize.sql.where(sequelize.fn('L2_DISTANCE','embedding', queryVector), Op.gt, 0)
    });
    console.log(result[0].embedding);
    console.log(result1[1].embedding);
    console.log(result2[0].embedding);
    console.log(result3[1].embedding);
  } catch (err) {
    console.log(err);
  } finally {
    sequelize.close();
  }
}

run();

Example to fetch top 5 nearest neighbours

// Retrieve the top 5 similar embeddings for the input query vector.
const result = await Item.findall({
  attributes: ['embeddings'],
  order: [
    sequelize.fn('VECTOR_DISTANCE', 'embeddings', queryVector)
  ],
  limit: 5
});
// SQL: SELECT "embeddings" FROM "items" "Item" ORDER BY VECTOR_DISTANCE("embedding", VECTOR('[1,2,3,4]')) OFFSET 0 ROWS FETCH NEXT 5 ROWS ONLY;

Examples to test Vector indexes (hnsw):

const Sequelize = require('Sequelize');
const queryInterface = Sequelize.queryInterface;

async function run() {
  // "items" table exists in the database and a column name "embedding" is of datatype vector.
  await queryInterface.addIndex('items', ['embedding'], { type: 'VECTOR', using: 'hnsw', parameter: { neighbor: 10, efconstruction: 10} });
}

// SQL: `CREATE VECTOR INDEX "items_embedding" ON "items" ("embedding") ORAGANIZATION INMEMORY NEIGHBOR GRAPH PARAMETERS (type hnsw, neighbor 10, efconstruction 10)

IVF index:

const Sequelize = require('Sequelize');
const queryInterface = Sequelize.queryInterface;

async function run() {
  // "items" table exists in the database and a column name "embedding" is of datatype vector.
  await queryInterface.addIndex('items', ['embedding'], { type: 'VECTOR', using: 'hnsw', parameter: { partitions: 5, samplesPerPartition: 10, minVectors: 10 } });
}

// SQL: `CREATE VECTOR INDEX "items_embedding" ON "items" ("embedding") ORAGANIZATION NEIGHBOR PARTITION GRAPH PARAMETERS (type ivf, NEIGHBOR PARTITION 5, SAMPLES_PER_PARTITION 10, MIN_VECTORS_PER_PARTITIONS 10)`

To know more about Oracle Vector type:

Getting started with vectors in 23ai

Guidelines for Using Vector Indexes

List of Breaking Changes

@hjamil-24 hjamil-24 changed the title Vector support feat(oracle): Vector datatype support Feb 3, 2025
@hjamil-24 hjamil-24 closed this Feb 5, 2025
@hjamil-24 hjamil-24 reopened this Apr 29, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant