Real Estate Vector Search API

Overview

This project implements a vector search-based real estate recommendation system using MongoDB, OpenAI embeddings, and Flask. It allows users to search for properties using natural language queries, leveraging vector similarity to find relevant listings and providing AI-enhanced responses.

Recording.2024-10-02.022805.mp4

Features

Natural language property search using vector embeddings
AI-powered response generation for property recommendations
MongoDB Atlas vector search integration
RESTful API endpoint for property queries

Tech Stack

Python 3.8+
Flask (Web framework)
MongoDB Atlas (Database with vector search capability)
OpenAI API (for embeddings and response generation)

Prerequisites

Python 3.8 or higher
MongoDB Atlas account with vector search enabled
OpenAI API key

Project Structure

/vector_search_project
│
├── /app
│   ├── __init__.py
│   ├── embeddings.py     # Handles embedding generation
│   ├── db.py             # Database connection and operations
│   └── api.py            # Flask API endpoints
│
├── /data
│   └── dataset.csv       # Real estate dataset
│
├── /scripts
│   └── load_data.py      # Script to load and embed data
│
├── .env                  # Environment variables
├── requirements.txt      # Python dependencies
└── app.py               # Main application entry point

Installation

Clone the repository:

git clone https://github.com/yourusername/vector_search_project.git
cd vector_search_project

Create and activate a virtual environment:

python -m venv venv
source venv/bin/activate  # On Windows, use `venv\Scripts\activate`

Install dependencies:

pip install -r requirements.txt

Set up environment variables: Create a .env file in the project root with the following variables:

OPENAI_API_KEY=your_openai_api_key
MONGO_URI=your_mongodb_connection_string

Data Loading and Embedding

Before running the API, you need to load and embed the real estate data:

Ensure your dataset is in the correct format and placed in data/dataset.csv
Run the data loading script:

python scripts/load_data.py

This script will:

Load the real estate data
Generate embeddings for each property
Store the data and embeddings in MongoDB
Create the necessary vector search index
[create a custom vector search index if necessary](create custom vector search index)

Running the Application

Start the Flask application:

python app.py

The API will be available at http://localhost:5000

API Usage

Vector Search Endpoint

Endpoint: POST /vector_search

Request Body:

{
  "query": "3 bedroom house in Aguadilla under $200,000"
}

Response:

{
  "response": "Detailed AI-generated response about matching properties",
  "source_information": "Information about the properties used to generate the response"
}

Example Queries

Basic location and bedroom query:

{
  "query": "3 bedroom houses in Aguadilla"
}

Price range query:

{
  "query": "homes under $150,000 in San Juan"
}

Complex feature query:

{
  "query": "large houses with more than 2000 square feet and a pool"
}

Technical Details

Vector Search Implementation

The system uses the following pipeline for vector search:

pipeline = [
    {
        "$vectorSearch": {
            "index": "vector_index",
            "queryVector": query_embedding,
            "path": "embedding_vector",
            "numCandidates": 150,
            "limit": 5
        }
    },
    {
        "$project": {
            "_id": 0,
            "brokered_by": 1,
            "status": 1,
            "price": 1,
            # ... other fields
        }
    }
]

Embedding Generation

Properties are embedded using OpenAI's text-embedding-3-small model. The embedding input combines various property features:

embedding_input = f"{property['brokered_by']}, {property['status']}, Price: {property['price']}, Beds: {property['bed']}, ..."

Troubleshooting

Common issues and solutions:

No results returned:
- Verify that the vector index is created correctly
- Check if documents have embedding vectors
- Ensure query embedding dimensionality matches document embeddings
MongoDB connection issues:
- Verify your MongoDB URI in the .env file
- Ensure your IP is whitelisted in MongoDB Atlas

Contributing

Fork the repository
Create a new branch for your feature
Commit your changes
Push to the branch
Create a new Pull Request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

OpenAI for providing the embedding and language models
MongoDB for their vector search capability
Bhai free me bohot explore karne diya thanks, MongoDB

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
app		app
data		data
scripts		scripts
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt
test.json		test.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Real Estate Vector Search API

Overview

Features

Tech Stack

Prerequisites

Project Structure

Installation

Data Loading and Embedding

Running the Application

API Usage

Vector Search Endpoint

Example Queries

Technical Details

Vector Search Implementation

Embedding Generation

Troubleshooting

Contributing

License

Acknowledgments

About

Releases

Packages

Languages

License

PrathameshPawar119/Vector-Search-Real-Estate

Folders and files

Latest commit

History

Repository files navigation

Real Estate Vector Search API

Overview

Features

Tech Stack

Prerequisites

Project Structure

Installation

Data Loading and Embedding

Running the Application

API Usage

Vector Search Endpoint

Example Queries

Technical Details

Vector Search Implementation

Embedding Generation

Troubleshooting

Contributing

License

Acknowledgments

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages