-
-
Notifications
You must be signed in to change notification settings - Fork 273
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
3e30ee6
commit 0b2a2c9
Showing
18 changed files
with
320 additions
and
205 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,10 @@ | ||
**Table 1: STREAM_HACKATHON.STREAMLIT.CUSTOMER_DETAILS** (Stores customer information) | ||
|
||
This table contains the personal information of customers who have made purchases on the platform. | ||
|
||
- CUSTOMER_ID: Number (38,0) [Primary Key, Not Null] - Unique identifier for customers | ||
- FIRST_NAME: Varchar (255) - First name of the customer | ||
- LAST_NAME: Varchar (255) - Last name of the customer | ||
- EMAIL: Varchar (255) - Email address of the customer | ||
- PHONE: Varchar (20) - Phone number of the customer | ||
- ADDRESS: Varchar (255) - Physical address of the customer |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,8 @@ | ||
**Table 2: STREAM_HACKATHON.STREAMLIT.ORDER_DETAILS** (Stores order information) | ||
|
||
This table contains information about orders placed by customers, including the date and total amount of each order. | ||
|
||
- ORDER_ID: Number (38,0) [Primary Key, Not Null] - Unique identifier for orders | ||
- CUSTOMER_ID: Number (38,0) [Foreign Key - CUSTOMER_DETAILS(CUSTOMER_ID)] - Customer who made the order | ||
- ORDER_DATE: Date - Date when the order was made | ||
- TOTAL_AMOUNT: Number (10,2) - Total amount of the order |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,8 @@ | ||
**Table 3: STREAM_HACKATHON.STREAMLIT.PAYMENTS** (Stores payment information) | ||
|
||
This table contains information about payments made by customers for their orders, including the date and amount of each payment. | ||
|
||
- PAYMENT_ID: Number (38,0) [Primary Key, Not Null] - Unique identifier for payments | ||
- ORDER_ID: Number (38,0) [Foreign Key - ORDER_DETAILS(ORDER_ID)] - Associated order for the payment | ||
- PAYMENT_DATE: Date - Date when the payment was made | ||
- AMOUNT: Number (10,2) - Amount of the payment |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,8 @@ | ||
**Table 4: STREAM_HACKATHON.STREAMLIT.PRODUCTS** (Stores product information) | ||
|
||
This table contains information about the products available for purchase on the platform, including their name, category, and price. | ||
|
||
- PRODUCT_ID: Number (38,0) [Primary Key, Not Null] - Unique identifier for products | ||
- PRODUCT_NAME: Varchar (255) - Name of the product | ||
- CATEGORY: Varchar (255) - Category of the product | ||
- PRICE: Number (10,2) - Price of the product |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,9 @@ | ||
**Table 5: STREAM_HACKATHON.STREAMLIT.TRANSACTIONS** (Stores transaction information) | ||
|
||
This table contains information about individual transactions that occur when customers purchase products, including the associated order, product, quantity, and price. | ||
|
||
- TRANSACTION_ID: Number (38,0) [Primary Key, Not Null] - Unique identifier for transactions | ||
- ORDER_ID: Number (38,0) [Foreign Key - ORDER_DETAILS(ORDER_ID)] - Associated order for the transaction | ||
- PRODUCT_ID: Number (38,0) - Product involved in the transaction | ||
- QUANTITY: Number (38,0) - Quantity of the product in the transaction | ||
- PRICE: Number (10,2) - Price of the product in the transaction |
Binary file not shown.
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,20 +1,57 @@ | ||
|
||
from langchain.embeddings import OpenAIEmbeddings | ||
from langchain.text_splitter import RecursiveCharacterTextSplitter | ||
from langchain.document_loaders import UnstructuredMarkdownLoader | ||
from langchain.vectorstores import FAISS | ||
from pydantic import BaseModel | ||
from langchain.text_splitter import CharacterTextSplitter | ||
from langchain.vectorstores import SupabaseVectorStore | ||
from langchain.embeddings.openai import OpenAIEmbeddings | ||
from langchain.document_loaders import DirectoryLoader | ||
import streamlit as st | ||
from supabase.client import Client, create_client | ||
from typing import Any, Dict | ||
|
||
|
||
class Secrets(BaseModel): | ||
SUPABASE_URL: str | ||
SUPABASE_SERVICE_KEY: str | ||
OPENAI_API_KEY: str | ||
|
||
|
||
class Config(BaseModel): | ||
chunk_size: int = 1000 | ||
chunk_overlap: int = 0 | ||
docs_dir: str = "docs/" | ||
docs_glob: str = "**/*.md" | ||
|
||
|
||
class DocumentProcessor: | ||
def __init__(self, secrets: Secrets, config: Config): | ||
self.client: Client = create_client( | ||
secrets.SUPABASE_URL, secrets.SUPABASE_SERVICE_KEY | ||
) | ||
self.loader = DirectoryLoader(config.docs_dir, glob=config.docs_glob) | ||
self.text_splitter = CharacterTextSplitter( | ||
chunk_size=config.chunk_size, chunk_overlap=config.chunk_overlap | ||
) | ||
self.embeddings = OpenAIEmbeddings(openai_api_key=secrets.OPENAI_API_KEY) | ||
|
||
loader = UnstructuredMarkdownLoader('schema.md') | ||
data = loader.load() | ||
def process(self) -> Dict[str, Any]: | ||
data = self.loader.load() | ||
texts = self.text_splitter.split_documents(data) | ||
vector_store = SupabaseVectorStore.from_documents( | ||
texts, self.embeddings, client=self.client | ||
) | ||
return vector_store | ||
|
||
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=20) | ||
texts = text_splitter.split_documents(data) | ||
|
||
embeddings = OpenAIEmbeddings(openai_api_key = st.secrets["OPENAI_API_KEY"]) | ||
docsearch = FAISS.from_documents(texts, embeddings) | ||
def run(): | ||
secrets = Secrets( | ||
SUPABASE_URL=st.secrets["SUPABASE_URL"], | ||
SUPABASE_SERVICE_KEY=st.secrets["SUPABASE_SERVICE_KEY"], | ||
OPENAI_API_KEY=st.secrets["OPENAI_API_KEY"], | ||
) | ||
config = Config() | ||
doc_processor = DocumentProcessor(secrets, config) | ||
result = doc_processor.process() | ||
return result | ||
|
||
docsearch.save_local("faiss_index") | ||
|
||
# with open("vectors.pkl", "wb") as f: | ||
# pickle.dump(docsearch, f) | ||
if __name__ == "__main__": | ||
run() |
Oops, something went wrong.