{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "65af1ebf-354a-47b9-a882-f88ee8a28d64",
   "metadata": {},
   "source": [
    "## LangChain + PDF Full Stack App"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1f65de69-4c22-48cb-8eb2-caf234d31f39",
   "metadata": {},
   "source": [
    "### Add OpenAI key in backend/.env"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "af660ffe-b777-4053-a2a4-73f8b6da9842",
   "metadata": {},
   "source": [
    "### Add imports in backend/routers/pdfs.py"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "5508ce92-1e7e-4302-9c15-e1e4bd598e93",
   "metadata": {},
   "outputs": [],
   "source": [
    "# from langchain import OpenAI, PromptTemplate\n",
    "# from langchain.chains import LLMChain"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3f1ab301-9e2f-484a-b990-896da9b69265",
   "metadata": {},
   "source": [
    "### Add basic LangChain code in backend/routers/pdfs.py"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b0aeae91-3a1f-46c4-b63d-a4d7c8f441f1",
   "metadata": {},
   "outputs": [],
   "source": [
    "# # LANGCHAIN\n",
    "# langchain_llm = OpenAI(temperature=0)\n",
    "\n",
    "# summarize_template_string = \"\"\"\n",
    "#         Provide a summary for the following text:\n",
    "#         {text}\n",
    "# \"\"\"\n",
    "\n",
    "# summarize_prompt = PromptTemplate(\n",
    "#     template=summarize_template_string,\n",
    "#     input_variables=['text'],\n",
    "# )\n",
    "\n",
    "# summarize_chain = LLMChain(\n",
    "#     llm=langchain_llm,\n",
    "#     prompt=summarize_prompt,\n",
    "# )\n",
    "\n",
    "# @router.post('/summarize-text')\n",
    "# async def summarize_text(text: str):\n",
    "#     summary = summarize_chain.run(text=text)\n",
    "#     return {'summary': summary}"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ab976fcc-1eed-4f63-ad31-8f822ae43472",
   "metadata": {},
   "source": [
    "### Check backend"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "1ab84822-47ad-49b0-bb23-040db5a5b368",
   "metadata": {},
   "outputs": [],
   "source": [
    "#uvicorn main:app --reload"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "259699f1-8b01-44fe-bd3b-3567db2798f8",
   "metadata": {},
   "source": [
    "http://127.0.0.1:8000/docs\n",
    "* Check how POST /todos/summarize-test works"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "aee41fa1-674e-421f-b2b9-1cbd147e7fed",
   "metadata": {},
   "source": [
    "### Add advanced LangChain code in backend/routers/pdfs.py"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "57d956af-fee1-4bf2-8507-ff62617430b4",
   "metadata": {},
   "source": [
    "The following route includes the RAG technique to ask a question about the PDF file identified by id:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "dec7abfc-1ffa-4f7f-a0ba-ce374c5f5ad4",
   "metadata": {},
   "outputs": [],
   "source": [
    "# @router.post(\"/qa-pdf/{id}\")\n",
    "# def qa_pdf_by_id(id: int, question_request: QuestionRequest,db: Session = Depends(get_db)):\n",
    "#     pdf = crud.read_pdf(db, id)\n",
    "#     if pdf is None:\n",
    "#         raise HTTPException(status_code=404, detail=\"PDF not found\")\n",
    "#     print(pdf.file)\n",
    "#     loader = PyPDFLoader(pdf.file)\n",
    "#     document = loader.load()\n",
    "#     text_splitter = RecursiveCharacterTextSplitter(chunk_size=3000,chunk_overlap=400)\n",
    "#     document_chunks = text_splitter.split_documents(document)\n",
    "#     embeddings = OpenAIEmbeddings()\n",
    "#     stored_embeddings = FAISS.from_documents(document_chunks, embeddings)\n",
    "#     QA_chain = RetrievalQA.from_chain_type(llm=llm,chain_type=\"stuff\",retriever=stored_embeddings.as_retriever())\n",
    "#     question = question_request.question\n",
    "#     answer = QA_chain.run(question)\n",
    "#     return answer"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b3438641-3646-4398-a234-e0e348177b14",
   "metadata": {},
   "source": [
    "In order for it to work, we need to add the following lines on the top of the file:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "a4e5ba99-1ae7-4198-9892-1879516fca2a",
   "metadata": {},
   "outputs": [],
   "source": [
    "# from langchain.document_loaders import PyPDFLoader\n",
    "# from langchain.text_splitter import RecursiveCharacterTextSplitter\n",
    "# from langchain.embeddings.openai import OpenAIEmbeddings\n",
    "# from langchain.vectorstores import FAISS\n",
    "# from langchain.chains import RetrievalQA\n",
    "# from schemas import QuestionRequest\n",
    "# llm = OpenAI()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "af05089c-f71b-402c-954e-cb3f0002e98e",
   "metadata": {},
   "source": [
    "You can check it:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "5273e9cb-ec45-4bc1-93dd-cdfc5e87094e",
   "metadata": {},
   "outputs": [],
   "source": [
    "#uvicorn main:app --reload"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4723c26b-98c2-4fc4-b6ce-35708fb277e1",
   "metadata": {},
   "source": [
    "http://127.0.0.1:8000/docs\n",
    "* Check how POST /pdfs/qa-pdf/{id}"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "bd3e0b90-beac-4ac6-9689-fe7d045b4d00",
   "metadata": {},
   "source": [
    "## Task for you: update frontend"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a73858ea-8f2f-43f0-bd74-a0e7517c6ba6",
   "metadata": {},
   "source": [
    "### Desired behavior:\n",
    "* The user selects one PDF file\n",
    "* Then he can enter a question about it in the input box below the main pdf-list box\n",
    "* After the question is submitted, the answer is displayed below"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "34cb4aa8-1e2f-4297-b887-414330dd0940",
   "metadata": {},
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "7c690fba-195d-48ae-bcfd-80836d769afb",
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.4"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
