AI? AI. The Privacy GPT Odyssey
This app patrially exists because I needed a GPT bot who can answer my queries on the latest Data Privacy regulations in MENA. Partially because FOMO.
Let’s get to the nitty-gritty, LangChain is a versatile framework for Python that allows developers to create context-aware applications powered by LLMs. It offers easy to use components and pre-built chains to facilitate reasoning and customization across a wide range of applications. Essentially, it makes harnessing the power of LLMs like GPT-3.5+ a breeze for noobies like me.
To provide context or “train” our custom knowledge base, we ideally need some form of database. Also, if we’re going to have GPT read through every law when I ask it something, I would be in severe credit card debt. Enter the persistent vector database cloud solution, Qdrant. Qdrant also has its advanced similarity search technology, which simplifies the process for GPT to obtain relevant data while reducing costs. We take advantage of this by splitting the raw text from various laws into chunks. These chunks are then added into a vector database called a “Collection.”
The vector store is then plugged into a “Retrieval chain function” of LangChain, which invokes OpenAI LLM on the stored vectors, giving you the desired response.
(Heyyo, Shanen from the future here! This app has been depreciated and replaced with https://privacybotweb.shanen.works/, you can read into depth why here https://shanen.works/flask/)
Try Privacy GPT out now!
P.S: A lot of details above have been very watered down.
Deploying a WebApp in Python was quite quick thanks to Streamlit. The app was deployed on an AWS EC2 instance with a reverse proxy for security.