AI? AI. The Privacy GPT Odyssey

This app patrially exists because I needed a GPT bot who can answer my queries on the latest Data Privacy regulations in MENA. Partially because FOMO.

Let’s get to the nitty-gritty, LangChain is basically a versatile framework for Python to develop context-aware applications powered by LLMs, offering easy-to-use components and pre-built chains to facilitate reasoning and customization in a wide range of applications. Basically, it makes harnessing the power of LLMs like GPT-3.5+ a breeze for noobies like me.

To bring context or “train”/create our custom knowledge base, we ideally need a sort of database. ALso if we’re going to have GPT read through every law when I ask it something, I would be in severe in credit card debt. Enter the persistent Vector Database cloud solution, Qdrant. Qdrant also has it’s advanced similarity search tech that makes it much simpler for GPT to obtain relevant data AND reduce costs. How we take advantage of this is by basically splitting the raw text from the various laws into chunks. These chunks are then addedd into a vector database called a “Collection”.

The vector store is then plugged into a “Retrieval chain function” of LangChain, which invokes OpenAI LLM on the stored vectors, giving you the desired response.

_config.yml

Try Privacy GPT out now!

P.S: A lot of details above have been very watered down.

Deploying a WebApp in Python was quite quick thanks to Streamlit. The app was deployed on an AWS EC2 instance with a reverse proxy for security.

Written on September 15, 2023