SOPHIE YANG / work
MEWORKTHOUGHTSRESUME
back
Mar 2025
Completed

Real Estate Legal Chatbot

A custom legal assistant chatbot specializing in real-estate policy for Canadians. Reducing friction to allow Canadians to buy their first and subsequent properties, faster.

It feels like a rite of passage for every software engineer to build a chatbot at some point.

And what better use case than turning the three business days it usually takes for your lawyer to reply to an email… into a few seconds?

We built the Philer Legal Real Estate chatbot to serve both sides: home buyers get quick, reliable answers to basic policy questions, and lawyers/agents can focus on higher-value work instead of inbox triage.

Working together with 2 other developers and a PM, our biggest challenge would be figuring out the backend tradeoffs around performance, reliability, and safety. From deciding how to store and retrieve embeddings efficiently, to putting guardrails in place so the model wouldn’t hallucinate legal advice, every decision forced us to consider our design deciscions: cost against scalability, convenience against compliance.

1 Week sprints

Our chatbot had a tight 4-week deadline, so we worked in 1-week sprints, meeting Mon/Wed/Fri to cycle through planning, design, development, and review.

A week just for messing around

Despite the tight deadlines, our group unanimously agreed to spend the first week “messing around”. Researching frameworks, tools, and patterns that could actually hold up in production.

During this week, I took a deep dive into Pinecone, learning how vector databases handle similarity search at scale and why indexing strategies matter for both speed and cost.

Then, to test my understanding, I built a small side project with LangChain vector embeddings, experimenting with how different chunking and embedding strategies affect retrieval accuracy.

By the end of this week, each of us felt more confident and clear in our approach.

Technical implementation

Before we began building, we drafted together the high-level architecture of the chatbot. Our design stayed consistent with our final product.

On the frontend, we built a familiar and intuitive chatbot interface with React, making it easy to embed into Philer’s existing platform. Despite initially creating an embedded chatbot component, we ultimately pivoted to a full-screen interface that would work in a subdomain.

In the backend, we powered responses with GPT-4 Turbo, wrapped in LangChain to handle retrieval and guardrails. All of our real estate policy documents were stored in AWS S3 buckets, which were then embedded and indexed in a Pinecone database.

This allowed us to run fast similarity searches, so the chatbot could ground answers in real documents instead of guessing.

One of the challenges our team faced was handling recency, particularly since we were dealing with dynamic information. Policies update frequently, so we designed a refresh pipeline to re-embed and sync Pinecone whenever new documents were added, ensuring that the bot has up-to-date information.

Final Thoughts

This chatbot was built from 0-1 in just a month! Could I have prompted the entirety of it in a week? Probably! But, learning how to collaborate with developers, communicate with my PM, and how to learn from scratch are important lessons I took away.

Feel free to try the chatbot out, or keep it in the back of your mind until you’re buying your first property (fingers crossed).

END.
BUILT WITH
AWSPythonTailwind CSSReact.jsLangChainPinecone
FIND OUT MORE
back
Mar 2025
Completed

Real Estate Legal Chatbot

A custom legal assistant chatbot specializing in real-estate policy for Canadians. Reducing friction to allow Canadians to buy their first and subsequent properties, faster.

It feels like a rite of passage for every software engineer to build a chatbot at some point.

And what better use case than turning the three business days it usually takes for your lawyer to reply to an email… into a few seconds?

We built the Philer Legal Real Estate chatbot to serve both sides: home buyers get quick, reliable answers to basic policy questions, and lawyers/agents can focus on higher-value work instead of inbox triage.

Working together with 2 other developers and a PM, our biggest challenge would be figuring out the backend tradeoffs around performance, reliability, and safety. From deciding how to store and retrieve embeddings efficiently, to putting guardrails in place so the model wouldn’t hallucinate legal advice, every decision forced us to consider our design deciscions: cost against scalability, convenience against compliance.

1 Week sprints

Our chatbot had a tight 4-week deadline, so we worked in 1-week sprints, meeting Mon/Wed/Fri to cycle through planning, design, development, and review.

A week just for messing around

Despite the tight deadlines, our group unanimously agreed to spend the first week “messing around”. Researching frameworks, tools, and patterns that could actually hold up in production.

During this week, I took a deep dive into Pinecone, learning how vector databases handle similarity search at scale and why indexing strategies matter for both speed and cost.

Then, to test my understanding, I built a small side project with LangChain vector embeddings, experimenting with how different chunking and embedding strategies affect retrieval accuracy.

By the end of this week, each of us felt more confident and clear in our approach.

Technical implementation

Before we began building, we drafted together the high-level architecture of the chatbot. Our design stayed consistent with our final product.

On the frontend, we built a familiar and intuitive chatbot interface with React, making it easy to embed into Philer’s existing platform. Despite initially creating an embedded chatbot component, we ultimately pivoted to a full-screen interface that would work in a subdomain.

In the backend, we powered responses with GPT-4 Turbo, wrapped in LangChain to handle retrieval and guardrails. All of our real estate policy documents were stored in AWS S3 buckets, which were then embedded and indexed in a Pinecone database.

This allowed us to run fast similarity searches, so the chatbot could ground answers in real documents instead of guessing.

One of the challenges our team faced was handling recency, particularly since we were dealing with dynamic information. Policies update frequently, so we designed a refresh pipeline to re-embed and sync Pinecone whenever new documents were added, ensuring that the bot has up-to-date information.

Final Thoughts

This chatbot was built from 0-1 in just a month! Could I have prompted the entirety of it in a week? Probably! But, learning how to collaborate with developers, communicate with my PM, and how to learn from scratch are important lessons I took away.

Feel free to try the chatbot out, or keep it in the back of your mind until you’re buying your first property (fingers crossed).

END.
BUILT WITH
AWSPythonTailwind CSSReact.jsLangChainPinecone
FIND OUT MORE