Let's Be Honest About This

Some questions you probably have, answered with complete transparency.

Ok let's start there. Why is it called 'be right back'?

It is an allusion to a dark Black Mirror episode.

Why build this?

Needed an excuse for that cute pixelated ghost running around.

Jokes aside, given I truly don't think this is a matter to joke about... This is the question. Why did I build this. I suppose there's a lot of answers there...

Part of it was just my fascination with Black Mirror, and that episode sticking with me for years.

Part of it is just the shear technical element. I wanted a project where:

1. I had lots of accessible data (see tennis-scorigami for my woes on the price of data)
2. It involved local LLMs given I think that's valuable experience now and models will continue to move to the edge
- a. Furthermore, it had LoRA / QLoRA opportunities.
3. It involved utilizing Rust as I've been trying to get better there.
4. It involved RAG
5. It involved DuckDB. I've been looking for excuses to use DuckDB forever and read their vector similarity search article recently.

And finally, yeah I was partially curious if it was feasible.

Ok but so.... do you think you SHOULD have built it?

Hm in which perspective?

From a time perspective?

From a time perspective, probably not? Not going to help me get a job at a foundational AI lab. My girlfriend definitely didn't think it was a good use of prepping for arbitrary interviews or spending quality time with her... which I would have to agree with her on.

What about from an engineer's perspective?

From a personal growth as an engineer perspective, yeah it was a good investment imo. New technology, better understanding of transformers (although I hate the craze), more Rust knowledge.

Ok you idiot what about ethically?

Yeah no. I don't agree with this project. I don't love that we're asymptotically approaching all Black Mirror episodes. It obviously needs vast amounts of consent that is lacking. Extensions would include flipping it around and only training on your own responses (which people have done). Or just waiting for confirmation codes from anyone who you're gonna spin up a persona for.

There are obviously real use cases. You should reach out if you're legitimately interested in using this. Karpathy said we're building ghosts and well this truly does feel like that. I don't like it though.

We do not need to be edging closer to a show that is based in dystopian principles.

My lord guys what are we doing. Is this how bleak it's going to get?

Regardless, my whole thesis / point is that this took maybe two or three weekends to build? Yeah a lot of late nights and early mornings, but the point is that pandora's box has already been opened. These tools are out in the public.

I'm not going to be the last one to build this. Hell, I'm probably already not the FIRST one to even build this. There's far more dark shit going on with voice-cloning, deep fakes for the dark web, and AI driven cybersecurity attacks.

TLDR: I'm not releasing this project because I don't agree with it ethically, I just wanted to see if it was possible and I wanted to learn new techniques and skills that are applicable.

That's dumb... Is there any functional use of this?

At the moment, no not really. I mean.... I should (and hopefully will) take the bones / infra of this project and ultimately just do a wrapper around your iMessage on Mac so that you can ask questions in a general form and be reminded of when someone's birthday is or what you and your partner talked about 2 months ago on a certain date, etc. Yeah that product seems more helpful and more beneficial. Should have done that.... but slightly less hard tbh.

What's your point with all this?

I think my point is two-fold. Technically, I wanted to learn new and interesting tech, skills, and practices that are seemingly everywhere. Ethically, I'm extremely concerned about malpractice of AI tools and the rate at which they're growing and being distributed.

How does it actually work?

Here's the technical rundown:

1. Data Extraction: We read your macOS Messages database (~/Library/Messages/chat.db) and extract conversations with your chosen contacts.
2. Training Data Prep: Messages are formatted into training examples that teach an AI model how that person texts.
3. LoRA Fine-Tuning:Using Apple's MLX framework, we train a small “adapter” (~50-200MB) on top of a base model like Qwen3.5. This is way more efficient than training a full model.
4. Local Inference:When you chat, the model generates responses using the base model + your trained adapter. Everything runs on your Mac's GPU via Metal acceleration.
5. RAG (Retrieval-Augmented Generation): We also index your conversations in a vector database, so the AI can pull relevant past messages for context. This makes responses way more authentic.

All of this happens locally. No cloud. No API calls. Just fast, private AI on your device.

This Project is Not Being Released

As mentioned above, I don't think this should be released without proper consent mechanisms. The code is available for educational purposes and transparency, but this is not a product I'm actively distributing.

If you have a legitimate use case or want to discuss the technical implementation, feel free to reach out.