Build powerful micro-agents that observe your digital world, remember what matters, and react intelligently—all while keeping your data 100% private and secure.
Website | WebApp | YouTube | Tiktok | Twitter
|
🧠 Text & Visual Memory 🎥 Smart Screen Recording 💾 Inteligent Context |
📧 Email • 💬 Discord • 📱 Telegram 📞 SMS • 💚 WhatsApp 🖥️ System Alerts Native OS notifications and pop-ups 📺 Observer Overlay Custom on-screen messages |
Creating your own Observer AI consist of three things:
- SENSORS - input that your model will have
- MODELS - models run by ollama or by Ob-Server
- TOOLS - functions for your model to use
- Navigate to the Agent Dashboard and click "Create New Agent"
- Fill in the "Configuration" tab with basic details (name, description, model, loop interval)
- Give your model a system prompt and Sensors! The current Sensors that exist are:
- Screen OCR ($SCREEN_OCR) Captures screen content as text via OCR
- Screenshot ($SCREEN_64) Captures screen as an image for multimodal models
- Agent Memory ($MEMORY@agent_id) Accesses agents' stored information
- Agent Image Memory ($IMEMORY@agent_id) Accesses agents' stored images
- Clipboard ($CLIPBOARD) It pastes the clipboard contents
- Microphone* ($MICROPHONE) Captures the microphone and adds a transcription
- Screen Audio* ($SCREEN_AUDIO) Captures the audio transcription of screen sharing a tab.
- All audio* ($ALL_AUDIO) Mixes the microphone and screen audio and provides a complete transcription of both (used for meetings).
* Uses a whisper model with transformers.js
Agent Tools:
getMemory(agentId)*– Retrieve stored memorysetMemory(agentId, content)*– Replace stored memoryappendMemory(agentId, content)*– Add to existing memorygetImageMemory(agentId)*- Retrieve images stored in memorysetImageMemory(agentId, images)- Set images to memoryappendImageMemory(agentId, images)- Add images to memorystartAgent(agentId)*– Starts an agentstopAgent(agentId)*– Stops an agenttime()- Gets current timesleep(ms)- Waits that ammount of miliseconds
Notification Tools:
sendEmail(email, message, images?)- Sends an emailsendPushover(user_token, message, images?, title?)- Sends a pushover notification.sendDiscord(discord_webhook, message, images?)Sends a discord message to a server.sendTelegram(chat_id, message, images?)Sends a telegram message with the Observer bot. Get the chat_id messaging the bot @observer_notification_bot.sendWhatsapp(phone_number, message)- Sends a whatsapp message with the Observer bot. Send a message first to +1 (555)783-4727 to use.notify(title, options)– Send browser notification⚠️ IMPORTANT: Some browsers block notificationssendSms(phone_number, message, images?)- Sends an SMS to a phone number, format as e.g. sendSms("hello",+181429367").⚠️ IMPORTANT : Due to A2P policy, some SMS messages are being blocked, not recommended for US/Canada.
Video Recording Tools:
startClip()- Starts a recording of any video media and saves it to the recording Tab.stopClip()- Stops an active recordingmarkClip(label)- Adds a label to any active recording that will be displayed in the recording Tab.
App Tools:
ask(question, title="Confirmation")- Pops up a system confirmation dialogmessage(message, title="Agent Message")- Pops up a system messagesystem_notify(body, title="Observer AI")- Sends a system notificationoverlay(body)- Pushes a message to the overlayclick()- Triggers a mouse click at the current cursor position⚠️ IMPORTANT: Position mouse before agent runs
Note: any function marked with
*takes anagentIdargument.
If you omitagentId, it defaults to the agent that’s running the code.
The "Code" tab receives the following variables as context before running:
prompt- The model's promptresponse- The model's responseagentId- The id of the agent running the codescreen- The screen as base64 if capturedcamera- The camera as base64 if capturedimemory- The agent's current image arrayimages- All images in context
JavaScript agents run in the browser sandbox, making them ideal for passive monitoring and notifications:
// Remove Think tags for deepseek model
const cleanedResponse = response.replace(/<think>[\s\S]*?<\/think>/g, '').trim();
// Get time
const time = time();
// Update memory with timestamp
appendMemory(`[${time}] ${cleanedResponse}`);
// Send to Telegram if the model mentions a word
if(response.includes("word")){
sendTelegram(cleanedResponse, "12345678") // Example chat_id
}There are a few ways to get Observer up and running with local inference. I recommend the Observer App.
Option 1: Just Install the Desktop App with any OpenAI compatible endpoint (Ollama, llama.cpp, vLLM)
Download the Official App:
Download Ollama for the best compatibility. Observer can connect directly to any server that provides a v1/chat/completions endpoint.
Set the Custom Model Server URL on the App to any OpenAI compatible endpoint if not using Ollama.
NOTE: Your browser app sends the request to localhost:3838 which the ObserverApp proxies to your Custom Model Server URL, this is because of CORS.
For Docker setup instructions, see docker/DOCKER.md.
For Jupyter server setup instructions, see app/JUPYTER.md.
Save your agent, test it from the dashboard, and upload to community to share with others!
We welcome contributions from the community! Here's how you can help:
- Fork the repository
- Create your feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'feat: add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
Built with ❤️ by Roy Medina for the Observer AI Community Special thanks to the Ollama team for being an awesome backbone to this project!
