Changelog

January 2024

January has been our most productive month to date.

We've successfully migrated our entire platform to a new, more efficient architecture.Β Our new Radars and Evaluations features are now publicly available to all users on Unlimited or self-hosted plans, and we've pushed a lot of usage improvements to the app.

Platform Overhaul

We've revamped our entire source code and eliminated the need for Vercel and Supabase, leading to several improvements:

  • Enhanced scalability of the app
  • Access to all dashboard data via the API
  • 10x increase in our speed of rolling out new features
  • Less security concerns and error-surface associated with 3rd-party dependencies
  • Simplified self-hosting setup
  • Quicker dashboard performance

This is what it looks like in our GitHub graph:

Radars

Radars have moved out of private access.

Radars are AI-powered alerts that monitor your runs for specific conditions, such as personal data, profanity, or negative sentiment.

Internally, we've deployed efficient, lightweight AI models that scan runs without relying on external API queries, perfect for self-hosted setups.

Evaluations

We've reimagined LLM evaluations from scratch, opting for no-code approach.

You can design and execute evaluations directly from the dashboard without needing to be a Python expert or have prior evaluation experience.

This is made possible through intuitive blocks that assess various aspects like:

  • How closely your model's outputs match an ideal output
  • The presence of hallucination in responses
  • Usage of costs and tokens

You can create evaluations with 20+ models, including open-source options like Mistral, directly from the dashboard.

Some evaluation scenarios still require code---say for testing your custom agents or integrating into a CI pipeline---we will soon release the Evaluation SDK.Β This allows for the execution of dashboard-created evals your codebase, enabling contributions from both technical and non-technical team members.

We're continuously refining these features and are open to providing private access to the Evaluations SDK to those interested.

App Enhancements

We've also pushed a lot of improvements to the app, such as:

  • Merged "LLM Calls", "Chats" and "Traces" into a single "Logs" page, improving filters and search functionality across all data types.

  • New filtering engine with a lot of additional filters (with more on the way).

  • Add tool calls to templates and preview them in your runs.

  • Button to duplicate templates

  • Improved filter and dashboard speed.

  • Simpler navigation around the app

Alongside these updates, we've made numerous fixes and are focused on extending these advancements to more affordable plans as we further optimize and reduce costs.

We're eagerly await your feedback :)

Also, if anyone in San Francisco wants to meet up with the founders, we're here for the month. Hit us up!

December 2023

πŸ“™ Prompt Templates in Alpha

Our new Prompt Templates feature is ready in alpha today.

Collaborate with non-technical team members, version prompts and decouple prompts from your source-code.

It's available today to all users, regardless of your plan and the docs can be found here.

JS integration is ready and we'll release the Python integration later this week.

Any feedback on this is greatly appreciated - we will iterate on this in the coming weeks with your feedback.

🐍 Chat tracking in Python

We heard you and simplified a lot the API to track messages.

You can also now track users' conversations directly from your Python or JS backend, with a much simpler API.

Check out the new Chat tracking docs here.

πŸ“· Support for OpenAI vision

We now support OpenAI's vision models and you can see the pictures used in your dashboard.

vision

πŸ€– More models in Playground

Find Gemini Pro and Mixtral in the prompt playground.

πŸš„ Faster dashboard

We've turbocharged our dashboard and Postgres database for quicker data rendering. Search, filters and navigation should be much quicker.

It's still a work in progress, but you should already feel the difference with heavy data loads.

There is also a lot more in terms of bug fixed and UI improvements to the dashboard.

Enjoy the holidays!

December 2023

Big news from our corner! We're shifting gears and our startup's name is changing from LLMonitor to Lunary.ai. Here's the lowdown.

First off, let's be real - saying 'LLMonitor' wasn't the smoothest. We heard you trying to pronounce it (and saw some of those puzzled looks). So, we're making it easier. Lunary rolls off the tongue way better, doesn't it?

Also, a bit of a hiccup with Google. Our early SEO experiments, let's just say, were a tad too experimental and abrupt. Google wasn't thrilled, and we got a penalty on our main llmonitor.com domain making us impossible to be found on Google. Lesson learned.

But hey, every cloud has a silver lining. This change isn't just about a new name. It's a fresh start, a clearer identity. Lunary.ai reflects our work better - to build the best AI developer platform.

Stay tuned for what's next. We're just getting started.

November 2023

πŸ” Prompt Playground in public beta (give us feedback!)

As you may have seen, we opened access to the Prompt Playground to all Pro and Unlimited users.

The prompt playground lets you edit prompts you receive.

You can also play with different models (such as Anthropic's, Google's or event Mistral models) to see how different models could fit your app.

We're working on adding an History as well as even more open-source models to the playground.

If you've started using it, we'll love to get your feedback.

playground

πŸ”ˆ Support for OpenAI Devday updates

Our Python, JS & LangChain SDKs were updated to support the new tools API and our dashboard will now properly track cost for the newly released models.

We are working for support of the new Assistants API :)

πŸŒ‘ Dark mode

Working late a night? The dashboard will now follow your system's settings.

dark mode

πŸ’¬ Automatic tokenizer for all models

Tokens will now be automatically tracked for all models. If you're using Claude or Google's models, tokens counting and cost calculation are even more precise.

πŸ—ƒοΈ Public API & AI Evaluations in beta access

We just released the docs for the Data API which you can find here.

We'll soon add more endpoints to fetch aggregate statistics as well as data from your users. AI Evaluation is a new feature that allow you to test if prompts match certain criterias that you set, using AI.

For example, you can use evaluations to test that your prompt responds correctly in perfect JSON, or that it doesn't contain any bad word.

You can then view statistics and explore the queries that failed (or passed) your criteria.

You can also use evaluators to evaluate the sentiment of your responses, and use that as basis for fine-tuning.

πŸ› Support for OpenAI streaming queries in Python & JS

We recently fixed the support for streaming queries (including cost calculation on the Python SDK). It was already working on the JavaScript SDK.

πŸ₯ LangChain support of AWS Bedrock, Google Palm and Anthropic

We fixed support for a lot of Chat classes in LangChain.

πŸ”— HTTP endpoint for reporting events

If you're working in an environment where our SDKs are not available, you can now use our API endpoint to report events. You can find the docs here but they are still a work in progress.

πŸ› And a lot more...

We've also fixed Bedrock and Google support with LangChain, pushed a lot of UI updates to the dashboard and improved the performance of the search and event ingestion.

In a few days we'll release better filtering to make browsing data easier.

And that's all for this month!

Thanks for being a user 🫢

October 2023

First we wanted to say, thanks to everyone who is using Lunary ❀️

Your feedback has been invaluable and we’ve been hard at work improving the product and SDKs.

The most notable changes these past few weeks include:

  • πŸ” Search & filters in dashboard
  • πŸ”ˆ Feedback tracking in general availability
  • πŸ’¬ Chat replays and recordings in beta
  • πŸ—ƒοΈ Exports (on pro plans)
  • πŸ› Lots of bug fixes across the board, esp. on the SDKs, like counting tokens for streaming queries. (too many to list, but if you had any issue on the past with one of our SDKs, odds are it is fixed.) πŸ“œ Revamped documentation. It’s now much clearer how to setup user, tracking, tags, and feedback tracking.
  • πŸ’° a cheaper pro plan of $25 / month (it’s an experiment, not sure yet if we’ll keep it).

I also want to reiterate an offer I made to some users: if you're interested in help integrating Lunary into your app, we'd love (Hugh & I, the founders) to help you 1-1 for free (email me or calendar slots if you're interested).

πŸ‘‰ Read the new docs