Simple pricing, built to scale with your needs

Pricing should never get in your way. This is why HoneyHive is free forever for individual developers and researchers.

Developer

For developers just getting started

Free

Join waitlist

10,000 events per month

Full evaluation and observability suite

Playground and Prompt Registry

1 workspace member

Usage Limits
Developer
enterprise
Event Ingestion Volume
10,000 per month
Custom
Data Retention
30d
Custom
Max Event Size
256KB
256KB
Max Event Volume per Hour
500MB
Custom
Max Requests per Minute
1,000
Custom
Observability
Developer
Enterprise
Trace Debugging
OpenTelemetry Support
Performance Monitoring
Dynamic Dashboard
Exploratory Data Analysis
Dataset Curation
Human Annotation
Data Export
Alerts
Evaluation
Developer
Enterprise
Online Evaluation w/ sampling
Offline Batch Evaluation
Test Suites
Evaluation Reports
CI/CD Integration
Pre-built Evaluators
Custom Evaluators
Human Evaluators only
Code, LLM, and Human
Prompt Studio
Developer
Enterprise
Playground
Prompt Versioning and History
Functions and External Tools
Deployment Environments
Custom Models
Workspace
Developer
Enterprise
Number of Users
1 user
Unlimited
Number of Projects
Unlimited
Unlimited
Collaboration & Lineage
Security
Developer
Enterprise
SSO (social)
SAML
Custom SSO
Role-based Access Control
Coming soon
Hosting
Cloud Hosted in US
Cloud Hosted in US
VPC Deployment Add-On
InfoSec Review
DPA and BAA
Support
Developer
Enterprise
Community Support
Email Support
Slack Connect Channel
SLA
CSM and Team Trainings

"It's critical to ensure quality and performance across our LLM agents. With HoneyHive, we've not only improved the capabilities of our agents but also seamlessly deployed them to thousands of users — all while enjoying peace of mind."

Divyansh Garg

Co-Founder & CEO, MultiOn

Frequently asked questions
What is an event?

An event refers to a single trace span, structured log, or metric label combination sent to our API as OTLP or JSON. It captures any relevant data from your system, including all context fields generated by your application's instrumentation. Each event can be up to 256KB in size and can contain any number of values.

What is an evaluator?

Automated Evaluators: An automated evaluator is a function (code or LLM) that helps you unit test any arbitrary event or combinations of events to generate a measurable score. Common examples of evaluators include Context Precision, ROUGE, Coherence, BERT Score, and more. We provide many common evaluators out-of-the-box and allow defining custom evaluators within the platform.

Human Evaluators: We strongly encourage a hybrid-evaluation approach, i.e. combining automated techniques with human evaluation. This helps you account for metric bias and better align your evaluators with your domain experts' scoring rubric. To enable this, you can define custom scoring rubrics in HoneyHive for graders to use when evaluating traces.

Do you support fine-tuning?

HoneyHive allows you to filter and curate datasets from your production logs. These datasets can be annotated by domain experts within the platform and exported programmatically for fine-tuning open-source models.

You can export datasets curated within HoneyHive using our SDK and use your preferred GPU cloud and optimization method (such as SFT, DPO, etc.) to fine-tune custom models. You can optionally also build active learning pipelines using our SDK to periodically export logs and run fine-tuning and validation jobs with your preferred fine-tuning providers. Contact us to learn more.

Is my data secure? 

All data is secure and encrypted in transit and at rest, managed by AWS. We conduct regular penetration tests, are currently undergoing SOC-2 audit, and provide flexible hosting solutions (cloud-hosted or in your VPC) to meet your security and privacy needs.

How do I manage and version prompts?

By default, we do not proxy your requests via our servers. Instead, we store prompts as configurations, which can be deployed and used in your application logic using the GET /Configuration API endpoint.

How do I log my data? 

You can log traces and any batch evaluation runs using our tracers and API endpoints, or async via our batch ingestion API endpoint. We offer native SDKs in Python and Typescript with OpenTelemetry support, and provide additional integrations with popular frameworks like LangChain and LlamaIndex.

How do I instrument my application?

We use OpenTelemetry protocol (OTLP) to auto-instrument applications in Python and Typescript.

For users using other languages, you can ingest your OpenTelemetry traces to our OTLP endpoint or manually instrument your application using our APIs.

Ship reliable AI products that your users trust