Simple pricing, built to scale with your needs

Pricing should never get in your way. This is why HoneyHive is free forever for individual developers and researchers.

Developer

Free

Request access

10k events per month

365d log retention

2 users only

Full evaluation and observability suite

Enterprise

Custom

Contact us

Custom usage limits

SSO & SAML

VPC hosting add-on

Dedicated support and SLA

Data Ingestion Limits
Developer
Team
enterprise
Event Ingestion Volume
10k per month
50k or 500k per month
Custom
Log Retention
365d
365d
Custom
Max Event Size
256KB
256KB
Over 1MB
Max Requests per Minute
1,000
1,000
60,000
Observability
Developer
Team
Enterprise
Distributed Tracing
OpenTelemetry Support
Performance Monitoring
Custom Charts
Exploratory Data Analysis
Dataset Curation
Human Annotation
Data Export
Alerts
Evaluation
Developer
Team
Enterprise
Online Evaluation w/ sampling
Offline Batch Evaluation
Test Suites
Evaluation Reports
CI/CD Integration
Pre-built Evaluators
Custom Evaluators
Human Evaluators only
Code, LLM, and Human
Code, LLM, and Human
Prompt Studio
Developer
Team
Enterprise
Playground
Prompt Versioning and History
Functions and External Tools
Prompt deployments
Custom Models in Playground
Workspace
Developer
Team
Enterprise
Number of Users
2 users
Unlimited
Unlimited
Number of Projects
Unlimited
Unlimited
Unlimited
Collaboration & Lineage
Security
Developer
Team
Enterprise
SSO (social)
SAML
Custom SSO
Role-based Access Control
Coming soon
Hosting
Cloud Hosted in US
Cloud Hosted in US
Cloud Hosted in US
VPC Self-Hosting Add-On
AWS, Azure, or GCP
InfoSec Review
DPA and BAA
Support
Developer
Team
Enterprise
Community Support
Email Support
Slack Connect Channel
SLA
CSM and Team Trainings

"It's critical to ensure quality and performance across our AI agents. With HoneyHive's state-of-the-art evaluation and monitoring tools, we've not only improved the capabilities of our agents but also seamlessly deployed them to thousands of users — all while enjoying peace of mind."

Divyansh Garg

Co-Founder & CEO, MultiOn

Frequently asked questions
What is an event?

An event refers to a single trace span, structured log, or metric label combination sent to our API as OTLP or JSON. It captures any relevant data from your system, including all context fields generated by your application's instrumentation. Each event can be up to 256KB in size and can contain any number of values.

What is an evaluator?

Automated Evaluators: An automated evaluator is a function (code or LLM) that helps you unit test any arbitrary event or combinations of events to generate a measurable score. Common examples of evaluators include Context Precision, ROUGE, Coherence, BERT Score, and more. We provide many common evaluators out-of-the-box and allow defining custom evaluators within the platform.

Human Evaluators: We strongly encourage a hybrid-evaluation approach, i.e. combining automated techniques with human evaluation. This helps you account for metric bias and better align your evaluators with your domain experts' scoring rubric. To enable this, you can define custom scoring rubrics in HoneyHive for graders to use when evaluating traces.

Do you support fine-tuning?

HoneyHive allows you to filter and curate datasets from your production logs. These datasets can be annotated by domain experts within the platform and exported programmatically for fine-tuning open-source models.

You can export datasets curated within HoneyHive using our SDK and use your preferred GPU cloud and optimization method (such as SFT, DPO, etc.) to fine-tune custom models. You can optionally also build active learning pipelines using our SDK to periodically export logs and run fine-tuning and validation jobs with your preferred fine-tuning providers. Contact us to learn more.

Is my data secure? 

All data is secure and encrypted in transit and at rest, managed by AWS. We conduct regular penetration tests, are currently undergoing SOC-2 audit, and provide flexible hosting solutions (cloud-hosted or in your VPC) to meet your security and privacy needs.

How do I manage and version prompts?

By default, we do not proxy your requests via our servers. Instead, we store prompts as configurations, which can be deployed and used in your application logic using the GET /Configuration API endpoint.

How do I log my data? 

You can log traces and any batch evaluation runs using our tracers and API endpoints, or async via our batch ingestion API endpoint. We offer native SDKs in Python and Typescript with OpenTelemetry support, and provide additional integrations with popular frameworks like LangChain and LlamaIndex.

How do I instrument my application?

We use OpenTelemetry (OTEL) to auto-instrument applications in Python and Typescript.

For users using other languages, you can ingest your OpenTelemetry traces to our OTEL endpoint or manually instrument your application using our APIs.

Ship reliable AI products that your users trust