What are the differences?
Between the Starling LM 7B Beta and Qwen1.5 14B Chat LLM models, which follows best instructions?
Compare

to


Starling LM 7B Beta
Berkeley Nest

Qwen1.5 14B Chat
Alibaba Cloud
Overview
![]() Starling LM 7B Beta | ![]() Qwen1.5 14B Chat | |
|---|---|---|
Provider Organization responsible for this model. | ![]() Berkeley Nest | ![]() Alibaba Cloud |
Input Context Window The total number of tokens that the input context window can accommodate. | 3.1K | 33K |
Maximum Output Tokens The maximum number of tokens this model can produce in one operation. | 4.1K | Not specified. |
Release Date The initial release date of the model. | November 15, 2023 24 months ago | February 5, 2024 21 months ago |
Knowledge Cutoff The latest date for which the information provided is considered reliable and current. | 2024/3 |
Pricing
![]() Starling LM 7B Beta | ![]() Qwen1.5 14B Chat | |
|---|---|---|
Input Costs associated with the data input to the model. | Not specified. | Not specified. |
Output Costs associated with the tokens produced by the model. | Not specified. | Not specified. |
Benchmark
![]() Starling LM 7B Beta | ![]() Qwen1.5 14B Chat | |
|---|---|---|
MMLU Assesses LLMs' ability to acquire knowledge in zero-shot and few-shot scenarios. | 63.9 | |
MMMU Comprehensive benchmark covering multiple disciplines and modalities. | ||
HellaSwag A demanding benchmark for sentence completion tasks. | ||
Arena Elo Ranking metric for LMSYS Chatbot Arena. | 1119 | 1108 |
5000+ teams use Lunary to build reliable AI applications
Building an AI chatbot?
Open-source GenAI monitoring, prompt management, and magic.
Open Source
Self Hostable
1-line Integration
Prompt Templates
Chat Replays
Analytics
Topic Classification
Agent Tracing
Custom Dashboards
Score LLM responses
PII Masking
Feedback Tracking
Open Source
Self Hostable
1-line Integration
Prompt Templates
Chat Replays
Analytics
Topic Classification
Agent Tracing
Custom Dashboards
Score LLM responses
PII Masking
Feedback Tracking



