NVIDIA: Llama 3.1 Nemotron Ultra 253B v1

131K

Llama-3.1-Nemotron-Ultra-253B-v1 is a large language model (LLM) optimized for advanced reasoning, human-interactive chat, retrieval-augmented generation (RAG), and tool-calling tasks. Derived from Meta’s Llama-3.1-405B-Instruct, it has been significantly customized using Neural Architecture Search (NAS), resulting in enhanced efficiency, reduced memory usage, and improved inference latency. The model supports a context length of up to 128K tokens and can operate efficiently on an 8x NVIDIA H100 node. Note: you must include `detailed thinking on` in the system prompt to enable reasoning. Please see [Usage Recommendations](https://huggingface.co/nvidia/Llama-3_1-Nemotron-Ultra-253B-v1#quick-start-and-usage-recommendations) for more.

OpenRouter 원가 (1M 토큰)

입력$0.600

출력$1.80

합계$2.40

인보이스랩 (월 예상)

입력 0.5M + 출력 0.5M 기준

공급가액₩1,976

부가세 (10%)₩197

결제 금액₩2,175

💡 부가세 ₩197 매입공제 가능

모델 정보

기본 정보

모델 ID	nvidia/llama-3.1-nemotron-ultra-253b-v1
제공사	NVIDIA
컨텍스트 윈도우	131,072 토큰
모달리티	text->text

지원 기능

📋 Structured Output📄 JSON Mode🌡️ Temperature📏 Max Tokens

API 사용법

Python (OpenAI SDK 호환)

from openai import OpenAI

client = OpenAI(
    api_key="your-dream-api-key",
    base_url="https://api.invoicedream.co.kr/v1"
)

response = client.chat.completions.create(
    model="nvidia/llama-3.1-nemotron-ultra-253b-v1",
    messages=[
        {"role": "user", "content": "안녕하세요"}
    ]
)

print(response.choices[0].message.content)

Node.js / TypeScript

import OpenAI from 'openai';

const client = new OpenAI({
  apiKey: 'your-dream-api-key',
  baseURL: 'https://api.invoicedream.co.kr/v1'
});

const response = await client.chat.completions.create({
  model: 'nvidia/llama-3.1-nemotron-ultra-253b-v1',
  messages: [{ role: 'user', content: '안녕하세요' }]
});

console.log(response.choices[0].message.content);

cURL

curl https://api.invoicedream.co.kr/v1/chat/completions \
  -H "Authorization: Bearer your-dream-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "nvidia/llama-3.1-nemotron-ultra-253b-v1",
    "messages": [{"role": "user", "content": "안녕하세요"}]
  }'

💡 Tip: OpenAI SDK를 그대로 사용할 수 있습니다.base_url만 변경하면 됩니다!