by google
Input price
$— /1M tok
Output price
$— /1M tok
Context length
8K tokens
JSON mode
Yes
Use this model through the LLMTest proxy. Replace your base URL and set the model ID:
import OpenAI from "openai";
const client = new OpenAI({
apiKey: "your-llmt_-key-here",
baseURL: "https://llmtest.io/v1",
});
const response = await client.chat.completions.create({
model: "google/gemma-3n-e2b-it:free",
messages: [
{ role: "system", content: "You are a helpful assistant." },
{ role: "user", content: "Hello, how are you?" }
],
temperature: 0.7, // 0-2, higher = more creative
max_tokens: 1024, // max tokens to generate
// stream: true, // enable streaming (SSE)
// top_p: 0.9, // nucleus sampling
// response_format: { type: "json_object" }, // guaranteed JSON output
});
| Parameter | Type | Description |
|---|---|---|
| model | string | Must be google/gemma-3n-e2b-it:free |
| messages | array | Array of message objects with role and content |
| temperature | number | Sampling temperature (0-2). Default varies by model. |
| max_tokens | integer | Max tokens to generate |
| top_p | number | Nucleus sampling (0-1) |
| stream | boolean | Stream response via SSE |
| stop | string | array | Stop sequences |
| response_format | object | Set to {"type": "json_object"} for guaranteed JSON output |