body.store
=true
can easily record the input and output content of the request body. It can be rewritten with parameters and forced to overwrite the store to record client communication content or implement cache functions. (Currently only supports the recording function of OpenAI’s Chat.Completions API)body.omg_extra_body.use_omg_cache
to true
to realize cache retrieval. If the stored request requires the use of cache, it can be returned directly without any deduction.npx omgvibe
in the command line to configure CodeX or Claude Code with one click, https://www.ohmygpt.com/cli-agent. The relevant scripts are open source, warehouse address: https://github.com/DogeNet-LLC/omgvibe (nickname: Nano Banana) is supported. This is the latest model released by Google to support image generation. It performs well on image generation tasks and can generate high-quality images. Can be called via /v1/chat/completions
:{
"model": "vertex-gemini-2.5-flash-image-preview",
"messages": [
{ "role": "user", "content": [{ "type": "text", "text": "Generate a picture of Hakimi" }] }
]
}
{
"model": "vertex-gemini-2.5-flash-image-preview",
"messages": [
{ "role": "user", "content": [{ "type": "text", "text": "Generate a picture of Hakimi" }] }
],
"omg_extra_body": { "gemini_b64img_upload_enabled": true }
}
/apis/overwrite-rules
)gpt-5
gpt-5-2025-08-07
gpt-5-mini
gpt-5-mini-2025-08-07
gpt-5-nano
gpt-5-nano-2025-08-07
gpt-5-chat-latest
ark-doubao-seed-1.6-250615
ark-doubao-seed-1.6-thinking-250615
ark-doubao-seed-1.6-flash-250615
gemini-2.5-pro
gemini-2.5-flash
gemini-2.5-flash-lite-preview-06-17
gemini-2.5-pro-preview-06-05
ark-deepseek-r1-250528
claude-sonnet-4-0
claude-sonnet-4-20250514
claude-opus-4-0
claude-opus-4-20250514
-thinking-enabled
suffix.doubao-1.5-thinking-pro-m-250428
doubao-1.5-thinking-vision-pro-250428
doubao-1.5-vision-pro-250328
doubao-1.5-vision-lite-250315
gemini-2.5-flash-preview-05-20
gemini-2.5-flash-preview-05-20-thinking-disabled
reasoning_effort
= none
gemini-2.5-pro-preview-05-06
gpt-image-1
gemini-2.5-flash-preview-04-17-thinking-disabled
request to disable thinking and output directly, suitable for fast translation scenariosgemini-2.5-flash-preview-04-17
max_tokens
parameter in the o-series are automatically rewritten to the max_completion_tokens
parametero3
o3-2025-04-16
o4-mini
o4-mini-2025-04-16
doubao-1.5-thinking-pro-250415
doubao-1.5-thinking-pro-vision-250415
claude-3-7-sonnet-20250219
+ -thinking-enabled
(for example, claude-3-7-sonnet-20250219-thinking-enabled
) to enable thinking compatibility support{
"model": "claude-3-7-sonnet-20250219-thinking-enabled",
"messages": [
{
"role": "user",
"content": "Let me write a 10-character Tang poem"
}
],
"stream": false
}
{
"id": "chatcmpl-f5YCgBV1YWsGe8T6XtHT5DB9CgUak",
"object": "chat.completion",
"created": 1744813755,
"model": "claude-3-7-sonnet-20250219",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "<think>I need to create a 10-character Tang poem. Tang poems usually have different formats, such as five-character, seven-character, etc., but here the requirement is a 10-character poem. A 10-character poem is not a common format for traditional Tang poetry, but I can create a poem containing 10 Chinese characters, paying attention to the tone and artistic conception.\n\nI can provide a single-line 10-character poem, or a multi-line poem with a total of 10 characters. A short poem of 10 words. \n\nPlan 1: A line of 10 words. For example: \n\"The spring breeze blows and the fragrance of flowers fills the mountains\n" (11 words, does not match)\n\"The bright moon hangs high and people are thinking alone in the quiet night\n\n\n" "(10 words, consistent)\n\"When the mountains are green and the water is singing and the birds are singing and the flowers are fragrant\" (10 words, consistent)\n\nOption 2: A multi-line poem with a total of 10 words\nFor example:\n\"The spring breeze is blowing,\n The garden is full of flowers. \"\n(8 words, not applicable)\n\n\"The lonely mountain is far away,\n the flowing water is long,\n the place where people return. \"\n(7 characters, not applicable)\n\n\"The bright moonlight,\n misses my hometown,\n I haven't returned yet. \"\n (9 words, not applicable)\n\nI chose to provide a 10-word single-line poem, paying attention to creating a beautiful artistic conception and a sense of rhythm that is in line with traditional Chinese poetry:\n\n\"The bright moon hangs high in the night and the quiet person thinks alone\"\nThis poem depicts a moonlit night, a scene of a person missing alone in the quiet night, with a touch of sadness and homesickness.\n\nOr\n\n\ "When the mountains are green, the water is singing, the birds are singing, and the flowers are fragrant."\nThis poem depicts a beautiful spring scene that is natural, harmonious, and full of vitality. \n\nI will think of some other 10-word poems:\n\n\"Autumn breeze. The courtyard is desolate and full of fallen leaves \"(10 words)\n\"The distant mountains are like black grass and the drizzle moistens the moss\"(10 words)\n\"The morning light dews the mountains and rivers into my dream\"(10 words)\n\"The small bridge and the flowing water of the people "In the mist and rain" (10 words)\n\nConsidering the style and artistic conception of Tang poetry, I chose:\n\n\"The mountains are high, the water is long, the clouds are light, and the wind is gentle\"\nThis poem depicts a quiet and beautiful natural scene, expressing the praise of the harmonious beauty of nature.\n\nI will consider the rhythm and rhythm again:\n\n\"Green mountains and flowing water, clouds are light and the moon is like a hook\"\nThis poem has both a description of the natural scenery. Writing, there is some artistic conception, and the last word "hook" is used as a rhyme to give people a sense of completion. \n\nAccording to the tradition of Tang poetry, I can usually name this 10-word poem "Night Thoughts" or "Landscape". \n\nIn the end, I decided to provide a 10-word Tang poem:\n\n"Landscape"\nGreen Mountains and Flowing Waters, Clouds and Light Moon Like Hooks</think>\n# Cross Tang Poetry\n\n## "Mountain Dwelling Scenery"\nGreen Mountains and Green Waters, Birds Chirping and Flowers Fragrant\n\n---\n\nNote: This is a short poem describing the beautiful scenery in the mountains, condensing the beauty of nature into the cross. ",
"refusal": null
},
"logprobs": null,
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 45,
"completion_tokens": 881,
"total_tokens": 926
}
}
gpt-4.1
gpt-4.1-2025-04-14
gpt-4.1-mini
gpt-4.1-mini-2025-04-14
gpt-4.1-nano
gpt-4.1-nano-2025-04-14
grok-3-beta
grok-3-mini-beta
grok-3-mini-fast-beta
grok-3-fast-beta
ark-deepseek-v3-250324
, Byte platform’s ultra-high concurrency, exclusive 30% off (ARK R1 50% off)-legacy-support
suffix after gemini model namex-gemini-legacy-support
true
command-a-03-2025
gpt-4o-search-preview
gpt-4o-search-preview-2025-03-11
gpt-4o-mini-search-preview
gpt-4o-mini-search-preview-2025-03-11
web_search_options
gpt-4.5-preview
gpt-4.5-preview-2025-02-27
gemini-2.0-flash-lite
gemini-2.0-flash-lite-001
claude-3-7-sonnet-latest
claude-3-7-sonnet-20250219
"thinking": {
"type": "enabled",
"budget_tokens": 1024
},
{
"model": "claude-3-7-sonnet-20250219",
"messages": [
{
"role": "user",
"content": "A brief analysis of the reasons why OpenAI is not open source"
}
],
"thinking": {
"type": "enabled",
"budget_tokens": 1024
},
"stream": false,
"max_tokens": 2048
}
anthropic-beta
output-128k-2025-02-19
ark-deepseek-v3-241226
)ark-deepseek-v3-241226
)c4ai-aya-expanse-32b
c4ai-aya-expanse-8b
omni-moderation-latest
omni-moderation-2024-09-26
ark-deepseek-v3-241226
)ark-deepseek-r1-250120
)o3-mini-rev
, which is charged based on the number of times and 120 coins
will be deducted for a successful call.deepseek-chat
has been increased following the official announcement:TA/deepseek-ai/DeepSeek-R1
Limited time discount price:TA/deepseek-ai/DeepSeek-V3
Limited time discount price:/v1/completions
API, which may be used by some softwareTA/deepseek-ai/DeepSeek-R1-Distill-Llama-70B
model support0.2 => 1.05
gemini-2.0-flash
gemini-2.0-flash-001
gemini-2.0-flash-lite-preview-02-05
gemini-2.0-flash-thinking-exp-01-21
gemini-2.0-pro-exp-02-05
ark-deepseek-r1-250120
ark-deepseek-v3-241226
TA/deepseek-ai/DeepSeek-V3
TA/deepseek-ai/DeepSeek-R1-Distill-Llama-70B-free
max_tokens
compatibility support (the o1 series does not support max_tokens
, so if this value is defined, assign it to max_completion_tokens
and delete this value to ensure the compatibility of other software)deepseek-reasoner
Beta function to customize the "prefix" continuation of the last assistant msgo3-mini
series of API model inference services, and added support for the following modelso3-mini
o3-mini-2025-01-31
4om-aup
:gpt-4o-mini-audio-preview
gpt-4o-mini-audio-preview-2024-12-17
o1-mini
has also been reduced, and the price has been reduced simultaneously.TA/deepseek-ai/DeepSeek-R1
ohmygpt/deepseek-r1-32b
ohmygpt/deepseek-r1-1.5b
deepseek-reasoner
modeltemperature
top_p
presence_penalty
frequency_penalty
that is not supported by this model will be removed when this model is requested.gpt-4o-web-rev
), which can be used by all users for free.gpt-4o-web-rev
reverse model. Pricing is still based on pay-per-use. One dollar can call 4o approximately 566 times, which is close to free. This service is maintained by the internal development team and will be more stable, but it is not recommended for commercial or official Prod purposes.o1
o1-2024-12-17
gemini-1.5-pro
model name/v1/completions
{
"model": "deepseek-chat",
"prompt": "This is a JSON: '",
"suffix": "', done."
}
``
#25-0106 Update:
- Follow up and update 01AI model name support, add support for `yi-vision-v2`
- Removed support for the deprecated 01AI model, but retained model rewriting functionality
#25-0105 Update:
- Reconstructed the billing module of the GPT series model in the Chat.Completions API, which now supports cache billing. The cache billing details can be viewed on the Pricing page of the Next site[1]
- The DeepSeek series model URL has been changed from the default URL to a Beta URL. Now users of DeepSeek can use some of DeepSeek's Beta features[2]
- Completely remove support for some old deprecated models ( `gpt-3.5-turbo-0301` , `gpt-3.5-turbo-0613` , `gpt-3.5-turbo-16k-0613` , `gpt-4-0314` , `gpt-4-32k` , `gpt-4-32k-0314` , `gpt-4-32k-0613` ) (To ensure the compatibility of old versions of the program, these old version model names will be automatically rewritten to new model names of the same level)
[1] Next station Pricing page https://next.ohmygpt.com/pricing
[2] DeepSeek Beta https://api-docs.deepseek.com/guides/chat_prefix_completion
#25-0103 Update:
- Optimize the API response speed: fix the performance degradation of the internal Redis service caused by the Socket.IO Stream Adapter, which in turn causes the API response speed to slow down.
#25-0102 Update:
- The DeepSeek series model billing function was called incorrectly, causing users who used the model to receive abnormal deductions. This bug was quickly fixed after receiving feedback from users.
- In order to compensate all users affected by this bug, all users' deepseek-related model calling fees from 24-1230 to 25-0102 will be waived and returned to their balances. They are being counted and will be processed as soon as possible.
#24-1231 Update:
- The rate of DeepSeek series models is changed from 1.1 to 1, that is, no additional fees will be charged
- Reconstruct the DeepSeek series model billing module to support CacheHitTokens billing
- Added DeepSeek series model pricing display to the Pricing page of Next station
Finally, I wish you all a happy New Year in 2025!
#24-1229 Update:
Thanks to group friend @Bryce for sponsoring, the following bean bag models are now available for free for a limited time. You can play as you like:
- `doubao-pro-128k-240628`
- `doubao-pro-32k-241215`
- `doubao-pro-4k-240515`
- `doubao-pro-4k-character-240728`
- `doubao-lite-4k-240328`
- `doubao-lite-4k-character-240828`
ps: According to visual inspection, the strongest one should be `doubao-pro-32k-241215`. If you are interested, you can come and try it out.


#24-1228 Update:
- Added large model of Doubao series that supports ByteDance
- Added moonshot v1 series large models that support The Dark Side of the Moon (provided by ByteDance’s Volcano Platform)
- The multiplier is 1 (that is, it is the original price directly, with a 0 markup. You don’t need to take this into account to earn rice. It’s just convenient for everyone)
For pricing details, please see the Next station Pricing page: https://next.ohmygpt.com/pricing

#24-1221 Update:
- Google AI series models are on sale at 20% off
- Improve the stability of Google AI series automatic retry function
#24-1219 Update:
- Added support for the latest 24-1217 version of the Realtime series models. The price of the new version of 4o Realtime has been reduced a lot, and the price of the mini is also cheaper. With the cache function of high hit rate, the price has reached an acceptable level:
`gpt-4o-realtime-preview-2024-12-17`
`gpt-4o-mini-realtime-preview`
`gpt-4o-mini-realtime-preview-2024-12-17`
- Supports the cache accounting function of Realtime series models.
- Updated the real-time call demo of the Next station. You can now experience the new version of the Realtime series model in the related demo of the Next station: https://next.ohmygpt.com/chat Mirror: https://next.aigptx.top/chat

#24-1218 Update:
- Added support for the latest `gpt-4o-audio-preview-2024-12-17` model, and the audio input and output price is much lower than the original 4oA1001.
- Regarding the latest official version of `o1`, since OAI is not fully open, I looked at a bunch of Tier5 accounts and found no one with o1 official version API permissions, so I did not access it today and am waiting for the subsequent opening of OAI.
- Chat.Completions API newly supports the `reasoning_effort` optional parameter, which can adjust the "effort level" when thinking about the o1 series model. According to the official documentation [1], this parameter is only meaningful for the o1 series.
- The "Consumption Records" section has added support for cache fields and audio input and output fields. These data can now be queried in the usage statistics/consumption records of the Next station.
- The new 4o-realtime1217 will be accessed tomorrow. It will take some time to add support for some of these fields today.


[1] OpenAI official documentation https://platform.openai.com/docs/api-reference/chat/create
#24-1205 Update:
- Optimized the balance display function of the Next station. The estimated legal currency value will now be displayed. The corresponding legal currency value estimate has also been added to the APIKey management table.
- API endpoint display has been added to the APIKey management page of the Next site.
- Added a CF anti-generation building tutorial document to achieve simple and safe API original data recording and model name rewriting functions: https://ohmygpt-docs.apifox.cn/doc-5628298
#24-1127 Update:
- Newly launched file storage business `file-v1`
This business plan will serve as a safe, high-speed, cost-effective and controllable file service to provide basic services for any business related to file processing, such as providing multi-modal file temporary storage/document upload and download functions for AI services. You can also use the storage service of this site for purposes such as drawing beds.
All users can use 1GB of file storage space for free, and any excess will be billed according to this pricing standard:
Pricing:
`$9.9/TB/Month`
In addition, there are no additional fees, and the request and export traffic fees are free.
Documentation: https://ohmygpt-docs.apifox.cn/doc-5575792
#24-1126 Update:
- Added support for Claude Count Message tokens API `/v1/messages/count_tokens`
Documentation: https://ohmygpt-docs.apifox.cn/api-237994522
The new file system was originally planned to go online today, but due to a lot of pitfalls caused by CDN caching issues, it will be postponed for a day or two.
#24-1123 Update:
- The official built-in PDF input function has been supported in Anthropic's native Messages interface. To use this function, you need to add the `pdfs-2024-09-25` keyword in the `anthropic-beta` request header to enable this Beta function:

#24-1121 Update:
- Added support for the latest `gpt-4o-2024-11-20` model, the price has not changed from the previous version 24-0806

#24-1116 Update:
- Improve the color distinction of model usage statistics chart
- Next station has added a real-time chat room technology demonstration demo. Welcome to test the experience (it is to release it for public testing to see how it works in practice):
Demo(main site): https://next.ohmygpt.com/tools
Demo (mirror): https://next.aigptx.top/tools
Feature:
1. Based on Socket.IO, relatively reliable two-way low-latency real-time communication between client and server is achieved.
2. There is no relational database behind this demo to record messages, and it relies solely on the framework emit to send messages.
3. You can create a room by yourself to initiate a group chat
4. Temporarily unlimited use of the built-in AI chat function (model Command R Plus (By Cohere)
#24-1111 Update:
- APIKey permission control revision:
In order to avoid more users’ doubts about the inability of the old Key to directly call the new model, some improvements have been made to the APIKey related modules of the website today:
#24-1110 Update:
- Added support for `glm-4v-plus` model, see [1] for pricing details
[1] https://next.ohmygpt.com/pricing
#24-1109 Update:
- Added Gemini series JSON output mode support, please see the documentation for relevant implementation details[1]
- Added `gemini-1.5-flash-8b` official version model support, please see [2] for pricing details
[1] https://ohmygpt-docs.apifox.cn/api-123253693
[2] https://next.ohmygpt.com/pricing
#24-1108 Update:
- Improve the message splicing logic of Gemini series models in the OpenAI compatible interface
- Improve the compatibility of the Claude series model OpenAI interface, while improving the documentation and showing relevant implementation details[1]
[1] https://ohmygpt-docs.apifox.cn/api-123253693
#24-1107 Update:
- Improved Tools call compatibility support for Claude series models in the OpenAI compatible interface
For details, see: https://ohmygpt-docs.apifox.cn/api-123253693
#24-1106 Update:
- Bot now supports PDF file chat
https://t.me/getfunbot
#24-1105 Update:
- Supports the latest Claude 3.5 Sonnet, adding the following model names:
- `claude-3-5-haiku`
- `claude-3-5-haiku-latest`
- `claude-3-5-haiku-20241022`
- Added new version of GLM series model support
- `glm-4-plus`
- `glm-4-airx`
- `glm-4-air`
- `glm-4-long`
- `glm-4-flashx`
- `glm-4-flash`
For price details, check the pricing page: https://next.ohmygpt.com/pricing
#24-1103 Update:
- Solve the Google/GitHub OAuth authentication jump problem of Next station
- Complete the new version of account registration/password reset logic (you can now use verification code to register/reset account password)

Attached:
Next main website: https://next.ohmygpt.com
Next site mirror: https://next.aigptx.top
At this point, the login, registration, recharge, API management and other functions of the new version of the website are basically completed. Currently, users who only use the API can start using the Next site. The next development direction of the website is:
- Vector retrieval service and knowledge base question and answer service
- Complete the open file system based on object storage (used for file upload related purposes, and can also be used as a small network disk
- Complete the Chat function of the new version of the website (including file upload, plug-in system, code interpreter, flow chart, Agent, etc.
- Complete the image generation function of the new website
- Improve TgBot and use the capabilities of the Tg platform to implement more interesting and practical functions at low cost
- Access more API service providers and more models
- Started doing some business based on Japanese companies
#24-1101 Update:
- Completed Next station usage details query function, now you can query or export usage details on Next station `/settings/usage` page


Attached:
Next main site: https://next.ohmygpt.com/settings/usage
Next site mirror: https://next.aigptx.top/settings/usage
#24-1031 Update:
- Completed Next station usage statistics function, now you can query the usage statistics chart on Next station `/settings/usage` page

Attached:
Next main site: https://next.ohmygpt.com/settings/usage
Next site mirror: https://next.aigptx.top/settings/usage
#24-1026 Update:
- Calling Claude series and Gemini series models through OpenAI compatible interfaces will now return usage-related data blocks
- Improved the pricing display page: Added detailed pricing display for some models such as Claude and Gemini, which can be viewed on the `/pricing` page of the Next station
Attached:
Example usage data block:
```json
{
"id": "chatcmpl-hMOdfnoaXtcpCZwRJISUAzPtsfZKs",
"object": "chat.completion.chunk",
"created": 1729944958,
"model": "claude-3-5-sonnet",
"choices": [],
"usage": {
"prompt_tokens": 11,
"completion_tokens": 8,
"total_tokens": 19,
"prompt_tokens_details": {
"cached_tokens": 0
},
"completion_tokens_details": {
"reasoning_tokens": 0
}
}
}
/pricing
page of Next stationclaude-3-5-sonnet-20241022
grok-beta
#Grok Original Pricing Notes
## grok-beta
PromptTokens:
$5.00
CompletionTokens:
$15.00
input:
$5/1M tokens
1tk = (5 * 250000)/1000000 = 1.25 coins
output:
$15/1M tokens
1tk = (15 * 250000)/1000000 = 3.75 coins
gpt-4o-audio-preview
series of models. However, due to technical reasons, only non-streaming calls of this model are currently supported. Users who need real-time voice interaction please use the /v1/realtime
real-time WebSocket endpoint of this site.# gpt-4o-audio-preview series original pricing accounting notes
gpt-4o-audio-preview
gpt-4o-audio-preview-2024-10-01
Pricing analysis:
Text
$2.50 / 1M tokens
$10.00 / 1M tokens
Audio***
$100.00 / 1M tokens
$200.00 / 1M tokens
====TEXT====
input:
$2.5 / 1M tokens
1tk = (2.5 * 250000)/1000000 = 0.625 coins
output:
$10/1M tokens
1tk = (10 * 250000)/1000000 = 2.5 coins
====TEXT====
====Audio====
input:
$100 / 1M tokens
1tk = (100 * 250000)/1000000 = 25 coins
output:
$200 / 1M tokens
1tk = (200 * 250000)/1000000 = 50 coins
====Audio====
yi-lightning
yi-large
yi-medium
yi-vision
yi-medium-200k
yi-spark
yi-large-rag
yi-large-fc
yi-large-turbo
# yi model original pricing calculation notes
## yi-lightning
Pricing:
¥0.99 /1M tokens
1tk = (0.99 * 34000)/1000000 = 0.03366 coins
## yi-large
Pricing:
¥20 /1M tokens
1tk = (20 * 34000)/1000000 = 0.68 coins
## yi-medium
Pricing:
¥2.5 /1M tokens
1tk = (2.5 * 34000)/1000000 = 0.085 coins
## yi-vision
Pricing:
¥6 /1M tokens
1tk = (6 * 34000)/1000000 = 0.204 coins
## yi-medium-200k
Pricing:
¥12 /1M tokens
1tk = (12 * 34000)/1000000 = 0.408 coins
## yi-spark
Pricing:
¥1 /1M tokens
1tk = (1 * 34000)/1000000 = 0.034 coins
## yi-large-rag
Pricing:
¥25 /1M tokens
1tk = (25 * 34000)/1000000 = 0.85 coins
## yi-large-fc
Pricing:
¥20 /1M tokens
1tk = (20 * 34000)/1000000 = 0.68 coins
## yi-large-turbo
Pricing:
¥12 /1M tokens
1tk = (12 * 34000)/1000000 = 0.408 coins
Note: The calculation here is based on ¥1=34000 coins
https://c-z0-api-01.hash070.com/v1
# 4o-realtime original pricing accounting notes
- gpt-4o-realtime-preview-2024-10-01
====TEXT====
input:
$5/1M tokens
1tk = (5 * 250000)/1000000 = 1.25 coins
output:
$20 / 1M tokens
1tk = (20 * 250000)/1000000 = 5 coins
====TEXT====
====Audio====
input:
$100 / 1M tokens
1tk = (100 * 250000)/1000000 = 25 coins
output:
$200 / 1M tokens
1tk = (200 * 250000)/1000000 = 50 coins
====Audio====
# o1 series original pricing calculation
o1-preview | o1-preview-2024-09-12
$15.00 / 1M input tokens
$60.00 / 1M output tokens
input:
$15/1M tokens
1tk = (15 * 250000)/1000000 = 3.75 coins
output:
$60 / 1M tokens
1tk = (60 * 250000)/1000000 = 15 coins
----
o1-mini | o1-mini-2024-09-12
$3.00 / 1M input tokens
$12.00 / 1M output tokens
input:
$3/1M tokens
1tk = (3 * 250000)/1000000 = 0.75 coins
output:
$12/1M tokens
1tk = (12 * 250000)/1000000 = 3 coins
====
OpenAI o1-preview
o1-preview is our new reasoning model for complex tasks that require broad general knowledge. The model has 128K context and an October 2023 knowledge cutoff.
Learn about o1-preview
Model
Pricing
o1-preview
$15.00 / 1M input tokens
$60.00 / 1M output tokens
o1-preview-2024-09-12
$15.00 / 1M input tokens
$60.00 / 1M output tokens
OpenAI o1-mini
o1-mini is a fast, cost-efficient reasoning model tailored to coding, math, and science use cases. The model has 128K context and an October 2023 knowledge cutoff.
Learn about o1-mini
Model
Pricing
o1-mini
$3.00 / 1M input tokens
$12.00 / 1M output tokens
o1-mini-2024-09-12
$3.00 / 1M input tokens
$12.00 / 1M output tokens
gemini-1.5-pro-exp-0827
and gemini-1.5-flash-exp-0827
(prices correspond to 1.5 pro and 1.5 flash respectively)gemini-1.5-pro-exp-0801
# gemini 1.5 pro original pricing calculation notes
----
Input price (per 1M tokens)
$3.50 for <= 128K tokens
$7.00 for > 128K tokens
----
input: for <= 128K tokens:
$3.50/1M tokens
1tk = (3.50 * 250000)/1000000 = 0.875 coins
input: for > 128K tokens:
$7.00 /1M tokens
1tk = (7.00 * 250000)/1000000 = 1.75 coins
----
Output price (per 1M tokens)
$10.50 for <= 128K tokens
$21.00 for > 128K tokens
----
output: for <= 128K tokens
$10.50 /1M tokens
1tk = (10.50 * 250000)/1000000 = 2.625 coins
output: for > 128K tokens
$21.00 /1M tokens
1tk = (21.00 * 250000)/1000000 = 5.25 coins
=====================================
#gemini 1.5 flash original pricing calculation notes
----
Input price (per 1M tokens)
$0.075 for <= 128K tokens
$0.15 for > 128K tokens
----
input: for <= 128K tokens:
$0.075 /1M tokens
1tk = (0.075 * 250000)/1000000 = 0.01875 coins
input: for > 128K tokens:
$0.15 /1M tokens
1tk = (0.15 * 250000)/1000000 = 0.0375 coins
----
Output price (per 1M tokens)
$0.30 for <= 128K tokens
$0.60 for > 128K tokens
----
output:
$0.30 /1M tokens
1tk = (0.30 * 250000)/1000000 = 0.075 coins
output:
$0.60 /1M tokens
1tk = (0.60 * 250000)/1000000 = 0.15 coins
# Opus input and read cache:
cache_creation_input_tokens:
$18.75 / 1M input tokens
1tk = (18.75 * 250000)/1000000 = 4.6875 coins
cache_read_input_tokens:
$1.50 / 1M input tokens
1tk = (1.50 * 250000)/1000000 = 0.375 coins
# Claude 3.0 Haiku input and read cache:
cache_creation_input_tokens:
$0.30 / 1M input tokens
1tk = (0.30 * 250000)/1000000 = 0.075 coins
cache_read_input_tokens:
$0.03 / 1M input tokens
1tk = (0.03 * 250000)/1000000 = 0.0075 coins
# Claude 3.5 Sonnet input and read cache:
cache_creation_input_tokens:
$3.75 / 1M input tokens
1tk = (3.75 * 250000)/1000000 = 0.9375 coins
cache_read_input_tokens:
$0.30 / 1M input tokens
1tk = (0.30 * 250000)/1000000 = 0.075 coins
chatgpt-4o-latest
gpt-4o-2024-08-06
gpt-3.5-turbo-0301 Deprecated Rewritten to gpt-3.5-turbo
gpt-3.5-turbo-0613 Deprecated Rewritten to gpt-3.5-turbo
gpt-3.5-turbo-16k Deprecated Rewritten to gpt-3.5-turbo
gpt-3.5-turbo-16k-0613 Deprecated Rewritten to gpt-3.5-turbo
gpt-4-0314 Deprecated Rewritten to gpt-4
gpt-4-vision-preview | gpt-4-1106-vision-preview Deprecated Rewritten to gpt-4o
claude-3-5-sonnet
claude-3-5-sonnet-20240620
gemini-1.5-pro-001
gemini-1.5-flash-001
/v1/models
API changes{
"object": "list",
"data": [
{
"id": "gpt-3.5-turbo",
"object": "model",
"created": 1717133244679,
"owned_by": "OhMyGPT"
}...
]
}
Gemini 1.5 original expense accounting notes
#Gemini 1.5 Pro
Price(input)
$3.50 / 1 million tokens (for prompts up to 128K tokens)
1tk = (3.5 * 250000) / 1000000 = 0.875 coins
$7.00 / 1 million tokens (for prompts longer than 128K)
1tk = (7 * 250000) / 1000000 = 1.75 coins
Price (output)
$10.50 / 1 million tokens (for prompts up to 128K tokens)
1tk = (10.5 * 250000) / 1000000 = 2.625 coins
$21.00 / 1 million tokens (for prompts longer than 128K)
1tk = (21 * 250000) / 1000000 = 5.25 coins
#Gemini 1.5 Flash
Price(input)
$0.35 = 1 million tokens (for prompts up to 128K tokens)
1tk = (0.35 * 250000) / 1000000 = 0.0875 coins
$0.70 = 1 million tokens (for prompts longer than 128K)
1tk = (0.7 * 250000) / 1000000 = 0.175 coins
Price (output)
$1.05 / 1 million tokens (for prompts up to 128K tokens)
1tk = (1.05 * 250000) / 1000000 = 0.2625 coins
$2.10 / 1 million tokens (for prompts longer than 128K)
1tk = (2.1 * 250000) / 1000000 = 0.525 coins
#0514 GPT4o Original Pricing Accounting
gpt-4o | gpt-4o-2024-05-13 $5.00 / 1M tokens $15.00 / 1M tokens
-gpt-4o
- gpt-4o-2024-05-13
input:
$5 = 1M tokens
1tk = (5 * 250000)/1000000 = 1.25 coins
output:
$15 = 1M tokens
1tk = (15 * 250000)/1000000 = 3.75 coins
"stream_options":{
"include_usage": true
}
data: {"id":"chatcmpl-9OcFBHIB9jKqJ9El1a1AkxkN4SH2G","object":"chat.completion.chunk","created":1715653909,"model":"gpt-4-turbo-2024-04- 09","system_fingerprint":"fp_0737e0dfd9","choices":[{"index":0,"delta":{},"logprobs":null,"finish_reason":"length"}],"usage":null}
data: {"id":"chatcmpl-9OcFBHIB9jKqJ9El1a1AkxkN4SH2G","object":"chat.completion.chunk","created":1715653909,"model":"gpt-4-turbo-2024-0 4-09","system_fingerprint":"fp_0737e0dfd9","choices":[],"usage":{"prompt_tokens":778,"completion_tokens":10,"total_tokens":788}}
data: [DONE]
TA/Qwen/Qwen1.5-0.5B-Chat
TA/Qwen/Qwen1.5-1.8B-Chat
TA/Qwen/Qwen1.5-4B-Chat
TA/Qwen/Qwen1.5-7B-Chat
TA/codellama/CodeLlama-7b-Instruct-hf
TA/meta-llama/Llama-2-7b-chat-hf
TA/meta-llama/Llama-3-8b-chat-hf
TA/mistralai/Mistral-7B-Instruct-v0.1
TA/mistralai/Mistral-7B-Instruct-v0.2
TA/google/gemma-7b-it
TA/Qwen/Qwen1.5-14B-Chat
TA/WizardLM/WizardLM-13B-V1.2
TA/meta-llama/Llama-2-13b-chat-hf
TA/Qwen/Qwen1.5-32B-Chat
TA/zero-one-ai/Yi-34B-Chat
TA/deepseek-ai/deepseek-coder-33b-instruct
TA/codellama/CodeLlama-34b-Instruct-hf
TA/Qwen/Qwen1.5-72B-Chat
TA/deepseek-ai/deepseek-llm-67b-chat
TA/codellama/CodeLlama-70b-Instruct-hf
TA/meta-llama/Llama-2-70b-chat-hf
TA/meta-llama/Llama-3-70b-chat-hf
TA/Qwen/Qwen1.5-110B-Chat
TA/mistralai/Mixtral-8x7B-Instruct-v0.1
TA/mistralai/Mixtral-8x22B-Instruct-v0.1
deepseek-coder
gpt-3.5-turbo-web-rev
is reverse engineered using Bindeng chatgpt. The delay is acceptable. Dozens of tokens are deducted symbolically for each successful request. It is close to free. It is based on the number of times instead of tokens. It is very practical when there are many tokens.deepseek-chat
command-r
: (supposedly) a model that benchmarks OAI’s gpt-3.5-turbocommand-r-plus
: (It is said) the target is Anthropic's C3 Sonnetcommand-r-online
: Based on CR, the official built-in web-search function is addedcommand-r-plus-online
: Based on CRP, additional official built-in web-search functiontools
parameter. Now Claude can also use ToolCall. It seems to have built-in thinking chain reasoning, which is a little more awesome than OpenAI's TC:{
"model": "claude-3-opus-20240229",
"max_tokens": 1024,
"messages": [
{
"role": "user",
"content": "Where did Dog Cook's stock fall today?"
}
],
"stream": false,
"tools": [
{
"name": "get_stock_price",
"description": "Get the current stock price for a given ticker symbol.",
"input_schema": {
"type": "object",
"properties": {
"ticker": {
"type": "string",
"description": "The stock ticker symbol, e.g. AAPL for Apple Inc."
}
},
"required": [
"ticker"
]
}
}
]
}
{
"id": "msg_01HVfeUwYukk6pWKzkGWsn8h",
"type": "message",
"role": "assistant",
"content": [
{
"type": "text",
"text": "<thinking>\nThe user is asking about the current stock price for \"狗库克\" which likely refers to Tim Cook, the CEO of Apple Inc. To get the current stock price, I would need to use the get_stock_price function and provide the ticker symbol for Apple, which is AAPL.\n\nThe user provided enough context to infer the parameter value needed:\nticker: Based on the reference to \"狗库克\" (a nickname for Tim Cook), the ticker symbol for Apple (AAPL) can be inferred.\n\nSince the required parameter can be inferred from the context, I can proceed with the function call.\n</thinking>"
},
{
"type": "tool_use",
"id": "toolu_01Tmx5NLhQZ7hEUpnYUGSPqV",
"name": "get_stock_price",
"input": {
"ticker": "AAPL"
}
}
],
"model": "claude-3-opus-20240229",
"stop_reason": "tool_use",
"stop_sequence": null,
"usage": {
"input_tokens": 507,
"output_tokens": 199
}
}
0.5
OpenAI.Chat.Completions
兼容接口中使用。claude-3-opus-patch
claude-3-sonnet-patch
claude-3-haiku-patch
claude-3-haiku
| claude-3-haiku-20240307
claude-3-opus
C3 (大杯)claude-3-sonnet
C3 (中杯)dimensions
parameter support for embedding v3 models/gs 什么是“芝士雪豹”?
text-davinci-003
text-davinci-002
text-ada-001
text-babbage-001
text-curie-001
code-davinci-edit-001
text-davinci-edit-001
text-search-ada-doc-001
VIP 600 RPM => 1k RPM
PREMIUM 0.5w RPM => 1w RPM
gpt-3.5-turbo
, gpt-3.5-turbo-16k
, gpt-4
and gpt-4-32k
, among which gpt-4-32k
provides reliable model inference services unified by Microsoft cloud Azure.credit card/Apple/Google/WeChat Pay
and other methods, and obtain official invoices and receipts at the same time.https://cn2us02.opapi.win/v1/
https://cfcus02.opapi.win/v1/
https://cfwus02.opapi.win/v1/