#25-0922 Update Notice:

Online front-end and back-end support for the request body content store. Setting body.store=true can easily record the input and output content of the request body. It can be rewritten with parameters and forced to overwrite the store to record client communication content or implement cache functions. (Currently only supports the recording function of OpenAI’s Chat.Completions API)

When going online, now set body.omg_extra_body.use_omg_cache to true to realize cache retrieval. If the stored request requires the use of cache, it can be returned directly without any deduction.

#25-0916 Update Notice:

Added CodeX support: In the CodeX dedicated interface, the latest gpt-5-codex released by OpenAI is supported. The original price is tentatively consistent with gpt-5, and the cost is still 5% of the original price.

Added one-click setting script for Coding Agent: Now you only need to enter npx omgvibe in the command line to configure CodeX or Claude Code with one click, https://www.ohmygpt.com/cli-agent. The relevant scripts are open source, warehouse address: https://github.com/DogeNet-LLC/omgvibe

Fixed the billing error of the CodeX dedicated interface: (the actual deduction is less than the fee that should be deducted, and 5% of the original price should be deducted, but only about 1~2% of the actual fee was deducted), the affected time range: from the launch of the CodeX service of this site yesterday to 2025-0916 5 pm Tokyo time.

#25-0915 Update Notice:

CodeX support (0.5 discount)

Easy to use, the effect is no worse than Claude Code + Opus 4.1, highly recommended

Configuration document
https://ohmygpt-docs.apifox.cn/doc-7412544

Added support for nano-banana model

Vertex-gemini-2.5-flash-image-preview

 (nickname: Nano Banana) is supported. This is the latest model released by Google to support image generation. It performs well on image generation tasks and can generate high-quality images. Can be called via /v1/chat/completions

{
  "model": "vertex-gemini-2.5-flash-image-preview",
  "messages": [
    { "role": "user", "content": [{ "type": "text", "text": "Generate a picture of Hakimi" }] }
  ]
}

This model returns images in the form of Markdown+Base64 by default, which has poor client compatibility and lengthy response text. To improve compatibility and bandwidth efficiency, the built-in "Convert and Upload" function can be turned on:

{
  "model": "vertex-gemini-2.5-flash-image-preview",
  "messages": [
    { "role": "user", "content": [{ "type": "text", "text": "Generate a picture of Hakimi" }] }
  ],
  "omg_extra_body": { "gemini_b64img_upload_enabled": true }
}

Claude Code pricing adjustment

Claude Code forwarding service has been adjusted from the original 10% discount to 0.5% discount, which is calculated at 5% of the official model price.

Parameter overwriting function (Public Beta)

Added experimental "parameter override" capability, which can overwrite or delete some API request body fields based on conditions to achieve more fine-grained customization.

Entrance: API console → Parameter override (/apis/overwrite-rules)

Documentation: Specific documents will be released later.

Note: This feature is in beta, please use it with caution. Feedback is welcome if you encounter problems; if you don't know what you are doing, please do not use this feature.

Others

The front-end Chat related functions have recently completed some refactoring, and currently file upload and image upload may not have an effect. Please be patient as we complete the refactoring and a new announcement will be made at that time.

If you need help, please contact us via email help@ohmygpt.com or "Feedback" in the upper right corner of the page.

Added WeChat payment CNY recharge channel

A new CNY recharge channel is added to support WeChat payment, and related orders support invoicing.

#25-0808 Update:

Added support for the latest gpt-5 series models, open to all users:

gpt-5

gpt-5-2025-08-07

gpt-5-mini

gpt-5-mini-2025-08-07

gpt-5-nano

gpt-5-nano-2025-08-07

gpt-5-chat-latest

A limited-time free Claude service has been added to the website Chat page. Welcome to use, share and give feedback.

#25-0626 Update:

Added support for Doubao Seed 1.6 series models

ark-doubao-seed-1.6-250615

ark-doubao-seed-1.6-thinking-250615

ark-doubao-seed-1.6-flash-250615

Pricing:

#25-0618 Update:

Added support for the new version of Gemini 2.5 series models

gemini-2.5-pro

gemini-2.5-flash

gemini-2.5-flash-lite-preview-06-17

#25-0611 Update:

Follow up by lowering the price of O3 models

#25-0608 Update:

Added support for Gemini 2.5 Pro 0605 preview version

gemini-2.5-pro-preview-06-05

#25-0530 Update:

Added support for ARK DeepSeek R1 25 0528

ark-deepseek-r1-250528

(ARK DS R1/V3 series long-term 50% off)

#25-0523 Update:

Added support for Anthropic C4 series:

claude-sonnet-4-0

claude-sonnet-4-20250514

claude-opus-4-0

claude-opus-4-20250514

Note:
These models can directly enable the thinking function by adding the -thinking-enabled suffix.

Added support for several ARK bean bag models:

doubao-1.5-thinking-pro-m-250428

doubao-1.5-thinking-vision-pro-250428

doubao-1.5-vision-pro-250328

doubao-1.5-vision-lite-250315

#25-0521 Update:

Added support for Gemini 2.5 Flash Preview 0520:

gemini-2.5-flash-preview-05-20

Thinking can be disabled by adding a suffix:

gemini-2.5-flash-preview-05-20-thinking-disabled

You can also disable thinking by passing the parameter reasoning_effort = none

Reference documentation: https://ai.google.dev/gemini-api/docs/openai

#25-0514 Update:

Stripe JP Japanese Yen payment channel is on sale for a limited time: bind the card to get an extra 5% of tokens for non-interactive payments, and an extra 2% of tokens for interactive payments.

An interesting preview version of Socket.IO endpoint has been launched: https://ohmygpt-docs.apifox.cn/doc-6666179

#25-0511 Update:

New Stripe JP Japanese Yen payment channel

**Based on Stripe JP payment channel, new support: **

Automatic deduction after binding card

Fast credit card recharge and other functions

Next station front-end detail optimization:

Add a real-time delay display in the lower right corner of the homepage (is it a little easter egg? Kangkang, what is your delay)

More features coming soon

Attachment: Previous update:

Added support for Gemini’s latest Gemini 2.5 Pro 0506 version

gemini-2.5-pro-preview-05-06

#25-0424 Update:

Added support for the latest GPT Image 1 model released by OAI:

gpt-image-1

Note:
This model is not cheap, and usually generates about 1~2 yuan per time. The webmaster did not find a particularly strong image generation ability when debugging ~~ (at least it is very average in the two-spined salamander) ~~, so call it with caution.

#25-0419 Update:

Added support for Gemini 2.5 Flash Preview 04-17 separate billing function for thinking output and normal output

Added Gemini 2.5 Flash Preview 04-17 thinking disabled version: use gemini-2.5-flash-preview-04-17-thinking-disabled request to disable thinking and output directly, suitable for fast translation scenarios

#25-0418 Update:

Added support for Gemini 2.5 Flash Preview 04-17

gemini-2.5-flash-preview-04-17

Note: The thinking pricing when outputting the Gemini 2.5 Flash model is inconsistent with the ordinary output pricing. Since the compatible interface of the Google AI official API does not return the thinking Tokens usage, the output fee is temporarily calculated based on the pricing of thinking output. It is expected to be calculated separately tomorrow and is not recommended for the time being.

o-series compatibility optimization: model requests that no longer support the max_tokens parameter in the o-series are automatically rewritten to the max_completion_tokens parameter

Optimization of the automatic retry function: The automatic retry mechanism has been optimized to avoid as much as possible the problem of occasionally failing to access the models that return 400, 404 and other recent user feedback.

#25-0417 Update:

Added support for the latest o3 and o4m series released by OpenAI:

o3

o3-2025-04-16

o4-mini

o4-mini-2025-04-16

Added Doubao 1.5 Thinking Pro series that supports Byte ARK (50% off for a limited time):

doubao-1.5-thinking-pro-250415

doubao-1.5-thinking-pro-vision-250415

#25-0416 Update:

Added the opening thinking function of the C37S model in the Claude In OAI compatible interface:

In the OpenAI Chat.Completion API, pass the model name to claude-3-7-sonnet-20250219 + -thinking-enabled (for example, claude-3-7-sonnet-20250219-thinking-enabled) to enable thinking compatibility support

Example:

{
    "model": "claude-3-7-sonnet-20250219-thinking-enabled",
    "messages": [
        {
            "role": "user",
            "content": "Let me write a 10-character Tang poem"
        }
    ],
    "stream": false
}

response:

{
    "id": "chatcmpl-f5YCgBV1YWsGe8T6XtHT5DB9CgUak",
    "object": "chat.completion",
    "created": 1744813755,
    "model": "claude-3-7-sonnet-20250219",
    "choices": [
        {
            "index": 0,
            "message": {
                "role": "assistant",
                "content": "<think>I need to create a 10-character Tang poem. Tang poems usually have different formats, such as five-character, seven-character, etc., but here the requirement is a 10-character poem. A 10-character poem is not a common format for traditional Tang poetry, but I can create a poem containing 10 Chinese characters, paying attention to the tone and artistic conception.\n\nI can provide a single-line 10-character poem, or a multi-line poem with a total of 10 characters. A short poem of 10 words. \n\nPlan 1: A line of 10 words. For example: \n\"The spring breeze blows and the fragrance of flowers fills the mountains\n" (11 words, does not match)\n\"The bright moon hangs high and people are thinking alone in the quiet night\n\n\n" "(10 words, consistent)\n\"When the mountains are green and the water is singing and the birds are singing and the flowers are fragrant\" (10 words, consistent)\n\nOption 2: A multi-line poem with a total of 10 words\nFor example:\n\"The spring breeze is blowing,\n The garden is full of flowers. \"\n(8 words, not applicable)\n\n\"The lonely mountain is far away,\n the flowing water is long,\n the place where people return. \"\n(7 characters, not applicable)\n\n\"The bright moonlight,\n misses my hometown,\n I haven't returned yet. \"\n (9 words, not applicable)\n\nI chose to provide a 10-word single-line poem, paying attention to creating a beautiful artistic conception and a sense of rhythm that is in line with traditional Chinese poetry:\n\n\"The bright moon hangs high in the night and the quiet person thinks alone\"\nThis poem depicts a moonlit night, a scene of a person missing alone in the quiet night, with a touch of sadness and homesickness.\n\nOr\n\n\ "When the mountains are green, the water is singing, the birds are singing, and the flowers are fragrant."\nThis poem depicts a beautiful spring scene that is natural, harmonious, and full of vitality. \n\nI will think of some other 10-word poems:\n\n\"Autumn breeze. The courtyard is desolate and full of fallen leaves \"(10 words)\n\"The distant mountains are like black grass and the drizzle moistens the moss\"(10 words)\n\"The morning light dews the mountains and rivers into my dream\"(10 words)\n\"The small bridge and the flowing water of the people "In the mist and rain" (10 words)\n\nConsidering the style and artistic conception of Tang poetry, I chose:\n\n\"The mountains are high, the water is long, the clouds are light, and the wind is gentle\"\nThis poem depicts a quiet and beautiful natural scene, expressing the praise of the harmonious beauty of nature.\n\nI will consider the rhythm and rhythm again:\n\n\"Green mountains and flowing water, clouds are light and the moon is like a hook\"\nThis poem has both a description of the natural scenery. Writing, there is some artistic conception, and the last word "hook" is used as a rhyme to give people a sense of completion. \n\nAccording to the tradition of Tang poetry, I can usually name this 10-word poem "Night Thoughts" or "Landscape". \n\nIn the end, I decided to provide a 10-word Tang poem:\n\n"Landscape"\nGreen Mountains and Flowing Waters, Clouds and Light Moon Like Hooks</think>\n# Cross Tang Poetry\n\n## "Mountain Dwelling Scenery"\nGreen Mountains and Green Waters, Birds Chirping and Flowers Fragrant\n\n---\n\nNote: This is a short poem describing the beautiful scenery in the mountains, condensing the beauty of nature into the cross. ",
                "refusal": null
            },
            "logprobs": null,
            "finish_reason": "stop"
        }
    ],
    "usage": {
        "prompt_tokens": 45,
        "completion_tokens": 881,
        "total_tokens": 926
    }
}

#25-0415 Update:

Added support for the latest GPT-4.1 series models:

gpt-4.1

gpt-4.1-2025-04-14

gpt-4.1-mini

gpt-4.1-mini-2025-04-14

gpt-4.1-nano

gpt-4.1-nano-2025-04-14

Added support for the latest GROK-3 series models:

grok-3-beta

grok-3-mini-beta

grok-3-mini-fast-beta

grok-3-fast-beta

Pricing data: https://next.ohmygpt.com/pricing

#25-0326 Update:

Supports ByteDance’s latest DS V3 250324 ark-deepseek-v3-250324, Byte platform’s ultra-high concurrency, exclusive 30% off (ARK R1 50% off)

Return of the old Gemini compatible solution:

You can use the old Gemini compatibility solution in any of the following ways as a temporary alternative to Google's official OAI compatibility solution for issues such as empty replies:

Add -legacy-support suffix after gemini model name
-Set HTTP Header
Key: x-gemini-legacy-support
Value: true

#25-0323 Update:

Gemini series models are replaced with official OAI compatible solutions, which perfectly support function calling, image recognition, speech recognition and other functions

Support Cohere's latest CommandA model: command-a-03-2025

The new version of the Next site has been launched and is under rapid development in the near future. After the development of the Chat function is completed, it will completely replace the main site. Experience and feedback are welcome: https://next.ohmygpt.com

#25-0313 Update:

Added support for the following models:

gpt-4o-search-preview

gpt-4o-search-preview-2025-03-11

gpt-4o-mini-search-preview

gpt-4o-mini-search-preview-2025-03-11

Added support for related parameters:

web_search_options

Related documents:

https://platform.openai.com/docs/api-reference/chat

https://platform.openai.com/docs/guides/tools-web-search

#25-0228 Update:

Added support for the latest gpt4.5 preview version released by OpenAI. The new models are as follows:

gpt-4.5-preview

gpt-4.5-preview-2025-02-27

#25-0226 Update:

Added support for the latest official version of G2FL released by Google AI. The new models are as follows:

gemini-2.0-flash-lite

gemini-2.0-flash-lite-001

Pricing table:

#25-0225 Update:

Added support for the latest C37S released by Anthropic. The new models are as follows:

claude-3-7-sonnet-latest

claude-3-7-sonnet-20250219

The price is the same as C35S

The new Thinking reasoning is currently only supported in the Messages native API. You need to add the following parameters to enable thinking:

    "thinking": {
        "type": "enabled",
        "budget_tokens": 1024
    },

Example:

{
    "model": "claude-3-7-sonnet-20250219",
    "messages": [
        {
            "role": "user",
            "content": "A brief analysis of the reasons why OpenAI is not open source"
        }
    ],
    "thinking": {
        "type": "enabled",
        "budget_tokens": 1024
    },
    "stream": false,
    "max_tokens": 2048
}

The extended output beta function has supported:

Add this header when Messages native API requests C37 to expand the output to 128k, otherwise the default maximum limit is 64k

Header: anthropic-beta
Value: output-128k-2025-02-19

#25-0224 Update:

Byte ARK's official DS discount has completely ended, but users of this site continue to enjoy special discounts at very cheap prices. You are welcome to use or access:

ARK DeepSeek R1 continues to have a 50% discount ( ark-deepseek-v3-241226 )

ARK DeepSeek V3 is an exclusive 30% discount for users of this site ( ark-deepseek-v3-241226 )

The Chat page DeepSeek R1/V3 built into the old version of the webpage is changed to provided by ARK (previously provided by TA/Azure

Added Cohere Aya model (if no relevant official pricing was found, it will be free for now):

c4ai-aya-expanse-32b

c4ai-aya-expanse-8b

#25-0221 Update:

Added OpenAI Omni Moderation model support:

omni-moderation-latest

omni-moderation-2024-09-26

ARK DeepSeek V3 users on this site now enjoy an exclusive 20% discount ( ark-deepseek-v3-241226 )

ARK DeepSeek R1 has now returned to 50% off, and will be changed to 30% off on the 23rd. It is also an exclusive discount for users of this site (ark-deepseek-r1-250120)

50% of ARK DS R1 purchases from the 18th to the 21st will be refunded

#25-0220 Update:

The limited-time discount on TA models is cancelled, and the multiplier is restored from 0.1 to 1.1

#25-0218 Update:

ByteDance ARK’s DeepSeek R1 and V3 APIs have returned to their original prices

-ark-deepseek-r1-250120
Original discount price: input 2 CNY / million Tokens | output 8 CNY / million Tokens
Current price: Input 4 CNY / million Tokens | Output 16 CNY / million Tokens

-ark-deepseek-v3-241226
Original discount price: input 1 CNY / million Tokens | output 4 CNY / million Tokens
Current price: Input 2 CNY / million Tokens | Output 8 CNY / million Tokens

#25-0212 Update:

Added support for o3-mini-rev, which is charged based on the number of times and 120 coins will be deducted for a successful call.

#25-0209 Update:

The pricing of deepseek-chat has been increased following the official announcement:

Original price: input 1 CNY, cache hit 0.1 CNY, output 2 CNY

Current price: input 2 CNY, cache hit 0.5 CNY, output 8 CNY

TA/deepseek-ai/DeepSeek-R1 Limited time discount price:

Same price for input and output: 5.15 yuan/million Tokens

TA/deepseek-ai/DeepSeek-V3 Limited time discount price:

Same price for input and output: 0.92 yuan/million Tokens

#25-0207 Update:

Added support for TA series models in the Completions API, that is, DS R1 similar to TA can now be used in the /v1/completions API, which may be used by some software

Removed some TA models that have been removed from the shelves, such as Qwen1.5

Added TA/deepseek-ai/DeepSeek-R1-Distill-Llama-70B model support

#25-0206 Update:

Google AI magnification change: 0.2 => 1.05

Added support for Gemini 2.0 Flash

gemini-2.0-flash

gemini-2.0-flash-001

Added support for Gemini's various free exp models (but the quotas for exp and preview models are super low and are for personal use only, and are not recommended for use in prod scenarios)

gemini-2.0-flash-lite-preview-02-05

gemini-2.0-flash-thinking-exp-01-21

gemini-2.0-pro-exp-02-05

#25-0205 Update:

Added support for DS R1 and V3 model inference services provided by Byte ARK:

ark-deepseek-r1-250120

ark-deepseek-v3-241226

#25-0204 Update:

Added support for TA/deepseek-ai/DeepSeek-V3

Added support for free TA/deepseek-ai/DeepSeek-R1-Distill-Llama-70B-free

Added o1 series max_tokens compatibility support (the o1 series does not support max_tokens, so if this value is defined, assign it to max_completion_tokens and delete this value to ensure the compatibility of other software)

Added compatibility support for the official deepseek-reasoner Beta function to customize the "prefix" continuation of the last assistant msg

#25-0201 Update:

Added support for OpenAI's latest closed-source o3-mini series of API model inference services, and added support for the following models

o3-mini

o3-mini-2025-01-31

Added support for 4om-aup:

gpt-4o-mini-audio-preview

gpt-4o-mini-audio-preview-2024-12-17

The cost is compared with o1 and o1m for fracture treatment (

Additional update 25-0201 night:

As reminded by friends, the price of o1-mini has also been reduced, and the price has been reduced simultaneously.

Acknowledgments:
DeepSeek deep search

#25-0130 Update:

Added the open source DeepSeek R1 model (671B full health version) hosted by TA, which can be used as an alternative to the recent stability issues on the DS official website:

TA/deepseek-ai/DeepSeek-R1

Current input and output fees: $3.5/million Tokens

Acknowledgments:
DeepSeek deep search

#25-0129 Update:

A new open source DeepSeek R1 model hosted on this site is now available for free and can be used as an alternative to the recent stability issues on the DS official website:

ohmygpt/deepseek-r1-32b

ohmygpt/deepseek-r1-1.5b

Acknowledgments:
DeepSeek deep search

In addition, today is the Chinese New Year. I wish everyone good health and all the best.

#25-0120 Update:

Added support for deepseek-reasoner model

Note: For compatibility reasons, temperature top_p presence_penalty frequency_penalty that is not supported by this model will be removed when this model is requested.

#25-0118 Update:

The old version of the chat page now supports the default ChatGPT 4o Free (gpt-4o-web-rev), which can be used by all users for free.

API update supports the gpt-4o-web-rev reverse model. Pricing is still based on pay-per-use. One dollar can call 4o approximately 566 times, which is close to free. This service is maintained by the internal development team and will be more stable, but it is not recommended for commercial or official Prod purposes.

#25-0114 Update:

Added stream compatibility adaptation that supports the official version of o1. Since o1 in the official API does not support stream streaming requests, many software will be unable to use o1. In order to expand compatibility, when requesting o1 stream mode, this site will request the official o1 API in a synchronous manner, and then convert the result into Stream format and return it to achieve compatibility.

#25-0112 Update:

Added support for the o1 official version, and the new supported model names are as follows:

o1

o1-2024-12-17

The pricing table can be viewed on the Next station Pricing page:

#25-0108 Update:

Added support for DeepSeek series models to be called in Completions API, and support for DS's FIM Completion

Added support for DeepSeek series model cached Tokens to be recorded in bill details

Added gemini-1.5-pro model name

Attached:

FIM example:
POST: /v1/completions

{
    "model": "deepseek-chat",
    "prompt": "This is a JSON: '",
    "suffix": "', done."
}
``

#25-0106 Update:

- Follow up and update 01AI model name support, add support for `yi-vision-v2`
- Removed support for the deprecated 01AI model, but retained model rewriting functionality

#25-0105 Update:

- Reconstructed the billing module of the GPT series model in the Chat.Completions API, which now supports cache billing. The cache billing details can be viewed on the Pricing page of the Next site[1]
- The DeepSeek series model URL has been changed from the default URL to a Beta URL. Now users of DeepSeek can use some of DeepSeek's Beta features[2]
- Completely remove support for some old deprecated models ( `gpt-3.5-turbo-0301` , `gpt-3.5-turbo-0613` , `gpt-3.5-turbo-16k-0613` , `gpt-4-0314` , `gpt-4-32k` , `gpt-4-32k-0314` , `gpt-4-32k-0613` ) (To ensure the compatibility of old versions of the program, these old version model names will be automatically rewritten to new model names of the same level)

[1] Next station Pricing page https://next.ohmygpt.com/pricing
[2] DeepSeek Beta https://api-docs.deepseek.com/guides/chat_prefix_completion

#25-0103 Update:

- Optimize the API response speed: fix the performance degradation of the internal Redis service caused by the Socket.IO Stream Adapter, which in turn causes the API response speed to slow down.

#25-0102 Update:

- The DeepSeek series model billing function was called incorrectly, causing users who used the model to receive abnormal deductions. This bug was quickly fixed after receiving feedback from users.
- In order to compensate all users affected by this bug, all users' deepseek-related model calling fees from 24-1230 to 25-0102 will be waived and returned to their balances. They are being counted and will be processed as soon as possible.

#24-1231 Update:

- The rate of DeepSeek series models is changed from 1.1 to 1, that is, no additional fees will be charged
- Reconstruct the DeepSeek series model billing module to support CacheHitTokens billing
- Added DeepSeek series model pricing display to the Pricing page of Next station

Finally, I wish you all a happy New Year in 2025!

#24-1229 Update:

Thanks to group friend @Bryce for sponsoring, the following bean bag models are now available for free for a limited time. You can play as you like:

- `doubao-pro-128k-240628`
- `doubao-pro-32k-241215`
- `doubao-pro-4k-240515`
- `doubao-pro-4k-character-240728`
- `doubao-lite-4k-240328`
- `doubao-lite-4k-character-240828`

ps: According to visual inspection, the strongest one should be `doubao-pro-32k-241215`. If you are interested, you can come and try it out.
 
![image.png](https://api.apifox.com/api/v1/projects/3190129/resources/486969/image-preview)
![image.png](https://api.apifox.com/api/v1/projects/3190129/resources/486968/image-preview)

#24-1228 Update:

- Added large model of Doubao series that supports ByteDance
- Added moonshot v1 series large models that support The Dark Side of the Moon (provided by ByteDance’s Volcano Platform)
- The multiplier is 1 (that is, it is the original price directly, with a 0 markup. You don’t need to take this into account to earn rice. It’s just convenient for everyone)

For pricing details, please see the Next station Pricing page: https://next.ohmygpt.com/pricing


![image.png](https://api.apifox.com/api/v1/projects/3190129/resources/486929/image-preview)

#24-1221 Update:

- Google AI series models are on sale at 20% off
- Improve the stability of Google AI series automatic retry function

#24-1219 Update:

- Added support for the latest 24-1217 version of the Realtime series models. The price of the new version of 4o Realtime has been reduced a lot, and the price of the mini is also cheaper. With the cache function of high hit rate, the price has reached an acceptable level:

`gpt-4o-realtime-preview-2024-12-17`
`gpt-4o-mini-realtime-preview`
`gpt-4o-mini-realtime-preview-2024-12-17`

- Supports the cache accounting function of Realtime series models.
- Updated the real-time call demo of the Next station. You can now experience the new version of the Realtime series model in the related demo of the Next station: https://next.ohmygpt.com/chat Mirror: https://next.aigptx.top/chat


![image.png](https://api.apifox.com/api/v1/projects/3190129/resources/485118/image-preview)


#24-1218 Update:

- Added support for the latest `gpt-4o-audio-preview-2024-12-17` model, and the audio input and output price is much lower than the original 4oA1001.
- Regarding the latest official version of `o1`, since OAI is not fully open, I looked at a bunch of Tier5 accounts and found no one with o1 official version API permissions, so I did not access it today and am waiting for the subsequent opening of OAI.
- Chat.Completions API newly supports the `reasoning_effort` optional parameter, which can adjust the "effort level" when thinking about the o1 series model. According to the official documentation [1], this parameter is only meaningful for the o1 series.
- The "Consumption Records" section has added support for cache fields and audio input and output fields. These data can now be queried in the usage statistics/consumption records of the Next station.
- The new 4o-realtime1217 will be accessed tomorrow. It will take some time to add support for some of these fields today.


![image.png](https://api.apifox.com/api/v1/projects/3190129/resources/484617/image-preview)

![image.png](https://api.apifox.com/api/v1/projects/3190129/resources/484618/image-preview)

[1] OpenAI official documentation https://platform.openai.com/docs/api-reference/chat/create

#24-1205 Update:

- Optimized the balance display function of the Next station. The estimated legal currency value will now be displayed. The corresponding legal currency value estimate has also been added to the APIKey management table.
- API endpoint display has been added to the APIKey management page of the Next site.
- Added a CF anti-generation building tutorial document to achieve simple and safe API original data recording and model name rewriting functions: https://ohmygpt-docs.apifox.cn/doc-5628298


#24-1127 Update:

- Newly launched file storage business `file-v1`

This business plan will serve as a safe, high-speed, cost-effective and controllable file service to provide basic services for any business related to file processing, such as providing multi-modal file temporary storage/document upload and download functions for AI services. You can also use the storage service of this site for purposes such as drawing beds.

All users can use 1GB of file storage space for free, and any excess will be billed according to this pricing standard:

Pricing:

`$9.9/TB/Month`

In addition, there are no additional fees, and the request and export traffic fees are free.

Documentation: https://ohmygpt-docs.apifox.cn/doc-5575792

#24-1126 Update:

- Added support for Claude Count Message tokens API `/v1/messages/count_tokens`

Documentation: https://ohmygpt-docs.apifox.cn/api-237994522

The new file system was originally planned to go online today, but due to a lot of pitfalls caused by CDN caching issues, it will be postponed for a day or two.

#24-1123 Update:

- The official built-in PDF input function has been supported in Anthropic's native Messages interface. To use this function, you need to add the `pdfs-2024-09-25` keyword in the `anthropic-beta` request header to enable this Beta function:

anthropic-beta: pdfs-2024-09-25, prompt-caching-2024-07-31


![image.png](https://api.apifox.com/api/v1/projects/3190129/resources/478988/image-preview)

#24-1121 Update:

- Added support for the latest `gpt-4o-2024-11-20` model, the price has not changed from the previous version 24-0806

![image.png](https://api.apifox.com/api/v1/projects/3190129/resources/478460/image-preview)

#24-1116 Update:

- Improve the color distinction of model usage statistics chart
- Next station has added a real-time chat room technology demonstration demo. Welcome to test the experience (it is to release it for public testing to see how it works in practice):

Demo(main site): https://next.ohmygpt.com/tools
Demo (mirror): https://next.aigptx.top/tools

Feature:
1. Based on Socket.IO, relatively reliable two-way low-latency real-time communication between client and server is achieved.
2. There is no relational database behind this demo to record messages, and it relies solely on the framework emit to send messages.
3. You can create a room by yourself to initiate a group chat
4. Temporarily unlimited use of the built-in AI chat function (model Command R Plus (By Cohere)


#24-1111 Update:

- APIKey permission control revision:

In order to avoid more users’ doubts about the inability of the old Key to directly call the new model, some improvements have been made to the APIKey related modules of the website today:

All APIKeys now have a new "check permissions" flag

Keys created in web pages will now no longer check model permissions (APIKeys created in web pages will set this flag to false by default)
-Old Keys can manually turn off this switch to bypass model permission checks

For security reasons, the "check permissions" flag of all old APIKeys in the system is set to true, that is, all old APIKey permission checks will still be enabled, and there is no need to worry about old keys being affected by this.



#24-1110 Update:

- Added support for `glm-4v-plus` model, see [1] for pricing details

[1] https://next.ohmygpt.com/pricing

#24-1109 Update:

- Added Gemini series JSON output mode support, please see the documentation for relevant implementation details[1]
- Added `gemini-1.5-flash-8b` official version model support, please see [2] for pricing details

[1] https://ohmygpt-docs.apifox.cn/api-123253693
[2] https://next.ohmygpt.com/pricing

#24-1108 Update:

- Improve the message splicing logic of Gemini series models in the OpenAI compatible interface
- Improve the compatibility of the Claude series model OpenAI interface, while improving the documentation and showing relevant implementation details[1]

[1] https://ohmygpt-docs.apifox.cn/api-123253693

#24-1107 Update:

- Improved Tools call compatibility support for Claude series models in the OpenAI compatible interface

For details, see: https://ohmygpt-docs.apifox.cn/api-123253693

#24-1106 Update:

- Bot now supports PDF file chat

https://t.me/getfunbot

#24-1105 Update:

- Supports the latest Claude 3.5 Sonnet, adding the following model names:
- `claude-3-5-haiku`
- `claude-3-5-haiku-latest`
- `claude-3-5-haiku-20241022`

- Added new version of GLM series model support
- `glm-4-plus`
- `glm-4-airx`
- `glm-4-air`
- `glm-4-long`
- `glm-4-flashx`
- `glm-4-flash`

For price details, check the pricing page: https://next.ohmygpt.com/pricing

#24-1103 Update:

- Solve the Google/GitHub OAuth authentication jump problem of Next station
- Complete the new version of account registration/password reset logic (you can now use verification code to register/reset account password)

![image.png](https://api.apifox.com/api/v1/projects/3190129/resources/474969/image-preview)

Attached:
Next main website: https://next.ohmygpt.com
Next site mirror: https://next.aigptx.top


At this point, the login, registration, recharge, API management and other functions of the new version of the website are basically completed. Currently, users who only use the API can start using the Next site. The next development direction of the website is:

- Vector retrieval service and knowledge base question and answer service
- Complete the open file system based on object storage (used for file upload related purposes, and can also be used as a small network disk
- Complete the Chat function of the new version of the website (including file upload, plug-in system, code interpreter, flow chart, Agent, etc.
- Complete the image generation function of the new website
- Improve TgBot and use the capabilities of the Tg platform to implement more interesting and practical functions at low cost
- Access more API service providers and more models
- Started doing some business based on Japanese companies


#24-1101 Update:

- Completed Next station usage details query function, now you can query or export usage details on Next station `/settings/usage` page

![image.png](https://api.apifox.com/api/v1/projects/3190129/resources/474721/image-preview)

![image.png](https://api.apifox.com/api/v1/projects/3190129/resources/474722/image-preview)

Attached:
Next main site: https://next.ohmygpt.com/settings/usage
Next site mirror: https://next.aigptx.top/settings/usage

#24-1031 Update:

- Completed Next station usage statistics function, now you can query the usage statistics chart on Next station `/settings/usage` page

![image.png](https://api.apifox.com/api/v1/projects/3190129/resources/474549/image-preview)

Attached:
Next main site: https://next.ohmygpt.com/settings/usage
Next site mirror: https://next.aigptx.top/settings/usage


#24-1026 Update:

- Calling Claude series and Gemini series models through OpenAI compatible interfaces will now return usage-related data blocks
- Improved the pricing display page: Added detailed pricing display for some models such as Claude and Gemini, which can be viewed on the `/pricing` page of the Next station

Attached:
Example usage data block:
```json
{
    "id": "chatcmpl-hMOdfnoaXtcpCZwRJISUAzPtsfZKs",
    "object": "chat.completion.chunk",
    "created": 1729944958,
    "model": "claude-3-5-sonnet",
    "choices": [],
    "usage": {
        "prompt_tokens": 11,
        "completion_tokens": 8,
        "total_tokens": 19,
        "prompt_tokens_details": {
            "cached_tokens": 0
        },
        "completion_tokens_details": {
            "reasoning_tokens": 0
        }
    }
}

#24-1024 Update:

Improved pricing page: Added detailed pricing display of some models such as OpenAI and 01AI, which can be viewed on the /pricing page of Next station

Next main website https://next.ohmygpt.com/pricing

Next mirror https://next.aigptx.top/pricing

Flux API is no longer free, please see the pricing page for details

Happy 1024 Programmer’s Day (I realized today is 1024 when I wrote this update log)

#24-1023 Update:

Added support for Anthropic's latest claude-3-5-sonnet-20241022

#24-1022 Update:

Added new Grok series models that support Laomajia X-AI, magnification 1.1: grok-beta

Added Qwen2.5 7B/72B model that supports TA

The self-developed Telegram Bot [beta version] has been launched: @getfunbot, and is currently actively developing new features. Document link: https://ohmygpt-docs.apifox.cn/doc-5319532

Attached:

#Grok Original Pricing Notes

## grok-beta

PromptTokens:
$5.00

CompletionTokens:
$15.00

input:

$5/1M tokens

1tk = (5 * 250000)/1000000 = 1.25 coins

output:

$15/1M tokens

1tk = (15 * 250000)/1000000 = 3.75 coins

#24-1018 Update:

Added support for the latest gpt-4o-audio-preview series of models. However, due to technical reasons, only non-streaming calls of this model are currently supported. Users who need real-time voice interaction please use the /v1/realtime real-time WebSocket endpoint of this site.

The self-service invoicing system has been updated. Now users can apply for and initiate mainland business-to-business transfers and recharges by themselves (it also starts from a minimum of 1 yuan)

System documentation: https://ohmygpt-docs.apifox.cn/doc-5305167

Attached:

# gpt-4o-audio-preview series original pricing accounting notes

gpt-4o-audio-preview
gpt-4o-audio-preview-2024-10-01

Pricing analysis:

Text
$2.50 / 1M tokens
$10.00 / 1M tokens
Audio***
$100.00 / 1M tokens
$200.00 / 1M tokens

====TEXT====

input:

$2.5 / 1M tokens

1tk = (2.5 * 250000)/1000000 = 0.625 coins

output:

$10/1M tokens

1tk = (10 * 250000)/1000000 = 2.5 coins
====TEXT====

====Audio====

input:

$100 / 1M tokens

1tk = (100 * 250000)/1000000 = 25 coins

output:

$200 / 1M tokens

1tk = (200 * 250000)/1000000 = 50 coins
====Audio====

#24-1017 Update:

A newly launched self-service invoicing system. Now users can apply for and issue invoices for alipay/alipayEnterprise/CNB2B (mainland business-to-business transfers) orders by themselves, starting from a minimum of 1 yuan.

System address: https://ivs.dogenet.cn/

System documentation: https://ohmygpt-docs.apifox.cn/doc-5305167

#24-1016 Update:

Added support for 01 All Things Series large models:

yi-lightning

yi-large

yi-medium

yi-vision

yi-medium-200k

yi-spark

yi-large-rag

yi-large-fc

yi-large-turbo

Multiply: 1 (original price forwarded)

Attached:

# yi model original pricing calculation notes

## yi-lightning

Pricing:
￥0.99 /1M tokens

1tk = (0.99 * 34000)/1000000 = 0.03366 coins

## yi-large

Pricing:
￥20 /1M tokens

1tk = (20 * 34000)/1000000 = 0.68 coins

## yi-medium

Pricing:
￥2.5 /1M tokens

1tk = (2.5 * 34000)/1000000 = 0.085 coins


## yi-vision

Pricing:
￥6 /1M tokens

1tk = (6 * 34000)/1000000 = 0.204 coins

## yi-medium-200k

Pricing:
￥12 /1M tokens

1tk = (12 * 34000)/1000000 = 0.408 coins

## yi-spark

Pricing:
￥1 /1M tokens

1tk = (1 * 34000)/1000000 = 0.034 coins

## yi-large-rag

Pricing:
￥25 /1M tokens

1tk = (25 * 34000)/1000000 = 0.85 coins

## yi-large-fc

Pricing:
￥20 /1M tokens

1tk = (20 * 34000)/1000000 = 0.68 coins

## yi-large-turbo

Pricing:
￥12 /1M tokens

1tk = (12 * 34000)/1000000 = 0.408 coins

Note: The calculation here is based on ￥1=34000 coins

#24-1011 Update:

Added CDN mirrors optimized for the Asia-Pacific region:

Mirror station 1: https://www.aigptx.top

Mirror station 2: https://x.dogenet.win

New version of website mirror 1: https://next.aigptx.top

Pure API reverse image: https://c-z0-api-01.hash070.com/v1

(The measured access speed in the Asia-Pacific region is first-rate, highly recommended by users in some regions)

#24-1005 Update:

Added support for OpenAI's latest Realtime Endpoint and 4o-realtime preview model. You can now talk to AI in real time with low latency.

The web version has launched out-of-the-box chat function, you can now experience it on the new web page: https://next.ohmygpt.com/chat or https://next.aigptx.top/chat

Newly supported model names:

gpt-4o-realtime-preview-2024-10-01

Attached:

Documentation: https://ohmygpt-docs.apifox.cn/websocket-3563866

# 4o-realtime original pricing accounting notes

- gpt-4o-realtime-preview-2024-10-01

====TEXT====

input:

$5/1M tokens

1tk = (5 * 250000)/1000000 = 1.25 coins

output:

$20 / 1M tokens

1tk = (20 * 250000)/1000000 = 5 coins
====TEXT====

====Audio====

input:

$100 / 1M tokens

1tk = (100 * 250000)/1000000 = 25 coins

output:

$200 / 1M tokens

1tk = (200 * 250000)/1000000 = 50 coins
====Audio====

#24-0913 Update

Added support for OpenAI's latest o1 and o1 mini series models. The current concurrency quota is very low and is only suitable for personal early adopter use (API stream calls have been supported to be compatible with most applications (converting non-stream response results into stream results)). It is not suitable for production use on the product line, nor is it suitable for immersive translation.

Mainland business-to-business transfers and mainland electronic invoice issuance are now supported. If necessary, you can directly chat with the group leader via email or tg for processing.

Attached:

Newly supported model names:
-o1-preview

o1-preview-2024-09-12
-o1-mini

o1-mini-2024-09-12

# o1 series original pricing calculation
o1-preview | o1-preview-2024-09-12

$15.00 / 1M input tokens
$60.00 / 1M output tokens

input:

$15/1M tokens

1tk = (15 * 250000)/1000000 = 3.75 coins

output:

$60 / 1M tokens

1tk = (60 * 250000)/1000000 = 15 coins


----

o1-mini | o1-mini-2024-09-12

$3.00 / 1M input tokens
$12.00 / 1M output tokens


input:

$3/1M tokens

1tk = (3 * 250000)/1000000 = 0.75 coins

output:

$12/1M tokens

1tk = (12 * 250000)/1000000 = 3 coins

====

OpenAI o1-preview
o1-preview is our new reasoning model for complex tasks that require broad general knowledge. The model has 128K context and an October 2023 knowledge cutoff.

Learn about o1-preview
Model
Pricing
o1-preview
$15.00 / 1M input tokens
$60.00 / 1M output tokens
o1-preview-2024-09-12
$15.00 / 1M input tokens
$60.00 / 1M output tokens
OpenAI o1-mini
o1-mini is a fast, cost-efficient reasoning model tailored to coding, math, and science use cases. The model has 128K context and an October 2023 knowledge cutoff.

Learn about o1-mini
Model
Pricing
o1-mini
$3.00 / 1M input tokens
$12.00 / 1M output tokens
o1-mini-2024-09-12
$3.00 / 1M input tokens
$12.00 / 1M output tokens

#0828 Update:

Added support for the recently listed gemini-1.5-pro-exp-0827 and gemini-1.5-flash-exp-0827 (prices correspond to 1.5 pro and 1.5 flash respectively)

0825 Added support for Flux series image generation API documentation: https://ohmygpt-docs.apifox.cn/api-208347493

#24-0819 Update

Follow up and lower the gemini 1.5 flash cost calculation formula

Added support for gemini-1.5-pro-exp-0801

The Gemini API service provider migrated from the Google Cloud Vertex AI platform to the Google AI platform, and the concurrency of Gemini related models increased significantly.

Turn off Gemini API built-in auditing

Attached:

# gemini 1.5 pro original pricing calculation notes

----
Input price (per 1M tokens)

$3.50 for <= 128K tokens

$7.00 for > 128K tokens
----

input: for <= 128K tokens:

$3.50/1M tokens

1tk = (3.50 * 250000)/1000000 = 0.875 coins

input: for > 128K tokens:

$7.00 /1M tokens

1tk = (7.00 * 250000)/1000000 = 1.75 coins

----

Output price (per 1M tokens)

$10.50 for <= 128K tokens

$21.00 for > 128K tokens

----

output: for <= 128K tokens

$10.50 /1M tokens

1tk = (10.50 * 250000)/1000000 = 2.625 coins

output: for > 128K tokens

$21.00 /1M tokens

1tk = (21.00 * 250000)/1000000 = 5.25 coins

=====================================


#gemini 1.5 flash original pricing calculation notes

----
Input price (per 1M tokens)

$0.075 for <= 128K tokens

$0.15 for > 128K tokens
----

input: for <= 128K tokens:

$0.075 /1M tokens

1tk = (0.075 * 250000)/1000000 = 0.01875 coins

input: for > 128K tokens:

$0.15 /1M tokens

1tk = (0.15 * 250000)/1000000 = 0.0375 coins


----
Output price (per 1M tokens)

$0.30 for <= 128K tokens

$0.60 for > 128K tokens
----

output:

$0.30 /1M tokens

1tk = (0.30 * 250000)/1000000 = 0.075 coins


output:

$0.60 /1M tokens

1tk = (0.60 * 250000)/1000000 = 0.15 coins

#24-0815 Update

Added Prompt Cache support for some Claude models, which may save a lot of costs for continuous conversations with very long texts. This feature is only supported on the Messages native interface and will take effect with the corresponding request header. For details, please refer to: https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching

Restore Mj service, but the price is twice as high as before

# Opus input and read cache:

cache_creation_input_tokens:

$18.75 / 1M input tokens

1tk = (18.75 * 250000)/1000000 = 4.6875 coins

cache_read_input_tokens:

$1.50 / 1M input tokens

1tk = (1.50 * 250000)/1000000 = 0.375 coins

# Claude 3.0 Haiku input and read cache:

cache_creation_input_tokens:

$0.30 / 1M input tokens

1tk = (0.30 * 250000)/1000000 = 0.075 coins

cache_read_input_tokens:

$0.03 / 1M input tokens

1tk = (0.03 * 250000)/1000000 = 0.0075 coins

# Claude 3.5 Sonnet input and read cache:

cache_creation_input_tokens:

$3.75 / 1M input tokens

1tk = (3.75 * 250000)/1000000 = 0.9375 coins

cache_read_input_tokens:

$0.30 / 1M input tokens

1tk = (0.30 * 250000)/1000000 = 0.075 coins

#24-0813 Update

Support chatgpt-4o-latest

#24-0807 Update:

Added support for the latest GPT-4 Omni version 0806, which is more than half the price of the first generation 4Omni.

Added support model name:

gpt-4o-2024-08-06

Added the function of IP recording for some API calls.

The function of requesting IP records has been implemented in some common interfaces.

You can now directly view the corresponding requested IP in the bill query function.

For security reasons, the request IP from the developer's OAuth APP will not be recorded

Attached:

#24-0710 Update:

The back-end authentication module has been reconstructed, and the development of the new website has made phased progress. A new open OAuth API has been added. Now developers can connect their projects to this site to quickly connect various AI capabilities to your projects. At the same time, there is no need to pay attention to miscellaneous matters such as user management, API integration, and billing. Focus on the development and implementation of functions, and you will also have the opportunity to obtain rewards, profits, and shares. For details, please view the document: https://ohmygpt-docs.apifox.cn/doc-4718106

Redirection of legacy models:

This site no longer supports the forwarding of the following models, but it still supports calling the API through these model names to ensure the compatibility of older versions of the program to a certain extent.

gpt-3.5-turbo-0301 Deprecated Rewritten to gpt-3.5-turbo
gpt-3.5-turbo-0613 Deprecated Rewritten to gpt-3.5-turbo
gpt-3.5-turbo-16k Deprecated Rewritten to gpt-3.5-turbo
gpt-3.5-turbo-16k-0613 Deprecated Rewritten to gpt-3.5-turbo

gpt-4-0314 Deprecated Rewritten to gpt-4

gpt-4-vision-preview | gpt-4-1106-vision-preview Deprecated Rewritten to gpt-4o

#24-0621 Update:

It supports Anthropic's latest release of Claude 3.5 Sonnet. The price is exactly the same as Claude 3 Sonnet, but A said that its capabilities and speed are stronger than C3Opus. It also supports native API calls through Messages and OpenAI Chat.Completions compatible API calls.

claude-3-5-sonnet

claude-3-5-sonnet-20240620

(The webmaster has successfully graduated on 24-0614 and is now back to active status (concentrating on learning new cheese + writing code and maintaining/developing new projects + steadily advancing the immigration plan) Thank you all for your continued support)

#24-0531 Update:

Support Google’s latest Gemini 1.5, new model support:

gemini-1.5-pro-001

gemini-1.5-flash-001

Both support Vision input

Note:

gemini-pro-vision and gemini-pro-vision have pointed to gemini-1.5-pro-001

The actual measured concurrency is pitifully small, and it is still only suitable for early adopters. We will try to apply for a quota increase in the future.

/v1/models API changes

Now all model names supported by this site will be returned

{
    "object": "list",
    "data": [
        {
            "id": "gpt-3.5-turbo",
            "object": "model",
            "created": 1717133244679,
            "owned_by": "OhMyGPT"
        }...
    ]
}

Attached:

Gemini 1.5 original expense accounting notes

#Gemini 1.5 Pro

Price(input)

$3.50 / 1 million tokens (for prompts up to 128K tokens)

1tk = (3.5 * 250000) / 1000000 = 0.875 coins

$7.00 / 1 million tokens (for prompts longer than 128K)

1tk = (7 * 250000) / 1000000 = 1.75 coins

Price (output)

$10.50 / 1 million tokens (for prompts up to 128K tokens)

1tk = (10.5 * 250000) / 1000000 = 2.625 coins

$21.00 / 1 million tokens (for prompts longer than 128K)

1tk = (21 * 250000) / 1000000 = 5.25 coins


#Gemini 1.5 Flash

Price(input)

$0.35 = 1 million tokens (for prompts up to 128K tokens)

1tk = (0.35 * 250000) / 1000000 = 0.0875 coins

$0.70 = 1 million tokens (for prompts longer than 128K)

1tk = (0.7 * 250000) / 1000000 = 0.175 coins

Price (output)

$1.05 / 1 million tokens (for prompts up to 128K tokens)

1tk = (1.05 * 250000) / 1000000 = 0.2625 coins

$2.10 / 1 million tokens (for prompts longer than 128K)

1tk = (2.1 * 250000) / 1000000 = 0.525 coins

#24-0514 Update:

**1. Added support for the latest GPT-4 Omni, OpenAI’s most advanced multi-modal model. **

Added support model name:

-gpt-4o

gpt-4o-2024-05-13

This model is currently the same as gpt-4-turbo, which only supports mixed text and image reasoning + plain text output.

The price of input and output is 70% to 50% lower than that of gpt4t. Coupled with the optimization of GPT4o's word segmenter, actually with the same number of words, GPT4o's Tokens will be reduced by more than half, so the call cost is about the same order of magnitude as gpt3.5.

Reference: https://openai.com/index/hello-gpt-4o/

Attached:

#0514 GPT4o Original Pricing Accounting

gpt-4o | gpt-4o-2024-05-13 $5.00 / 1M tokens $15.00 / 1M tokens

-gpt-4o
- gpt-4o-2024-05-13

input:
$5 = 1M tokens

1tk = (5 * 250000)/1000000 = 1.25 coins

output:
$15 = 1M tokens

1tk = (15 * 250000)/1000000 = 3.75 coins

Remove GPT4 restrictions for ordinary users

Small changes to OpenAI streaming requests

The server has made a change to the Stream request of the OpenAI model. It should not be a destructive update. When stream=true, the following parameter will be forcibly added, requiring the OpenAI API to return the number of input and output Tokens for this request.

    "stream_options":{
        "include_usage": true
    }

In the future, stream requests on the server side will no longer need to parse and calculate tokens. Everything will be based on the number of tokens reported by the upstream API. This not only simplifies a lot of code, but also improves the accuracy of billing.

Note:

This change will cause the streaming request to have an extra chunk in the penultimate line:

data: {"id":"chatcmpl-9OcFBHIB9jKqJ9El1a1AkxkN4SH2G","object":"chat.completion.chunk","created":1715653909,"model":"gpt-4-turbo-2024-04- 09","system_fingerprint":"fp_0737e0dfd9","choices":[{"index":0,"delta":{},"logprobs":null,"finish_reason":"length"}],"usage":null}

data: {"id":"chatcmpl-9OcFBHIB9jKqJ9El1a1AkxkN4SH2G","object":"chat.completion.chunk","created":1715653909,"model":"gpt-4-turbo-2024-0 4-09","system_fingerprint":"fp_0737e0dfd9","choices":[],"usage":{"prompt_tokens":778,"completion_tokens":10,"total_tokens":788}}

data: [DONE]

The tokens chunk at the end is a common AI SSE stream output practice. Many AI services have this chunk, and OpenAI currently supports it natively.

Note: The choices in the penultimate line is an empty array. If your code uses an unsafe array reading method, an array out-of-bounds exception may occur when reading the penultimate line.

If you encounter this problem, we recommend improving your program's stream parsing code to make your program's stream parsing more robust.

#24-0510 Update:

Added support for the following open source models and calls through the OpenAI standard Chat.Completions API:

TA/Qwen/Qwen1.5-0.5B-Chat

TA/Qwen/Qwen1.5-1.8B-Chat

TA/Qwen/Qwen1.5-4B-Chat

TA/Qwen/Qwen1.5-7B-Chat

TA/codellama/CodeLlama-7b-Instruct-hf

TA/meta-llama/Llama-2-7b-chat-hf

TA/meta-llama/Llama-3-8b-chat-hf

TA/mistralai/Mistral-7B-Instruct-v0.1

TA/mistralai/Mistral-7B-Instruct-v0.2

TA/google/gemma-7b-it

TA/Qwen/Qwen1.5-14B-Chat

TA/WizardLM/WizardLM-13B-V1.2

TA/meta-llama/Llama-2-13b-chat-hf

TA/Qwen/Qwen1.5-32B-Chat

TA/zero-one-ai/Yi-34B-Chat

TA/deepseek-ai/deepseek-coder-33b-instruct

TA/codellama/CodeLlama-34b-Instruct-hf

TA/Qwen/Qwen1.5-72B-Chat

TA/deepseek-ai/deepseek-llm-67b-chat

TA/codellama/CodeLlama-70b-Instruct-hf

TA/meta-llama/Llama-2-70b-chat-hf

TA/meta-llama/Llama-3-70b-chat-hf

TA/Qwen/Qwen1.5-110B-Chat

TA/mistralai/Mixtral-8x7B-Instruct-v0.1

TA/mistralai/Mixtral-8x22B-Instruct-v0.1

Added Coder model that supports DeepSeek

deepseek-coder

#24-0509 Update:

24-0509 Update Notice

Since the low-priced OAI channel is basically unable to maintain the normal operation of the website, and it is also expected that the low-priced channel will gradually disappear, in order to ensure the quality of service, it was decided to officially switch to formal payment, and the OpenAI API multiplier was modified to 1.1. That is, the website’s deduction price for OpenAI is now the official price * 1.1. The surplus will be used to subsidize the payment gateway fee, website operating expenses and support the continued development of the website and the company.

This is a difficult decision, but after this adjustment, the API service will be more stable and reliable, and the website's business model will be more formal and healthy. Please understand and thank you for your support.

Since the website itself is operated through a US company, with a bunch of US company accounts under its name, as well as a bunch of debit and credit cards, it is very efficient and convenient in terms of fund transfer and bill payment.

OpenAI's AI services are indeed the industry benchmark, but with the recent development of the AI field, there are many other excellent options.

It is recommended to consider the following AI services as alternatives:

Anthropic's Claude3 series: the whole series is multi-modal, among which Haiku and Sonnet are quite cost-effective. The execution and understanding of many tasks are obviously better than 3.5. The current embedded search function of the website is also implemented by Haiku.

DeepSeek's latest DeepSeek-V2 model (deepseek-chat) has been quite popular recently. It (claims to be) the most powerful open source model and is comparable to GPT4Turbo. The price is also close to free, with one million Tokens only costing 1 yuan. The actual practical experience is not bad and it is worth a try.

CoHere's CommandR series is also a strong competitor to 3.5.

3.5 reverse: gpt-3.5-turbo-web-rev is reverse engineered using Bindeng chatgpt. The delay is acceptable. Dozens of tokens are deducted symbolically for each successful request. It is close to free. It is based on the number of times instead of tokens. It is very practical when there are many tokens.

#24-0508 Update:

Access the latest DeepSeekV2 from DeepSeek

Added model name:

deepseek-chat

#24-0409 Update:

Newly added CommandR model that is connected to the popular Cohere and can now be directly called through the OpenAI.Chat.Completions compatible API

Limited time 10% off special offer

New model:

command-r: (supposedly) a model that benchmarks OAI’s gpt-3.5-turbo

command-r-plus: (It is said) the target is Anthropic's C3 Sonnet

command-r-online: Based on CR, the official built-in web-search function is added

command-r-plus-online: Based on CRP, additional official built-in web-search function

PS:
(I heard) The Command R series is also relatively good at Role Play.

#24-0405 Update:

Claude Messages API supports the latest tools parameter. Now Claude can also use ToolCall. It seems to have built-in thinking chain reasoning, which is a little more awesome than OpenAI's TC:

Example input:

{
    "model": "claude-3-opus-20240229",
    "max_tokens": 1024,
    "messages": [
        {
            "role": "user",
            "content": "Where did Dog Cook's stock fall today?"
        }
    ],
    "stream": false,
    "tools": [
        {
            "name": "get_stock_price",
            "description": "Get the current stock price for a given ticker symbol.",
            "input_schema": {
                "type": "object",
                "properties": {
                    "ticker": {
                        "type": "string",
                        "description": "The stock ticker symbol, e.g. AAPL for Apple Inc."
                    }
                },
                "required": [
                    "ticker"
                ]
            }
        }
    ]
}

Output:

{
    "id": "msg_01HVfeUwYukk6pWKzkGWsn8h",
    "type": "message",
    "role": "assistant",
    "content": [
        {
            "type": "text",
            "text": "<thinking>\nThe user is asking about the current stock price for \"狗库克\" which likely refers to Tim Cook, the CEO of Apple Inc. To get the current stock price, I would need to use the get_stock_price function and provide the ticker symbol for Apple, which is AAPL.\n\nThe user provided enough context to infer the parameter value needed:\nticker: Based on the reference to \"狗库克\" (a nickname for Tim Cook), the ticker symbol for Apple (AAPL) can be inferred.\n\nSince the required parameter can be inferred from the context, I can proceed with the function call.\n</thinking>"
        },
        {
            "type": "tool_use",
            "id": "toolu_01Tmx5NLhQZ7hEUpnYUGSPqV",
            "name": "get_stock_price",
            "input": {
                "ticker": "AAPL"
            }
        }
    ],
    "model": "claude-3-opus-20240229",
    "stop_reason": "tool_use",
    "stop_sequence": null,
    "usage": {
        "input_tokens": 507,
        "output_tokens": 199
    }
}

24-0401更新：

在 Chat.Completions API中新增兼容支持智谱AI的GLM系列模型(费率1.1)

新增模型名称:

glm-3-turbo

glm-4

glm-4v

(glm-4v没法直接在Lobe中通过OpenAI API直接使用，明天再看看

Anthropic模型限时折扣结束，恢复到1.1

24-0326更新：

价格倍率调整：OpenAI普通模型和GPT4的倍率统一调整为 0.5

24-0321更新：

修复部分前端文字错误

优化内嵌的LLM搜索功能

C3系列现在可在OpenAI.Chat.Completions API中使用内嵌的联网搜索功能（ Prompt以 /gs gs/ 即可触发

24-0318更新：

修复OpenAI.Chat.Completions兼容接口中的Claude模型stop参数没有正确映射的问题

优化兼容接口中对于Claude的适配：

role=system的msg会被提取出来，放到API的system参数中再发过去

如果一个assistant的msg中包含图像，为了使其符合claude的msg api的校验规则，会在前面加一个占位的msg，从而将这个msg顶到user角色中，使其能正常带上图像内容而不报错

如果msg的长度为偶数，则现在不会再强制补成奇数，（一般情况下）此时末尾会是assistant，这样会触发原生的直接补全功能

24-0315更新：

鉴权模块变更：如果您的APIKey打开了”管理权限“的开关，那么它将绕过模型权限的检查。（新的模型名称不用每次都去手动勾选了

针对C3的中文半角符号问题，上线了三个补丁版本，会修正C3输出的中文中包含的半角符号，不影响英文内容的符号（仅可在 OpenAI.Chat.Completions 兼容接口中使用。

claude-3-opus-patch

claude-3-sonnet-patch

claude-3-haiku-patch

添加OpenAI.Models.RetriveModels API接口支持

24-0314更新：

新增 Claude 3 Haiku （小杯）支持 claude-3-haiku | claude-3-haiku-20240307

特色：

同样支持图像/文字混合的多模态输入

价格和性能对标gpt-3.5-turbo

（Anthropic声称）Haiku的性能比3.5更加优秀（https://www.anthropic.com/news/claude-3-family

C3系列中最快最便宜的

24-0312更新：

上线Groq的Mixtral-8x7b 模型，免费（ groq-mixtral-8x7b-32768 ）

注意：非生产就绪，仅供个人体验

24-0310

单笔$10以上的订单将免除Stripe手续费

24-0309

补充Claude带版本号的模型名称（claude-2.0/2.1/3-opus-20240229...

优化设置界面中模型名称的展示方式

24-0308

移除Anthropic(Claude)系列模型的前置内容过滤器，现在您可以随意使用Claude进行角色扮演

Anthropic系列模型限时5折特惠

24-0306

解决新的Anthropic接口的兼容性问题，包括：

网页/兼容API中的Cladue系列模型，因各个角色的对话顺序限制导致的请求错误

LobeChat中使用C3模型+图像输入时导致的错误

修复更新Vision相关代码后导致GPT4V接口异常的问题

24-0305

新增Anthropic最新的Claude3系列模型，C3全系多模态，支持图像输入，其中C3大杯（Opus）声称比GPT4还要强。

claude-3-opus C3 （大杯）

claude-3-sonnet C3 （中杯）

C3小杯等待发布

价格参考：https://www.anthropic.com/api#pricing

现可通过OpenAI的Chat.Completions接口，以GPT4V的标准输入和输出，也可以通过Anthropic官方的Messages接口格式使用。

文档：https://ohmygpt-docs.apifox.cn/api-152507338

24-0303

微信收款渠道被关闭，请各位用户以后通过Stripe渠道进行充值。

24-0217

由于OpenAI官方已经将gpt-3.5-turbo滚动更新到0125，因此默认模型的计费也同步更新到0125了

24-0205

Google AI服务（包括text-bision/gemini-pro等模型）不再免费，倍率变更为1.1

24-0203

倍率变动

不再区分闲时和平时了，倍率变更为：

普通模型倍率： 0.3
GPT-4系列模型倍率： 0.4

24-0127

新增一个图像生成API和模型( nd-imagine-v3 )，适合生成二刺螈图像 (experimental)

API 文档：https://ohmygpt-docs.apifox.cn/api-145379071

24-0125

新增 gpt-4-0125-preview 和 gpt-4-turbo-preview ，其上下文窗口、参数和价格相对于1106均无变化（据说解决了懒惰的问题

新增第三代嵌入模型： text-embedding-3-small 和 text-embedding-3-large ，其中第三代嵌入模型对于第二代有如下变化

价格和ada v2不同，3代小号嵌入模型比ada v2便宜，大号比ada v2贵一点点

嵌入API现可选维度（dimesions）参数

总结：

总共新增如下模型名称

gpt-4-turbo-preview
gpt-4-0125-preview
text-embedding-3-small
text-embedding-3-large

受影响的API Endpoints:

Embeddings API: Add dimensions parameter support for embedding v3 models
Chat Completions API: 新增0125批次的GPT4 Turbo模型

参考：

https://platform.openai.com/docs/changelog/jan-25th-2024
https://openai.com/blog/new-embedding-models-and-api-updates

附：前几天的更新：

0122更新：

新增serper的谷歌搜索开放API，更便宜更好用，文档： https://ohmygpt-docs.apifox.cn/api-145085337

将LLM内嵌搜索功能的搜索服务商更改为serper，仍然可使用 /gs | gs/ 或者特定的模型名称触发

0123更新：

解决账单数据量过大时，按月统计图表可能会加载不出来的Bug，同时优化统计速度

24-0119

新增大模型内嵌联网功能：

目前已支持通过网页端和API来访问内嵌了联网功能的大模型

目前仅支持OpenAI的3.5/4/4Turbo系列的模型（32k和Azure除外）
目前仅支持基于谷歌官方搜索API的搜索功能
后面会接入更多模型以及更多可选的搜索服务提供方

使用方法：

在网页中直接使用：

当你需要使用谷歌联网搜索时，在prompt开头加上 /gs 或者 gs/ 就可以强制要求LLM在本次回答中，先使用谷歌搜索一下，然后再结合搜索到的信息进行回答。

示例：

/gs 什么是“芝士雪豹”？

通过API调用：

当你需要通过API在第三方客户端或者其它项目中使用时，您可以通过以下两种方式调用本站内置的LLM联网功能：

方式1：在密钥权限列表中勾选 chat-completions-with-gs 这样您就可以像网页中的那样，在需要谷歌搜索辅助时，在您输入的prompt最前面加上 /gs 或者 gs/ 即可触发LLM的谷歌搜索功能。（不勾选此项的话无法触发LLM的谷歌联网搜索功能）

方式2：上线了 xx-online-with-gs 系列模型，您可以通过直接调用该模型名称，来要求每次请求都要使用谷歌查询来辅助回答，无需特别的关键字触发。

新增权限名称：

chat-completions-with-gs LLM内嵌谷歌搜索开关，不勾选它的话，此Key将无法使用 /gs 或 gs/ 关键词触发谷歌联网搜索功能

gpt-3.5-turbo-online-with-gs 强制触发基于谷歌搜索的LLM联网模型名称，下面都是一样的

gpt-4-online-with-gs

gpt-4-1106-preview-online-with-gs

已知缺点：

在能力较弱的3.5使用联网功能时，可能会出现脚标和链接输出格式错误或混乱的问题，会尝试继续调整和优化相关的Prompt

新增Completions模型支持：

davinci-002

babbage-002

作为初代GPT3的替代品

24-0113

上线了谷歌搜索兼容API

（为后面的LLM联网功能做铺垫

文档：https://ohmygpt-docs.apifox.cn/api-141553709

24-0110

移除了24-01-04弃用的模型，受影响的API和模型如下：

Completions:

text-davinci-003
text-davinci-002
text-ada-001
text-babbage-001
text-curie-001

Edits:

code-davinci-edit-001
text-davinci-edit-001

Embeddings:

text-search-ada-doc-001

参考链接：
https://platform.openai.com/docs/deprecations

23-1214

增集成谷歌最新的Gemini Pro系列模型推理服务，同时Google AI模型推理服务继续向本站所有用户免费开放

新增Google AI系列模型：

gemini-pro

gemini-pro-vision

现可通过本站提供的Chat/Completions API直接调用，其中Gemini Pro Vision是一个多模态的大模型，支持基于文字和图片混合推理，参数用法和OpenAI的GPT4v一致，支持流和非流，模型能力大约是3.5的水平，多模态推理能力差强人意（指大体上算令人满意），可以当作小GPT4v来玩。

附：
相关文档说明：https://ohmygpt-docs.apifox.cn/api-123253693

23-1201

新增集成Google PaLM 2，限时免费中。

新增模型：

palm-2-chat-bison

palm-2-chat-bison-32k

现可在线网页体验或通过OpenAI的ChatCompletions API进行调用。

23-1125

计费相关：

新增“闲时折扣”：闲时普通模型倍率调整为0.1，GPT4倍率调整为0.3。此举旨在回馈新老客户，同时提高服务器资源利用率，鼓励在空闲时间利用模型进行批量处理数据。

注：当前”闲时“定义：时区UTC+8（上海标准时间），每天的 0:00 a.m. 到 9:00 a.m. ，还有周六和周日全天应用“闲时折扣”，此折扣可能会更改，建议查阅最新的文档，通过API来自动检查当前折扣倍率。

支持自助获取PDF格式的发票和收据（不收取额外费用）：现在可以通过Stripe渠道，使用"Stripe Invoice"创建一个订单，订单创建后您的邮箱将会立即收到一个发票（Invoice），支付完成后将收到收据（Receipt）。

模型相关：

更新Azure OpenAI部署：将gpt35版本升级到1106，gpt4版本升级到1106p，API版本升级到 2023-12-01-preview ，同样支持Tools，同时移除自带的filter以免影响使用。

23-1117

231117更新通知：

账单相关：

优化消费数据统计图表相关的查询性能

优化前端统计图表支出展示方式，改为直接展示各个时段的支出费用而不是代币数目，更加直观。

添加相关密钥到API调用的账单记录里。

新增账单数据批量导出支持，支持导出为JSON或者CSV文件

会员相关：

降低VIP会员门槛，现在半年内充值¥40以上即可符合VIP条件

23-1113

放宽速率限制策略+降低400流控力度，会员的速率配额现已翻倍：

VIP 600 RPM => 1k RPM
PREMIUM 0.5w RPM => 1w RPM

（虽然仍然不鼓励触发400，但是已大大降低触发400后带来的主动流控力度，不再因为一个400就导致直接429了。

23-1106

新的模型和API已经从今天中午开始陆续上线完毕

新增普通模型支持：

gpt-3.5-turbo-1106

新的GPT35模型
输入价格比老模型降低约1/3

老的 gpt-3.5-turbo 仍然指向旧版的35-0613，旧版没有降价所以 gpt-3.5-turbo 还是那个价格（可以考虑主动将默认模型重写到新款1106，打算征求一下群里的意见+看看新款稳不稳定再改。

dalle-e-3

OpenAI最新款的图像生成模型，效果也是非常不错，已经快要干翻Mj了，调用起来非常简单，推荐使用这个。

tts-1

tts-1-hd

OpenAI最新款的TTS语音转文本模型，效果也是非常棒，媲美真实人声，一个声音就原生支持多国语言，个人觉得效果比Azure的神经网络语音引擎还要牛x，对于语言学习类、AI对话交流类、AI朗读(听书、网页阅读)等方面的应用应该很有用。

新增GPT-4 Trubo模型支持：

gpt-4-1106-preview

gpt-4-vision-preview

优点：

速度飞快

超大上下文窗口 128k（其中输入最多124k，输出最多4k）

强大的多模态图像输入支持

新版的Tools工具调用支持，可一次调用多个函数

更低价格

新的预览版模型的请求频率限制较为严格，目前看上去是众生平等每个OpAI账号100RPD，本站总共只能提供大约几千到几万的RPD，虽说可用性没法保证，但是已经可以开始尝试接入并适配到自己的应用了。

还有需要注意的是 gpt-4-vision-preview 最好主动指定一下 max_tokens ，否则会生成几token就中止输出。

23-1104

将Anthropic模型（Claude2/ClaudeInstant1）添加到OpenAI ChatCompletions接口中

即现在可以通过 /v1/chat/completions 接口调用 claude-2 和 claude-instant-1 这两个模型了（OpAI接口的兼容性相比Anthropic原接口好一些，直接自定义一个模型名称就可以用了。）

同时将Anthropic模型费率从之前的1.2下调到1.1

23-1025

普通模型倍率下调为0.2

GPT4倍率下调为0.4

23-0917

Midjourney画图前端网页已初步完成并上线，可通过点击网页菜单中的“AI画图”体验该功能，NORMAL模式限时免费中

Attached: Related Midjourney learning materials:
https://learningprompt.wiki/zh-Hans/docs/category/-fundamentals-1

23-0906

Midjourney API NORMAL mode is provided for free (mainly because no one uses it, so it is left alone.

Claude2 has changed from web reverse engineering to using a paid interface, which is no longer free. The stability and usability have been greatly improved.

Passed the Azure GPT4 approval in the name of the company, and now can provide stable and reliable version 0613 model inference service (with func_call) on the Azure endpoint, with a magnification of 1.1, and the models are as follows: gpt-3.5-turbo, gpt-3.5-turbo-16k, gpt-4 and gpt-4-32k, among which gpt-4-32k provides reliable model inference services unified by Microsoft cloud Azure.

gpt-4-32k is provided by Azure, so this model has also entered a production-ready state, and enterprise/advanced users can safely access the production environment for use.

Affected by the depreciation of CNY, according to the latest exchange rate of 1$:7.3￥, the ratio of exchange tokens for the RMB recharge channel (pay by scanning the QR code on WeChat) is adjusted from 1￥=35000 to 1￥=34000

#23-0901

Midjourney open API interface service has been officially launched and put into production⭐
Related introduction documents: https://ohmygpt-docs.apifox.cn/doc-2941619

23-0829

Accessing Stripe supports direct recharge with US dollars through credit card/Apple/Google/WeChat Pay and other methods, and obtain official invoices and receipts at the same time.

23-0722

Added free Claude2 API

Interface document address: https://apifox.com/apidoc/shared-9aac3c46-2596-43c0-b69f-17e3256d0bf8/api-97461921

23-0720

Added CN2 GIA mirror API

CN2 GIA-E: https://cn2us02.opapi.win/v1/
CloudFlare CDN (stable): https://cfcus02.opapi.win/v1/
CloudFlare Worker (stable and reasonably fast): https://cfwus02.opapi.win/v1/

23-0719

For planned website maintenance, the service will be interrupted for 10 minutes and migrated to a high-defense and high-performance server.

23-0620

3.5% off model for a limited time

23-0614

Synchronize OpenAI to lower the cost of gpt-3.5-turbo model

Added gpt-3.5-turbo-0613

Added gpt-3.5-turbo-16k

23-0517

Reconstruct the website front-end and optimize the chat page and recharge page.

Added aff invitation function

Added key permission control function to finely control the models that each key can access.

23-0426

Launched webpage text-to-speech reading function

Change log

24-0401更新：#

24-0326更新：#

24-0321更新：#

24-0318更新：#

24-0315更新：#

24-0314更新：#

24-0312更新：#

24-0310#

24-0309#

24-0308#

24-0306#

24-0305#

24-0303#

24-0217#

24-0205#

24-0203#

24-0127#

24-0125#

24-0119#

24-0113#

24-0110#

23-1214#

23-1201#

23-1125#

23-1117#

23-1113#

23-1106#

23-1104#

23-1025#

23-0917#

23-0906#

23-0829#

23-0722#

23-0720#

23-0719#

23-0620#

23-0614#

23-0517#

23-0426#

24-0401更新：

24-0326更新：

24-0321更新：

24-0318更新：

24-0315更新：

24-0314更新：

24-0312更新：

24-0310

24-0309

24-0308

24-0306

24-0305

24-0303

24-0217

24-0205

24-0203

24-0127

24-0125

24-0119

24-0113

24-0110

23-1214

23-1201

23-1125

23-1117

23-1113

23-1106

23-1104

23-1025

23-0917

23-0906

23-0829

23-0722

23-0720

23-0719

23-0620

23-0614

23-0517

23-0426