gemini-to-open-api-wrapper
To access the Gemini 2.0 Flash model from a command-line AI tool like aichat
, which requires an OpenAI-compatible API, you can create a wrapper API that translates OpenAI-compatible requests into requests compatible with the Gemini API. This can be hosted locally and used as a bridge between aichat
and the Gemini API[4].
Here's how to do it:
Set up the API Environment: You will need Python and the Flask
framework to create a local API server. You'll also need the Google AI for Developers library to interact with the Gemini API[3].
Install Flask and the Google AI Python library:
pip install Flask google-generativeai
API Key Setup: Get a Gemini API key from Google AI for Developers[3].
Code the Translation Layer: Create a Flask application that listens for OpenAI-compatible requests and translates them to Gemini API requests[4].
Here’s a basic example of a Flask application (app.py
) that does this:
from flask import Flask, request, jsonify
import google.generativeai as genai
app = Flask(__name__)
# Replace with your actual Gemini API key
GOOGLE_API_KEY = "YOUR_GEMINI_API_KEY"
genai.configure(api_key=GOOGLE_API_KEY)
# Function to translate OpenAI-like request to Gemini
def translate_to_gemini(openai_request):
messages = openai_request.get('messages', [])
# Extract the user message
user_message = next((msg['content'] for msg in messages if msg['role'] == 'user'), None)
return user_message
# Function to call Gemini API
def call_gemini_api(prompt):
model = genai.GenerativeModel('gemini-2.0-flash')
try:
response = model.generate_content(prompt)
return response.text
except Exception as e:
return str(e)
@app.route('/v1/chat/completions', methods=['POST'])
def chat_completions():
data = request.get_json()
# Translate the OpenAI-like request to Gemini
gemini_prompt = translate_to_gemini(data)
# Call the Gemini API
gemini_response = call_gemini_api(gemini_prompt)
# Format the response to be OpenAI-compatible
response = {
"choices": [
{
"message": {
"role": "assistant",
"content": gemini_response
}
}
]
}
return jsonify(response)
if __name__ == '__main__':
app.run(debug=True, port=5000)
Run the application: Execute the Python script to start the local server[3].
python app.py
This starts a local server on port 5000.
Configure aichat
: Point aichat
to your local server by changing the base URL to http://localhost:5000/v1
[4]. You may need to configure the tool to send requests in the OpenAI format, which your Flask app is designed to handle.
Testing: Send a request to http://localhost:5000/v1/chat/completions
with a payload that mimics the OpenAI format to test if the translation works correctly[4].
Example payload:
{
"model": "gemini-2.0-flash",
"messages": [
{"role": "user", "content": "Explain how AI works"}
]
}
Deployment Notes: Consider using environment variables for sensitive information like API keys. For production, use a more robust WSGI server like Gunicorn or uWSGI instead of the Flask development server[5].
Citations: [1] https://docs.github.com/en/copilot/using-github-copilot/ai-models/using-gemini-flash-in-github-copilot [2] https://developers.googleblog.com/en/gemini-is-now-accessible-from-the-openai-library/ [3] https://ai.google.dev/gemini-api/docs [4] https://ai.google.dev/gemini-api/docs/openai [5] https://cloud.google.com/vertex-ai/generative-ai/docs/gemini-v2 [6] https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/call-vertex-using-openai-library?hl=en [7] https://cloud.google.com/vertex-ai/generative-ai/docs/thinking [8] https://discuss.ai.google.dev/t/openai-compatibility-with-gemini-2-0-flash-issue-with-using-the-gemini-2-0-flash-001-model/65007