A Complete How-To Guide to Google Gemini

January 13, 2025

7 minutes

🟢easy Reading Level

In 2023, Google announced Gemini, a multimodal large language model (LLM) capable of processing text, images, and audio with impressive performance. One of the most accessible ways to experience its capabilities is through the Gemini chatbot, previously known as Google Bard.

Apart from working with multimodal input, Gemini simplifies how we interact with information by unifying Google Search’s power with a conversational AI interface. So, instead of manually parsing countless web pages, you get concise, search-enhanced answers in a single chat.

In this article, we’ll explore Gemini’s unique capabilities, how to get started, and the standout features that make it an exciting tool for everyday use.

Note

While “Gemini” also refers to Google’s broader family of AI models, we’ll use the name here to talk specifically about Google's AI chatbot.

Let’s dig into it.

What is Google Gemini?

Gemini's versions: Gemini (free version) and Gemini Advanced (paid version)

Gemini is an artificial intelligence (AI) chatbot built on Google’s Gemini 1.5 Flash (free version) and Gemini 1.5 Pro (Gemini Advanced—paid version) models. This underlying technology is multimodal, meaning it can natively handle and combine text, images, audio, video, and code.

For example, you can upload a photo of a landmark and ask about its history, share a snippet of code for debugging, or dictate your queries using voice input.

You can upload an image and ask questions about it. In this example, we asked, "What's the breed of this dog?"

Gemini also supports over 40 languages, so it can also function as an on-the-fly translator or language tutor. And because it’s tightly integrated with Google Search, you can often get relevant, up-to-date information within the same chat thread.

Gemini’s standout capability is its integration with Google Search, enabling it to retrieve and summarize real-time information. Unlike traditional search engines, Gemini delivers answers in a conversational format, minimizing the need to sift through multiple web pages.

Exploring Gemini’s Versions

1. Gemini (Free Version)

  • Powered by Gemini 1.5 Flash.
  • Supports multimodal queries with text, images, and audio.
  • Integrated Google Search for up-to-date, contextual answers.

Gemini Advanced ($20/month)

Gemini Advanced is a $20/month tier that unlocks Gemini 1.5 Pro and additional benefits, including:

  • Enhanced context window: Processes up to 2 million tokens.
  • Experimental model access: Gemini-Exp-1206 for complex coding and advanced math tasks.
  • File uploads: Work directly with your documents and images.
  • Code execution: Run and edit Python code in-app.
  • Priority features: Early access to experimental tools like Deep Research for creating detailed reports.
  • Customizations via Gems: Tailor the chatbot to specific workflows or tones.

What are Gems?

Gems are customizable "profiles" that refine Gemini’s behavior to suit your needs. They allow users to define:

  • Tone: Adjust responses to be more formal, casual, or playful.
  • Workflow: Create pre-built step-by-step instructions for repetitive tasks.

Gems are primarily available to Gemini Advanced users, but they’re rolling out to more users over time.

Key Features of Gemini (Free Version)

FeatureDescription
Multimodal abilitiesHandles text, images, and audio inputs. Can also generate images based on prompts.
Real-time data integrationDisplays sources and related links for quick fact-checking.
Feedback handlingGive Gemini a thumbs up/down, regenerate a response, or ask for modifications in style (shorter, more casual, more formal, etc.).
Response sharingExport chats to Google Docs or Gmail or create shareable links.
Built-in fact-checkClick the Google button beneath a Gemini response to run an automated “double-check” against live search results.

Getting Started with Google Gemini

Who can use the free version of Google Gemini?

Anyone 13+ (depending on your region) with a personal Google account can access Gemini. Just visit Gemini’s webpage or download the mobile app (availability varies by region).

1. Sign in with your Google Account

Go to gemini.google.com and log in.

2. Start a Chat

Type your query in the text box at the bottom. You can also speak your query using a microphone icon on the right. Use the camera icon on the left side of the text field to upload images.

The camera icon on the left side of the text field lets you upload images. A microphone icon on the right lets you add voice prompts.

3. Manage Your Chats

In the left-hand sidebar, rename, pin, or delete your conversations.

Manage your chats in the left-hand sidebar.

Key Capabilities in Practice

1. Text Chat & Interacting with Responses

Type your prompt, "Explain gravity in simple terms,” and press Enter.

Astronaut

Prompt


Explain gravity in simple terms.

Gemini replies in seconds. You can:

  • Like/Dislike the response
  • Ask it to modify the answer’s tone or length
  • Share or export the conversation
  • Fact-check the response using the Google button

An example answer from Gemini is ways you can interact with it.

2. Generating Images

Gemini Chat's free version allows you to generate images using a prompt, where you can specify the details of the image, the style, and more.

Astronaut

Prompt


Generate an image of a futuristic space elevator. Make it in cyberpunk style.

Example image generated by Google Gemini.

3. Translating Written Notes

Snap a photo of handwritten notes and ask Gemini to transcribe them. You’ll get a neat digital version, saving hours of manual typing.

Astronaut

Prompt


Translate these notes into digital text for me.

4. Summarizing Long Text

Don’t want to read a 5,000-word article? Paste the text into Gemini, and ask for a summary. It’ll produce a concise overview.

Astronaut

Prompt


[Paste your text here]

Give me a 100-word summary of this text in [simple/technical/creative/etc.] terms.

5. Generating Code

Gemini can write or refactor code in multiple languages. Provide a description or partial snippet, and it’ll suggest improvements or generate new code.

Astronaut

Prompt


Give me a simple HTML, JS, CSS, and Python code for a word counter app that uses Flask.

Summary: Use Cases and Applications

The free version of Gemini is versatile, making it useful for various applications. Apart from what we've already discussed, Gemini can also assist with:

FeatureDescriptionUse Cases
Homework assistanceExplains concepts, helps write essays, and summarizes learning content.Solving math problems, writing book reports, and preparing study notes.
Research supportProvides citations, outlines, and ideas for projects or academic papers.Drafting research proposals, building bibliographies, and outlining thesis content.
Creative writingCreates poetry, stories, and marketing copy.Writing short stories, brainstorming ad copy, or generating dialogue for creative projects.
Coding assistanceDebugs code, explains programming logic and generates code in various languages.Fixing bugs, learning a new programming language, and generating reusable code snippets.
Multimodal query handlingDescribes uploaded images and analyzes data visualizations.Explaining diagrams, interpreting charts, and generating insights from visual data.
Language translationHandles multiple languages and supports language learning by simulating conversations.Translating documents, practicing conversational skills, and learning new vocabulary.
Brainstorm & generate content ideasOffers ideas for blogs, marketing campaigns, and creative projects.Developing article angles, creating slogans, and outlining marketing strategies.
Write taglines & short copyCrafts ads, email subject lines, and other concise forms of communication.Creating catchy taglines, writing promotional emails, and drafting social media captions.
Compare research or dataGenerates comparison charts and evaluates data side by side.Analyzing differences in articles, summarizing pros and cons, and comparing research findings.
Travel & activity recommendationsCombines real-time data with insights about destinations or activities.Planning vacations, exploring local attractions, and finding suitable accommodations.
Image recognitionDescribes images and identifies their content.Recognizing objects, analyzing artwork, and generating captions for images.

6. Limitations of Gemini (Free Version)

Even though Gemini’s free version is powerful, here are some current downsides:

  1. No direct web lookups: You can’t just say, “Find me the official website for X,” and have Gemini open or parse it in real-time.
  2. No full document uploads: You can share images, but not PDFs, Word docs, or long external files (beyond copy-paste).
  3. Occasional hallucinations: If the data doesn’t exist or is contradictory, it might provide misleading or incorrect answers.
  4. Writing style constraints: You can’t fully customize its “voice” or make it consistently witty, educational, or formal on demand.

Comparison with Competitors

FeatureGeminiChatGPT FreeClaude Free
Multimodal capability✅ (Text+Images)
Integration with Google apps
Document upload❌ (Images only)

Takeaway: Gemini excels in multimodal tasks and integrates well with Google’s ecosystem, but it lacks some file-uploading abilities that are found on other platforms.

Conclusion

Whether you’re brainstorming content, summarizing a dense article, translating a foreign text, or debugging code, Gemini’s free version is a strong starting point. Keep in mind that it’s not perfect: it can make mistakes, offer incomplete references, or lack advanced file upload capabilities. Still, it continues to evolve, and Google has indicated plans to bring more advanced features to the free version over time. For now, Gemini offers a helpful way to simplify tasks and explore the potential of AI-assisted tools in everyday use.

Andres Caceres

Andres Caceres, a documentation writer at Learn Prompting, has a passion for AI, math, and education. Outside of work, he enjoys playing soccer and tennis, spending time with his three huskies, and tutoring. His enthusiasm for learning and sharing knowledge drives his dedication to making complex concepts more accessible through clear and concise documentation.

Valeriia Kuka

Valeriia Kuka, Head of Content at Learn Prompting, is passionate about making AI and ML accessible. Valeriia previously grew a 60K+ follower AI-focused social media account, earning reposts from Stanford NLP, Amazon Research, Hugging Face, and AI researchers. She has also worked with AI/ML newsletters and global communities with 100K+ members and authored clear and concise explainers and historical articles.


© 2025 Learn Prompting. All rights reserved.