Monday, February 3, 2025

Best AI Assistants (Chatbots)

Posted by Abdullah Zaheer on Monday, February 03, 2025 in , , , , , | No comments
The Best AI assistants (Chatbots)

AI Tools
AI Assistants (Chatbots): 
Ch‎atGPT
ChatGPT

ChatGPT consistently ranks at the top of the LM Arena leaderboard, outperforming other models in key benchmarks. It's the world's most popular AI application, with 200 million users as of October 2024.

I’ve used ChatGPT extensively for brainstorming ideas, translation tasks, coding, AI script generation, data analysis and managing research-heavy tasks. The new 4o model is a significant leap forward—it’s incredibly fast, and feels way smarter than any of the previous versions of ChatGPT.

With ChatGPT's multimodal capabilities I can paste in images—like a chart or graph—and ask questions about it, making it much easier to interpret visual data quickly. I fed it a PNG image of a chart and it analysed the chart, gave me a table of the raw data (that it read from the chart image) and then re-did the chart in my preferred colors - pretty impressive.

ChatGPT can now retain context over time, personalizing responses based on previous conversations. For instance, I’ve used it to refine recurring project ideas without re-explaining every detail, saving hours of effort. You can review and manage what it remembers through OpenAI’s controls, to make sure it doesn't go all Skynet on you.

The integrated ChatGPT search option (more on this later) makes it even easier to find relevant information directly within conversations, which cuts down on the hallucinations with the use of RAG (Retrieval Augmented Generation). RAG grounds the AI's answer by retrieving information from external data sources.

While it excels in creative and general-purpose tasks, I’d recommend exploring other tools like Claude (see below) for coding. It's not that ChatGPT is bad at coding tasks, it's just that Claude is great at them.

ChatGPT o1

o1 is a specialized advanced reasoning model built for complex problem-solving, coding, and math.

While I find 4o excels in creativity and versatility, o1 has proven incredibly useful for specific tasks like coding, troubleshooting technical issues, and even solving intricate math problems that other models struggled with. I’ve used it to generate shell scripts, work through spreadsheet problems, and even tackle cryptic crossword puzzles, where its precision and logical depth really shine.

However, it lacks the broader capabilities and tool integrations of 4o, so I see it more as a complementary option for specific needs rather than a full replacement for more creative or expansive tasks.

Operators

In January 2025, ChatGPT introduced "Operators," AI agents that can book hotels, order food, and shop online. Exclusive to Pro users ($200/month), they show exciting potential but are hit-or-miss in execution.

For instance, I asked the Operator to book a hotel in NYC. It started strong, navigating filters and searching TripAdvisor, but eventually got stuck in a loop. Ordering a pizza was similar—it customized the order but couldn’t complete checkout. Shopping worked better; it found a laptop under $1,000 on Amazon but required me to finish the purchase manually. Operators let you take control when they get stuck, but the laggy browser often makes doing it yourself easier.

Right now, Operators feel more like a proof-of-concept than a practical tool. While the idea of automating repetitive tasks is exciting, it needs improvements in speed and reliability. If you’re already a Pro user, it’s worth exploring, but not essential yet.

Pricing

OpenAI offer a free tier which currently gives you limited access to GPT-4o and unlimited access to ChatGPT-4o mini. The Plus plan gets you wider access and costs $20/month - I think that's pretty good value for money. They also offer a Pro plan for $200/month which gives you priority access to their latest tools.

Cl‎aude

Claude

I’ve been using Claude (their Sonnet 3.5 model to be specific), for coding tasks, and it’s quickly becoming my go-to for code reviews. What really makes Claude stand out is how precise it is—it seems to "get" the nuances of programming better than other tools I’ve tried. I’ve used it to spot subtle issues in my code and even brainstorm better ways to structure projects. Anthropic are training these models on more recent and specialized coding knowledge and it shows, especially when tackling modern frameworks or troubleshooting tricky bugs.

Another thing I love about Claude is how nice it is to talk to. It feels like it has more "soul" compared to ChatGPT—the tone is warmer, and conversations just flow better. Whether I’m bouncing around ideas or working through a complicated issue, it’s genuinely pleasant to interact with. I have quite reached Her levels of affection for Claude, but we're getting there.

That said, I have hit the response and rate limits a little faster than I’d like, which can be a hassle if I’m deep into a project. But for $20/month on the Pro plan, it’s still a great deal, especially if you’re looking for an AI assistant that’s smart, approachable, and particularly strong in coding tasks.

Gemini

Gemini

Google’s Gemini fits seamlessly into the Google ecosystem. On Android, it feels like a natural extension of the system rather than a separate app, and if you’re already using Google Workspace, it’s incredibly convenient. Whether I was drafting emails, summarizing articles, or asking it random questions, it delivered quickly and smoothly.

I’ve found it useful in unexpected ways too. When reviewing legal documents, I’d do my initial read-through and then ask Gemini to double-check if I missed anything. Another time, I struggled with a confusing sizing chart while shopping for clothes. I snapped a picture of the label, described my usual size, and let Gemini handle the rest. The suggestion was spot-on, and I ended up with a perfect fit!

For creative projects, Gemini’s image capabilities really shine. I once uploaded an image I liked and asked it to describe it as a prompt for an AI image generator. The results were creative and inspiring, making it a fun tool for brainstorming new ideas.

When I was working on a project proposal, Gemini Advanced provided nuanced, tailored suggestions that felt like a genuine productivity boost. It even made copywriting easier—generating meaningful text for design mockups that felt polished, rather than using generic filler like "Lorem Ipsum."

However, it’s not perfect. One frustration I had was with its context retention. When revising a piece of writing, I had to re-explain instructions a few times because it would forget what we’d already discussed. Similarly, when I uploaded an Excel file, got a summary, and later updated the data, Gemini treated the updated file as a brand-new task instead of building on what we’d already done.

Another weak spot is its performance on technical tasks. While it’s great at formatting and debugging simple code, I found that it sometimes rewrote JavaScript as Python unnecessarily. For more specialized or dense content, like legal texts, its analysis lacked depth compared to what I was hoping for. Even its responses to some image-based queries were occasionally inaccurate, which was a letdown after seeing its creative potential elsewhere.

That said, Gemini’s strengths outweigh its flaws. Its tight integration with Google tools makes it practical for anyone already in the Google ecosystem, and its ability to handle both text and images makes it a versatile tool for creative projects. While it’s not the best choice for highly technical or niche tasks, it’s a solid, fast, and easy-to-use assistant for everyday needs—and for me, the advanced features have made it a tool I’ve come to rely on.

While the free Basic version (using the 1.5 Flash model) covers most casual needs, the $19.99/month Gemini Advanced adds the more powerful 1.5 Pro and Gemini-Exp-1206 models for complex tasks like coding, math, and deep research, including analyzing texts up to 1,500 pages.

De‎epSeek

DeepSeek

DeepSeek is also worth checking out. They let you use their V3 and new R1 models for free on their site, although you still have to pay for API access (it's very cheap though).

DeepSeek's search feels more engaging and "sticky" even after just a few queries. Its transparency—showing reasoning and openly acknowledging what it knows and what it might not—builds a significant level of user trust.

In January 2025, they launched their R1 model as a competitor to ChatGPT's o1, quickly gaining attention in the AI community for being both cost-effective and open source. I've played around with both their R1 and V3 models.

I asked both ChatGPT-o1 and DeepSeek-R1 to analyze sections of a presentation I’m working on. R1 provided a more comprehensive analysis, addressing key aspects that o1 overlooked. I also had both brainstorm ideas, and once again, R1 delivered significantly better suggestions than o1.

For coding I’ve been relying more on DeepSeek (v3) lately because of its straightforward approach—it gets straight to the point with its suggestions. Claude (3.5 Sonnet), by contrast, often takes a more detailed route, proposing multiple solutions and leaning toward the one that aligns best with solid software engineering practices. Both tools are excellent in their own ways, and I’ve started using them equally. DeepSeek is great for its affordability and efficiency, while Claude is invaluable for double-checking critical code and ensuring everything is on point. Together, they make a great team.

For writing, I'm less keen on these DeepSeek models. I find its output less natural-sounding and oftentimes boring and repetitive.

0 Comments: