“AI-powered Japanese learning” is now the default app store pitch. AI tutor. AI conversation partner. AI-personalized SRS. AI pronunciation grading. AI everything.
After subscribing to 7 of these in April 2026 and testing each for a full 30 days, my honest report: about 20% of the AI features genuinely improved my Japanese. The rest were marketing, or worse, actively harmful. Here’s the teardown, feature by feature, no app names named (because that gets lawsuits), just categories.
What Actually Helped
1. OCR Scan for Instant Flashcard Creation
Point camera at a menu, a manga panel, a sign. The app extracts Japanese text, segments it and lets you add to your SRS deck in one tap. This is not novel AI — ML Kit has done OCR for 5 years — but the UX breakthrough was that it’s finally fast enough to be worth doing mid-meal.
Retention impact: measurably higher for words I captured from real life vs words I saw in a preset deck. Emotional context is a memory multiplier.
2. Example Sentence Generation With Your Current Vocab
Give the model a list of words you’ve already learned. Ask it to produce natural sentences using only those words plus a new target word. Now the new word appears in a comprehensible context.
This is the one place where LLMs are genuinely better than static textbooks. Textbook example sentences are frozen. AI-generated ones can use your vocabulary.
3. Grammar Pattern Explanations in Plain English
Ask the AI “why does this sentence use 〜のに instead of 〜けど?” and get a 150-word explanation with examples. This is faster than flipping through three grammar references. I still cross-checked against Bunpo and DoJG because the AI occasionally hallucinated, but the 80% accuracy was good enough for most questions.
What Was Marketing Theater
4. AI Conversation Partners
The pitch: unlimited practice with a patient Japanese tutor. The reality: GPT with a Japanese system prompt, no voice synthesis quality control, and responses that defaulted to textbook-perfect grammar no Japanese person actually speaks.
Every “AI tutor” I tested gave me いいですね when I made mistakes and never corrected errors unless I explicitly begged. A real tutor would’ve stopped me on my third sentence to fix my pitch accent. The AI let me practice bad Japanese for 30 minutes.
5. AI-Personalized SRS
Two apps advertised that their AI “adjusts your intervals based on your learning style.” I tested with identical review patterns and the intervals matched vanilla SM-2 algorithm exactly. The “AI” was a branding layer on a 30-year-old algorithm.
Genuine personalization would need months of behavioral data and a legit ML model. No app is doing this yet — they’re using ease-factor tweaks and calling it AI.
6. AI Pronunciation Grading
Every app claimed to grade my pronunciation. Every app gave me 90%+ scores on sentences I deliberately mumbled. The grading models are too lenient to be useful; they’re tuned for retention (make users feel good) not accuracy (tell users the truth).
One app gave me a perfect score for saying “sushi” in English accent. A real pronunciation trainer would catch the dead dental /s/ and the non-short /i/.
What Was Actively Harmful
7. AI-Generated Lesson Content
Two apps produced full lessons procedurally. Some lessons contained outright wrong information — a grammar explanation that contradicted itself, a vocab list with a pitch accent that didn’t exist. Users have no way to know which sentences are hallucinated.
Static human-reviewed content is strictly better than dynamic LLM content at the current hallucination rate. This was the feature I immediately turned off on every app that had it.
8. AI Motivation Nudges
“Hey Shawn, you haven’t studied today. Your friends in Tokyo are learning faster than you.” I don’t have friends in Tokyo. I don’t have friends named Shawn in the app. The AI made them up to trigger FOMO. This crossed a line for me — I uninstalled two apps over this.
The Scoring Rubric I Used
| Feature Type | 30-Day Retention Gain | Friction Added | Keep? |
|---|---|---|---|
| OCR scan to SRS | +18% | Low | Yes |
| Vocab-aware sentence gen | +12% | Low | Yes |
| Grammar explainer | +6% | Low | Yes |
| AI tutor chat | -2% | High | No |
| AI-personalized SRS | 0% | Low | No |
| AI pronunciation grading | -4% | Medium | No |
| AI lesson generation | -8% | High | No |
| AI motivation nudges | -15% (uninstall rate) | Very High | No |
The pattern is clear: AI helps when it augments your existing data, hurts when it generates fake data.
What to Look for in 2026
- Does it use AI on your real input? OCR scan + sentence gen with your vocab = yes. AI-generated lessons = no.
- Does it let you verify? If the grammar explainer lets you cross-reference examples, fine. If it hides sources, skip.
- Is the SRS algorithm transparent? SM-2 or FSRS disclosed = fine. “Proprietary AI” = almost certainly marketing.
- Does it respect your attention? Motivation nudges that make up facts about you are disqualifying.
OCR scan to auto-create flashcards, separate SRS decks for kanji/vocab/hira/kata, lock-screen widget, transparent SM-2 algorithm. No AI tutor, no fake motivation nudges, no hallucinated lessons. Just the tools that actually work.
The Bigger Point
AI in 2026 is great at augmenting static content with your personal context. It’s terrible at generating the static content itself, and it’s terrible at replacing human teachers for correction.
The apps that will win the next two years are the ones that use AI quietly on the edges (OCR, contextual examples, explanation on demand) while keeping the core content human-verified. The apps loudly shouting “AI tutor” are often the weakest products underneath.
Test any app for 30 days before paying. And measure your actual retention, not how flashy the onboarding felt. The data tells the truth.