Background
This work was developed in the CTL Seminar, A Deep Dive into Advising and AI, in Spring 2024. In this example set, one short introduction document about LaGuardia Community College has been translated into the top 23 languages spoken at LaGuardia. The following AI tools were used to generate examples:
- Google Transalte
- Gemeni
- ChatGPT 3.5
- ChatGPT 4
The goal of the project was to translate information about the college (such as the orientation materials) into languages that are commonly spoken by LaGuardia students and their family members. Only a sample document is shown on this page.
Below is a screenshot of the list of the top 24 languages spoken at LaGuardia Community College. The complete slide deck, developed in CTL Seminar A Deep Dive into Advising and AI (Spring 2024), is available here
| # | Language | Fall 2022 | # | Language | Fall 2022 |
|---|---|---|---|---|---|
| 1 | English | 60.5% | 13 | French | 0.4% |
| 2 | Spanish | 18.6% | 14 | Urdu | 0.4% |
| 3 | Bengali | 3.6% | 15 | Punjabi | 0.4% |
| 4 | Chinese | 3.5% | 16 | Portuguese | 0.3% |
| 5 | Nepali | 1.4% | 17 | Igbo | 0.3% |
| 6 | Haitian Creole | 1.2% | 18 | Hindi | 0.2% |
| 7 | Tibetan | 1.2% | 19 | Burmese | 0.2% |
| 8 | Tagalog | 1.1% | 20 | Pilipino | 0.2% |
| 9 | Arabic | 1.0% | 21 | Uzbek | 0.2% |
| 10 | Korean | 1.0% | 22 | Thai | 0.2% |
| 11 | Polish | 0.6% | 23 | Russian | 0.1% |
| 12 | Albanian | 0.4% | 24 | Japanese | 0.1% |
Overall, the quality of Google Translation was fairly limited. There are a lot of awkward expressions, a lack of fluency, and, in some cases, outright mistakes in translation. Native speakers of those languages would definitely notice that the text has been translated with machine translation.
Sample Prompt
trans -t bn+eo+ha+haw+hi+id+ja+km+ko+lo+mg+my+ne+pa+sw+ta+te+th+tl+jg+ur+uz+vi+yue+zh-CH+zh-TW file://EnglishVersion.txt # using Google Translate LaGuardia Community College E themeluar në vitin 1971 në Long Island City, Queens, LaGuardia Community College është një nga shtatë kolegjet komunitare të Universitetit të Qytetit të Nju Jorkut (CUNY). Tetë departamentet akademike të LaGuardia-s ofrojnë më shumë se 50 programe të diplomës dyvjeçare... [snip]
The following outputs were generated by the command using Gemini (May, 2024). There is definitely a great deal of fluency that was lacking in Google Translation. It is interesting that Gemini outperforms Google Translation, which Google has spent a great deal of time and resources to bring to its current level.
Sample Prompt
Using this text and other available resources/information, explain to [LANGUAGE] speakers who are not familiar with the American higher education why they should go to a community college. Use [LANGUAGE]. **Pourquoi les francophones devraient-ils aller dans un community college ?** Le **LaGuardia Community College** est un établissement d'enseignement supérieur qui offre une variété de programmes d'études, notamment dans les domaines de la santé, des sciences, des technologies, des affaires et des arts libéraux. [snip]
The outputs below were generated by the command using ChatGPT 3.5. While the quality of translation is equivalent to that of Gemini, a notable function of ChatGPT 3.5 is its audio generation model (a.k.a. Whisper). The audio quality of the speech is natural, although it struggles with numbers and English loanwords. It costs a fraction of a cent to generate an audio file.
Sample Prompt
Using this text and other available resources/information, explain to [LANGUAGE] speakers who are not familiar with the American higher education why they should go to a community college. Use [LANGUAGE]. [+AUDIO GENERATION OPTION] Title : Përkthim LaGuardia Community College, Shpjegim Përfitimesh Komunitar LaGuardia Community College I themeluar në vitin 1971 në Long Island City, Queens,... [snip]
- Albanian | Albanian (audio)
- Arabic | Arabic (audio)
- Bengali | Bengali (audio)
- Burmese | Burmese (audio)
- English | English (audio)
- French | French (audio)
- Haitian Creole | Haitian Creole (audio)
- Hindi | Hindi (audio)
- Igbo | Igbo (audio)
- Japanese | Japanese (audio)
- Korean | Korean (audio)
- Mandarin Chinese | Mandarin Chinese (audio)
- Nepali | Nepali (audio)
- Pilipino | Pilipino (audio)
- Polish | Polish (audio)
- Portuguese | Portuguese (audio)
- Punjabi | Punjabi (audio)
- Russian | Russian (audio)
- Spanish | Spanish (audio)
- Tagalog | Tagalog (audio)
- Thai | Thai (audio)
- Tibetan | Tibetan (audio)
- Urdu | Urdu (audio)
- Uzbek | Uzbek (audio)
ChatGPT-4o was introduced in May 2024. The outputs below were generated by the command using ChatGPT-4o. The quality of the translation and text-to-speech was a significant improvement. The text is no longer direct translation from the original document and offers a great deal of fluency and natural flow in the translated languages. In addition, the token-based cost was significantly reduced, allowing users to add more text (contexts) to enrich the output.
Sample Prompt
Using this text and other available resources/information, explain to [LANGUAGE] speakers who are not familiar with the American higher education why they should go to a community college. Use [LANGUAGE]. [+AUDIO GENERATION OPTION] Për ata që nuk janë të njohur me sistemin arsimor të lartë në Amerikë, zgjedhja për të ndjekur studimet në një kolegj komuniteti si LaGuardia Community College mund të duket një opsion shumë tërheqës. Ja disa arsye pse një student nga Shqipëria apo rajoni duhet të konsiderojë këtë mundësi:... [snap]
- Albanian | Albanian
- Arabic | Arabic
- Bengali | Bengali
- Burmese | Burmese
- English | English
- French | French
- Haitian Creole | Haitian Creole
- Hindi | Hindi
- Igbo | Igbo
- Japanese | Japanese
- Korean | Korean
- Mandarin Chinese | Mandarin Chinese
- Nepali | Nepali
- Pilipino | Pilipino
- Polish | Polish
- Portuguese | Portuguese
- Punjabi | Punjabi
- Russian | Russian
- Spanish | Spanish
- Tagalog | Tagalog
- Thai | Thai
- Tibetan | Tibetan
- Urdu | Urdu
- Uzbek | Uzbek
GPT-5 was introduced in May 2025, and it was claimed to have doctoral-student-level reasoning skills. The model comes with a new concept called "reasoning," which basically loops questions and responses until it reaches a highly refined response. The most advanced reasoning is extremely expensive and time-consuming -- it takes about 10–20 minutes, but it can resolve a very difficult question (about $10 with the API). The issue of hallucinations is practically resolved with this reasoning model and a new live web-search function. OpenAI and its major competitors (Gemini and Claude) started to focus on image and video creation, and TTS (text-to-speech) hasn't seen much progress since 2024. At OpenAI, the latest model remains GPT-4o-tts. A few new AI companies (such as ElevenLabs) have started to offer high-quality text-to-speech services, but they are usually very expensive.
Another major advancement in LLMs is their ability to code. Until GPT-4o, we needed to code (in Python or similar computer languages) if we wanted to run a large number of inquiries. With GPT-5, LLMs can generate API code by themselves, so we won't need to write a program at all. The prompts for this section show a two-step process to reflect this change. Also, the prompt now asks for more than simple translation so that the model can show its ability beyond transaltion.
The quality of translation has plateaued with GPT-4o, and there is not much room to improve since then. Therefore, the prompt has been revised to add a few additional pieces of information to the response. The quality of responses is impressive—even scary to see what the large language model can accomplish now.
Sample Prompts
Please write a Python script to run an API command to translate a document into the following languages. "Albanian" "Arabic" "Bengali" "Burmese" "Traditional Chinese" "Simplified Chinese" "French" "Haitian Creole" "Hindi" "Igbo" "Japanese" "Korean" "Nepali" "Pilipino" "Polish" "Portuguese" "Punjabi" "Russian" "Spanish" "Tagalog" "Thai" "Tibetan" "Urdu" "Uzbek" Below is a clean, self-contained Python script that does exactly what you described, following your usual CLI/script style: #!/usr/bin/env python3.12 # -*- coding: utf-8 -*- from __future__ import annotations import argparse import subprocess import sys from pathlib import Path from datetime import date ... [snap]
for lang in "Albanian" "Arabic" "Bengali" "Burmese" "Traditional Chinese" "Simplified Chinese" "French" "Haitian Creole" "Hindi" "Igbo" "Japanese" "Korean" "Nepali" "Pilipino" "Polish" "Portuguese" "Punjabi" "Russian" "Spanish" "Tagalog" "Thai" "Tibetan" "Urdu" "Uzbek"; do python_script.py --audio --websearch --reasoning "medium" -p "Translate the following document into $lang. Preserve the original style and produce a faithful, equivalent translation in $lang. After completing the translation, explain to $lang speakers who are not familiar with the American higher-education system why attending a community college can be beneficial. Tailor this explanation to the specific circumstances of the $lang-speaking community in New York City and keep it within a few paragraphs. Write in $lang. Generate only output and do not introduce any additional commentary before the output." --file english.txt; done LaGuardia Community College E themeluar në vitin 1971 në Long Island City, Queens, LaGuardia Community College është një nga shtatë kolegjet komunitare të Universitetit të Qytetit të Nju Jorkut (CUNY). ... [snap]
- Albanian | Albanian
- Arabic | Arabic
- Bengali | Bengali
- Burmese | Burmese
- English | English
- French | French
- Haitian Creole | Haitian Creole
- Hindi | Hindi
- Igbo | Igbo
- Japanese | Japanese
- Korean | Korean
- Mandarin Chinese | Mandarin Chinese
- Nepali | Nepali
- Pilipino | Pilipino
- Polish | Polish
- Portuguese | Portuguese
- Punjabi | Punjabi
- Russian | Russian
- Spanish | Spanish
- Tagalog | Tagalog
- Thai | Thai
- Tibetan | Tibetan
- Urdu | Urdu
- Uzbek | Uzbek