Translate Text using Python
Want to share your content on python-bloggers? click here.
In this article we will discuss how to translate text with Google Translate API using Python.
Table of contents
- Introduction
- Basic usage
- Specifying source and destination languages
- Conclusion
Introduction
We all used an online translator or looked up a word in a dictionary. There are multiple online translators available and they are very well maintained. The leader in this field is of course Google.
Most of us are already familiar with Google translate service, and some of us, who use Google Chrome, have their plugin installed as well. The service is absolutely free and translates text between over 100 different languages.
If you made it to this page you are probably curious how we can have something automated or programmed using Python.
To continue following this tutorial we will need the following Python library: googletrans.
If you don’t have it installed, please open “Command Prompt” (on Windows) and install it using the following code:
pip install googletrans
Basic usage
To get started we will need to import the Translator object and create an instance of its class:
from googletrans import Translator translator = Translator()
Now let’s see how we can quickly get to use it.
By default the translator object will do language auto detection. To put this into perspective, say I will just give it some phrase in Russian language. Let’s see what happens:
myword='Здравствуйте!' print(translator.translate(myword))
And we get:
Translated(src=ru, dest=en, text=Hello!, pronunciation=None, extra_data="{'translat...")
Okay so far we get some valid return but the format may not be what we want. What’s important here is that the translator auto detected that we are translating from Russian (src=ru) and (probably because I’m in Canada) it detected the destination language as English (dest=en).
And the main part we want is the actual text of the translation, which is correctly translated as “Hello!” (text=Hello!). Now, if we want to only retrieve the text of the translation, we can simply do:
print(translator.translate(myword).text)
To get:
Hello!
The question you may have is how confident is the translator that the source language is Russian? Good question! The library actually provides code to check for the confidence level. Let’s see how it works:
print(translator.detect(myword))
And we see:
Detected(lang=ru, confidence=0.98046875)
The translator is 98% confident that the source language is Russian (good enough for my case).
Language auto detection is a very useful option and it definitely speeds things up. But what if we wanted to translate Russian into Spanish? The next section will explain exactly how to do it.
Specifying source and destination languages
Let’s see what we can do if we want to specify both source and destination languages for our translation.
First, we need to make sure that the languages we want are supported by Google. Generally, in 99.9% of cases, the library will have everything required. But if you would like to check what languages are supported, you can execute the following code:
for key, value in googletrans.LANGUAGES.items(): print(key,':', value)
The list of all supported languages:
af : afrikaans sq : albanian am : amharic ar : arabic hy : armenian az : azerbaijani eu : basque be : belarusian bn : bengali bs : bosnian bg : bulgarian ca : catalan ceb : cebuano ny : chichewa zh-cn : chinese (simplified) zh-tw : chinese (traditional) co : corsican hr : croatian cs : czech da : danish nl : dutch en : english eo : esperanto et : estonian tl : filipino fi : finnish fr : french fy : frisian gl : galician ka : georgian de : german el : greek gu : gujarati ht : haitian creole ha : hausa haw : hawaiian iw : hebrew he : hebrew hi : hindi hmn : hmong hu : hungarian is : icelandic ig : igbo id : indonesian ga : irish it : italian ja : japanese jw : javanese kn : kannada kk : kazakh km : khmer ko : korean ku : kurdish (kurmanji) ky : kyrgyz lo : lao la : latin lv : latvian lt : lithuanian lb : luxembourgish mk : macedonian mg : malagasy ms : malay ml : malayalam mt : maltese mi : maori mr : marathi mn : mongolian my : myanmar (burmese) ne : nepali no : norwegian or : odia ps : pashto fa : persian pl : polish pt : portuguese pa : punjabi ro : romanian ru : russian sm : samoan gd : scots gaelic sr : serbian st : sesotho sn : shona sd : sindhi si : sinhala sk : slovak sl : slovenian so : somali es : spanish su : sundanese sw : swahili sv : swedish tg : tajik ta : tamil te : telugu th : thai tr : turkish uk : ukrainian ur : urdu ug : uyghur uz : uzbek vi : vietnamese cy : welsh xh : xhosa yi : yiddish yo : yoruba zu : zulu
From the above list we know for sure that Russian and Spanish are supported. Next step is to figure out how to use it.
The code will be very similar to the one with used in the previous section with auto detection. With one addition that now we will specify both source and destination languages:
print(translator.translate(myword, src='ru', dest='es').text)
And we get:
¡Hola!
Note that when specifying languages, we don’t use the actual language names (Russian, Spanish), rather than the abbreviations (ru, es), complete list of which is shown is the above part where we printed all of the supported languages with key-value pairs.
Conclusion
In this article we discussed how to translate text with Google Translate API using Python.
By working through this code, you should be able to scale it to translating full texts, lists, entries in dictionaries, and so on.
I also encourage you to check out my other posts on Python Programming.
Feel free to leave comments below if you have any questions or have suggestions for some edits.
The post Translate Text using Python appeared first on PyShark.
Want to share your content on python-bloggers? click here.