Translate Text using Python

[This article was first published on PyShark, and kindly contributed to python-bloggers]. (You can report issue about the content on this page here)
Want to share your content on python-bloggers? click here.

In this article we will discuss how to translate text with Google Translate API using Python.

Table of contents

  • Introduction
  • Basic usage
  • Specifying source and destination languages
  • Conclusion

Introduction

We all used an online translator or looked up a word in a dictionary. There are multiple online translators available and they are very well maintained. The leader in this field is of course Google.

Most of us are already familiar with Google translate service, and some of us, who use Google Chrome, have their plugin installed as well. The service is absolutely free and translates text between over 100 different languages.

If you made it to this page you are probably curious how we can have something automated or programmed using Python.

To continue following this tutorial we will need the following Python library: googletrans.

If you don’t have it installed, please open “Command Prompt” (on Windows) and install it using the following code:

pip install googletrans

Basic usage

To get started we will need to import the Translator object and create an instance of its class:

from googletrans import Translator

translator = Translator()

Now let’s see how we can quickly get to use it.

By default the translator object will do language auto detection. To put this into perspective, say I will just give it some phrase in Russian language. Let’s see what happens:

myword='Здравствуйте!'

print(translator.translate(myword))

And we get:

Translated(src=ru, dest=en, text=Hello!, pronunciation=None, extra_data="{'translat...")

Okay so far we get some valid return but the format may not be what we want. What’s important here is that the translator auto detected that we are translating from Russian (src=ru) and (probably because I’m in Canada) it detected the destination language as English (dest=en).

And the main part we want is the actual text of the translation, which is correctly translated as “Hello!” (text=Hello!). Now, if we want to only retrieve the text of the translation, we can simply do:

print(translator.translate(myword).text)

To get:

Hello!

The question you may have is how confident is the translator that the source language is Russian? Good question! The library actually provides code to check for the confidence level. Let’s see how it works:

print(translator.detect(myword))

And we see:

Detected(lang=ru, confidence=0.98046875)

The translator is 98% confident that the source language is Russian (good enough for my case).

Language auto detection is a very useful option and it definitely speeds things up. But what if we wanted to translate Russian into Spanish? The next section will explain exactly how to do it.


Specifying source and destination languages

Let’s see what we can do if we want to specify both source and destination languages for our translation.

First, we need to make sure that the languages we want are supported by Google. Generally, in 99.9% of cases, the library will have everything required. But if you would like to check what languages are supported, you can execute the following code:

for key, value in googletrans.LANGUAGES.items():
    print(key,':', value)

The list of all supported languages:

af : afrikaans
sq : albanian
am : amharic
ar : arabic
hy : armenian
az : azerbaijani
eu : basque
be : belarusian
bn : bengali
bs : bosnian
bg : bulgarian
ca : catalan
ceb : cebuano
ny : chichewa
zh-cn : chinese (simplified)
zh-tw : chinese (traditional)
co : corsican
hr : croatian
cs : czech
da : danish
nl : dutch
en : english
eo : esperanto
et : estonian
tl : filipino
fi : finnish
fr : french
fy : frisian
gl : galician
ka : georgian
de : german
el : greek
gu : gujarati
ht : haitian creole
ha : hausa
haw : hawaiian
iw : hebrew
he : hebrew
hi : hindi
hmn : hmong
hu : hungarian
is : icelandic
ig : igbo
id : indonesian
ga : irish
it : italian
ja : japanese
jw : javanese
kn : kannada
kk : kazakh
km : khmer
ko : korean
ku : kurdish (kurmanji)
ky : kyrgyz
lo : lao
la : latin
lv : latvian
lt : lithuanian
lb : luxembourgish
mk : macedonian
mg : malagasy
ms : malay
ml : malayalam
mt : maltese
mi : maori
mr : marathi
mn : mongolian
my : myanmar (burmese)
ne : nepali
no : norwegian
or : odia
ps : pashto
fa : persian
pl : polish
pt : portuguese
pa : punjabi
ro : romanian
ru : russian
sm : samoan
gd : scots gaelic
sr : serbian
st : sesotho
sn : shona
sd : sindhi
si : sinhala
sk : slovak
sl : slovenian
so : somali
es : spanish
su : sundanese
sw : swahili
sv : swedish
tg : tajik
ta : tamil
te : telugu
th : thai
tr : turkish
uk : ukrainian
ur : urdu
ug : uyghur
uz : uzbek
vi : vietnamese
cy : welsh
xh : xhosa
yi : yiddish
yo : yoruba
zu : zulu

From the above list we know for sure that Russian and Spanish are supported. Next step is to figure out how to use it.

The code will be very similar to the one with used in the previous section with auto detection. With one addition that now we will specify both source and destination languages:

print(translator.translate(myword, src='ru', dest='es').text)

And we get:

¡Hola!

Note that when specifying languages, we don’t use the actual language names (Russian, Spanish), rather than the abbreviations (ru, es), complete list of which is shown is the above part where we printed all of the supported languages with key-value pairs.


Conclusion

In this article we discussed how to translate text with Google Translate API using Python.

By working through this code, you should be able to scale it to translating full texts, lists, entries in dictionaries, and so on.

I also encourage you to check out my other posts on Python Programming.

Feel free to leave comments below if you have any questions or have suggestions for some edits.

The post Translate Text using Python appeared first on PyShark.

To leave a comment for the author, please follow the link and comment on their blog: PyShark.

Want to share your content on python-bloggers? click here.