Translate Text form Pictures using Azure

March 09, 2021

Computer Vision is an AI Service part of the Azure Cognitive Services that analyzes content in images and video.
Translator in Azure is an AI service, part of Azure Cognitive Services, used for real-time text translation and detection. It is fast and easy to implement, to bring intelligence to your text processing projects.


Create

In this post we will build upon two older posts: the post about Computer Vision and the post about Azure Translator. Follow the Create steps of both of theese posts to create your Azure resources.


Implement

Create a new C# console project in Visual Studio or open an existing one. Follow the previous posts to install the Microsoft.Azure.CognitiveServices.Vision.ComputerVision and Newtonsoft.Json NuGet packages.

Open the class that you need to implement the image analyser and translator in. For a new project you can use the Program.cs.
Add the using statements you see below at the top of the file.

      using Microsoft.Azure.CognitiveServices.Vision.ComputerVision;
      using Microsoft.Azure.CognitiveServices.Vision.ComputerVision.Models;
      using System.Collections.Generic;
      using System.Threading.Tasks;
      using System.Threading;
      using System.Linq;
      using System.Net.Http;
      using System.Text;
      using Newtonsoft.Json;

Input your Subscription Keys, Endpoints and Location (Location is only for the Translator resource). You can see where to find them in the previous posts. The READ_TEXT_URL_IMAGE string at line 9 should contain the URL of the image you wish to analyse.

// Add your Computer Vision subscription key and endpoint
private static readonly string ComputerVisionsubScriptionKey = "1e6cd418eKEY_HERE450704d3e63c";
private static readonly string ComputerVisionEndpoint = "https://compvisiondemobinarygrounds.cognitiveservices.azure.com/";
private static readonly string TranslatorSubscriptionKey = "5b50844fKEY_HERE8be8f6f8f40f7";
private static readonly string TranslatorEndpoint = "https://api.cognitive.microsofttranslator.com/";
private static readonly string TranslatorLocation = "eastus2";

// URL image used for analyzing an image
private const string READ_TEXT_URL_IMAGE = "";

Replace your Main function with the following code. Your Main should be asynchronous because it needs to wait before all the asynchronous functions return their results before exiting. Do not worry about the missing functions, we will create them next.
You can change the translated language by changing the “&to=de” part of the string of route variable in line 9. You can find a list of the supported languages along with their codes here.

static async Task Main(string[] args)
{
    // Create a client
    ComputerVisionClient client = Authenticate(ComputerVisionEndpoint, ComputerVisionsubScriptionKey);

    var analisedText = await ReadFileUrl(client, READ_TEXT_URL_IMAGE);

    // Output languages are defined as parameters, input language detected.
    string route = "/translate?api-version=3.0&to=de";
    string textToTranslate = analisedText;
    object[] body = new object[] { new { Text = textToTranslate } };
    var requestBody = JsonConvert.SerializeObject(body);

    using (var client2 = new HttpClient())
    using (var request = new HttpRequestMessage())
    {
        // Build the request.
        request.Method = HttpMethod.Post;
        request.RequestUri = new Uri(TranslatorEndpoint + route);
        request.Content = new StringContent(requestBody, Encoding.UTF8, "application/json");
        request.Headers.Add("Ocp-Apim-Subscription-Key", TranslatorSubscriptionKey);
        request.Headers.Add("Ocp-Apim-Subscription-Region", TranslatorLocation);


        // Send the request and get response.
        HttpResponseMessage response = await client2.SendAsync(request).ConfigureAwait(false);
        // Read response as a string.
        string resultJson = await response.Content.ReadAsStringAsync();

        try
        {
            List<Rootobject> output = JsonConvert.DeserializeObject<List<Rootobject>>(resultJson);
            Console.WriteLine($"Input Text: {textToTranslate}\nPredicted Language: {output.FirstOrDefault().detectedLanguage.language}\nPredicted Score: {output.FirstOrDefault().detectedLanguage.score}\n\n");
            foreach (Translation obj in output.FirstOrDefault().translations)
                Console.WriteLine($"Translated Language: {obj.to}\nResult: {obj.text}\n\n");
        }
        catch (Exception e)
        {
            Console.WriteLine(e);
        }
    }
}

Create the Authenticate function below your Main.

public static ComputerVisionClient Authenticate(string endpoint, string key)
{
    ComputerVisionClient client =
        new ComputerVisionClient(new ApiKeyServiceClientCredentials(key))
        { Endpoint = endpoint };
    return client;
}

The following function extracts the text from the given picture. Place it under the Authenticate function.

public static async Task<string> ReadFileUrl(ComputerVisionClient client, string urlFile)
{
    Console.WriteLine("Extracted Text:");
    Console.WriteLine();

    // Read text from URL
    var textHeaders = await client.ReadAsync(urlFile, language: "en");
    // After the request, get the operation location (operation ID)
    string operationLocation = textHeaders.OperationLocation;
    Thread.Sleep(2000);
    // Retrieve the URI where the extracted text will be stored from the Operation-Location header.
    // We only need the ID and not the full URL
    const int numberOfCharsInOperationId = 36;
    string operationId = operationLocation.Substring(operationLocation.Length - numberOfCharsInOperationId);

    // Extract the text
    ReadOperationResult results;
    do
    {
        results = await client.GetReadResultAsync(Guid.Parse(operationId));
    }
    while ((results.Status == OperationStatusCodes.Running ||
        results.Status == OperationStatusCodes.NotStarted));
    // Display the found text.
    Console.WriteLine();
    var textUrlFileResults = results.AnalyzeResult.ReadResults;
    string output = "";
    foreach (ReadResult page in textUrlFileResults)
    {
        foreach (Line line in page.Lines)
        {
            Console.WriteLine(line.Text);
            output += " " + line.Text;
        }
    }
    Console.WriteLine();
    return output;
}

Add theese classes to deserialize your JSON. You can place them in separate files, or in the same file under the class you are working on.

public class Rootobject
{
    public Detectedlanguage detectedLanguage { get; set; }
    public List<Translation> translations { get; set; }
}

public class Detectedlanguage
{
    public string language { get; set; }
    public float score { get; set; }
}

public class Translation
{
    public string text { get; set; }
    public string to { get; set; }
}

Now everithing should be working as intended, let’s try testing our new project!


Test

Place the URL of this picture as an input.


This is the result for German translation.


And thats how you can translate text from a picture to any supported language you wish!

About Me

Hi, my name is Demetris Bakas and I am a software engineer that loves to write code and be creative. I always find new technologies intriguing and I like to work with other people and be a part of a team. My goal is to develop software that people will find useful and will aid them in their everyday lives.
For any questions feel free to contact me at social media using the links below.