Cleaning up translation/localization po files?

Hi all,

Hopefully this is a quick question: when running our localization workflow (which includes running GatherText), our output .po files seem to have stale entries in them. For example, if we fix a typo in a text field of a UI widget, and run GatherText, and what we see in the .po file is the old text + translation, and the new text + translation. Obviously we no longer want the pre-fixed text in there. This happens even if deleting all .po files, etc. Is this a known issue with a known fix? Is there maybe some longer/slower version of GatherText we can run that looks for no longer referenced text like this and cleans it up somehow? Thanks as always for any help!

-Matt

If you use blueprint, it seems that they keep in mind all text they had … I had a blueprint B included in an other bp A, and when I edited text on B, the gatherer found the text in the bp A … All I found was to rename the widget parent in bp A (which was a widget switcher)

Interesting. I’m glad to see someone else has at least seen this behavior. In the example we just checked, the text was (is) in a child widget of another widget, but there isn’t a widget switcher in this particular bit of UI. Still, the parent somehow keeping a reference to the child’s text might be relevant here, I guess we’ll see (thanks for the quick reply, though, it already gives me some clues of where to look next…)

Ah, re-reading the reply, I see that maybe it doesn’t matter what the parent is, but that renaming it may fix this, we’ll give that a try too.

There used to be a few issues with the package cache of texts prior to 4.11, one of which was that we were including transient data in the cache, which meant that a UMG asset would also gather the text from any UMG asset widgets nested within it, causing issues when the inner UMG asset was changed without re-saving the outer UMG asset.

It is possible to disable the caching mechanism, however it requires a code-change. If you open up GatherTextFromAssetsCommandlet.cpp and search for PackageFileSummary.GetFileVersionUE4() >= VER_UE4_SERIALIZE_TEXT_IN_PACKAGES then you’ll see the code that’s trying to use the cached package data. What you’ll want to do is comment out that if block so that we ignore the cache from the package and always load up the package and gather from it fresh. The end result should look something like this:

/*
 // Package has gatherable text data in its header, process immediately.
 if (PackageFileSummary.GetFileVersionUE4() >= VER_UE4_SERIALIZE_TEXT_IN_PACKAGES)
 {
     TArray<FGatherableTextData> GatherableTextDataArray;
 
     if (PackageFileSummary.GatherableTextDataOffset > 0)
     {
         FileReader->Seek(PackageFileSummary.GatherableTextDataOffset);
 
         for(int32 i = 0; i < PackageFileSummary.GatherableTextDataCount; ++i)
         {
             FGatherableTextData* GatherableTextData = new(GatherableTextDataArray) FGatherableTextData;
             (*FileReader) << *GatherableTextData;
         }
 
         ProcessGatherableTextDataArray(PackageFile, GatherableTextDataArray);
     }
 }
 // Package not resaved since gatherable text data was added to package headers, defer processing until after loading.
 else */if (PackageFileSummary.PackageFlags & PKG_RequiresLocalizationGather || PackageFileSummary.GetFileVersionUE4() < VER_UE4_PACKAGE_REQUIRES_LOCALIZATION_GATHER_FLAGGING)
 {
     PackageFileNamesToLoad.Add(PackageFile);
 }

Hopefully this issue will be resolved once 4.11 ships (we no longer gather transient data), however it will require a re-save of all of your UMG assets in order to update the cache in the package header.

Nice to see that this behavior will be fixed soon !!!