SaveStringToFile with ö character

I’m trying to read a a list of files and save the list in a txt. Some of them have a ö in their name which shows up as ?

I’ve slimmed my code just to check this issue:

FString newText = "AAAA\x00D6 BBBB";
char myword[] = "AAAA\x00D6 BBBB";
FFileHelper::SaveStringToFile(newText, *SaveFilePath, FFileHelper::EEncodingOptions::AutoDetect);

I’ve tried all of the EEncodingOptions and both char and FString. It always saves as “AAAA? BBBB” instead of “AAAAö BBBB”

UPDATE:

So I was able to save the character but only using char and Write not with SaveStringToFile or FString.

IF I use char it shows “AAAAÖ BBBB” correctly:

IPlatformFile& PlatformFile = FPlatformFileManager::Get().GetPlatformFile();
IFileHandle* FileHandle = PlatformFile.OpenWrite(*SaveFilePath);
char myword[] = "AAAA\x00D6 BBBB";
FileHandle->Write((const uint8*)myword, 10);
delete FileHandle;

If I try FString is shows “AAAA?” when viewing file as .txt although both are still 10 bytes:

IPlatformFile& PlatformFile = FPlatformFileManager::Get().GetPlatformFile();
IFileHandle* FileHandle = PlatformFile.OpenWrite(*SaveFilePath);
FString newText = "AAAA\x00D6 BBBB";
FileHandle->Write((const uint8*)*newText, 10);
delete FileHandle;

It does seem FString doesn’t support the Ö although the documentation under character encoding says it is “Characters between 32 and 126 inclusive” which would include Ö.

Now I am having this issue in both reading file names, displaying them, and saving a list of files to a .txt file so I would much rather be able to use FString to manipulate these with Unreal functions.

have you tried to debug and see where it fails?

bool FFileHelper::SaveStringToFile( const FString& String, const TCHAR* Filename,  EEncodingOptions EncodingOptions, IFileManager* FileManager /*= &IFileManager::Get()*/, uint32 WriteFlags )

No, do you know how I would go about doing that?

I guess you are using Visual Studio for you C++ project, right? So you can debug in there. Unreal comes with a source code so when you debug your code, you can go inside Engine’s source code.

Also, can you confirm that saved file is Unicode file? it should be 22 bytes. (2 + 10*2)

10 bytes for both FString and char and FileHandle->Write and FFileHelper::SaveStringToFile. I found I can get the character to display correctly with FileHandle->Write and char although it’s not ideal (updated original post).

Also if I just create a .txt file in windows and paste in AAAAÖ BBBB it saves as 10 bytes?

ok, this is what happening: you save as ASCII file, not as Unicode. That is why you see ‘?’. Unicode is 2 bytes per character. If you save in UTF8, then Ö should be converted into \x00D6. In Windows, do you have set up your language as local? Like in mine, all Russian letters will go as ASCII, not Unicode.

There is a good article about it: Character Encoding | Unreal Engine Documentation

Huh, I’m probably not using Unicode however I am still a bit confused.

When I create a .txt file in windows I can use the Ö character and also that character falls under the ASCII range of between 32 and 126 (and is only 1 byte per character).

What I’m basically trying to do is allow users to select music on their computer so allow whatever is allowed in windows file names. I tested with my file and came across “Röyksopp” which is where this whole issue started.

Thanks. I found that article too which says "All strings in Unreal Engine 4 (UE4) are stored in memory in UTF-16 " and as my edited post it seems to have an issue with FString and FileHandle->Write but not char.

SaveStringToFile also uses FString.

yes, you are not saving as Unicode that is for sure. Did you try EEncodingOptions::ForceUnicode ?

bool FFileHelper::SaveStringToFile( const FString& String, const TCHAR* Filename,  EEncodingOptions EncodingOptions, IFileManager* FileManager /*= &IFileManager::Get()*/, uint32 WriteFlags )
{
	// max size of the string is a UCS2CHAR for each character and some UNICODE magic 
	auto Ar = TUniquePtr<FArchive>( FileManager->CreateFileWriter( Filename, WriteFlags ) );
	if( !Ar )
		return false;

	if( String.IsEmpty() )
		return true;

	const TCHAR* StrPtr = *String;

	bool SaveAsUnicode = EncodingOptions == EEncodingOptions::ForceUnicode || ( EncodingOptions == EEncodingOptions::AutoDetect && !FCString::IsPureAnsi(StrPtr) );
	if( EncodingOptions == EEncodingOptions::ForceUTF8 )
	{
		UTF8CHAR UTF8BOM[] = { 0xEF, 0xBB, 0xBF };
		Ar->Serialize( &UTF8BOM, ARRAY_COUNT(UTF8BOM) * sizeof(UTF8CHAR) );

		FTCHARToUTF8 UTF8String(StrPtr);
		Ar->Serialize( (UTF8CHAR*)UTF8String.Get(), UTF8String.Length() * sizeof(UTF8CHAR) );
	}
	else if ( EncodingOptions == EEncodingOptions::ForceUTF8WithoutBOM )
	{
		FTCHARToUTF8 UTF8String(StrPtr);
		Ar->Serialize((UTF8CHAR*)UTF8String.Get(), UTF8String.Length() * sizeof(UTF8CHAR));
	}
	else if (SaveAsUnicode)
	{
		UCS2CHAR BOM = UNICODE_BOM;
		Ar->Serialize( &BOM, sizeof(UCS2CHAR) );

		auto Src = StringCast<UCS2CHAR>(StrPtr, String.Len());
		Ar->Serialize( (UCS2CHAR*)Src.Get(), Src.Length() * sizeof(UCS2CHAR) );
	}
	else
	{
		auto Src = StringCast<ANSICHAR>(StrPtr, String.Len());
		Ar->Serialize( (ANSICHAR*)Src.Get(), Src.Length() * sizeof(ANSICHAR) );
	}

	return true;
}

Yes I’ve tried all the EEncodingOptions options :(.

Although I don’t believe Ö requires unicode.

It is part of extended ASCII table, codes > 127. It highly depends on a codepage what is there. For example, I have Russian alphabet there. So, any Russian text will be saved as 1 byte character. Anyway, I see where the problem is. I’ll try myself like in your example to save it. It will be good to know anyway.

Thanks for all your help. I definitely don’t have a full grasp on all this yet myself.

Ok, I made it to save into Unicode:
FString newText = “AAAA\x00D6 BBBB”;
FFileHelper::SaveStringToFile(newText, TEXT(“C:/test.txt”), FFileHelper::EEncodingOptions::ForceUnicode);

File size is 22 bytes.

BUT, looking through a debugger, “AAAA\x00D6 BBBB” gets converted in FString to “AAAA? BBBB”. So, the problem is not actually in SaveStringToFile, but what you are saving.

The correct way: FString newText = TEXT(“AAAA\x00D6 BBBB”);

Oh hey, thanks!