Reducing Blazor WASM size by providing custom ICU data

 
 
  • Gérald Barré

The International Components for Unicode (ICU) is a set of libraries that provide Unicode and internationalization support for software applications. Unicode is a standardized encoding system that represents almost all of the written languages of the world. It is used to represent characters in computers, mobile devices, and other digital devices. ICU provides a wide range of functions for working with Unicode text, including string search and replace, character conversion, and text normalization. It also provides support for various global scripts and locales, allowing software applications to adapt to different language and cultural conventions.

One of the key benefits of using ICU is that it enables software applications to handle text consistently across different platforms and languages. This is especially important for applications that are used internationally, as it ensures that text is displayed and processed correctly regardless of the language or locale.

In the case of a Blazor application, the ICU data files are used to provide support for the globalization features of the application. These include support for date and time formatting, number formatting, and string comparison. One downside is that the browser must download the ICU data files when the application is loaded. This can increase the size of the application, and can also increase the time it takes to load the application.

Many applications only need to support a small number of languages, so the ICU data files can be customized to only include the data for the languages that are required. This can reduce the size of the application by a few hundred KiloBytes, and reduce the time it takes to load the application.

#Build custom ICU data files

  1. Open the dotnet/icu repository

  2. Create a new Codespace for the repository

  3. Customize the files icu-filters/icudt_*.json. For instance, you can remove a few locales to reduce the size of the data files

  4. Open a terminal in the Codespace

  5. Build the data files using ./build.sh /p:TargetOS=Browser /p:TargetArchitecture=wasm /p:IcuTracing=true

  6. Download the data files from artifacts/bin/icu-browser-wasm/

  7. Copy the file to the root folder of the Blazor project

.NET uses sharding to split the ICU data files into multiple files. This allows the browser to only download the data files that are required, and so to reduce the download size (source code on GitHub). The following files are created:

  • icudt.json contains the data for all locales
  • icudt_CJK.json contains the data for all locales that use CJK characters
  • icudt_EFIGS.json contains the data for English, French, Italian, German, and Spanish locales
  • icudt_no_CJK.json contains the data for all locales that do not use CJK characters

#Use custom ICU data files

When building a Blazor application, the ICU data files are provided by the .NET SDK. To use the custom ICU data files, the application must be configured to use the custom data files.

app.csproj (csproj (MSBuild project file))
<Project Sdk="Microsoft.NET.Sdk.BlazorWebAssembly">

  <Target Name="UseCustomICU" AfterTargets="ResolveRuntimePackAssets">
    <ItemGroup>
      <ReferenceCopyLocalPaths Remove="@(ReferenceCopyLocalPaths)"
                               Condition="'%(ReferenceCopyLocalPaths.Extension)' == '.dat' AND $([System.String]::Copy('%(ReferenceCopyLocalPaths.FileName)').StartsWith('icudt'))" />

      <ReferenceCopyLocalPaths Include="$(MSBuildThisFileDirectory)icudt.dat" DestinationSubPath="icudt.dat" />
      <ReferenceCopyLocalPaths Include="$(MSBuildThisFileDirectory)icudt_CJK.dat" DestinationSubPath="icudt_CJK.dat" />
      <ReferenceCopyLocalPaths Include="$(MSBuildThisFileDirectory)icudt_EFIGS.dat" DestinationSubPath="icudt_EFIGS.dat" />
      <ReferenceCopyLocalPaths Include="$(MSBuildThisFileDirectory)icudt_no_CJK.dat" DestinationSubPath="icudt_no_CJK.dat" />
    </ItemGroup>
  </Target>
</Project>

If you only want to use the main icu data files, you can force the app to only load the icudt.dat file by adding the following property to the csproj file:

csproj (MSBuild project file)
  <PropertyGroup>
    <BlazorWebAssemblyLoadAllGlobalizationData>true</BlazorWebAssemblyLoadAllGlobalizationData>
  </PropertyGroup>

Custom data file including only en_US and fr_FR after Brotli compression is 131kB whereas the original file is 321kB. So, it saves 190kB!

Do you have a question or a suggestion about this post? Contact me!

Follow me:
Enjoy this blog?Buy Me A Coffee💖 Sponsor on GitHub