Firebase or Swift not detecting umlauts

I found some weirdest thing in Firebase Database/Storage. The thing is that I don’t know if Firebase or Swift is not detecting umlauts e.g(ä, ö, ü).

I did some easy things with Firebase like upload images to Firebase Storage and then download them into tableview. Some of my .png files had umlauts in the title for example(Röda.png).

  • Swift - Could not cast value of type '__NSCFString' to 'NSDictionary'
  • Cocoapods: Unable to find a specification for `Firebase/Core`
  • Not receiving Push Notifications Using firebase and Objective C
  • How to handle launch options in Swift 3 when a notification is tapped? Getting syntax problems
  • How do I design a simple Firebase Database that stores arrays?
  • Remove anonymous user from Auth database
  • So the problem occurs now if I download them. The only time my download url is nil is if the file name contains the umlauts I was talking about.

    So I tried some alternatives like in HTML ö - ö. But this is not working. Can you guys suggest me something? I can’t use ö - o, ü - u etc.

    This is the code when url is nil when trying to set some values into Firebase:

    FIRStorage.storage().reference()
              .child("\(productImageref!).png")
              .downloadURLWithCompletion({(url, error)in
    
    
    FIRDatabase.database().reference()
               .child("Snuses").child(productImageref!).child("productUrl")
               .setValue(url!.absoluteString)
    
    let resource = Resource(downloadURL: url!, cacheKey: productImageref)
    

    2 Solutions Collect From Internet About “Firebase or Swift not detecting umlauts”

    Horray for Unicode!

    The short answer is that no, we’re actually not doing anything special here. Basically all we do under the hood is:

    // This is the list at https://cloud.google.com/storage/docs/json_api/ without the & because query parameters
    NSString *const kGCSObjectAllowedCharacterSet = 
        @"ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789-._~!$'()*+,;=:@";
    
    - (nullable NSString *)GCSEscapedString:(NSString *)string {
      NSCharacterSet *allowedCharacters =
          [NSCharacterSet characterSetWithCharactersInString:kGCSObjectAllowedCharacterSet];
    
      return [string stringByAddingPercentEncodingWithAllowedCharacters:allowedCharacters];
    }
    

    What blows my mind is that:

    let str1 = "o\u{308}" // decomposed : latin small letter o + combining diaeresis
    let str2 = "\u{f6}"   // precomposed: latin small letter o with diaeresis
    
    print(str1, str2, str1 == str2) // ö ö true
    

    returns true. In Objective-C (which the Firebase Storage client is built in), it totally shouldn’t, as they’re two totally different characters (in actuality, the length of str1 is 2 while the length of str2 is 1 in Obj-C, while in Swift I assume the answer is 1 for both).

    Apple must be normalizing strings before comparison in Swift (probably a reasonable thing to do, since otherwise it leads to bugs like this where strings are “the same” but compare differently). Turns out, this is exactly what they do (see the “Extended Grapheme Clusters” section of their docs).

    So, when you provide two different characters in Swift, they’re being propagated to Obj-C as different characters and thus are encoded differently. Not a bug, just one of the many differences between Swift’s String type and Obj-C’s NSString type. When in doubt, choose a canonical representation you expect and stick with it, but as a library developer, it’s very hard for us to choose that representation for you.

    Thus, when naming files that contain Unicode characters, make sure to pick a standard representation (C,D,KC, or KD) and always use it when creating references.

    let imageName = "smorgasbörd.jpg"
    let path = "images/\(imageName)"
    let decomposedPath = path.decomposedStringWithCanonicalMapping // Unicode Form D
    let ref = FIRStorage.storage().reference().child(decomposedPath)
    // use this ref and you'll always get the same objects
    

    After spending a fair bit of time research your problem, the difference boils down to how the character ö is encoded and I traced it down to Unicode normalization forms.

    The letter ö can be written in two ways, and String / NSString considers them equal:

    let str1 = "o\u{308}" // decomposed : latin small letter o + combining diaeresis
    let str2 = "\u{f6}"   // precomposed: latin small letter o with diaeresis
    
    print(str1, str2, str1 == str2) // ö ö true
    

    But when you percent-encode them, they produce different results:

    print(str1.stringByAddingPercentEncodingWithAllowedCharacters(.URLPathAllowedCharacterSet())!)
    print(str2.stringByAddingPercentEncodingWithAllowedCharacters(.URLPathAllowedCharacterSet())!)
    
    // o%CC%88
    // %C3%B6
    

    My guess is that Google / Firebase chooses the decomposed form while Apple prefers the other in its text input system. You can convert the file name to its decomposed form to match Firebase:

    let str3 = str2.decomposedStringWithCanonicalMapping
    print(str3.stringByAddingPercentEncodingWithAllowedCharacters(.URLPathAllowedCharacterSet()))
    
    // o%CC%88
    

    This is irrelevant for ASCII-ranged characters. Unicode can be very confusing.

    References:

    • The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (highly recommended)
    • Strings in Swift 2
    • NSString and Unicode