Skip to contents

Text cleaning works best if the encoding is known. This function attempts to convert text to UTF-8 encoding, and provides an informative error if that is not possible.

Usage

validate_utf8(text)

Arguments

text

A character vector to clean.

Value

The text with formal UTF-8 encoding, if possible.

Examples

text <- "fa\xE7ile"
# Specify the encoding so the example is the same on all systems.
Encoding(text) <- "latin1"
validate_utf8(text)
#> [1] "façile"