The UTF8 class encapsulates a utf8 encoded array of characters and allows for easy encoding and decoding. More...
#include <hps.h>
Public Member Functions | |
UTF8 () | |
The default constructor creates an empty UTF8 string. More... | |
~UTF8 () | |
A destructor for a UTF8 string. More... | |
UTF8 (char const *in_string, char const *in_locale=0) | |
This constructor can be used to encode a string from any known locale to utf8. More... | |
UTF8 (wchar_t const *in_string) | |
This constructor can be used to encode a wide character string to utf8. More... | |
UTF8 (UTF8 const &in_that) | |
The copy constructor copies the source utf8 string. More... | |
UTF8 (UTF8 &&in_that) | |
The move constructor takes control of the underlying data from the source utf8 string. More... | |
UTF8 & | Assign (UTF8 &&in_utf8) |
Moves the source UTF8 object to this object. More... | |
UTF8 & | operator= (UTF8 &&in_utf8) |
The move assignment operator takes control of the underlying data from the source utf8 string. More... | |
size_t | ToWStr (wchar_t *out_wide_string) const |
Decode a utf8 encoded string into a wide character buffer. More... | |
size_t | ToWStr (WCharArray &out_wide_string) const |
Decode a utf8 encoded string into a wide character buffer. More... | |
bool | IsValid () const |
Indicates whether this utf8 string has been initialized. More... | |
bool | Empty () const |
Indicates whether this utf8 string is empty. More... | |
void | Clear () |
Reset all string data. More... | |
void | Reset () |
Resets this object to its initial, uninitialized state. More... | |
size_t | GetLength () const |
Retrieves the number of bytes in the utf8 encoded string up to but not including the null terminator. More... | |
size_t | GetWStrLength () const |
Retrieves the number of wide characters in the wchar_t string up to but not including the null terminator. More... | |
char const * | GetBytes () const |
Retrieves the raw, utf8 encoded character array. More... | |
operator char const * () const | |
Allows typecasting to const char * by retrieves the raw, utf8 encoded character array. More... | |
char | At (size_t in_index) const |
Retrieves the utf8 encoded character at the specified index. More... | |
UTF8 & | Assign (UTF8 const &in_utf8) |
Copies the source UTF8 object to this object. More... | |
UTF8 & | operator= (UTF8 const &in_utf8) |
Copies the source UTF8 object to this object. More... | |
UTF8 & | operator+= (UTF8 const &in_utf8) |
Appends a UTF8 object to the end of this object. More... | |
UTF8 & | operator+= (char const *in_utf8) |
Appends a utf8 encoded string to the end of this object. More... | |
UTF8 | operator+ (UTF8 const &in_utf8) const |
Creates a new UTF8 object by appending a UTF8 object to the end of this object. More... | |
UTF8 | operator+ (char const *in_utf8) const |
Creates a new UTF8 object by appending a utf8 encoded string to the end of this object. More... | |
bool | operator== (UTF8 const &in_utf8) const |
This function is used to check an object for equivalence to this. More... | |
bool | operator!= (UTF8 const &in_utf8) const |
This function is used to check an object for equivalence to this. More... | |
bool | operator== (char const *in_utf8) const |
This function is used to check a utf8-encoded character string for equivalence to this. More... | |
bool | operator!= (char const *in_utf8) const |
This function is used to check a utf8-encoded character string for equivalence to this. More... | |
size_t | GetHash () const |
Returns a hash code for the utf8 encoded characters. More... | |
Private Member Functions | |
size_t | internal_encode (wchar_t const *in_wide_string) |
size_t | internal_decode (wchar_t *out_wide_string) const |
Static Private Member Functions | |
static size_t | internal_decode (size_t in_length, const char *in_utf8_string, wchar_t *out_wide_string) |
Private Attributes | |
char * | _text |
size_t | _length |
size_t | _hash_key |
char | _buffer [_buffer_size] |
Static Private Attributes | |
static const size_t | _buffer_size = 64 - sizeof(const char *) - 2 * sizeof(size_t) |
Friends | |
class | HPSI::String |
bool | operator== (char const *in_left, UTF8 const &in_right) |
This function is used to check a utf8-encoded character string for equivalence to a UTF8 object. More... | |
bool | operator!= (char const *in_left, UTF8 const &in_right) |
This function is used to check a utf8-encoded character string for equivalence to a UTF8 object. More... | |
bool | operator== (wchar_t const *in_left, UTF8 const &in_right) |
This function is used to check a wide character string for equivalence to a UTF8 object. More... | |
bool | operator!= (wchar_t const *in_left, UTF8 const &in_right) |
This function is used to check a wide character string for equivalence to a UTF8 object. More... | |
UTF8 | operator+ (char const *in_left, UTF8 const &in_right) |
Creates a new UTF8 object by appending a UTF8 object to the end of a utf8-encoded character string. More... | |
UTF8 | operator+ (wchar_t const *in_left, UTF8 const &in_right) |
Creates a new UTF8 object by appending a UTF8 object to the end of a wide character string. More... | |
The UTF8 class encapsulates a utf8 encoded array of characters and allows for easy encoding and decoding.
HPS::UTF8::UTF8 | ( | ) |
The default constructor creates an empty UTF8 string.
HPS::UTF8::~UTF8 | ( | ) |
A destructor for a UTF8 string.
HPS::UTF8::UTF8 | ( | char const * | in_string, |
char const * | in_locale = 0 |
||
) |
This constructor can be used to encode a string from any known locale to utf8.
Be careful not to re-encode a string that's already utf8 encoded.
in_string | The string to be encoded. |
in_locale | A string identifying the source locale of in_string. If none is specified, the default locale on the local machine will be used. If in_string is already utf8 encoded, specify the locale as "utf8" to prevent re-encoding. |
HPS::UTF8::UTF8 | ( | wchar_t const * | in_string | ) |
This constructor can be used to encode a wide character string to utf8.
in_string | The string to be encoded. |
HPS::UTF8::UTF8 | ( | UTF8 const & | in_that | ) |
The copy constructor copies the source utf8 string.
in_that | the source to be copied. |
HPS::UTF8::UTF8 | ( | UTF8 && | in_that | ) |
The move constructor takes control of the underlying data from the source utf8 string.
the | source of the move. |
Moves the source UTF8 object to this object.
This method is functionally equivalent to the overloaded assignment operator.
in_utf8 | The source of the move. |
Copies the source UTF8 object to this object.
This method is functionally equivalent to the overloaded assignment operator.
in_utf8 | The source of the copy. |
|
inline |
Retrieves the utf8 encoded character at the specified index.
This method may split up individual code points.
void HPS::UTF8::Clear | ( | ) |
Reset all string data.
|
inline |
Indicates whether this utf8 string is empty.
|
inline |
Retrieves the raw, utf8 encoded character array.
size_t HPS::UTF8::GetHash | ( | ) | const |
Returns a hash code for the utf8 encoded characters.
|
inline |
Retrieves the number of bytes in the utf8 encoded string up to but not including the null terminator.
This will return 0 if the utf8 object is uninitialized.
|
inline |
Retrieves the number of wide characters in the wchar_t string up to but not including the null terminator.
This will return 0 if the utf8 object is uninitialized.
|
staticprivate |
|
private |
|
private |
|
inline |
Indicates whether this utf8 string has been initialized.
|
inline |
Allows typecasting to const char * by retrieves the raw, utf8 encoded character array.
|
inline |
This function is used to check an object for equivalence to this.
in_utf8 | The object to compare to this. |
|
inline |
This function is used to check a utf8-encoded character string for equivalence to this.
in_utf8 | The object to compare to this. |
UTF8 HPS::UTF8::operator+ | ( | char const * | in_utf8 | ) | const |
Appends a UTF8 object to the end of this object.
in_utf8 | The tail end of the new string. |
UTF8& HPS::UTF8::operator+= | ( | char const * | in_utf8 | ) |
Appends a utf8 encoded string to the end of this object.
in_utf8 | A string, assumed to be utf8 encoded, used as the tail end of the new string. |
The move assignment operator takes control of the underlying data from the source utf8 string.
the | source of the move. |
Copies the source UTF8 object to this object.
in_utf8 | The source of the copy. |
bool HPS::UTF8::operator== | ( | UTF8 const & | in_utf8 | ) | const |
This function is used to check an object for equivalence to this.
in_utf8 | The object to compare to this. |
bool HPS::UTF8::operator== | ( | char const * | in_utf8 | ) | const |
This function is used to check a utf8-encoded character string for equivalence to this.
in_utf8 | The object to compare to this. |
|
inline |
Resets this object to its initial, uninitialized state.
size_t HPS::UTF8::ToWStr | ( | wchar_t * | out_wide_string | ) | const |
Decode a utf8 encoded string into a wide character buffer.
out_wide_string |
size_t HPS::UTF8::ToWStr | ( | WCharArray & | out_wide_string | ) | const |
Decode a utf8 encoded string into a wide character buffer.
|
friend |
|
friend |
|
friend |
Creates a new UTF8 object by appending a UTF8 object to the end of a utf8-encoded character string.
in_left | A string, assumed to be utf8 encoded, used as the head end of the new string. |
in_right | A UTF8 object used as the tail end of the new string. |
Creates a new UTF8 object by appending a UTF8 object to the end of a wide character string.
in_left | A wide character string used as the head end of the new string. |
in_right | A UTF8 object used as the tail end of the new string. |
|
friend |
|
friend |
|
private |
|
staticprivate |
|
mutableprivate |
|
private |
|
private |