Text streams are I/O streams, that can be used to convert text between different encodings. The framework provides six types of text streams, three basic class templates and three stream types to work with Pt's unicode character type Pt::Char. Text streams are derived from the I/O streams provided by the C++ standard library. Each of the two sets of classes consist of a stream type for input, output and both, respectively. The basic class templates look like this:
template <typename CharT, typename ByteT> class BasicTextIStream : public std::basic_istream<CharT>... template <typename CharT, typename ByteT> class BasicTextOStream : public std::basic_ostream<CharT>... template <typename CharT, typename ByteT> class BasicTextStream : public std::basic_iostream<CharT>...
Text streams potentially convert between character types of different size. The first template parameter is the character type to convert to and the second template parameter is the character type to convert from. They are also called internal and external character types and may actually be the same type. The internal character type is used as the character type of the standard C++ stream.
The concrete text stream classes are specializations of the basic text classes to convert from byte streams to unicode. Therefore, the internal character type is Pt::Char, while the external character type is char.
class TextIStream : public BasicTextIStream<Char, char>... class TextOStream : public BasicTextOStream<Char, char>... class TextStream : public BasicTextStream<Char, char>...
A text stream always works with another stream as input or output. A text input stream works with a std::basic_istream to read the encoded input. A text output stream needs a std::basic_ostream to write the encoded output. A text codec is used by all streams to perform the actual translation, for example the Pt::Utf8Codec. The following example shows how to read UTF-8 encoded text:
std::istringstream iss("UTF-8 encoded text"); Pt::String s; Pt::TextIStream tis(iss, new Pt::Utf8Codec()); std::getline(tis, s);
A string stream is used as the input for the text stream, which uses a text codec to convert from UTF-8 to the raw unicode character type. The std::getline function will read all input into a Pt::String. Of course, all extraction operators can also be used, for example, to directly read numbers from the stream. The next example shows how to encode text to an UTF-8 byte sequence:
std::ostringstream oss; Pt::String s = L"Hello World!"; Pt::TextOStream tos(oss, new Pt::Utf8Codec()); tos << s; tos.terminate();
The string stream serves as the output of the text stream, which uses the same type of codec like the input text stream before. This time, the codec is used to convert from the raw unicode character type to UTF-8. When all data has been written to the output text stream, terminate needs to be called to finish off the output byte sequence. This is especially important for encodings with shift states. The destructor of the text stream will also terminate the output sequence. All insertion operators can be used for text output streams e.g. to format numbers.
The examples so far create the text codec on the heap with the new operator and the stream manages the lifetime of the codec. This can be avoided by passing a value different from 0 to the codecs constructor, in which case the codec must exist at least as long as the stream that uses it:
Codecs are normally stateless, which means that one codec can be used with multiple text streams.
The Base64 encoding scheme is not a character encoding in the classical sense, but works very similar to other types of encodings. The framework provides a text codec to convert to and from base64 encoded text named Pt::Base64Codec. It can be used with the basic text stream templates, where the internal and external character types are both char. The following example shows how text is converted to base64:
std::ostringstream oss; BasicTextOStream<char, char> b64(oss, new Base64Codec()); b64 << "Hello World!"; b64.terminate();
The string stream serves as the output for the base64 encoded text. The base64 codec is used with a basic text stream to convert the string "Hello World!". This time it is important to terminate the output sequence, because the base64 format required padding at the end.