The QHebrewCodec class provides conversion to and from visually ordered Hebrew. Más...

#include <qrtlcodec.h>

Diagrama de herencias de QHebrewCodec

Métodos públicos
virtual int	mibEnum () const
const char *	name () const
const char *	mimeName () const
QCString	fromUnicode (const QString &uc, int &lenInOut) const
QString	toUnicode (const char *chars, int len) const
int	heuristicContentMatch (const char *chars, int len) const
virtual int	mibEnum () const
const char *	name () const
const char *	mimeName () const
QCString	fromUnicode (const QString &uc, int &lenInOut) const
QString	toUnicode (const char *chars, int len) const
int	heuristicContentMatch (const char *chars, int len) const

Descripción detallada

The QHebrewCodec class provides conversion to and from visually ordered Hebrew.

Hebrew as a semitic language is written from right to left. Because older computer systems couldn't handle reordering a string so that the first letter appears on the right, many older documents were encoded in visual order, so that the first letter of a line is the rightmost one in the string.

In contrast to this, Unicode defines characters to be in logical order (the order you would read the string). This codec tries to convert visually ordered Hebrew (8859-8) to Unicode. This might not always work perfectly, because reversing the bidi (bi-directional) algorithm that transforms from logical to visual order is non-trivial.

Transformation from Unicode to visual Hebrew (8859-8) is done using the bidi algorithm in Qt, and will produce correct results, so long as the codec is given the text a whole paragraph at a time. Places where newlines are supposed to go can be indicated by a newline character ('
'). Note that these newline characters change the reordering behaviour of the algorithm, since the bidi reordering only takes place within one line of text, whereas line breaks are determined in visual order.

Visually ordered Hebrew is still used quite often in some places, mainly in email communication (since most email programs still don't understand logically ordered Hebrew) and on web pages. The use on web pages is rapidly decreasing, due to the availability of browsers that correctly support logically ordered Hebrew.

This codec has the name "iso8859-8". If you don't want any bidi reordering to happen during conversion, use the "iso8859-8-i" codec, which assumes logical order for the 8-bit string.

Documentación de las funciones miembro

QCString QHebrewCodec::fromUnicode	(	const QString &	uc,
		int &	lenInOut
	)		const `[virtual]`

Transforms the logically ordered QString, uc, into a visually ordered string in the 8859-8 encoding. Qt's bidi algorithm is used to perform this task. Note that newline characters affect the reordering, since reordering is done on a line by line basis.

The algorithm is designed to work on whole paragraphs of text, so processing a line at a time may produce incorrect results. This approach is taken because the reordering of the contents of a particular line in a paragraph may depend on the previous line in the same paragraph.

Some encodings (for example Japanese or UTF-8) are multibyte (so one input character is mapped to two output characters). The lenInOut argument specifies the number of QChars that should be converted and is set to the number of characters returned.

Reimplementado de QTextCodec.

QCString QHebrewCodec::fromUnicode	(	const QString &	uc,
		int &	lenInOut
	)		const `[virtual]`

QTextCodec subclasses must reimplement either this function or makeEncoder(). It converts the first lenInOut characters of uc from Unicode to the encoding of the subclass. If lenInOut is negative or too large, the length of uc is used instead.

Converts lenInOut characters (not bytes) from uc, producing a QCString. lenInOut will be set to the length of the result (in bytes).

The default implementation makes an encoder with makeEncoder() and converts the input with that. Note that the default makeEncoder() implementation makes an encoder that simply calls this function, hence subclasses must reimplement one function or the other to avoid infinite recursion.

Reimplementado de QTextCodec.

int QHebrewCodec::heuristicContentMatch	(	const char *	chars,
		int	len
	)		const `[virtual]`

Implementa QTextCodec.

int QHebrewCodec::heuristicContentMatch	(	const char *	chars,
		int	len
	)		const `[virtual]`

QTextCodec subclasses must reimplement this function. It examines the first len bytes of chars and returns a value indicating how likely it is that the string is a prefix of text encoded in the encoding of the subclass. A negative return value indicates that the text is detectably not in the encoding (e.g. it contains characters undefined in the encoding). A return value of 0 indicates that the text should be decoded with this codec rather than as ASCII, but there is no particular evidence. The value should range up to len. Thus, most decoders will return -1, 0, or -len.

The characters are not null terminated.

Ver también:: codecForContent().

Implementa QTextCodec.

int QHebrewCodec::mibEnum ( ) const [virtual]

Implementa QTextCodec.

virtual int QHebrewCodec::mibEnum ( ) const [virtual]

Subclasses of QTextCodec must reimplement this function. It returns the MIBenum (see the IANA character-sets encoding file for more information). It is important that each QTextCodec subclass returns the correct unique value for this function.

Implementa QTextCodec.

const char* QHebrewCodec::mimeName ( ) const [virtual]

Returns the preferred mime name of the encoding as defined in the IANA character-sets encoding file.

Reimplementado de QTextCodec.

const char * QHebrewCodec::mimeName ( ) const [virtual]

Returns the codec's mime name.

Reimplementado de QTextCodec.

const char * QHebrewCodec::name ( ) const [virtual]

Implementa QTextCodec.

const char* QHebrewCodec::name ( ) const [virtual]

QTextCodec subclasses must reimplement this function. It returns the name of the encoding supported by the subclass. When choosing a name for an encoding, consider these points: On X11, heuristicNameMatch( const char * hint ) is used to test if a the QTextCodec can convert between Unicode and the encoding of a font with encoding hint, such as "iso8859-1" for Latin-1 fonts, "koi8-r" for Russian KOI8 fonts. The default algorithm of heuristicNameMatch() uses name(). Some applications may use this function to present encodings to the end user.

Implementa QTextCodec.

QString QHebrewCodec::toUnicode	(	const char *	chars,
		int	len
	)		const `[virtual]`

QTextCodec subclasses must reimplement this function or makeDecoder(). It converts the first len characters of chars to Unicode.

The default implementation makes a decoder with makeDecoder() and converts the input with that. Note that the default makeDecoder() implementation makes a decoder that simply calls this function, hence subclasses must reimplement one function or the other to avoid infinite recursion.

Reimplementado de QTextCodec.

QString QHebrewCodec::toUnicode	(	const char *	chars,
		int	len
	)		const `[virtual]`

Since Hebrew (and Arabic) is written from left to right, but iso8859-8 assumes visual ordering (as opposed to the logical ordering of Unicode), we must reverse the order of the input string (the first len characters of chars) to put it into logical order.

One problem is that the basic text direction is unknown. So this function uses some heuristics to guess it, and if it can't guess the right one, it assumes, the basic text direction is right to left.

This behaviour can be overridden, by putting a control character at the beginning of the text to indicate which basic text direction to use. If the basic text direction is left-to-right, the control character should be (uchar) 0xFE. For right-to-left it should be 0xFF. Both characters are undefined in the iso 8859-8 charset.

Example: A visually ordered string "english WERBEH american" would be recognized as having a basic left to right direction. So the logically ordered QString would be "english HEBREW american".

By prepending a (uchar)0xFF at the start of the string, QHebrewCodec::toUnicode() would use a basic text direction of right to left, and the string would thus become "american HEBREW english".

Reimplementado de QTextCodec.

La documentación para esta clase fue generada a partir de los siguientes ficheros:

src/qt/include/qrtlcodec.h
src/qt/src/codecs/qrtlcodec.h
src/qt/src/codecs/qrtlcodec.cpp

Métodos públicos

Descripción detallada

Documentación de las funciones miembro