net.ninthtest.nio.charset
Class CharsetTranslator

java.lang.Object
  extended by net.ninthtest.nio.charset.CharsetTranslator

public class CharsetTranslator
extends Object

Instances of CharsetTranslator translate byte streams from one character encoding to another.

The CharsetDecoder and CharsetEncoder that are used to perform the translation use CodingErrorAction.REPORT for both malformed-input and unmappable-character actions. Use CharsetDecoder and CharsetEncoder directly if this behavior is not desirable.

The useXMLCharRefReplacement(boolean) feature can be used to enable replacement of unmappable characters with their XML character reference equivalents. The replacement occurs on encoding, as characters are written to the target output stream. This feature is useful when preparing text for display on the Web.

Unlike CharsetDecoder and CharsetEncoder, there is no support for incremental translation using java.nio buffers. All translations are performed as single operations (though reads are buffered internally, and the size of the internal character buffer can be controlled).

CharsetTranslator instances always reset the internal decoder/encoder before translating. Therefore, it is safe to re-use the same instance for multiple translation operations.

CharsetTranslator implements equals(Object) and hashCode(). This allows instances to be cached in a lookup table, for example.

Version:
1.0 $Revision$
Author:
Matthew Zipay (mattz@ninthtest.info)

Field Summary
static int DEFAULT_BUFFER_SIZE
          The default buffer size used when reading from the source input stream.
 
Constructor Summary
CharsetTranslator(Charset sourceCharset, Charset targetCharset)
          Constructs a new CharsetTranslator that can translate using a decoder and encoder from the given source and target Charsets, respectively.
CharsetTranslator(String sourceCharsetName, String targetCharsetName)
          Constructs a new CharsetTranslator that can translate from the named source encoding to the named target encoding.
 
Method Summary
 boolean equals(Object obj)
          
 int getBufferSize()
          Returns the size of the buffer used when reading from the source input stream.
 int hashCode()
          
 boolean isUsingXMLCharRefReplacement()
          Tells whether or not this translator will replace unmappable characters with their XML character reference equivalents.
 void setBufferSize(int bufferSize)
          Sets the size of the buffer used when reading from the source input stream.
 Charset sourceCharset()
          Returns the source charset.
 Charset targetCharset()
          Returns the target charset.
 String toString()
          
 void translate(InputStream sourceStream, OutputStream targetStream)
          Translates a stream of bytes from one character encoding to another.
 CharsetTranslator useXMLCharRefReplacement(boolean useXMLCharRefReplacement)
          Tells this translator whether or not to use XML character reference replacements.
 
Methods inherited from class java.lang.Object
clone, finalize, getClass, notify, notifyAll, wait, wait, wait
 

Field Detail

DEFAULT_BUFFER_SIZE

public static final int DEFAULT_BUFFER_SIZE
The default buffer size used when reading from the source input stream.

The buffer size is expressed as the maximum number of characters that will be read from the source input stream at once.

See Also:
getBufferSize(), setBufferSize(int), Constant Field Values
Constructor Detail

CharsetTranslator

public CharsetTranslator(String sourceCharsetName,
                         String targetCharsetName)
Constructs a new CharsetTranslator that can translate from the named source encoding to the named target encoding.

Parameters:
sourceCharsetName - the character encoding used to decode bytes read from the source input stream
targetCharsetName - the character encoding used to encode characters written to the target output stream
Throws:
IllegalCharsetNameException - if the source or target charset name is illegal
IllegalArgumentException - if either sourceCharsetName or targetCharsetName is null
UnsupportedCharsetException - if the current JVM does not support the named source or target charset

CharsetTranslator

public CharsetTranslator(Charset sourceCharset,
                         Charset targetCharset)
Constructs a new CharsetTranslator that can translate using a decoder and encoder from the given source and target Charsets, respectively.

Parameters:
sourceCharset - the character encoding used to decode bytes read from the source input stream
targetCharset - the character encoding used to encode characters written to the target output stream
Throws:
IllegalArgumentException - if either sourceCharset or targetCharset is null
Method Detail

sourceCharset

public final Charset sourceCharset()
Returns the source charset.

Returns:
the charset that identifies the source input stream's assumed character encoding

targetCharset

public final Charset targetCharset()
Returns the target charset.

Returns:
the charset that identifies the desired character encoding for the target output stream

isUsingXMLCharRefReplacement

public boolean isUsingXMLCharRefReplacement()
Tells whether or not this translator will replace unmappable characters with their XML character reference equivalents.

Replacement occurs as characters are written to the target output stream.

Returns:
true if this translator will use XML character reference replacements

useXMLCharRefReplacement

public final CharsetTranslator useXMLCharRefReplacement(boolean useXMLCharRefReplacement)
Tells this translator whether or not to use XML character reference replacements.

Replacement occurs as characters are written to the target output stream.

Parameters:
useXMLCharRefReplacement - true if unmappable characters should be replaced with their XML character reference equivalents
Returns:
this translator

getBufferSize

public int getBufferSize()
Returns the size of the buffer used when reading from the source input stream.

Returns:
the maximum number of characters that will be read from the source at once

setBufferSize

public void setBufferSize(int bufferSize)
Sets the size of the buffer used when reading from the source input stream.

Parameters:
bufferSize - the maximum number of characters that will be read from the source at once
Throws:
IllegalArgumentException - if the buffer size is less than 1 (one)

translate

public void translate(InputStream sourceStream,
                      OutputStream targetStream)
               throws IOException
Translates a stream of bytes from one character encoding to another.

Parameters:
sourceStream - the stream of bytes to be translated
targetStream - the stream to which translated bytes are written
Throws:
IOException - if any reading/decoding/encoding/writing operation fails

hashCode

public int hashCode()

The hash code of a CharsetTranslator is based on the source charset, target charset, and whether or not XML character reference replacement is enabled.

Overrides:
hashCode in class Object
Returns:
a hash code value for this translator
See Also:
Object.hashCode()

equals

public boolean equals(Object obj)

Two CharsetTranslator instances are considered equal if, and only if, each instance is using the same source and target charset and XML character reference replacement is either enabled or disabled for both instances at the time of comparison.

Overrides:
equals in class Object
Parameters:
obj - the reference object with which to compare
Returns:
true if this translator is the same as obj; false otherwise
See Also:
Object.equals(java.lang.Object)

toString

public String toString()

Overrides:
toString in class Object
Returns:
a string indicating "source_charset_name -> target_charset_name"
See Also:
Object.toString()


Copyright © 2010 Matthew Zipay. All Rights Reserved.