File: active_support/core_ext/string/multibyte.rb

Overview
Module Structure
Code

Overview

Module Structure

  module: <Toplevel Module>
  module: ActiveSupport#3
  module: CoreExtensions#4
  module: String#5
  module: Multibyte#7
has properties
method: mb_chars (1/2) #42
method: is_utf8? (1/2) #52
method: chars #57
method: mb_chars (2/E) #63
method: is_utf8? (2/E) #67

Code

   1  # encoding: utf-8
   2 
   3  module ActiveSupport #:nodoc:
   4    module CoreExtensions #:nodoc:
   5      module String #:nodoc:
   6        # Implements multibyte methods for easier access to multibyte characters in a String instance.
   7        module Multibyte
   8          unless '1.9'.respond_to?(:force_encoding)
   9            # == Multibyte proxy
  10            #
  11            # +mb_chars+ is a multibyte safe proxy for string methods.
  12            #
  13            # In Ruby 1.8 and older it creates and returns an instance of the ActiveSupport::Multibyte::Chars class which
  14            # encapsulates the original string. A Unicode safe version of all the String methods are defined on this proxy
  15            # class. If the proxy class doesn't respond to a certain method, it's forwarded to the encapsuled string.
  16            #
  17            #   name = 'Claus Müller'
  18            #   name.reverse  #=> "rell??M sualC"
  19            #   name.length   #=> 13
  20            #
  21            #   name.mb_chars.reverse.to_s   #=> "rellüM sualC"
  22            #   name.mb_chars.length         #=> 12
  23            #
  24            # In Ruby 1.9 and newer +mb_chars+ returns +self+ because String is (mostly) encoding aware. This means that
  25            # it becomes easy to run one version of your code on multiple Ruby versions.
  26            #
  27            # == Method chaining
  28            #
  29            # All the methods on the Chars proxy which normally return a string will return a Chars object. This allows
  30            # method chaining on the result of any of these methods.
  31            #
  32            #   name.mb_chars.reverse.length #=> 12
  33            #
  34            # == Interoperability and configuration
  35            #
  36            # The Chars object tries to be as interchangeable with String objects as possible: sorting and comparing between
  37            # String and Char work like expected. The bang! methods change the internal string representation in the Chars
  38            # object. Interoperability problems can be resolved easily with a +to_s+ call.
  39            #
  40            # For more information about the methods defined on the Chars proxy see ActiveSupport::Multibyte::Chars. For
  41            # information about how to change the default Multibyte behaviour see ActiveSupport::Multibyte.
  42            def mb_chars
  43              if ActiveSupport::Multibyte.proxy_class.wants?(self)
  44                ActiveSupport::Multibyte.proxy_class.new(self)
  45              else
  46                self
  47              end
  48            end
  49            
  50            # Returns true if the string has UTF-8 semantics (a String used for purely byte resources is unlikely to have
  51            # them), returns false otherwise.
  52            def is_utf8?
  53              ActiveSupport::Multibyte::Chars.consumes?(self)
  54            end
  55 
  56            unless '1.8.7 and later'.respond_to?(:chars)
  57              def chars
  58                ActiveSupport::Deprecation.warn('String#chars has been deprecated in favor of String#mb_chars.', caller)
  59                mb_chars
  60              end
  61            end
  62          else
  63            def mb_chars #:nodoc
  64              self
  65            end
  66            
  67            def is_utf8? #:nodoc
  68              case encoding
  69              when Encoding::UTF_8
  70                valid_encoding?
  71              when Encoding::ASCII_8BIT, Encoding::US_ASCII
  72                dup.force_encoding(Encoding::UTF_8).valid_encoding?
  73              else
  74                false
  75              end
  76            end
  77          end
  78        end
  79      end
  80    end
  81  end