JDK源码(二)：String

作者: Ethan_zyc | 来源:发表于2019-03-22 23:56 被阅读0次

JDK源码(二)：String
JDK源码 -- String
jdk源码 String
String 的特点和常见的重要的方法
String 源码
JDK 源码解析 —— String
【String源码】equals解析与使用
Java源码分析-String
Java源码学习 -- String
@NotEmpty、@NotNull、@NotBlank注解解析

String 类实现了三个接口Serializable/Comparable/CharSequence，一个个看：

Serializable：序列化接口，表示这个类是可序列化的，所谓的Serializable,就是java提供的通用数据保存和读取的接口，下次好好理解了这个类再详细说

Comparable：字面意思，可比较，实现compareTo方法，从源码可以看出就是挨个比较字符串中字符的大小（unicode编码大小）

public int compareTo(String anotherString) {
    int len1 = value.length;
    int len2 = anotherString.value.length;
    int lim = Math.min(len1, len2);
    char v1[] = value;
    char v2[] = anotherString.value;

    int k = 0;
    while (k < lim) {
        char c1 = v1[k];
        char c2 = v2[k];
        if (c1 != c2) {
            return c1 - c2;
        }
        k++;
    }
    return len1 - len2;
}

CharSequence：String、StringBuilder和StringBuffer都实现了这个接口，从字面意思理解就是这个类的对象是字符序列，从实现这个接口的三个类也可以看出

构造方法

最基本的构造方法

public String(String original) {
    this.value = original.value;
    this.hash = original.hash;
}

看到这边，想到面试中常考的一个问题：

String a = "aa";
String b = new String("aa");
System.out.println(a==b);

答案大家肯定都知道，是false，原因是a是从常量池取的，而b是new出来的一个对象

常量池(constant pool)指的是在编译期被确定，并被保存在已编译的.class文件中的一些数据。它包括了关于类、方法、接口等中的常量，也包括字符串常量。

所以只要new了就不相等

String b = new String("aa");
String c = new String("aa");
System.out.println(c==b);

这个同样是false

扯远了，回到上面的构造方法，用的比较少，并且没特殊需求，我觉得还是直接给值比较好，从常量池中取值。

字符数组构造字符串

public String(char value[]) {
    this.value = Arrays.copyOf(value, value.length);
}

String b = new String(new char[]{'a', 'a'});

其实就是回归本质，字符串从数据结构角度看就是串，而串就是字符的集合。

字符数组构造字符串（进阶版）

public String(char value[], int offset, int count) {
    if (offset < 0) {
        throw new StringIndexOutOfBoundsException(offset);
    }
    if (count <= 0) {
        if (count < 0) {
            throw new StringIndexOutOfBoundsException(count);
        }
        if (offset <= value.length) {
            this.value = "".value;
            return;
        }
    }
    // Note: offset or count might be near -1>>>1.
    if (offset > value.length - count) {
        throw new StringIndexOutOfBoundsException(offset + count);
    }
    this.value = Arrays.copyOfRange(value, offset, offset+count);
}

所谓进阶版其实就是比上面多了两个参数，一个是字符数组的起始位置，还有个是从起始位置开始的字符个数

char[] chars = {'a', 'b', 'c', 'd'};
String b = new String(chars, 1, 2);
System.out.println(b); // bc

字符数组构造字符串（进阶ascii版）

public String(int[] codePoints, int offset, int count)

char[] chars = {97, 98, 99, 100};
String b = new String(chars, 1, 2);
System.out.println(b); // bc

看个例子就懂了，char ch='a'给ch赋值的时候，'a'在内存中的存储值就是97，所以这个方法就是直接转换用整数赋值的char

byte数组构造字符串

public String(byte bytes[], int offset, int length, String charsetName)

byte[] bytes = {97, 98, 99, 100};
String b = new String(bytes, 1, 2, StandardCharsets.UTF_8);
System.out.println(b); // bc

还是一个道理，还有有几个和上面类似的就不细说了，都一样。

StringBuffer 和 StringBuilder 去构造字符串

public String(StringBuffer buffer) {
    synchronized(buffer) {
        this.value = Arrays.copyOf(buffer.getValue(), buffer.length());
    }
}

public String(StringBuilder builder) {
    this.value = Arrays.copyOf(builder.getValue(), builder.length());
}

StringBuffer 和 StringBuilder 在这里也显而易见了，StringBuffer是线程安全的。

私有构造方法

String(char[] value, boolean share) {
    // assert share : "unshared not supported";
    this.value = value;
}

从注释可以看出，现在只支持 share 为 true，并且这样构造的字符串直接和字符数组共享，性能提高了。这个方法之所以没有设为protected是因为一旦公开就破坏了字符串的不可变性，可以直接修改字符数组去修改字符串了。

常用api

public int length()

返回字符串的长度

public boolean isEmpty()

判断字符串长度是否为0

public char charAt(int index) {
    if ((index < 0) || (index >= value.length)) {
        throw new StringIndexOutOfBoundsException(index);
    }
    return value[index];
}

获取字符串中下标为index的字符

public int codePointAt(int index) {
    if ((index < 0) || (index >= value.length)) {
        throw new StringIndexOutOfBoundsException(index);
    }
    return Character.codePointAtImpl(value, index, value.length);
}

String a = "abcd";
int i = a.codePointAt(1);
System.out.println(i); // 98  也就是b

返回字符串中下标为index的字符的unicode 编码

public int codePointBefore(int index) {
    int i = index - 1;
    if ((i < 0) || (i >= value.length)) {
        throw new StringIndexOutOfBoundsException(index);
    }
    return Character.codePointBeforeImpl(value, index, 0);
}

String a = "abcd";
int i = a.codePointBefore(1);
System.out.println(i);

看源码发现主要多了这一步 int i = index - 1; 再看字面意思很明显，就是返回前一个字符的unicode 编码

public int codePointCount(int beginIndex, int endIndex) {
    if (beginIndex < 0 || endIndex > value.length || beginIndex > endIndex) {
        throw new IndexOutOfBoundsException();
    }
    return Character.codePointCountImpl(value, beginIndex, endIndex - beginIndex);
}

除了增补字符，即代码点为 U+10000～U+10FFFF 的字符，这个方法返回的就是字符串的长度，当然是beginIndex到endIndex之间字符的长度

关于增补字符，可以看这篇 https://www.oschina.net/question/12_12216

public void getChars(int srcBegin, int srcEnd, char dst[], int dstBegin) {
    if (srcBegin < 0) {
        throw new StringIndexOutOfBoundsException(srcBegin);
    }
    if (srcEnd > value.length) {
        throw new StringIndexOutOfBoundsException(srcEnd);
    }
    if (srcBegin > srcEnd) {
        throw new StringIndexOutOfBoundsException(srcEnd - srcBegin);
    }
    System.arraycopy(value, srcBegin, dst, dstBegin, srcEnd - srcBegin);
}

char[] chars = {'a', 'b', 'c'};
String a = "123456789";
a.getChars(1,2, chars, 1);
System.out.println(chars); // a2c

把字符串中下标为srcBegin到srcEnd复制到目标字符数组中，dstBegin为字符数组的偏移量

getBytes()/getBytes(Charset charset)/getBytes(String charsetName)

都是获取字符串的unicode 编码

String a = "abcd";
byte[] bytes = a.getBytes();
System.out.println(Arrays.toString(bytes)); // [97, 98, 99, 100]

今日总结

总感觉自己看源码的方式不太对，这样一个个看效率太低，得想想怎么提高效率了，今天就看了String类的一半吧，不过在String类中看到了大量的重载，有时候是为了功能性，有时候是为了提供默认项，就跟springboot给我们提供默认配置一样，学习学习，等会再多琢磨琢磨。还有就是参数的校验以及异常处理都值得好好学习。

JDK源码(二)：String
String 类实现了三个接口Serializable/Comparable/CharSequence，一个个看：...
JDK源码 -- String
一、概念类定义：实现了Serializable接口，可进行序列化。实现了Comparable接口，可进行比较...
jdk源码 String
jdk 8 API 文档中 String篇链接 https://docs.oracle.com/javase/8/...
String 的特点和常见的重要的方法
以主流的 JDK 版本 1.8 来说，String 内部实际存储结构为 char 数组，源码如下： String ...
String 源码
本文中的源码源于JDK 8 String类的定义 String类被设计成final的，避免被继承。同时String...
JDK 源码解析 —— String
String 类定义： String类被声明为final的，意味着它不可以被继承。同时内部保持着了一个final ...
【String源码】equals解析与使用
本文基于JDK1.8 1 String中equals源码分析 String是final类型，它不可被继承。Stri...
Java源码分析-String
对String类的讨论已经是老生常谈了，最权威的莫过于源码，今天就来看下String源码。基于JDK1.8。支持...
Java源码学习 -- String
String 源码学习常见面试题： String是如何实现的？有哪些常用的方法？回答：以主流的 JDK 版...
@NotEmpty、@NotNull、@NotBlank注解解析
源码解析 @NotEmpty根据JDK源码注释说明，该注解只能应用于char可读序列(可简单理解为String对象...